Solved: Load multiple csv file using NOCONCATENATE after l... - Qlik Community

Anonymous · ‎2017-12-04

I'm trying to load all csv files from a folder and after that I want to use NoConcatenate.

I need this to do the JOIN after load another csv file from a folder.

The steps that I want to do:

1. Load csv files in a table. After loading I want to use NoCocatenate. If I use it before the script will load each separably. So, I need to load all files and after that I want to add NoConcatenate line?

2. Load other csv files. Also here I want to load all files and then use NoConcatenate line to avoid concatenation with the first one

3. After that I want to do a join of these 2 tables.

Any help? I tried also to load the table again using NoConcatenate and dropping the old one but the problem is still not resolved.

this is the script that I used to load all files from the first folder:

// First folder

for each file in filelist('lib://IMPORT_CSV\*.csv')

[join_ImportError_files]:

load

"timestamp",

timestamp as ts_import_join,

"day" as day_join,

"time" as time_join,

"hour" as hour_join,

requestor_lab,

production_lab,

last_timestamp,

id_log,

request_number as request_number_join,

request_number as request_number_import,

sample_number as sample_number_join,

universal_test_id

from [$(file)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file;

[join_ImportError]:

NoConcatenate

LOAD

"timestamp",

ts_import_join,

day_join,

time_join,

hour_join,

requestor_lab,

production_lab,

last_timestamp,

id_log,

request_number_join,

request_number_import,

sample_number_join,

universal_test_id

Resident join_ImportError_files;

DROP TABLE [join_ImportError_files]

[join_ImportError_filter_unique]:

inner keep

LOAD

request_number_join,

MAX(timestamp) as ts_import_join,

MAX(day_join) as day_join,

MAX(time_join) as time_join,

MAX(hour_join) as hour_join,

sample_number_join

resident join_ImportError

Group By request_number_join, sample_number_join;

drop table join_ImportError_filter_unique;

// The second folder (DB)

for each file in filelist('lib://DB_CSV\*.csv')

[join_DB_SnapWroklistSample_files]:

// NoConcatenate

LOAD

SAMPLECODE as sample_number_join,

SAMPLECODE2,

REQUESTCODE as request_number_join,

REQUESTCODE as request_number_db,

TIMEVALIDATED,

TIMERESULT,

DATETIMEREQ,

DATETIMEREQ_TS,

REQ_DATE,

REQ_HOUR,

TIMESAMPLEIN,

TIMESAMPLEIN_TS as ts_db_join,

SAMPLEIN_DATE,

SAMPLEIN_HOUR

from [$(file)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

[join_DB_SnapWroklistSample]:

NoConcatenate

LOAD

sample_number_join,

SAMPLECODE2,

request_number_join,

request_number_db,

TIMEVALIDATED,

TIMERESULT,

DATETIMEREQ,

DATETIMEREQ_TS,

REQ_DATE,

REQ_HOUR,

TIMESAMPLEIN,

ts_db_join,

SAMPLEIN_DATE,

SAMPLEIN_HOUR

Resident join_DB_SnapWroklistSample_files;

DROP TABLE [join_DB_SnapWroklistSample_files];

[join_SnapWroklistSample_filter_unique]:

inner keep

LOAD

sample_number_join,

request_number_join,

MIN(ts_db_join) as ts_db_join

resident join_DB_SnapWroklistSample

Group By request_number_join, sample_number_join;

drop table join_SnapWroklistSample_filter_unique;

[join_DB_Import]:

NoConcatenate

Load request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join

resident [join_ImportError];

join

Load request_number_join, sample_number_join, ts_db_join

resident [join_DB_SnapWroklistSample];

[join_DB_Import2]: //add_diff_field

NoConcatenate

LOAD request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join,

ts_db_join , Num#(Interval(ts_db_join-ts_import_join,'S'),'0.0') as diff3

RESIDENT [join_DB_Import];

DROP TABLE [join_ImportError];

DROP TABLE [join_DB_SnapWroklistSample];

DROP TABLE [join_DB_Import];

Anonymous · ‎2017-12-05

I resolved the problem. This is the solution.

The steps are the same :

1. Load all csv files in a table (without NoConcatenate: we can not use it because if we use it, the script will generate one table for each file). After that load again the table using NoConcatenate. Then, drop the first table (withour NoConcatenate).

2. Load other csv files with the same procedure of step 1.

3. After that I want to do a join of these 2 tables.

// Load all files from the first folder (Import):

for each file in filelist('lib://IMPORT_CSV\*.csv')

[join_ImportError_files]:

load

"timestamp",

timestamp as ts_import_join,

"day" as day_join,

"time" as time_join,

"hour" as hour_join,

requestor_lab,

production_lab,

last_timestamp,

id_log,

request_number as request_number_join,

request_number as request_number_import,

sample_number as sample_number_join,

universal_test_id

from [$(file)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file;

// Load again the table with NoConcatenate

[join_ImportError]:

NoConcatenate

LOAD

"timestamp",

ts_import_join,

day_join,

time_join,

hour_join,

requestor_lab,

production_lab,

last_timestamp,

id_log,

request_number_join,

request_number_import,

sample_number_join,

universal_test_id

Resident join_ImportError_files;

// Drop the first table with NoConcatenate

DROP TABLE [join_ImportError_files];

// I use this to group by -> u can skip this step

[join_ImportError_filter_unique]:

inner keep

LOAD

request_number_join,

MAX(timestamp) as ts_import_join,

MAX(day_join) as day_join,

MAX(time_join) as time_join,

MAX(hour_join) as hour_join,

sample_number_join

resident join_ImportError

Group By request_number_join, sample_number_join;

drop table join_ImportError_filter_unique;

// Do the same steps for the second csv files

for each file2 in filelist('lib://DB_CSV\*.csv')

[join_DB_SnapWroklistSample_files]:

LOAD

SAMPLECODE as sample_number_join,

SAMPLECODE2,

REQUESTCODE as request_number_join,

REQUESTCODE as request_number_db,

TIMEVALIDATED,

TIMERESULT,

DATETIMEREQ,

DATETIMEREQ_TS,

REQ_DATE,

REQ_HOUR,

TIMESAMPLEIN,

TIMESAMPLEIN_TS as ts_db_join,

SAMPLEIN_DATE,

SAMPLEIN_HOUR

from [$(file2)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file2;

[join_DB_SnapWroklistSample]:

NoConcatenate

LOAD

sample_number_join,

SAMPLECODE2,

request_number_join,

request_number_db,

TIMEVALIDATED,

TIMERESULT,

DATETIMEREQ,

DATETIMEREQ_TS,

REQ_DATE,

REQ_HOUR,

TIMESAMPLEIN,

ts_db_join,

SAMPLEIN_DATE,

SAMPLEIN_HOUR

Resident join_DB_SnapWroklistSample_files;

DROP TABLE [join_DB_SnapWroklistSample_files];

[join_SnapWroklistSample_filter_unique]:

inner keep

LOAD

sample_number_join,

request_number_join,

MIN(ts_db_join) as ts_db_join

resident join_DB_SnapWroklistSample

Group By request_number_join, sample_number_join;

drop table join_SnapWroklistSample_filter_unique;

// Here we do the join

[join_DB_Import]:

NoConcatenate

Load request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join

resident [join_ImportError];

join

Load request_number_join, sample_number_join, ts_db_join

resident [join_DB_SnapWroklistSample];

// I use this to do the difference between 2 timestamp -> you can skip

[join_DB_Import2]: //add_diff_field

NoConcatenate

LOAD request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join,

ts_db_join , Num#(Interval(ts_db_join-ts_import_join,'S'),'0.0') as diff3

RESIDENT [join_DB_Import];

DROP TABLE [join_ImportError];

DROP TABLE [join_DB_SnapWroklistSample];

DROP TABLE [join_DB_Import];

View solution in original post

Anil_Babu_Samineni · ‎2017-12-04

After run this script, how is your data model looks like?

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful

Anonymous · ‎2017-12-05

I resolved the problem. This is the solution.

The steps are the same :

1. Load all csv files in a table (without NoConcatenate: we can not use it because if we use it, the script will generate one table for each file). After that load again the table using NoConcatenate. Then, drop the first table (withour NoConcatenate).

2. Load other csv files with the same procedure of step 1.

3. After that I want to do a join of these 2 tables.

// Load all files from the first folder (Import):

for each file in filelist('lib://IMPORT_CSV\*.csv')

[join_ImportError_files]:

load

"timestamp",

timestamp as ts_import_join,

"day" as day_join,

"time" as time_join,

"hour" as hour_join,

requestor_lab,

production_lab,

last_timestamp,

id_log,

request_number as request_number_join,

request_number as request_number_import,

sample_number as sample_number_join,

universal_test_id

from [$(file)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file;

// Load again the table with NoConcatenate

[join_ImportError]:

NoConcatenate

LOAD

"timestamp",

ts_import_join,

day_join,

time_join,

hour_join,

requestor_lab,

production_lab,

last_timestamp,

id_log,

request_number_join,

request_number_import,

sample_number_join,

universal_test_id

Resident join_ImportError_files;

// Drop the first table with NoConcatenate

DROP TABLE [join_ImportError_files];

// I use this to group by -> u can skip this step

[join_ImportError_filter_unique]:

inner keep

LOAD

request_number_join,

MAX(timestamp) as ts_import_join,

MAX(day_join) as day_join,

MAX(time_join) as time_join,

MAX(hour_join) as hour_join,

sample_number_join

resident join_ImportError

Group By request_number_join, sample_number_join;

drop table join_ImportError_filter_unique;

// Do the same steps for the second csv files

for each file2 in filelist('lib://DB_CSV\*.csv')

[join_DB_SnapWroklistSample_files]:

LOAD

SAMPLECODE as sample_number_join,

SAMPLECODE2,

REQUESTCODE as request_number_join,

REQUESTCODE as request_number_db,

TIMEVALIDATED,

TIMERESULT,

DATETIMEREQ,

DATETIMEREQ_TS,

REQ_DATE,

REQ_HOUR,

TIMESAMPLEIN,

TIMESAMPLEIN_TS as ts_db_join,

SAMPLEIN_DATE,

SAMPLEIN_HOUR

from [$(file2)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file2;

[join_DB_SnapWroklistSample]:

NoConcatenate

LOAD

sample_number_join,

SAMPLECODE2,

request_number_join,

request_number_db,

TIMEVALIDATED,

TIMERESULT,

DATETIMEREQ,

DATETIMEREQ_TS,

REQ_DATE,

REQ_HOUR,

TIMESAMPLEIN,

ts_db_join,

SAMPLEIN_DATE,

SAMPLEIN_HOUR

Resident join_DB_SnapWroklistSample_files;

DROP TABLE [join_DB_SnapWroklistSample_files];

[join_SnapWroklistSample_filter_unique]:

inner keep

LOAD

sample_number_join,

request_number_join,

MIN(ts_db_join) as ts_db_join

resident join_DB_SnapWroklistSample

Group By request_number_join, sample_number_join;

drop table join_SnapWroklistSample_filter_unique;

// Here we do the join

[join_DB_Import]:

NoConcatenate

Load request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join

resident [join_ImportError];

join

Load request_number_join, sample_number_join, ts_db_join

resident [join_DB_SnapWroklistSample];

// I use this to do the difference between 2 timestamp -> you can skip

[join_DB_Import2]: //add_diff_field

NoConcatenate

LOAD request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join,

ts_db_join , Num#(Interval(ts_db_join-ts_import_join,'S'),'0.0') as diff3

RESIDENT [join_DB_Import];

DROP TABLE [join_ImportError];

DROP TABLE [join_DB_SnapWroklistSample];

DROP TABLE [join_DB_Import];

Load multiple csv file using NOCONCATENATE after loading