Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Load multiple csv file using NOCONCATENATE after loading

I'm trying to load all csv files from a folder and after that I want to use NoConcatenate.

I need this to do the JOIN after load another csv file from a folder.

The steps that I want to do:

     1. Load csv files in a table. After loading I want to use NoCocatenate. If I use it before the script will load each separably. So, I need to load all files and after that I want to add NoConcatenate line?

     2. Load other csv files. Also here I want to load all files and then use NoConcatenate line to avoid concatenation with the first one

     3. After that I want to do a join of these 2 tables.

Any help? I tried also to load the table again using NoConcatenate and dropping the old one but the problem is still not resolved.

this is the script that I used to load all files from the first folder:

// First folder

for each file in filelist('lib://IMPORT_CSV\*.csv')

    [join_ImportError_files]:

    load

    "timestamp",

        timestamp as ts_import_join,

        "day" as day_join,

        "time" as time_join,

        "hour" as hour_join,

        requestor_lab,

        production_lab,

        last_timestamp,

        id_log,

        request_number as request_number_join,

        request_number as request_number_import,

        sample_number as sample_number_join,

        universal_test_id

    from [$(file)]

    (txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file;

[join_ImportError]:

NoConcatenate

LOAD

    "timestamp",

    ts_import_join,

    day_join,

    time_join,

    hour_join,

    requestor_lab,

    production_lab,

    last_timestamp,

    id_log,

    request_number_join,

    request_number_import,

    sample_number_join,

    universal_test_id

Resident join_ImportError_files;

DROP TABLE [join_ImportError_files]

[join_ImportError_filter_unique]:

inner keep

LOAD

request_number_join,

    MAX(timestamp) as ts_import_join,

    MAX(day_join) as day_join,

    MAX(time_join) as time_join,

    MAX(hour_join) as hour_join,

    sample_number_join

resident join_ImportError

Group By request_number_join, sample_number_join;

drop table join_ImportError_filter_unique;

// The second folder (DB)

for each file in filelist('lib://DB_CSV\*.csv')

  [join_DB_SnapWroklistSample_files]:

  // NoConcatenate

  LOAD

      SAMPLECODE as sample_number_join,

      SAMPLECODE2,

      REQUESTCODE as request_number_join,

      REQUESTCODE as request_number_db,

      TIMEVALIDATED,

      TIMERESULT,

      DATETIMEREQ,

      DATETIMEREQ_TS,

      REQ_DATE,

      REQ_HOUR,

      TIMESAMPLEIN,

      TIMESAMPLEIN_TS as ts_db_join,

      SAMPLEIN_DATE,

      SAMPLEIN_HOUR

  from [$(file)]

(txt, codepage is 28591, embedded labels, delimiter is ',', msq);

[join_DB_SnapWroklistSample]:

NoConcatenate

LOAD

    sample_number_join,

    SAMPLECODE2,

    request_number_join,

    request_number_db,

    TIMEVALIDATED,

    TIMERESULT,

    DATETIMEREQ,

    DATETIMEREQ_TS,

    REQ_DATE,

    REQ_HOUR,

    TIMESAMPLEIN,

    ts_db_join,

    SAMPLEIN_DATE,

    SAMPLEIN_HOUR

Resident join_DB_SnapWroklistSample_files;

DROP TABLE [join_DB_SnapWroklistSample_files];

[join_SnapWroklistSample_filter_unique]:

inner keep 

LOAD

sample_number_join,

    request_number_join,

    MIN(ts_db_join) as ts_db_join

resident join_DB_SnapWroklistSample

Group By request_number_join, sample_number_join;

drop table join_SnapWroklistSample_filter_unique;

[join_DB_Import]:

NoConcatenate

Load request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join

resident [join_ImportError];

join

Load request_number_join, sample_number_join, ts_db_join

resident [join_DB_SnapWroklistSample];

[join_DB_Import2]:  //add_diff_field

NoConcatenate

LOAD request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join,

ts_db_join , Num#(Interval(ts_db_join-ts_import_join,'S'),'0.0') as diff3

RESIDENT [join_DB_Import];

DROP TABLE [join_ImportError];

DROP TABLE [join_DB_SnapWroklistSample];

DROP TABLE [join_DB_Import];

1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

I resolved the problem. This is the solution.

The steps are the same :

     1. Load all csv files in a table (without NoConcatenate: we can not use it because if we use it, the script will generate one table for each file). After that load again the table using NoConcatenate. Then, drop the first table (withour NoConcatenate).

     2. Load other csv files with the same procedure of step 1.

     3. After that I want to do a join of these 2 tables.

//  Load all files from the first folder (Import):

for each file in filelist('lib://IMPORT_CSV\*.csv')

    [join_ImportError_files]:

    load

    "timestamp",

        timestamp as ts_import_join,

        "day" as day_join,

        "time" as time_join,

        "hour" as hour_join,

        requestor_lab,

        production_lab,

        last_timestamp,

        id_log,

        request_number as request_number_join,

        request_number as request_number_import,

        sample_number as sample_number_join,

        universal_test_id

    from [$(file)]

    (txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file;

// Load again the table with NoConcatenate

[join_ImportError]:

NoConcatenate

LOAD

    "timestamp",

    ts_import_join,

    day_join,

    time_join,

    hour_join,

    requestor_lab,

    production_lab,

    last_timestamp,

    id_log,

    request_number_join,

    request_number_import,

    sample_number_join,

    universal_test_id

Resident join_ImportError_files;

// Drop the first table with NoConcatenate

DROP TABLE [join_ImportError_files];

// I use this to group by -> u can skip this step

[join_ImportError_filter_unique]:

inner keep

LOAD

request_number_join,

    MAX(timestamp) as ts_import_join,

    MAX(day_join) as day_join,

    MAX(time_join) as time_join,

    MAX(hour_join) as hour_join,

    sample_number_join

resident join_ImportError

Group By request_number_join, sample_number_join;

drop table join_ImportError_filter_unique;

// Do the same steps for the second csv files

for each file2 in filelist('lib://DB_CSV\*.csv')

  [join_DB_SnapWroklistSample_files]:

  LOAD

      SAMPLECODE as sample_number_join,

      SAMPLECODE2,

      REQUESTCODE as request_number_join,

      REQUESTCODE as request_number_db,

      TIMEVALIDATED,

      TIMERESULT,

      DATETIMEREQ,

      DATETIMEREQ_TS,

      REQ_DATE,

      REQ_HOUR,

      TIMESAMPLEIN,

      TIMESAMPLEIN_TS as ts_db_join,

      SAMPLEIN_DATE,

      SAMPLEIN_HOUR

  from [$(file2)]

  (txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file2;

[join_DB_SnapWroklistSample]:

NoConcatenate

LOAD

    sample_number_join,

    SAMPLECODE2,

    request_number_join,

    request_number_db,

    TIMEVALIDATED,

    TIMERESULT,

    DATETIMEREQ,

    DATETIMEREQ_TS,

    REQ_DATE,

    REQ_HOUR,

    TIMESAMPLEIN,

    ts_db_join,

    SAMPLEIN_DATE,

    SAMPLEIN_HOUR

Resident join_DB_SnapWroklistSample_files;

DROP TABLE [join_DB_SnapWroklistSample_files];

[join_SnapWroklistSample_filter_unique]:

inner keep 

LOAD

sample_number_join,

    request_number_join,

    MIN(ts_db_join) as ts_db_join

resident join_DB_SnapWroklistSample

Group By request_number_join, sample_number_join;

drop table join_SnapWroklistSample_filter_unique;

// Here we do the join

[join_DB_Import]:

NoConcatenate

Load request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join

resident [join_ImportError];

join

Load request_number_join, sample_number_join, ts_db_join

resident [join_DB_SnapWroklistSample];

// I use this to do the difference between 2 timestamp -> you can skip

[join_DB_Import2]:  //add_diff_field

NoConcatenate

LOAD request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join,

ts_db_join , Num#(Interval(ts_db_join-ts_import_join,'S'),'0.0') as diff3

   

RESIDENT [join_DB_Import];

DROP TABLE [join_ImportError];

DROP TABLE [join_DB_SnapWroklistSample];

DROP TABLE [join_DB_Import];

View solution in original post

2 Replies
Anil_Babu_Samineni

After run this script, how is your data model looks like?

Please add me Anil_Babu_Samineni to interact faster when reply back. Speak low think High.

Before develop something, think If placed (The Right information | To the right people | At the Right time | In the Right place | With the Right context)
Anonymous
Not applicable
Author

I resolved the problem. This is the solution.

The steps are the same :

     1. Load all csv files in a table (without NoConcatenate: we can not use it because if we use it, the script will generate one table for each file). After that load again the table using NoConcatenate. Then, drop the first table (withour NoConcatenate).

     2. Load other csv files with the same procedure of step 1.

     3. After that I want to do a join of these 2 tables.

//  Load all files from the first folder (Import):

for each file in filelist('lib://IMPORT_CSV\*.csv')

    [join_ImportError_files]:

    load

    "timestamp",

        timestamp as ts_import_join,

        "day" as day_join,

        "time" as time_join,

        "hour" as hour_join,

        requestor_lab,

        production_lab,

        last_timestamp,

        id_log,

        request_number as request_number_join,

        request_number as request_number_import,

        sample_number as sample_number_join,

        universal_test_id

    from [$(file)]

    (txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file;

// Load again the table with NoConcatenate

[join_ImportError]:

NoConcatenate

LOAD

    "timestamp",

    ts_import_join,

    day_join,

    time_join,

    hour_join,

    requestor_lab,

    production_lab,

    last_timestamp,

    id_log,

    request_number_join,

    request_number_import,

    sample_number_join,

    universal_test_id

Resident join_ImportError_files;

// Drop the first table with NoConcatenate

DROP TABLE [join_ImportError_files];

// I use this to group by -> u can skip this step

[join_ImportError_filter_unique]:

inner keep

LOAD

request_number_join,

    MAX(timestamp) as ts_import_join,

    MAX(day_join) as day_join,

    MAX(time_join) as time_join,

    MAX(hour_join) as hour_join,

    sample_number_join

resident join_ImportError

Group By request_number_join, sample_number_join;

drop table join_ImportError_filter_unique;

// Do the same steps for the second csv files

for each file2 in filelist('lib://DB_CSV\*.csv')

  [join_DB_SnapWroklistSample_files]:

  LOAD

      SAMPLECODE as sample_number_join,

      SAMPLECODE2,

      REQUESTCODE as request_number_join,

      REQUESTCODE as request_number_db,

      TIMEVALIDATED,

      TIMERESULT,

      DATETIMEREQ,

      DATETIMEREQ_TS,

      REQ_DATE,

      REQ_HOUR,

      TIMESAMPLEIN,

      TIMESAMPLEIN_TS as ts_db_join,

      SAMPLEIN_DATE,

      SAMPLEIN_HOUR

  from [$(file2)]

  (txt, codepage is 28591, embedded labels, delimiter is ',', msq);

next file2;

[join_DB_SnapWroklistSample]:

NoConcatenate

LOAD

    sample_number_join,

    SAMPLECODE2,

    request_number_join,

    request_number_db,

    TIMEVALIDATED,

    TIMERESULT,

    DATETIMEREQ,

    DATETIMEREQ_TS,

    REQ_DATE,

    REQ_HOUR,

    TIMESAMPLEIN,

    ts_db_join,

    SAMPLEIN_DATE,

    SAMPLEIN_HOUR

Resident join_DB_SnapWroklistSample_files;

DROP TABLE [join_DB_SnapWroklistSample_files];

[join_SnapWroklistSample_filter_unique]:

inner keep 

LOAD

sample_number_join,

    request_number_join,

    MIN(ts_db_join) as ts_db_join

resident join_DB_SnapWroklistSample

Group By request_number_join, sample_number_join;

drop table join_SnapWroklistSample_filter_unique;

// Here we do the join

[join_DB_Import]:

NoConcatenate

Load request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join

resident [join_ImportError];

join

Load request_number_join, sample_number_join, ts_db_join

resident [join_DB_SnapWroklistSample];

// I use this to do the difference between 2 timestamp -> you can skip

[join_DB_Import2]:  //add_diff_field

NoConcatenate

LOAD request_number_join, sample_number_join, ts_import_join, day_join, time_join, hour_join,

ts_db_join , Num#(Interval(ts_db_join-ts_import_join,'S'),'0.0') as diff3

   

RESIDENT [join_DB_Import];

DROP TABLE [join_ImportError];

DROP TABLE [join_DB_SnapWroklistSample];

DROP TABLE [join_DB_Import];