Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
aj0031724
Partner - Creator
Partner - Creator

Load only new file also support overwrite if data changes

Dear Team,

I need your advice .

I have  a requirement to load files with filename elike  (RNC_date*.txt)  having 4 colmns WITH DATA like

DATE RNC HOUR(HH:MM)  VALUE

12.08.2015   R1 13:00       10

from specific folder(C:\STORAGE\NETWORK) per hour .

so each date there will be max 24 files if all are new or more if any existing hour value changes.

1)Situation-->

we need to add only new files each hour.

2) Also check if any xisting VALUE changes for unqiue combination of (DATE,RNC,HOUR) then remove the old one and insert the new data for that combination.

We have  a existing code that load all files from this folder .

So is there any ANY by which I can ensure that I load only new files and also update the old data if there is any changes.

Pleas suggest.

6 Replies
sunny_talwar

Have you looked at Incremental load?

You can a lot of sources here: Incremental Load in Qlikview - Sources

Incremental Load Scenarios.pdf

Not applicable

Hi Avinash,

It's hard to provide a working source code.

You can easily build a list of new files. Below are some highlights which should help

Get a list of files in your directory

For each file in FileList('C:\STORAGE\NETWORK\RNC_date*.txt')

  

TMP_FILE:

LOAD

subfield('$(file)','\',-1) as Filename

autogenerate 1;

  

Next file;

Compare it with a list you have already loaded. You can keep it in a qvd

Build a list of files which are in list1 and not in list2

I didn't get this requirement

2) Also check if any xisting VALUE changes for unqiue combination of (DATE,RNC,HOUR) then remove the old one and insert the new data for that combination.

Does it mean you need to check it in old files?

You can keep a hash for every file and compare it, if it doesn't match -> reload file.

Yury

aj0031724
Partner - Creator
Partner - Creator
Author

Dear Yury,

For point 2  I mean to say that I need to update BH values for any old data which got changed for combination of (DATE,RNC,HOUR).

for example-->

today when I load data it has values like

DATE RNC hour    BHVALUES

16/11/2015  R1    13:00    10.

NEXT DAY I got the modified value for the same combination like :

DATE RNC hour   BHVALUES

16/11/2015  R1  13:00  15.

so in my qvd file I need to remove the old data and insert the modified data for unqiue combination of (DATE,RNC,BH).

Can we handle this during load?

aj0031724
Partner - Creator
Partner - Creator
Author

Dear Yury,

For point 2  I mean to say that I need to update BH values for any old data which got changed for combination of (DATE,RNC,HOUR).

for example-->

today when I load data it has values like

DATE RNC hour    BHVALUES

16/11/2015  R1    13:00    10.

NEXT DAY I got the modified value for the same combination like :

DATE RNC hour   BHVALUES

16/11/2015  R1  13:00  15.

so in my qvd file I need to remove the old data and insert the modified data for unqiue combination of (DATE,RNC,BH).

Can we handle this during load?

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

In this scenario, it's easiest to just load everything that is available and not try to determine if a data value has changed. I'm assuming that the directories will get cleaned out on a regular basis, but there is no harm in loading the same data twice. The Exists() test in the QVD load below will prevent duplication.

1. Form a unique key for each record as: DATE & RNC & Hour as ID.

2. Load all data in the directory.

3. Concatenate Load your QVD with: WHERE NOT Exists(ID).

-Rob

http://masterssummit.com

http://qlikviewcookbook.com

Not applicable

Hi Avinash,

Please find my prototype.

You need to improve it for your case of course.

// Get a list of files from the source foulder

For each file in FileList('C:\Users\x\Desktop\191409\*.txt') 

    

TMP_FILE: 

LOAD 

subfield('$(file)','\',-1) as Filename 

,date(FileTime('$(file)'), 'DD/MM/YYYY hh:mm:ss')  as FileTime

autogenerate 1; 

    

Next file; 

// If it's not the very first run then read a list of files already loaded before

if not isnull(QVDCreateTime('C:\Users\x\Desktop\191409\last_load_file_list.qvd')) then

  left join(TMP_FILE)

  LOAD

     Filename

    ,LastLoadFileTime;

    LOAD

      Filename

     ,FileTime  as LastLoadFileTime

    FROM C:\Users\x\Desktop\191409\last_load_file_list.qvd (qvd);

   

  LOAD_FILES: 

  LOAD

     Filename

    ,FileTime

    ,if(FileTime <> LastLoadFileTime, 1, 0)  as LoadFlag

  RESIDENT TMP_FILE; 

  DROP TABLE TMP_FILE;

   

else    

   store TMP_FILE into C:\Users\x\Desktop\191409\last_load_file_list.qvd (qvd);

end if 

/* Files in LOAD_FILES are the once that should be reloaded

Please note that reload flag is based on FileTime

There are 2 cases:

1. Files have been changed, as a result olf filetime doesn't match to new flletime and file has to be reloaded (I DON'T CHECK FILE'S BODY, IT CAN BE CRITCAL IN YOUR CASE TO DO SO)

2. New files, in that case LastLoadFileTime is NULL and all new files will be marked to load

*/

/// HERE YOU LOAD YOUR FILES

// Save files time for next run

   store LOAD_FILES into C:\Users\x\Desktop\191409\last_load_file_list.qvd (qvd);