Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
rittermd
Master
Master

Load Question

I am getting a csv file placed into a folder each night from our HR system.  They contain employee data and each days file will have the same layout and most of the data in today's file will be the same as what came in yesterday.  There can be some new records and there can be changes to some of the fields. 

Currently the load script is setup to just add todays records to what was already in there.  So each day my qvd file is growing with mostly redundant data.

I need to find out what the best process is in the script to only load new records and updated fields each day instead of adding all of the records.

If you need more information or if I didn't explain this clearly please let me know.

4 Replies
sunny_talwar

Is there a unique identifier which can help in identifying the existing records? If there is one, you can only load data which doesn't exist using Where not Exist statement. The logic you need is incremental load... Check these links to see how ti will work for you

https://www.resultdata.com/three-types-of-qlikview-incremental-loads/

Incremental Load in QlikView – Part1 – Learn QlikView

https://www.analyticsvidhya.com/blog/2014/09/qlikview-incremental-load/

jagan
Luminary Alumni
Luminary Alumni

Hi,

For doing this you need the following

- Any date field to identify that the record is modified (something like Modified/Updated Date field)

- Need a Primary key to identify each record

- Need to identify the data pattern

     -Are we deleting any records, or just updating and inserting new records.

You can check for the sample script and examples in Qlikview help file.

Check for this topic

Using QVD Files for Incremental Load

You will get very good explanation for all the scenarios.

Regards,

Jagan.

rittermd
Master
Master
Author

I don't have direct access to the SQL data and there is no modify date.

The back end system is generating a csv file each day.  The file name includes the date created. 

So currently the task is setup to reload every csv file in the folder. There are currently 280 of them and eventually there will be over 1000.

There is a unique key (employee id).

Every new csv file contains new hires as well as has a record for each employee that was in the prior days csv file.  Many of them are identical.  Some might have updates.  No real way to tell with the process we have currently except to compare every field and update if a change is found. 

Is there a good way to do this when you can't access the data directly and are dealing with a daily csv file?

sunny_talwar

Well assuming that the new file's employee id will have all the updated information I would delete any occurrence of existing employee id

NewFile:

LOAD empID,

           ...

FROM NewEmpId;

ExistingData:

LOAD empID,

           ...

FROM ExistingQVD

Where not Exists(empID);

Concatenate (ExistingData)

LOAD *

Resident NewFile;