Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Incremental load

Hello,

I am new to qlikview. In my project we are doing incremental load and getting data as qvx files.

On daily basis qvx files are coming and they are coming only incremental data only, but in my script it is converting into qvd from that incremental script only. It is not appending to existing QVD with incremental data.

Inserts:

LOAD *

FROM

$(Vqvxpath)packets_inserts.qvx(qvx);

date_packets:

LOAD Max(packet_record_updated_datetime) as packets_Maxdate

Resident Inserts;

LET VDate_packets = Peek('packets_Maxdate',0,'date_packets');

DROP Table date_packets;

STORE Inserts into $(Vqvdpath)packets.QVD(qvd);

DROP Table Inserts;

Updates:

LOAD *

FROM

$(Vqvxpath)packets_updates.qvx(qvx)

Where (packet_record_updated_datetime >'$(VDate_packets)') ;

Concatenate

LOAD *

FROM

$(Vqvdpath)packets.qvd(qvd)

Where not Exists(team_key);

STORE Updates into $(Vqvdpath)packets.QVD(qvd);

Pls help on this issue.

Regards,

Kumar

19 Replies
Peter_Cammaert
Partner - Champion III
Partner - Champion III

Yes, it's perfectly possible to do what you want (even with separate inserts and updates). Have a look at Jonathan's suggestion below, because his script change implements what you're after. Taking into account what Jd Sreeni said, the basic mechanism is as follows:

Every day you'll have three sets of rows:

  • A history file called packets.qvd. That's the complete data set created during yesterday's reload
  • An "inserts" file called packets_inserts.qvx. Those are the new records, added to the data source since our last reload
  • An "Updates" file called packets_updates.qvx. Those are the updates to existing records that have been made since our last reload.

The incremental load consists of the following steps:

  • Load all inserts into memory. Let's call this table NewData.
  • Add all updates to NewData that are more recent than yesterday's history file
  • Add all records to NewData from yesterday's history file that do not already exist in NewData
  • Store NewData to packets.qvd. We just created an up-to-date reference file that tomorrow will become the history file.

To make this work, you'll need to decide on a Primary Key that identifies each record in your QVD in a unique way. Using this primary key, you can decide whether a record already exists in NewData and should not be loaded again from anywhere else.

Best,

Peter

Anonymous
Not applicable
Author

Hi Peter,

Still I am in confusion. Please bear with me

In full load I got packets_inserts.QVX as 400 mb and packets_updates.QVX as 300 mb. Based on this i created packets.qvd.

In second day we have done incremental load and got packets_inserts.QVX as 100 mb packets_updates.qvx as 30 mb and these files overriding the existing QVX files. So I have only fresh data and based on this I want existing full load and incremental data QVD.

According to my script it is creating packet.qvd from incremental data only and it is not having previous data(full load) in packets.QVD. But I have to have in QVD full load data and as well as incremental data.

Regards,

Kumar

Anonymous
Not applicable
Author

Hi Jonathan,

Thanks for your supporting. I used your script, still it is creating QVD with incremental data, but it is not having previous data in that QVD. I want both full and incremental data.

I am thinking that due to QVX files are overriding. So it is creating QVD with that data only.

Regards,

Kumar

Peter_Cammaert
Partner - Champion III
Partner - Champion III

No problem, I'll just repeat what I said before about the Incremental Load steps:

  • Load all inserts into memory. Let's call this table NewData.
  • Add all updates to NewData that are more recent than yesterday's history file
  • Add all records to NewData from yesterday's history file that do not already exist in NewData
  • Store NewData to packets.qvd. We just created an up-to-date reference file that tomorrow will become the history file.

You see, there are no STORE statements inbetween these steps. Only at the end and just a single one. The idea is that you assemble your final table in-memory from the different local data sources: Inserts QVX, Updates QVX and History QVD. The three thrown together are stored as the new reference QVD. At that time, this QVD contains records from all three sources and without doubles.

Adjust your script accordingly. Start by eliminating the different STORE statements, as you'll be overwriting your QVD again and again.

Best,

Peter

[Edit] In bold is the step that adds Full load records that haven't changed. The history file is just yesterday's version of packets.qvd. In the last step, we'll be overwriting that file with today's (new) version.

Anonymous
Not applicable
Author

Hi Peter,

Thanks for your patience.

Inserts:

LOAD *

FROM

$(Vqvxpath)packets_inserts.qvx(qvx);

date_packets:

LOAD Max(packet_record_updated_datetime) as packets_Maxdate

Resident Inserts;

LET VDate_packets = Peek('packets_Maxdate',0,'date_packets');

DROP Table date_packets;

Concatenate(Inserts)

LOAD *

FROM

$(Vqvxpath)packets_updates.qvx(qvx)

Where (packet_record_updated_datetime >'$(VDate_packets)') ;

Concatenate(Inserts)

LOAD *

FROM

$(Vqvdpath)packets.qvd(qvd)

Where not Exists(team_key);

In this script it will create Packets.QVD. In incremental load it has to check that already created packets.QVD data and then it has to add incremental data to to existing QVD .  But in our script, we are not able check already created packets.QVD data. like below

maxdate:

load max(date) as maxdate

from old.qvd;


Pls help on this.

Regards,

Kumar

Anonymous
Not applicable
Author

Hi,

It is primary key and having unique values.

Regards,

Kumar

Anonymous
Not applicable
Author

Hi Sreeni,

Can you help me, where and I have to keep these flag in above script.

Regards,
Kumar

Peter_Cammaert
Partner - Champion III
Partner - Champion III

There are two ways to go from here:

  1. Do you want to keep ony one version of every row with a unique team_key? The effect will be that you will keep the updated row from the Updates QVX, and throw away the old row from packets.qvd.
  2. Or do you want to keep all previous versions of rows with a unique team_key value? The effect will be that all rows from the history file will be kept, and all inserts and updates will simply be added to this existing QVD. In this way, you can go back to previous versions by date-of-change.

I think you want the first solution. But better be sure about this, so please make your selection.

Peter

Peter_Cammaert
Partner - Champion III
Partner - Champion III

If you select the first approach, try with this script:

// Everyday, we have fresh Inserts and fresh Updates in QVX files

NewPackets:

LOAD *

FROM [$(Vqvxpath)packets_updates.qvx] (qvx); // Updates may be on Inserts

CONCATENATE (NewPackets)

LOAD *

FROM [$(Vqvxpath)packets_inserts.qvx] (qvx)

WHERE Not Exists(team_key); // Add Inserts that do not exist yet


CONCATENATE (NewPackets)

LOAD *

FROM [$(Vqvdpath)packets.qvd] (qvd)

WHERE Not Exists(team_key); // Add history that hasn't been updated

STORE NewPackets INTO [$(Vqvdpath)packets.qvd] (qvd); // Overwrite history

Best,

Peter

Anonymous
Not applicable
Author

Hi Peter,

I want to maintain all versions of data including old and new.

Regards,

Kumar