Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
gerhardl
Creator II
Creator II

Insert and Replace in QVD

Hi there,

I load the following files into a QV Master File:

Daily Account Extract (about 500,000 records, growing daily). I replace this daily with a NEW file.

Daily Transaction Extracts. I load all historical extracts. Currently about 3 million records. Older extracts tored as QVD, to which I concatenate new daily files. Then I update the QVD file every month or so to keep my script load time short.

Daily ApplicationExtracts. I load all historical extracts. Currently about 500,000records. Older extracts tored as QVD, to which I concatenate new daily files. Then I update the QVD file every month or so to keep my script load time short.

The QVD's run quickly - the most of the script run time is spent on the account extract.

My question:

Can I somehow store a QVD of the account extract, then replace the extract daily, but when I run the script, it does an insert and replace type thing.

So it will add all new records (new accounts) that are NOT in the QVD, and it will replace all fields that are different to that of the QVD file.

Possible?

Thanks.

Gerhard

6 Replies
Jason_Michaelides
Luminary Alumni
Luminary Alumni

Hi,

This is normally done with a LastModifiedDate and ID in your data:

//first load modified records from the DB - this will get new and updated rows

Data:

LOAD

  ID,

  Blah,

  Blah2,

  Blah3,

; SQL....

WHERE LastModifiedDate > $(vLastUpdated);

//now load data from your QVD file that isn't in the above load

Concatenate (Data)

LOAD * FROM Data.qvd(qvd) WHERE NOT EXISTS(ID);

//wrote back to the qvd file

STORE Data INTO Data.qvd;

Hope this helps,

Jason

gerhardl
Creator II
Creator II
Author

Hi Jason, thanks for the response.

Don't think I explained properly though:

I don't load directly from a database. I am still trying to get the entire company on board with QlikView.

Our system does a batch run every morning and generates reports, which are mostly text files.

The Account Extract, Transaction and Application extracts are text files, which I then copy to my local drive and my script loads from there.

So the account extract is basically a daily snapshot of our client base. For old exisiting accounts, the data might change daily (obviously the account number and most of the customer detail will remain constant, but a lot of fields will change daily, like "last payment date" and "amount in arrears" etc.). And accounts opened the previous day will be new records in the extract.

I replace the account extract every day and load the newest one - so I don't really have a "last update" or "last modified" to look at.

Maybe I'm misunderstanding you though - I'm still learning all this.

magavi_framsteg
Partner - Creator III
Partner - Creator III

Hi gerhardl.

Then I think you need to do as Jason pointed out.

Since the Account details change you need to run it every day.

Maybe, if possible, you can load only the changed Accounts and load them into QV first.

ie.

Accounts:

LOAD ChangedAccounts;

CONCATENATE (Accounts)

LOAD AllAccounts Where NOT Exists (AccountID);

Or maybe it's just quicker to do a total load every day, you'll have to verify that.

And also, my understanding is that you're doing a proof of concept kind of application, so maybe it's not cruical if the load takes a bit longer. In the real world, when loading from database, it is very convenient to implement an incremental load just as Jason pointed out.

Kind regards

Magnus Åvitsland

BI Consultant

Framsteg Business Intelligence Corp.

gerhardl
Creator II
Creator II
Author

Hi Magnus,

I have no way of identifying the changed accounts myself.

Okay - I think I will stick to loading the full, replaced file every day for now.

Thank you both

Jason_Michaelides
Luminary Alumni
Luminary Alumni

There's always a way....

I have worked with this type of situation before - where records change but I have no way of knowing which ones have.  So, I built a concatenated string of every field that is likely to change, wrapped it in Hash128() to keep it smaller (you may need to use Hash192() as your recordset is quite large) and used this is my marker for the EXISTS() function:

//This is your INITIAL, FULL load - just done once to create your initial QVD file

Data:

LOAD

  Hash128(ChangingField1 & '/' & ChangingField2 & '/' & ChangingField3 & '/'...etc)     AS     KEY,

  ID,

  Blah,

  Blah2,

  Blah3,

FROM your text file;

//write to the qvd file

STORE Data INTO Data.qvd;

Now onto the daily work...

//first, load the QVD file into memory

Data:

LOAD * FROM Data.qvd(qvd);

//Then, load NEW and UPDATED records from the text file

Concatenate (Data)

LOAD

  Hash128(ChangingField1 & '/' & ChangingField2 & '/' & ChangingField3 & '/'...etc)     AS     KEY,

  ID,

  Blah,

  Blah2,

  Blah3,

  1               AS     NewOrUpdatedFlag

FROM your text file

WHERE NOT EXISTS(KEY,Hash128(ChangingField1 & '/' & ChangingField2 & '/' & ChangingField3 & '/'...etc));

//so, now you have all yesterday's records, plus any new and changed records .

//HOWEVER,changed records now exist in both their old AND new incarnations.

//So, we must now delete the old records

<This bit I haven't worked out yet and have run out of time! Will complete later if you haven't worked it out before me!  Trying to avoid another RESIDENT load with ORDER BY as that will be slow...>

//wrote back to the qvd file

STORE Data INTO Data.qvd;

Hope this helps,

Jason

gerhardl
Creator II
Creator II
Author

I like the optimism!

That looks insanely complicated though... especially with the size of my script as it is.

It looks like this:

//Account Extract:

Load <about 150 fields>

From <my text file which I replace daily with the newest file. It is possible that ANY field can change, except for the Account No> ;

//Transaction Extracts:

TranQVD:

Load * From TranQVD.qvd(qvd);

Concatenate(TranQVD)

LOAD <about 30 fields>

FROM <a folder full of text files>

Where exists( [Account No]);

//Funeral Policy Holder Extracts:

LOAD <about 20 fields>

FROM <a text file I also replace every day - small file so I'm happy to do full load daily>;

//Application Extracts:

AppsQVD:

Load * From AppsQVD.qvd (qvd);

concatenate(AppsQVD)

LOAD <about 90 fields>

FROM <a folder full of text files>;

If you figure it out ot would really help me if you can edit the above to show what it should look like - I'm going to see how far I get with your above comments.

In the Account Extract, Transaction Extract and FPH Extract the [Account No] will be KEY field.

In the Account Extract and Application Extract the [Appl No] is a KEY field.