Using QVD Files for Incremental Load

Anonymous · ‎2015-05-07

I'm tying to load 2 tables (TRANS, ADVM) from a Visual FoxPro db, it needs to be refreshed hourly. Since both tables hold a lot of transactions and it is our production database I tried an hourly incremental load on each of the tables and storing them into a qvd and using the qvd for the application.

TRANS:

trans_ID,

time,

date,

operator,

trans_type

ADVM:

trans_ID,

event,

item,

soldon,

price

One table has date and time, but the other one has only the date, so I tried filtering by trans_ID, using a variable to hold the last known trans_ID, it was working well, but apparently the id numbers don't go in a chronological order and I was loosing a few transactions.

So I need a different way of getting the information.

I'm testing on the qvd's now, I tried doing a join and filtering by date and time on table TRANS, but that loads all 4 mil records from ADVM. Does anyone have any ideas?

Transactions:

LOAD trans_ID,

time,

date,

operator,

trans_type

FROM

(qvd)

where date(date)>='5/7/2015' and time>'6:00:00';

left join(Transactions)

LOAD trans_act,

event,

item,

soldon,

price

FROM

(qvd);

ramoncova06 · ‎2015-05-07

if you want to do incremental load you should be using concatenate instead of left join, if you have a transid and this is unique for each transaction you can use where not exists

Transactions:

LOAD trans_ID,

time,

date,

operator,

trans_type

FROM

(qvd);

concatenate Transactions

LOAD trans_act,

event,

item,

soldon,

price,

trans_ID

FROM

(qvd)

where not exists (trans_ID,trans_ID);

here is more info on incremental loads with QVDs

Using QVD Files for Incremental Load

Incremental load is a very common task in relation to data bases. It is defined as loading nothing but new or changed records from the database. All other data should already be available, in one way or another. With QVD Files it is possible to perform incremental load in most cases.

The basic process is described below:

1. Load the new data from Database table (a slow process, but loading a limited number of records).

2. Load the old data from QVD file (loading many records, but a much faster process).

3. Create a new QVD file.

4. Repeat the procedure for every table loaded.

The complexity of the actual solution depends on the nature of the source database, but the following basic cases can be identified:

1) Case 1: Append Only (typically log files

2) Case 2: Insert Only (No Update or Delete)

3) Case 3: Insert and Update (No Delete)

4) Case 4: Insert, Update and Delete

Below you will find outlined solutions for each of these cases. The reading of QVD files can be done in either optimized mode or standard mode. (The method employed is automatically selected by the QlikView script engine depending on the complexity of the operation.) Optimized mode is (very approximately) about 10x faster than standard mode or about 100x faster than loading the database in the ordinary fashion.

Case 1: Append Only

The simplest case is the one of log files; files in which records are only appended and never deleted. The following conditions apply:

The database must be a log file (or some other file in which records are appended and not inserted or deleted) which is contained in a text file (no ODBC/OLE DB).
QlikView keeps track of the number of records that have been previously read and loads only records added at the end of the file.

Script Example:

Buffer (Incremental) Load * From LogFile.txt (ansi, txt, delimiter is '\t', embedded labels);

Case 2: Insert Only (No Update or Delete)

If the data resides in a database other than a simple log file the case 1 approach will not work. However, the problem can still be solved with minimum amount of extra work. The following conditions apply:

The data source can be any database.
QlikView loads records inserted in the database after the last script execution.
A field ModificationDate (or similar) is required for QlikView to recognize which records are new.

Script Example:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(BeginningThisExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD;

STORE QV_Table INTO File.QVD;

(The hash signs in the SQL WHERE clause define the beginning and end of a date. Check your database manual for the correct date syntax for your database.)

Case 3: Insert and Update (No Delete)

The next case is applicable when data in previously loaded records may have changed between script executions. The following conditions apply:

The data source can be any database.
QlikView loads records inserted into the database or updated in the database after the last script execution
A field ModificationDate (or similar) is required for QlikView to recognize which records are new.
A primary key field is required for QlikView to sort out updated records from the QVD file.
This solution will force the reading of the QVD file to standard mode (rather than optimized), which is still considerably faster than loading the entire database.

Script Example:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD

WHERE NOT Exists(PrimaryKey);

STORE QV_Table INTO File.QVD;

Case 4: Insert, Update and Delete

The most difficult case to handle is when records are actually deleted from the source database between script executions. The following conditions apply:

The data source can be any database.
QlikView loads records inserted into the database or updated in the database after the last script execution.
QlikView removes records deleted from the database after the last script execution.
A field ModificationDate (or similar) is required for QlikView to recognize which records are new.
A primary key field is required for QlikView to sort out updated records from the QVD file.
This solution will force the reading of the QVD file to standard mode (rather than optimized), which is still considerably faster than loading the entire database.

Script Example:

Let ThisExecTime = Now( );

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(ThisExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD

WHERE NOT EXISTS(PrimaryKey);

Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;

If ScriptErrorCount = 0 then

STORE QV_Table INTO File.QVD;

Let LastExecTime = ThisExecTime;

End If

View solution in original post

ramoncova06 · ‎2015-05-07

if you want to do incremental load you should be using concatenate instead of left join, if you have a transid and this is unique for each transaction you can use where not exists

Transactions:

LOAD trans_ID,

time,

date,

operator,

trans_type

FROM

(qvd);

concatenate Transactions

LOAD trans_act,

event,

item,

soldon,

price,

trans_ID

FROM

(qvd)

where not exists (trans_ID,trans_ID);

here is more info on incremental loads with QVDs

Using QVD Files for Incremental Load

Incremental load is a very common task in relation to data bases. It is defined as loading nothing but new or changed records from the database. All other data should already be available, in one way or another. With QVD Files it is possible to perform incremental load in most cases.

The basic process is described below:

1. Load the new data from Database table (a slow process, but loading a limited number of records).

2. Load the old data from QVD file (loading many records, but a much faster process).

3. Create a new QVD file.

4. Repeat the procedure for every table loaded.

The complexity of the actual solution depends on the nature of the source database, but the following basic cases can be identified:

1) Case 1: Append Only (typically log files

2) Case 2: Insert Only (No Update or Delete)

3) Case 3: Insert and Update (No Delete)

4) Case 4: Insert, Update and Delete

Below you will find outlined solutions for each of these cases. The reading of QVD files can be done in either optimized mode or standard mode. (The method employed is automatically selected by the QlikView script engine depending on the complexity of the operation.) Optimized mode is (very approximately) about 10x faster than standard mode or about 100x faster than loading the database in the ordinary fashion.

Case 1: Append Only

The simplest case is the one of log files; files in which records are only appended and never deleted. The following conditions apply:

The database must be a log file (or some other file in which records are appended and not inserted or deleted) which is contained in a text file (no ODBC/OLE DB).
QlikView keeps track of the number of records that have been previously read and loads only records added at the end of the file.

Script Example:

Buffer (Incremental) Load * From LogFile.txt (ansi, txt, delimiter is '\t', embedded labels);

Case 2: Insert Only (No Update or Delete)

If the data resides in a database other than a simple log file the case 1 approach will not work. However, the problem can still be solved with minimum amount of extra work. The following conditions apply:

The data source can be any database.
QlikView loads records inserted in the database after the last script execution.
A field ModificationDate (or similar) is required for QlikView to recognize which records are new.

Script Example:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(BeginningThisExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD;

STORE QV_Table INTO File.QVD;

(The hash signs in the SQL WHERE clause define the beginning and end of a date. Check your database manual for the correct date syntax for your database.)

Case 3: Insert and Update (No Delete)

The next case is applicable when data in previously loaded records may have changed between script executions. The following conditions apply:

The data source can be any database.
QlikView loads records inserted into the database or updated in the database after the last script execution
A field ModificationDate (or similar) is required for QlikView to recognize which records are new.
A primary key field is required for QlikView to sort out updated records from the QVD file.
This solution will force the reading of the QVD file to standard mode (rather than optimized), which is still considerably faster than loading the entire database.

Script Example:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD

WHERE NOT Exists(PrimaryKey);

STORE QV_Table INTO File.QVD;

Case 4: Insert, Update and Delete

The most difficult case to handle is when records are actually deleted from the source database between script executions. The following conditions apply:

The data source can be any database.
QlikView loads records inserted into the database or updated in the database after the last script execution.
QlikView removes records deleted from the database after the last script execution.
A field ModificationDate (or similar) is required for QlikView to recognize which records are new.
A primary key field is required for QlikView to sort out updated records from the QVD file.
This solution will force the reading of the QVD file to standard mode (rather than optimized), which is still considerably faster than loading the entire database.

Script Example:

Let ThisExecTime = Now( );

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(ThisExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD

WHERE NOT EXISTS(PrimaryKey);

Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;

If ScriptErrorCount = 0 then

STORE QV_Table INTO File.QVD;

Let LastExecTime = ThisExecTime;

End If

Anonymous · ‎2015-05-08

That is a very good answer, I tried it but it's not working because the field trans_ID is not unique in table ADVM, so when I load it with the clause 'where not exists' it only loads one record when there will be more. Do you have any other ideas?

ramoncova06 · ‎2015-05-08

do you have a way to create a unique key ? maybe by combining a key with another value ?

ramoncova06 · ‎2015-05-08

if not then you might be able to insert the values based on last updated date

Gysbert_Wassenaar · ‎2015-05-08

Try this:

Transactions:

LOAD trans_ID,

trans_ID as LookUp_trans_ID

time,

date,

operator,

trans_type

FROM

(qvd);

concatenate Transactions

LOAD trans_act,

event,

item,

soldon,

price,

trans_ID

FROM

(qvd)

where not exists ( LookUp_trans_ID,trans_ID);

DROP FIELD LookUp_trans_ID;

talk is cheap, supply exceeds demand

Anonymous · ‎2015-05-11

I found an other field that I wasn't loading before, that has unique values and I was able to apply your technique on that and it works perfectly!

Thank you so much!

Conditional load based on an incremental load

Using QVD Files for Incremental Load

Case 1: Append Only

Case 2: Insert Only (No Update or Delete)

Case 3: Insert and Update (No Delete)

Case 4: Insert, Update and Delete

Using QVD Files for Incremental Load

Case 1: Append Only

Case 2: Insert Only (No Update or Delete)

Case 3: Insert and Update (No Delete)

Case 4: Insert, Update and Delete