Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

QVDs creation?

Hi All,

Whats the best way of creating the QVDs and maintaining them(Reload them)? I searched the post but not sure as this is more like a suggestion if you can provide me with. I have data which is dumped everyday for sales, and the records run into thousands everday. II planned to have a monthly QVD which creates every month(not sure how) and then a Daily QVD which will just reload and add the data to the existing MonthQVD as the MonthQVD sales and orders itself takes more than 30 mins to reload.

Can someone please suggest me the best way of maintaing this? As my client wants current + 2 years data which is more than 100 million rows and if there is any other alternate solution to create the QVDs for this data maintenance...

My actual question is whats the best way to maintain QVDs for data dumped everyday. Also I will send the names of the source files if someone can help me with this further.

Mama

7 Replies
suniljain
Master
Master

Using QVD files for incremental load

Incremental load is a very common task in relation to data bases. It is defined as loading nothing but new or changed records from the database. All other data should already be available, in one way or another. QVD files make it is possible to perform incremental load in most cases.

The basic process is described below:

1. Load the new data from Database table (a slow process, but loading a limited number of records).

2. Load the old data from QVD file (loading many records, but a much faster process).

3. Create a new QVD file.

4. Repeat the procedure for every table loaded.

The complexity of the actual solution depends on the nature of the source database, but the following basic cases can be identified:

1) Append only (Logfiles).

2) Insert only (No Update or Delete).

3) Insert and Update (No Delete).

4) Insert, Update and Delete.

Below you will find outlined solutions for each of these cases. The reading of QVD files can be done in either "super-fast" mode or "standard mode". (The method employed is automatically selected by the QlikView script engine depending on the complexity of the operation.) "super-fast mode" is (very approximately) about 10x faster than "standard mode" or about 100x faster than loading the database in the ordinary fashion.

Case 1: Append only

The simplest case is the one of log files, which are only appended. The following conditions apply:

  • The database must be a log file which is contained in a text file (no ODBC/OLE DB).

  • QlikView keeps track of the number of records that have been previously read and loads only records added at the end of the file.

Script example:

Buffer (Incremental) Load * From LogFile.txt (ansi, txt, delimiter is '\t', embedded labels);

Case 2: Insert only. (No Update or Delete)

If the data resides in a database other than a simple log file the case 1 approach will not work. However, the problem can still be solved with minimum amount of extra work. The following conditions apply:

  • The data source can be any database.

  • QlikView loads records inserted in the database after the last script execution.

  • A field ModificationDate (or similar) is required for QlikView to recognize which records are new.

Script example:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(BeginningThisExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD;

STORE QV_Table INTO File.QVD;

Case 3: Insert and Update. (No Delete)

The next case is applicable when data in previously loaded records may have changed between script executions. The following conditions apply:

  • The data source can be any database.

  • QlikView loads records inserted into the database or updated in the database after the last script execution

  • A field ModificationDate (or similar) is required for QlikView to recognize which records are new.

  • A primary key field is required for QlikView to sort out updated records from the QVD file.

  • This solution will force the reading of the QVD file to "standard mode" (rather than "super-fast mode"), which is still considerably faster than loading the entire database.

Script example:

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD

WHERE NOT Exists(PrimaryKey);

STORE QV_Table INTO File.QVD;

Case 4: Insert, Update and Delete

The most difficult case to handle is when records are actually deleted from the source database between script executions. The following conditions apply:

  • The data source can be any database.

  • QlikView loads records inserted into the database or updated in the database after the last script execution.

  • QlikView removes records deleted from the database after the last script execution.

  • A field ModificationDate (or similar) is required for QlikView to recognize which records are new.

  • A primary key field is required for QlikView to sort out updated records from the QVD file.

  • This solution will force the reading of the QVD file to "standard mode" (rather than "super-fast mode"), which is still considerably faster than loading the entire database.

Script example:

Let ThisExecTime = Now( );

QV_Table:

SQL SELECT PrimaryKey, X, Y FROM DB_TABLE

WHERE ModificationTime >= #$(LastExecTime)#

AND ModificationTime < #$(ThisExecTime)#;

Concatenate LOAD PrimaryKey, X, Y FROM File.QVD

WHERE NOT EXISTS(PrimaryKey);

Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;

If ScriptErrorCount = 0 then

STORE QV_Table INTO File.QVD;

Let LastExecTime = ThisExecTime;

End If

Not applicable
Author

Hi,

When u load data you obviously have some kind of date say posted date or uploaded date.

When your script starts, load the data from qvd, extract max date from the loaded data.

then load the data from source database with a where condition as where posted date / uploaded date > $(currentmaxdate).

It will load only new records, obviously u need a concatenate.

Reg

Shubhu

Not applicable
Author

Hi All,

Thank you for the detailed response, here is what I have now I have a file NEW_ONVDWH_DAILY_SALES_LINES.QVD which holds the data till yesterday and another file

TODAY_ONVDWH_DAILY_SALES_LINES.QVD, whic I just created a QVD and loaded which shows me the data for today only.

Now in when I create a new QVW to show this data can you please help me to write a load statement which joins them together?

Thanks in Advance

peter_turner
Partner - Specialist
Partner - Specialist

Hi Mady,

If your QVD files have the same structure and fields, when you load them into the QVW QlikView will automatically join them together.

Or you can just define which table they should be loaded into by using:

Table_Name_Here:
load new qvd etc
load today qvd etc.

suniljain
Master
Master

During load you have to perform insert,Update and Delete of data.

Not applicable
Author

That's not always true. You may only need to append. This is a consideration to take into account when architecting how to handle historic data with the new data.

In general I do not recommend creating daily files and instead recommend creating weekly or monthly files. If the data allows, yearly files! The reason is one of performance - if you have a lot of QVDs to load from then you will have a chance of making your reload slower than it needs to be due to the disk I/O cost for every file open of a QVD. To mitigate this IO as much as possible, have less QVDs but with more data in them wherever possible.

To take a simple example: A system produces a few thousand lines of transactions every day. If you load this into a daily QVD you will find it takes a long time to load a whole year's worth (365 files which are tiny) than ready 52 (weeks), 12 (months) or just 1 as the volume is low.

Hope this helps.

suniljain
Master
Master

its depend on how movement on Data happening in ERP ?.

BI architecture is always build on the movement on data in ERP. if you not considering that part then it create inconsisteny in data.