Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Concatenate Data to existing QVD File

Hi Team,

Good Afternoon,

I want to load data from MS SQL Database(208 million records) to a QVD File. I am trying to load the data in incremental way, ie if the file doesnt exists then i should start loading data from the beginning. If data was previously loaded from the database to a QVD file then it should load only those records which are new in the database.

I want to create only 1 script which does the job for me. I don't want to load first 100 data in to a QVD file and then load other data in incremental fashion. All this should done in a shot (i.e Script) when i execute the script. Script should be capable enough to detect whether there is previously data pulled or not and then based on that it should start loading the data. QVD FileName will be predefined.

I want to know whether it is possible to load such huge data to a QVD File.

If yes then how? want a sample script which should do the job for me.

I have a int field which is unique and trying to get this thing done based on that. 

Db structure : ID int,FirstName nvarchar(50), LastName nvarchar(50) ,dtCreatedDate datetime, IsActive bit

My Logic is 

Step 1 : Check whether a qvd file exists

Step 2 : if no, get the total no of records from db and store it in a variable totalrowcount.  second initialize variable  qvdrowcount as 0.

Step 3: use a do.. loop which should will first pull 100000 records from database to my qvd file. Then increment the qvdrowcount variable by 100000. Get the Last ID NO  from the QVD file.

STep 4: check whether qvdrowcount < totaldbrowcount. If yes then pull next set of 100000 records where ID NO Greater than the lastIDNO from qvd file and saves it to the same qvd file.

Step 5: this continues till my while expression "qvdrowcount < totaldbrowcount"  returns false and the loop breaks.

Step 6 : Incase if data was pulled previously in a qvd file then, get the recordcount from qvd file, get the max ID NO from the qvd file and then follow the steps  from 3 to 5.

Kindly suggest whether this approach is a optimal solution or not. If No, then kindly guide me in achieving this task using QLikView.

Thanks in advance for the help.

Regards,

Krunal. Shaholia

2 Replies
marcus_sommer

I would do it a bit different by using an initial load from the db and storing the results into a qvd

[Table]: select * from db; store [Table] into [Table_$(i)].qvd (qvd); drop tables [Table];

Alternatively into several qvd's by looping over a time-field maybe in this way:

for i = 2015 to year(today())

     [Table_$(i)]: select * from db where year(dtCreatedDate) = $(i);

     store [Table_$(i)] into [Table_$(i)].qvd (qvd); drop tables [Table_$(i)];

next

but this approach needs later addtionally steps because the new incremental loaded data must also be stored in such a periodical qvd.

After this I would comment it out - of course you could use filesize() within an if-statement or maybe ERRORMODE or something similar to check if the qvd or any other data exists but it's not mandatory to get a working incremental load-approach. And the more conditions do you want to check into the more branches do you need to switch for the various statements.

The next step in the daily business is to load the qvd and to get the max. date, for example:

Table: load * from [Table].qvd (qvd);

MaxMaxdtCreatedDate:

load max(dtCreatedDate) as MaxdtCreatedDate;

load fieldvalue(‘dtCreatedDate’, recno()) as Id autogenerate fieldvaluecount(‘dtCreatedDate’);

let vMaxdtCreatedDate = peek('MaxdtCreatedDate', 0, 'MaxdtCreatedDate');

and then loading only the newest data from the db with:

select * from db where dtCreatedDate > '$(vMaxdtCreatedDate)';

which should be autoconcatenated to the table above (the same table-structure) and then again:

store [Table] into

.qvd (qvd);

The most important aspect is to limit the queries against the db because this will be quite probably the slowest part then loading optimized from a qvd is very fast and by just a few fields and a modern machine + ssd-storage it needs maybe a minute or two.

More optimization would be possible by splitting the timestamp into a date -- and a time-field, see: The Importance Of Being Distinct.

- Marcus

Anonymous
Not applicable
Author

HI Marcus,

Good Morning / Afternoon.

Thank you for all your support. Your code snippet has helped me to achieve what I  wanted with a sample file of 150K records where  I am looping to limit the rows that are pulled at one shot to 10000 and when the counter is greater than 150K it will exit the loop. However, I will try to pull 208 million records and check whether  I am getting any error and get back to you.

Regards,

Krunal. Shaholia