Skip to main content
Announcements
Global Transformation Awards! Applications are now open. Submit Entry
cancel
Showing results for 
Search instead for 
Did you mean: 
arixooo123
Creator III
Creator III

Incremental Load on monthly Partitioned QVDs

Hi Guys,

I have created some monthly partitioned QVD files, now I want to create an incremental loading procedure to load the data.

I assume the following should be done but I have difficulties in writing the script for that:

1- select the last qvd file (based on Year-Month)

2- finding Max (Date)

3- selecting rows of data greater than MaxDate

4- storing the new rows in: a) the same qvd file if the date were in the same month  b) storing them in a new file if date was in a new month

My QVD files in below format:

TableName_2017-Sep.qvd

TableName_2016-Jan.qvd

and ....

Many thanks in advance

Ramon

9 Replies
arixooo123
Creator III
Creator III
Author

I wrote this script, but unfortunately, It's not working

// Selecting the last QVD file based on the date in FileName    

vMaxDate = 0;

vFinalFile ='';

For Each vFile in FileList('lib://QlikData/MyTable/*.QVD')

  vDate = Date#(TextBetween(vFile, 'MyTable_', '.QVD'), 'yyyy-MMM');

  vFinalFile = If(vDate > vMaxDate, vFile, vFinalFile);

  vMaxDate = RangeMax(vMaxDate, vDate);

Next

MyTable:  

Load * from [$(vFinalFile)] (qvd);

Set vMaxDate =;

Set vFile =;

Set vFinalFile  =;

    

//Finding the maximum date of Edited or Created columns

ConsDate:

Load RangeMax(max(Created),max(Edited)) as MaxDate

resident MyTable;

Let ConsDate=Peek('MaxDate',0,'ConsDate');

Drop table MyTable;

//Selecting the new data

IncrementalCons:

Load *;

SQL

select * from

DB.dbo.MyTable where Created>$(ConsDate) or Edited>$(ConsDate);

// storing the new data in its related qvd - partitioning

YearData:

LOAD

DISTINCT (Year(Created)&'-'& Month(Created))  AS YrMonth // Formatting the Date

Resident IncrementalCons;

FOR i = 0  to NoOfRows('YearData')-1;

LET vYrMonth = Peek('YrMonth',$(i),'YearData');

NoConcatenate

FinalTable:

LOAD *

Resident IncrementalCons

WHERE (Year(Created)&'-'& Month(Created)) = '$(YrMonth)';

If (vYrMonth=vMaxDate)then

Concatenate

Load * from [$(vFinalFile)](qvd) where not Exists(Id);

STORE FinalTable into '$(vPathStore)/MyTable/MyTable_$(vYrMonth).QVD'(QVD);

else

STORE FinalTable into '$(vPathStore)/MyTable/MyTable_$(vYrMonth).QVD'(QVD);

end if;

DROP Table FinalTable;

next i;

arixooo123
Creator III
Creator III
Author

Any suggestion?

rahulpawarb
Specialist III
Specialist III

Hello ar eiuiew,

Could you please elaborate more on this? What is present output generated by the script?

Observation: Current script is loading only those records which has timestamp greater than present in the QVD repository, it will always update the present month QVD file to append the records.

Please correct me if I am wrong.

Regards!

Rahul

arixooo123
Creator III
Creator III
Author

Hi Rahul

I really appreciate your help

the problem is in writing the slice of data in its related QVD files. The part I wrote down an IF. The selected data could be related to different QVDs (Since data in last months could be edited) but we store it in QVD based on Created data. So, imagine a scenario in which the selected data in the last window is related to the last 3 months QVDs. We should update the first 2 QVDs and create another QVD file for the current month (let's assume the month just changed and now we have a new month)


-----------------------------------------------------------------------------------------------------------

Let me give you an example:

Id          Fact Columns               Edited                            Created

100       xxxxxxxxxxxx            2017-06-01                     2017-05-05

200       xxxxxxxxxxxx            2017-06-01                     2017-04-05

300       xxxxxxxxxxxx            2017-06-01                     2017-03-05

400       xxxxxxxxxxxx                    -                              2017-06-01


The first 3 rows (100,200,300) must be updated into their related QVD files, but the last row( 400) should be added to a new qvd file for month of June. (assuming this is the first time we do incremental loading in June and there is no qvd file created for this month before)

arixooo123
Creator III
Creator III
Author

jontydkpi‌ 

Hi Jonathan

I'll be grateful if you could advise.

arixooo123
Creator III
Creator III
Author

Dear jontydkpi‌  rahulpawarb

Any suggestion?

I am still stuck in this thing.

When storing the selected data in their related QVD files based on Created date, it mess up the whole thing and loop through all qvd files and replace their data with only the selected rows ( but I need to concatenate the data for prior months and for the new month create a new qvd file)

I really appreciate your feedback on this

kaushiknsolanki
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi,

What I feel is it will be too much of code when you want to update the records in respective QVD's.

Instead other way is

You concatenate all QVDS first.

Find the Max Date

Get the new data;

Perform insert/update/delete

Find the numbers of Months in the data.

run the loop for those many times

store the qvds for each month;

This way the manageability of the code and debugging will become easy, with less code.

Regards,

Kaushik Solanki

Please remember to hit the 'Like' button and for helpful answers and resolutions, click on the 'Accept As Solution' button. Cheers!
arixooo123
Creator III
Creator III
Author

Hi Kaushik,

Thanks for your response

The point of partitioning QVDs is actually performance tuning and decreasing the incremental load time. If I concatenate all the qvds and find the maximum there, it will take much more time and consume more memory.

I actually wrote the code, but I guess the part that it stores old month data in their respective QVDs is not working. maybe because of Concatenate or No concatenate clauses. It's all new to me and I am still learning.

Anonymous
Not applicable

hi, if you still want to display the information to user for all the users, dont worry about the memory. Because it will be in memory.

I think Mr. Kaushik has suggested the right approach in this case.

Recently i had also tried the same thing on one of my use case but finally decided for a single QVD.