Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Incremental Load Problem!

Hi All,

I tried to do the incremental load with the help of the document which is given in the below link.

http://community.qlik.com/media/p/125837.aspx

But as per me it is not working . Requesting you to let me know what changes I need to do so that it works as expected.

Explanation about the problem given below:

~~~~~~~~

I tried the incremental load application with the data of mine. I am facing few issues when I was trying with my data. It is not working as expected. Explained about the problem in the screen shot which is given below. Hope you can look into it and provide some solution for the same.

error loading image

error loading image

I have attached the document which I was trying with.

~~~~~~~~

Thanks and Regards,

Rikab Kothari

44 Replies
suniljain
Master
Master

Dear Rikab,

Uncomment following part and Add date from which you want to update record in QVD.

MakeDate(2005,01,01) Change Date here.

Regards

Sunil Jain.

Not applicable
Author

Hi Haneesh,

I am not sure from where it is loading. Are you sure that this is what we can expect when we are doing incremental load. Please clarify me!

As these days I thought incremental load means loading only the updated data(Inserted,Updated and Deleted). Means it will not load those records which already exists, it will load load only the updated data.

If you say this is what it is expected then,

When I load the data directly from the source it is taking 3 seconds only but when I load the data from QVD it is taking 4 seconds. One of the basic purpose of doing the incremental load is to reduce the loading time. But where as here it is taking more time when compared to loading from the source. Please clarify me! As I don't have much experience in handling the incremental load.

Not applicable
Author

Hi Haneesh,

Can you please recheck with the attached document and let me know from where it is loading. Whether it is loading from the source or the QVD.

Not applicable
Author

Hi Rikab,

We will have to load the already existing data in QVD in order to concatenate it with the new data and load it into the same QVD.

Also, you mentioned that it takes only 3 sec to load DB data where as QVD takes 4 sec. How many records are retrieved in each load? I guess QVD should have more rows.

Assume a production scenario where fact table has 2000000 rows and everyday around 20000 records are added. If we implement incremental load today, then for the first time it will load all the 2000000 rows from DB which might take 45 min. Tomorrow when incremental load is run, it will fetch only the new 20000 rows from fact and it will take less than 5 min and will take around 10 min to load the QVD data and overall the entire loading will be done within 20 min.

If we do full reload everyday, then the load time will be more than 50 min and it increases as the volume grows. Hence we go for incremental load approach to save time.

Please note that all the reload time mentioned here are based on assumptions. Let me know if you are still unclear on any of the point.

Regards,

Haneesh

Not applicable
Author

Hi Rakop

My document is working daily.If you want to make that document hourly or minutely you must format the dates like timestamps.After that you can see that document is working.

Just do those three steps.

1-Make all the date\timestamp formats same and minutely or hourly.

2-Delete the qvd file and reload the document.

3-Add new lines to excel.But don't forget the add the modifed date column.And reload document again.

I'm in consultancy today so I'll not be available to answer but at lunch i'll try to answer your other questions.

Not applicable
Author

Hi Rikab,

Yes, it is loaded from the QVD when ran second time. I ran it for the first time and it is loaded from excel. I ran it again and it is loaded from QVD. Please refer the attached log files. The 'incrementalload.qvw.2010_08_13_13_03_39.log' file is generated for first run and 'incrementalload.qvw.2010_08_13_13_04_02.log for the second. You can find the details of data loaded here.

In this case since the source is excel, there might not be huge difference in reload time. But, when you have DB as source, which is in another machine, then there will be huge time difference between DB load and QVD load. I guess, this clears your doubt.

-Haneesh

Not applicable
Author

Attaching the next log file

jonathandienst
Partner - Champion III
Partner - Champion III

Rikab

I think your problem may be that you are misreading the script execution dialog. As I see it, on your first run, 65535 records were loaded from the spreadsheet as there is no QVD file. On the second run, nothing was loaded from the spreadsheet (Sheet1$ 0 lines fetched), and 65535 lines were loaded from the QVD file (DSC 65,535 lined fetched).

If you change a line in the source data (change one cell in the column stockist_modified_on to a future date), you should see 1 record loaded from the Sheet1$ and 65,534 records loaded from the QVD file (DSC).

This is the correct behaviour for an incremental load. Haneesh's log files should show the same thing, and if you examine the each of the log files of your own run from Rob's example, you shoud see the same. (Don't forget to rename the log file after the first run as it will be overwritten on the second run).

Are you perhaps misunderstanding what is meant by incremental load? By incremental load we mean that we are reducing the number of records being fetched from the original source data by storing records in a QVD after loading. On subsequent loads, we need to load only new/changed records, and load the unchanged date from the QVD. The same amount of data is loaded, only some comes from a QVD file rather than the original data source.

Hope this helps

Jonathan

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Not applicable
Author


Sunil Jain wrote:
I checked your application . It extracting full data from qvd for comparison with CSV.
and Increamental load check with full data of QVD because there is no limit on backdate updation. and It happening in my case when I developed increamental load application with SAP R/3.
One Solution is if your data is fixed for particular period then you can make seperate qvd for that. and compare key field for limited period only. <div></div>


Many thanks for your reply! I cannot make separate qvd for particular period as it is not fixed.

By the way please let me know which method is reliable? Rob's method or this method? Please let me know the reason for the same.

suniljain
Master
Master

Rob's Methos is faster because It is work on the basis of indexing on primary key.

It takes lesser time to update and insert record in existing QVD. I tested this logic on 6 crore of records. And Happy with the performance of Rob's Application.

Regards

Sunil Jain