Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
MalcolmCICWF
Creator III
Creator III

Incrimental Load with an Update Date

We have a bunch of qvds set up with incremental loads at our company. Most of our qvds that load from SQL have a WHERE clause  grabbing any record where the CREATE_DATE is greater or equal to a variable date that is created and stored each reload. It then concatenates any records found with the qvd if a primary key does not exist (I believe this is a normal incremental load).

The issue I am seeing is that some of the SQL tables have an UPDATE_DATE where records can be updated if need be. This is written into some of the qvd loads, but because the primary key from this updated record exists in the qvd already, it does not get added. We are not adding updated records to the qvd, only those with new CREATE_DATE.

Anyone have an incremental script that works with two different dates fields?

Here is the current script (I removed the fields to shorten it):

////////////////////////////////////////////////////////////////////////////////////////

MaxDate:
load * from max_source_record.qvd (qvd);
let maxdate= text(date(peek('temp1',0,MaxDate),'YYYYMMDD'));
drop table pointless;

//////////////////////////////////////////////////////////////////////////////////////////

SOURCE_RECORD:
Load
   fields
;
SQL SELECT 
    fields
FROM dbo.table with (nolock)

/*comment for full reload*/ where (CONVERT(VARCHAR,CREATE_DATE,112) >= '$(maxdate)') or CONVERT(VARCHAR,CHANGE_DATE,112) >= '$(maxdate)';
/*comment for full reload*/ CONCATENATE LOAD * FROM SOURCE_RECORD.qvd (qvd) where not exists(PRIMARY_KEY);
;
STORE * FROM SOURCE_RECORD INTO SOURCE_RECORD.qvd (qvd);

DROP TABLE SOURCE_RECORD;

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

currentmax:
load *;
sql select max(case when CHANGE_DATE > CREATE_DATE then CHANGE_DATE else CREATE_DATE end) as temp1
from dbo.table;
STORE * FROM currentmax INTO max_source_record.qvd (qvd);
DROP TABLE currentmax;

1 Solution

Accepted Solutions
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi Jeremy - the WHERE NOT EXISTS constraint is on the load from the QVD, and it is rows from the old QVD that are not loaded.  When the combined QVD is then written the latest version of updated records is kept, and the previous version is effectively overwritten.

The logic in your original script was largely correct.

View solution in original post

12 Replies
Anonymous
Not applicable

Does each row have a unique key ?

If so then have a look at the section Free Form Data With Modification of Existing Rows of this blog post http://www.quickintelligence.co.uk/qlikview-incremental-load/

The important bit is WHERE NOT EXISTS (IDField)

MalcolmCICWF
Creator III
Creator III
Author

Yes the Primary_key is the unique field for each row, and I agree, the WHERE NOT EXISTS is what is preventing the updated records from being stored because the unique key exists.

I will look at your link, thanks for a starting place, I wasn't sure how to word this question or search for this online.


I read through and though it was a really good article, it still does not solve my problem. I have a unique ID, this is the PRIMARY_KEY, but that record has a CREATE_DATE and an UPDATE_DATE. I have an example below showing what I would have in the qvd already, and what SQL pulls the next day. You can see I now have a 4th record with a new CREATE_DATE that will concatenate to the qvd, but I also have record 1 that has been updated, but when it goes to concatenate, the PRIMARY_KEY exists and will not be updated. (At least how my script has it.

does anyone know of a workaround in this situation that I could adapt?

   

In current QVD on 7/13
PRIMARY_KEYfield changed exampleCREATE_DATEUPDATE_DATE
ABC1NO7/13/2015null
ABC2NO7/13/2015null
ABC3NO7/13/2015null
SQL pulled 7/14 and needs to concatenate into QVD
PRIMARY_KEYfield changed exampleCREATE_DATEUPDATE_DATE
ABC1YES7/13/20157/14/2015
ABC2NO7/13/2015null
ABC3NO7/13/2015null
ABC4NO7/14/2015null
Anonymous
Not applicable

In essence :

  • Load from database if UPDATE_DATE is after last load and if not then load if CREATE_DATE is after last load otherwise discard

  • Concatenate onto table from above by loading qvd from last load using  WHERE NOT EXISTS ( [PrimaryKey] )
    • [You should notice that it is still an optimised load, which is one of the great things about WHERE NOT EXISTS]
stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Jeremy,

What you have there looks like it should work.  As long as you pull the SQL first and then append the data from the QVD you should always have the latest rows.

The only thing I can see is that you reference CHANGE_DATE in your code, but UPDATE_DATE in your example.  Have you got the wrong field?

Other observations; I would get the max date from the SQL data before loading the rest of the data.  As it stands it is possible for rows to be written / updated between the start of the main SQL load and the pull of the max date.  These could fall between the cracks.  Putting the get date first could cause the same row to come from SQL twice, but the WHERE EXISTS would stop duplicates finding their way into the final QVD.

Also, be careful of data types, in both SQL and QlikView.  In the SQL code you are converting dates to VARCHAR before comparing them.  This may not always pull the write values due to string compare not being the same as a date one - you should instead convert the variable (which will be a string) to a date.  It may be that the format of having year, month then day means that this isn't actually an issue.

Hope that helps.

Steve

MalcolmCICWF
Creator III
Creator III
Author

I'm not sure what you mean here

MalcolmCICWF
Creator III
Creator III
Author

I modified the script to read better, so I changed to Update Date.

This does not work because the PRIMARY_KEY on the record which was updated already exists in the QVD, so it will not concatenate, therefore it still has the original created record, not the updated. The updated record does not qualify to concatenate to the qvd because the primary key would already exist.

as far as your observation of the stored max date going before the SQL load, I agree, I did not write any of these. The next time the QVD is reloaded, it should get those that fell between the crack the day before in the load as those records do not exist and they max date is equal to or greater than the CREATE_DATE.

stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi Jeremy,

The primary key that is already in the QVD should not stop the row being brought in from SQL.  If the query is correct then the updated SQL row should be brought into memory.  The concatenate from the existing QVD should then bring in old rows, but not ones that have been updated (i.e. are already in memory).

If the original ABC1 is still there after the merge, then I think it most likely that the SQL statement failed to retrieve the modified record.

You can test for this in a couple of different ways.  One would be to put an EXIT SCRIPT statement after the SQL load, and take a look at what is in the table at that point.  You would need to ensure that you don't store the latest date file if you do this (as the new QVD will not have those rows due to the script being aborted).

Another way you can test is to write the content of the SOURCE_RECORD to a separate QVD in between the SQL load and the QVD concatenate.  This way you can see what the SQL turned up.

Alternatively, you can check your SQL code by issuing it directly against the database using the database IDE.  The best way to grab the code (with the variables inserted) is to go into Debug mode in the load script and step through the code a line at a time.

Hope that helps,
Steve

MalcolmCICWF
Creator III
Creator III
Author

The primary key that is already in the QVD should not stop the row being brought in from SQL. (yep I realize this)  If the query is correct then the updated SQL row should be brought into memory.(Correct)  The concatenate from the existing QVD should then bring in old rows, but not ones that have been updated (i.e. are already in memory).

If the original ABC1 is still there after the merge, then I think it most likely that the SQL statement failed to retrieve the modified record

----------------------------------------------------------------------------------------------------------------------------------------------------------

I think maybe I am just misunderstanding what the WHERE NOT EXISTS is doing. So you are telling me the WHERE NOT EXISTS is excluding the record from the qvd during the concatenation because it exist in the SQL pull? As I understood it the records that are new in the SQL pull are concatenated to the existing qvd. So, if the record (because of the primary key) existed in the qvd the record form the SQL pull is not appended into the qvd. (i.e. SQL code pulls a record that has been updated but it exists in the qvd, it is then not concatenated to the qvd because that primary key exists... I could be wrong).

stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

Hi Jeremy - the WHERE NOT EXISTS constraint is on the load from the QVD, and it is rows from the old QVD that are not loaded.  When the combined QVD is then written the latest version of updated records is kept, and the previous version is effectively overwritten.

The logic in your original script was largely correct.