Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Loading new rows into one table

Hi,

I am every day reloading table from SAP HR.

Date are without date.

How can i check which new rows were appended?

For example QV load 1000 in 21.11.2017.

Next day QV will reload and in QVD file will be 1500 row.

How can i identify these 500 hundreds new rows?

I was trying to do incremental load but problem is that i do not know which rows are new.

Best Wishes,

Jacek Antek

1 Solution

Accepted Solutions
petter
Partner - Champion III
Partner - Champion III

A common technique to find out which rows are new or changed if you don't have a unique ID or a changed/updated timestamp is to use what you suggest a combination of fields. Furthermore to optimize this - since a combined field can become big - you can do a hash function on the combined fields.  This returns a hash-value that is still unique that is much smaller and it has constant size.

Then determining what has changed or been added is a matter of comparing your current data's hashes with your new data's hashes and those that have no match are rows that have changed or have been updated.

View solution in original post

10 Replies
jonathandienst
Partner - Champion III
Partner - Champion III

Does the field have some sort of unique ID? Or would you be able create one at load time?

If there is no unique identifier or modified date, then it will not be possible to do an incremental load.

Logic will get you from a to b. Imagination will take you everywhere. - A Einstein
Anonymous
Not applicable
Author

Thank you,

No, but i can create Unique ID with combining fields.

Employee_Number and his company shortcut and his department_shortcut create different rows for one employee.

For example:

1001,                ISP,    Department1 ,                  Person1

1001,                ISP,     Department2,                  Person1

1002,                ICL,    Department1,                   Person1

1003,                OGG,  Depertment1                 Person1

I think I can create unique ID combining column 1, second and third.

And if I had this what I should do next ?

Best wishes,

Jacek

Anonymous
Not applicable
Author

Anyone ?

Jacek

petter
Partner - Champion III
Partner - Champion III

A common technique to find out which rows are new or changed if you don't have a unique ID or a changed/updated timestamp is to use what you suggest a combination of fields. Furthermore to optimize this - since a combined field can become big - you can do a hash function on the combined fields.  This returns a hash-value that is still unique that is much smaller and it has constant size.

Then determining what has changed or been added is a matter of comparing your current data's hashes with your new data's hashes and those that have no match are rows that have changed or have been updated.

Anonymous
Not applicable
Author

Thank you Petter,

So if i have the combination of fields further I can use Exists function and search for newest records ?

  Best Wishes,
Jacek

petter
Partner - Champion III
Partner - Champion III

Well - the hash-function will not relate to newest or oldest. It will only uniquely identify a record. You can make a hash out of all the fields/columns of a record and then have a fingerprint (the hash) which can be stored in a QVD. Then you compare during the next reload the hashes from the QVD with the incoming records to see if anything has changed. This represents the worst case scenario if you need to hash all the fields.

Anonymous
Not applicable
Author

Thank you Petter,

unfortunately it has to be the worst possible scenario in my case.

Then you compare during the next reload the hashes from the QVD with the incoming records to see if anything has changed.

So here i can use exissts function or maybe other?

Best Wishes,

Jacek

petter
Partner - Champion III
Partner - Champion III

Yes you can. The hash is stored in a single field and then it is easy to use Exists() function.

petter
Partner - Champion III
Partner - Champion III

Here is the link to the most reliable hash function in the help: Hash256 - script and chart function ‒ QlikView