Qlik Community

QlikView App Development

Discussion Board for collaboration related to QlikView App Development.

Announcements
Leverage your QlikView investment to modernize BI – see how! Join Group
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Specialist
Specialist

Incremental load: how to avoid duplicate rows?

What is the surefire way of avoiding duplicate rows when an incremental load refreshes more than once a day? I am currently using this algorithm but noticed it's not always working:

If qvdcreatetime('QVDlocation') Then

Table1:

LOAD *;

FROM [QVDlocation] qvd;

else

Table1:

LOAD

Autonumberhash25(Order_ID,DATETIME) as primarykey,

Date,

Order,

etc.

FROM SQL statement

end if

Concatenate (Table1)

LOAD DISTINCT

Autonumberhash256(Order_ID,DATETIME) as primarykey,

Date,

Order,

etc.

FROM SQL statement WHERE DATETIME>sysdate

Here, DATETIME is an Oracle datatype "Date with timestamp", in the format "DDMMYYYY HH:MM:SS"

Shouldn't the autonumberhash256 work to prevent duplicate rows?

3 Replies
MVP & Luminary
MVP & Luminary

Re: Incremental load: how to avoid duplicate rows?

autonumberhash256 merely creates a hash value. It does not check for existing values. For that you need a where clause.


concatenate(Table1)

LOAD Autonumberhash256(Order_ID,DATETIME) as primarykey, ...etc...

WHERE Not Exists(primarykey,Autonumberhash256(Order_ID,DATETIME));

SELECT .... FROM ... WHERE DATETIME>sysdate;


If QVDlocation does not exist then records are loaded from an sql source. The next statement will concatenate records to that. If those records come from the same sql source then you'll likely be loading some records twice.


talk is cheap, supply exceeds demand
Highlighted
Specialist
Specialist

Re: Incremental load: how to avoid duplicate rows?

Thanks Gysbert, you mentioned:

If those records come from the same sql source then you'll likely be loading some records twice.


For my case, the same SQL source is being used. Are you saying some records will load twice even with the WHERE Not Exists(primarykey) clause?


EDIT: I'm also getting an SQL error saying "Field not found: primarykey"

Highlighted
Specialist
Specialist

Re: Incremental load: how to avoid duplicate rows?

Just tested this again and can see that if I use some other field, the SQL error doesn't happen, e.g. WHERE NOT EXISTS (ORDER_ID).

But using primarykey where primarykey = autonumberhash256(ORDER_ID,DATE_TIME) still causes the SQL error "field not found: primary key". Any ideas?