Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
yelin_nyu
Creator
Creator

how to not load in 'duplicate' data?

i get a rolling 2 week data history on daily basis, since orders get updated from time to time, so i want to load in the updated 'historical' data. There is a set of unqiue keys based on item ordered, and order date. If someone makes a change to qty, the order date will remain the orignal order date. currently i am using inner joins to itself to find the max(load id)  group by item, order date. It works good, but the problem is that since we are getting more data overtime, and i am not doing incremental load at this point. i want to know ways i can avoid pull in data using perhaps not exist functions.

Thanks

2 Replies
yelin_nyu
Creator
Creator
Author

ok i want to add something.

order id      item      qty      order date     load date

1                   A         5                 1/1                    1/1

2                   C          3                 1/2                    1/5              

3                    B          2                 1/3                    1/4

3                    B          100               1/5                    1/6

so if i do below, only the order w qty 2 get pulled in.

order:

load * where not Exists (key);

LOAD [order id],

     [order date],

     item,

     qty,

     [load date],

     [order id] & item as key

FROM

tbl1

;

i want the final table to look like

1                   A         5                 1/1                    1/1

2                   C          3                 1/2                    1/5              

3                    B          100               1/5                 1/6

stevedark
Partner Ambassador/MVP
Partner Ambassador/MVP

You appear to be on the right track.  You need to use an ORDER BY statement on the initial load to ensure that the row you wish to keep is loaded first.  You need to ensure that your date field is a proper date in order to sort on it correctly.

Hope that helps,

Steve