Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
userid128223
Creator
Creator

Update, Append, Purge, Composite key

I have following data that has source QVD and daily qvd. Source needs to get modified on daily basis via updates from daily.qvd.

  • Uniq key is composed of:  EmpID+EmpName+CreateDate
  • Daily file will have duplicate data where only updateDate will indicate latest recods which needs to be kept.
  • Data that is similar in daily.qvd needs to be purged out based on latest Updatdate.

Daily.qvd

EmpIDEmpNameAmountTerminalNoCreateDateUpdateDate
123Joe$200201/6/20141/6/2014
123Joe$100,000201/6/20141/7/2014
123Joe$500,000201/6/20141/9/2014
123Joe$400401/8/20141/8/2014
235Doe$55201/8/20141/22/2014
235Doe$200,000201/8/20142/1/2014
235Doe$500501/15/20141/15/2014
345Kim$456561/9/20141/9/2014
724Mele$25201/18/20141/18/2014
887Duwane$15301/22/20141/22/2014

Source.QVD

EmpIDEmpNameAmountTerminalNoCreateDateUpdateDate
123Joe$200201/6/20141/6/2014
123Joe$400401/8/20141/8/2014
235Doe$55201/8/20141/22/2014
235Doe$200,000201/8/20142/1/2014
345Kim$456561/9/20141/9/2014

FINAL RESULTS:

EmpIDEmpNameAmountTerminalNoCreateDateUpdateDate
123Joe$500,000201/6/20141/9/2014
123Joe$400401/8/20141/8/2014
235Doe$200,000201/8/20142/1/2014
345Kim$456561/9/20141/9/2014
724Mele$25201/18/20141/18/2014
887Duwane$15301/22/20141/22/2014
1 Solution

Accepted Solutions
Nicole-Smith

This should do the trick:

Final:

LOAD * FROM Daily.qvd (qvd);

CONCATENATE (Final) LOAD * FROM Source.qvd (qvd);

INNER JOIN (Final) LOAD EmpID, CreateDate, max(UpdateDate) as UpdateDate RESIDENT Final GROUP BY EmpID, CreateDate;

Example files also attached.

View solution in original post

20 Replies
Nicole-Smith

This should do the trick:

Final:

LOAD * FROM Daily.qvd (qvd);

CONCATENATE (Final) LOAD * FROM Source.qvd (qvd);

INNER JOIN (Final) LOAD EmpID, CreateDate, max(UpdateDate) as UpdateDate RESIDENT Final GROUP BY EmpID, CreateDate;

Example files also attached.

userid128223
Creator
Creator
Author

Thanks Nicole for quick reply.

2 follow-up questions.

  1. Does the uniq key in this whole scenerio matters, i dont see any thing to compute uniq key or use it in code.
  2. Will this always fetch the new updated records even if they exist in Source.qvd.
  3. Can you please explain how your code works. I was following you up until Concatenate. Please explain below.
    1. INNER JOIN (Final) LOAD EmpID, CreateDate, max(UpdateDate) as UpdateDate RESIDENT Final GROUP BY EmpID, CreateDate;
Nicole-Smith

1.  I don't need to create a key because we're not linking the tables.  We're joining and concatenating on fields that exist in both files and that are named the same, so there is no need to concatenate the three fields together.

2.  It should.

3.  We start by just concatenating all data together (as you said you understand).  Then the inner join is used to keep everything that doesn't have the max(UpdateDate) for each Employee/CreateDate combination.

userid128223
Creator
Creator
Author

What does Group by EmpID, CreateDate do?

userid128223
Creator
Creator
Author

Nicole I tried your solution and it works great  However I still have hard time understanding it. I will try below and you correct me or fill in the gaps.

Final:

LOAD * FROM Daily.qvd (qvd);    // Load all the data from Daily.qvd

CONCATENATE (Final) LOAD * FROM Source.qvd (qvd);   // Load all the data to above tables and append any new found records.

At this point we have this table.

EmpIDEmpNameAmountTerminalNoCreateDateUpdateDate
123Joe$200.00201/6/20141/6/2014
123Joe$400.00401/8/20141/8/2014
123Joe$100,000.00201/6/20141/7/2014
123Joe$500,000.00201/6/20141/9/2014
235Doe$55.00201/8/20141/22/2014
235Doe$500.00501/15/20141/15/2014
235Doe$200,000.00201/8/20142/1/2014
345Kim$456.00561/9/20141/9/2014
724Mele$25.00201/18/20141/18/2014
887Duwane$15.00301/22/20141/22/2014

I am lost after this......

After this INNER JOIN (FINAL)   // how does it look here at this point.

LOAD EmpID, CreateDate, max(UpdateDate) as UpdateDate RESIDENT Final GROUP BY EmpID, CreateDate;

Nicole-Smith

The Group By clause is needed because max() is an aggregation function.

We are loading EmpID, CreateDate, and max(UpdateDate) grouped by EmpID and CreateDate since you want the newest UpdateDate for each EmpId and CreateDate.  By inner joining only the newest UpdateDate, EmpId, and CreateDate to the old table with all the data, it gets rid of the ones that don't have the newest UpdateDate.

userid128223
Creator
Creator
Author

Thanks Nicole.

I need training in basics i guess. where do you go to learn these stuff.

Nicole-Smith

For me, it's been a combination of learning at work when I hit a problem as well as just reading posts on the community.

userid128223
Creator
Creator
Author

For me its more of sql i think then qlikview. i think SQL is 50% of qlikview.