Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Avoiding dups in my load

T_BDE_DAILY:
LOAD
[Loan Number]

FROM $(qvd_layer_2_path)T_BDE_DAILY.qvd(qvd)
where [Loss Mit Status Code] = 'A' and [Servicing Sold ID] <> 'NSM120110' and [Referral Code] <> 'CARC' ;  /*produces 3000 records*/

left join(T_BDE_DAILY)
LOAD distinct [Loan Number] ,
[Carrying Value],
[BI Data As Of Date]
FROM  $(qvd_layer_2_path)HDD_MRTG_DATASHARE_RDX.BSLN_MAIN.qvd (qvd
) /*produces about 10,000 records*/

Heres my issue.  Despite invoking distinct LOAN, when I do a left join to the T_BDE_DAILY I am getting duplicate records from the BSLN_MAIN table.  There should only be 3000 records.  How can I modify the BSLN_MAIN to pull in the LOAN_NUMBER, CARRYING_VALUE based on the max BI DATA AS OF DATE only.   For example   if LOAN NUMBER 1111 is listed 5 times I only want the most recent BI DATA AS OF DATE and left join it back to T_BDE_DAILY

1 Solution

Accepted Solutions
sunny_talwar

Here are the answers

1) It does

2) I did mean to put a negative sign in front of the date so that you get the Carrying value associated with the max date. Without the negative sign, you will get Carrying Value associated with the min date.

3) Group By slows down the reload and is resource intensive, but since you are looking to get aggregated data, every possible solution will be resource extensive.

View solution in original post

4 Replies
sunny_talwar

Try this:

T_BDE_DAILY:
LOAD [Loan Number]

FROM $(qvd_layer_2_path)T_BDE_DAILY.qvd(qvd)
Where [Loss Mit Status Code] = 'A' and [Servicing Sold ID] <> 'NSM120110' and [Referral Code] <> 'CARC' ;  /*produces 3000 records*/

Left Join (T_BDE_DAILY)
LOAD [Loan Number] ,
    FirstSortedValue([Carrying Value], -[BI Data As Of Date]) as [Carrying Value],
    Max([BI Data As Of Date]) as [BI Data As Of Date]
FROM  $(qvd_layer_2_path)HDD_MRTG_DATASHARE_RDX.BSLN_MAIN.qvd (qvd)

Group By [Loan Number];

Not applicable
Author

I am running it with the new code now.  1. Does this code scan through the entire BSLN_MAIN table to get the firstsorted and locate the maxdate  2.  Did you mean to put a - before the BI DATA AS OF DATE or is this just a typo.  Are you attempting to show the last most carry value based on the last BI DATA AS OF DATE?  3.Do these types of queries generally take longer to process?

sunny_talwar

Here are the answers

1) It does

2) I did mean to put a negative sign in front of the date so that you get the Carrying value associated with the max date. Without the negative sign, you will get Carrying Value associated with the min date.

3) Group By slows down the reload and is resource intensive, but since you are looking to get aggregated data, every possible solution will be resource extensive.

Not applicable
Author

Thank you very much.  I learned something today.  That piece of code took about an hr or so to get through however the dups issue is gone.  I will save the code and your notes.