Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
I have the following
FINALCAC:
LOAD DISTINCT
CAC_CLIENTNUMBER,
CAC_CURRENCYCODE,
CAC_SEQUENCENO,
CAC_INVOICECODE,
CAC_COVERJOIN,
CAC_INVOICEDATE,
CAC_TOTALDUE,
JNL_SEQUENCENUMBER,
DebtorType,
Source,
RCP_BATCHNUMBER,
TLG_RUNNUMBER
RESIDENT CAC;
Through the wonderful design of 2 of the tables I have previously left joined additional information I have duplicate rows even when I load distinct.
A combination of CAC_INVOICECODE and CAC_SEQUENCENO are 100% unique. The information that I left joined into this table being TLG_RUNNUMBER creates duplicate rows and being blunt I don't care which line(s) disappear.
Thanks in advance for any help you can offer.
Rob
Not quite clear to me, what you are intending to do.
Are you able to post an example?
Peter
Hello,
- joining in QV means to merge two table together into one table. As I understood your question, you joined before the distinct load you are showing above. This it is to late for your distinct because the it includes every field of the load statement and as you said: "A combination of CAC_INVOICECODE and CAC_SEQUENCENO are 100% unique".
- so take a look at the left join itself: I assume that the two tables got a keyfield with a common name. Producing duplicates rows with a left join then is a strong sign that your second (right) table has duplicate values in your keyfield. This is the point to look at.
Regards, Roland
Hi Roland,
You are on the right track with your logic, my problem is that they left joined tables "CAC" and "JNL" do not have a unique join nor for want of trying can I make one.
I have duplicates and I just want to get rid of them and from what you are saying LOAD DISTINCT is the wrong path to go down.
Do I have any other options for cleansing my data, I am thinking along the lines of MAX and GROUP BY?
Thanks
Rob
If I understand correctly then then TLG_RUNNUMBER in the load prevents the distinct doing its thing in the load you show. You therefore need to revisit the data before doing the left join and do the distinct there. The left join must be matching on some columns?
To answer your question directly though you could create an autonumber hash on the two columns that define uniqueness and then use it, something like:
FINALCAC:
Load ...
autonumberhash128(CAC_INVOICECODE & '-' & CAC_SEQUENCENO) as hashkey
resident CAC
where not exists(autonumberhash128(CAC_INVOICECODE & '-' & CAC_SEQUENCENO));
drop field hashkey;
which means only the first row with this unique combination would be loaded.
Regards,
Gordon
Regards,
Gordon
Hello Rob,
I understand your approach. But in my experience it is always a better praxis to solve a problem where it is generated. Since I do not know your data I can't tell you exactly the correct load-statements. But I think you should look deeply into your joining. Try to use an inner join or inner keep to load your data into this one table.
RR
Hi Roland,
First of all thanks for your time much appreciated,
I'll rewind a little and try and give you a little more tools and hopefully help you understand our clunky database.
CACJNL:
LOAD
CAC_CLIENTNUMBER,
CAC_CURRENCYCODE,
CAC_SEQUENCENO,
CAC_TRANIDENT&CAC_INVOICENO as CAC_INVOICECODE,
CAC_TRANIDENT&CAC_INVOICENO&CAC_TOTALDUE&CAC_INVOICEDATE as CAC_INVOICECODEAMT,
CAC_COVERNUMBER&'/'&CAC_VERNO as CAC_COVERJOIN,
CAC_INVOICEDATE,
CAC_TOTALDUE,
'CLT' as DebtorType,
'JNL' as Source
FROM
(qvd)
LEFT JOIN (CACJNL)
LOAD
JNL_TRANIDENT&JNL_INVOICENO&JNL_AMOUNT&JNL_DATEDONE as CAC_INVOICECODEAMT,
JNL_SEQUENCENUMBER,
JNL_RUNNUMBER as xJNL_RUNNUMBER
Resident JNL1;
Hi Rob,
so you did some heroisch things to gernerate a appropriate key for both tables. Nevertheless you are not responsible for the design of your source database.
To get rid of your duplicate rows there are now two possibilities: Let's assume there are still duplicates in your resident table JNL1. Then use the distinct within join - load like "left join (CAJNL) load distinct JNL.... resident JNL1". If you have it in your table CACJNL use similar "load distinct .. from ... (qvd);". If you still have after that duplicates the only way is to create a temp. unique number (a rowid like in oracle is not availible in QV). See therefor at the post above from Gordon.
To give you a view behind left/right keep and join I attached a small sample app. Start playing around with different loads while sliding the tabs.
RR
Hi Rob,
sorry now with the correct application.
RR