Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi guys,
I have a problem I have encountered a few times already. I would like to make a left join and at the same time be sure I don't expand the number of rows of the base table. The purpose is to get one fieldA for each join_key in the BaseTable, it does not matter whether it is AA1, AA2 or AA3.
BaseTable | ||
join_key | field1 | field3 |
abc | john | williams |
def | paul | jones |
ghi | david | terry |
JoinTable | |
join_key | fieldA |
abc | AA1 |
abc | AA2 |
abc | AA3 |
So what I would like to do the following join and make sure I will only get one value for fieldA for each join_key.
LEFT JOIN( BaseTable) LOAD * Resident JoinTable;
DROP TABLE JoinTable;
My current approach is to make an aggregation of the JoinTable in order to ensure the join_key is unique. With numbers, I use a min() or max() function. With text fields, I use min(autonumber(join_key & fieldA)) as an aggregation formula.
I think I am making it more complicated than is necessary, so do you know a better way?
Hi,
Since it is one to many mapping it will increase the number of rows. Can you please let me know what exactly should be your output ??
Hi,
In this case it does not matter which field is selected, as long as it is field related to the join_key. So possible output:
table 1 | |||
join_key | field1 | field3 | |
abc | john | williams | AA3 |
def | paul | jones | - |
ghi | david | terry | - |
Try this.
LEFT JOIN( BaseTable)
First 1
LOAD * Resident JoinTable;
DROP TABLE JoinTable;
Thanks for the quick replies.
@Jagan, this is working for this case, but I have a JoinTable with 2,5 million records and multiple join_keys... Do you have another solution?
Hi,
First you need to create unique values against each join key. It looks like you need the latest record against each join key. You should load first join table and use aggregation function (Min,Max) as per your requirement and then join it with transaction table.
Is there any field in you join table which can defined the latest row ??
regards
Vijay
Since it is one to many we can't control i think so..
The best way in my mind to make a Left Join and being sure that the number of records will not change, is to use ApplyMap instead of join. It will generate exactly what you want.
The Applymap function is a type of lookup function that maps one value with another. You must then first load a translation table, where the first column is the From field and the second is the To field. Then you can call this translation table in subsequent tables. A trivial example:
MapCountryToContinent:
Mapping Load * inline
[Country, Continent
Germany, Europe
Sweden, Europe
US, North America
Canada, North America] ;
Data:
Load
Applymap('MapCountryToContinent', Country) as Continent
from
Hi Henric,
Here mapping logic can not work because ther are multiple values for same key.
Vijay
You could try this ...
Left Join (BaseTable)
load join_key, subfield(concat(fieldA,';'),';',count(fieldA)) as vals resident JoinTable group by join_key;
... it groups the Join_key and creates a string of values which you can then optionally select first or last (as my code) value from. I am guessing it may be slow for 2.5million records though.
flipside