Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everybody !
I would like, in one of my project, to join two tables.
I extract from a DB a field (Site_index), which is an index field. We can get from this field the linked values (Country & Region), but this is from a flat file.
I have this model:
Data:
LOAD ....,
Site_index,
.............
SQL SELECT * FROM maDB;
Site_index:
LOAD Site_index,
Country,
Region
FROM [Data/Mapping/Site_index.xls,Sheet is...]
LEFT JOIN(Site_index)
LOAD Site_index,
Country,
Region
RESIDENT Site_index;
DROP TABLE Site_index;
I tried using the JOIN function but i don't achieve what i expected.
I had from the DB 33000rows and, after the left JOIN, about 83000 rows..
Can someone help me please ?
Thanks !!!
Franck,
it's a common problem... There are only 2 possible causes:
To get any clues, load the two tables separately and then count the number of distinct values of the "Common Field" in both tables. Compare the total number of rows in the tables to the number of unique key values. If you keys are, in fact, unique, the numbers should match. If any of the numbers are different, then your keys are not unique.
If you'd like to post your script, we could take a look...
Oleg
Hello Franck,
I don't exactly understand what you want to achieve but I think you just want to add the country and region from the flatfile to the Data table:
Data:
LOAD ....,
Site_index,
.............
SQL SELECT * FROM maDB;
LEFT JOIN (Data)
LOAD Site_index,
Country,
Region
FROM [Data/Mapping/Site_index.xls,Sheet is...]
In your script you are joining the Site_index table with the same table, do you have a reason for that or is it a mistake?
regards Mark
Mark wrote:
Hello Franck,
I don't exactly understand what you want to achieve but I think you just want to add the country and region from the flatfile to the Data table:
<blockquote><pre>
Data:
LOAD ....,
Site_index,
.............
SQL SELECT * FROM maDB;
LEFT JOIN (Data)
LOAD Site_index,
Country,
Region
FROM [Data/Mapping/Site_index.xls,Sheet is...]
In your script you are joining the Site_index table with the same table, do you have a reason for that or is it a mistake?
regards Mark
Yes, Thanks, Mark,
I think i need to admit it's a mistake...
Currently, my issue is the following:
if i only use the db, i've about 30.000 rows. If i use the JOIN, i've got after almost 80.000 rows.
Does anybody know why i got more results using the left join than if i don't ?
It seems to me that :
LEFT JOIN(table1)
LOAD commonField,fieldToAdd1,fieldToAdd2
Resident table2
So, normally, using it, we preserve the number of rows of the first table (if there are maximum one result thru the JOIN)....
Can someone help me ?
Thanks 😉
Franck,
it's a common problem... There are only 2 possible causes:
To get any clues, load the two tables separately and then count the number of distinct values of the "Common Field" in both tables. Compare the total number of rows in the tables to the number of unique key values. If you keys are, in fact, unique, the numbers should match. If any of the numbers are different, then your keys are not unique.
If you'd like to post your script, we could take a look...
Oleg
I think it should be something like this:
if Site_index is your key you should join it with that field.
Data:
LOAD ....,
Site_index,
.............
SQL SELECT * FROM maDB;
LEFT JOIN(Data)
LOAD Site_index,
Country,
Region
FROM [Data/Mapping/Site_index.xls,Sheet is...]
Rey-man
Many and many thanks Oleg, i just reviewed my "mapping" file and there was redundant data....