Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi, why are there duplicate rows? I concatenated tables from different sources and 'itemnum' is primary key. The itemnum appears in all the tables and has data for all columns but still duplicates, for example, it has data for all the columns from one table but is null for the other columns in other tables. So there is duplicate like below:
itemnum | site | name | balance | class | code | Group |
---|---|---|---|---|---|---|
1009 | A | D | - | B | - | - |
1009 | - | - | 45 | - | FG | H |
How can I combine everything to one row (distinct itemnum)?
itemnum | site | name | balance | class | code | Group |
---|---|---|---|---|---|---|
1009 | A | D | 45 | B | FG | H |
Thanks.
What's the script to join on primary key? cos all I did was outer join(tablename)
ItemClassification:
LOAD *, text(ITEMNUM) as itemnum;
LOAD ITEMNUM, .
.
.
.(other fields)
.
.
.
FROM
<filename>
(biff, embedded labels, table is Sheet1$);
outer Join(ItemClassification)
LOAD *, ltrim(text(ITEMNUM)) as itemnum;
LOAD ITEMNUM, .
.
.
.
.
.
FROM
<filename>
(biff, embedded labels, table is Sheet1$);
outer Join(ItemClassification)
LOAD *, ltrim(text(ITEMNUM)) as itemnum;
LOAD ITEMNUM,.
.
.
.
.
.
FROM
<filename>
(ooxml, embedded labels, table is (filename));
Outer join (Table), shouldn't be giving u duplicate rows.
Can u paste sample data
Just a guess
Try ltrim on first table as well and let me know if that causes the issue
Is there another way to match the itemnum?
Sure,
if you can't fix your original join, you can use a resident load with a group by:
LOAD
itemnum,
only(site),
only(name),
only(balance),
only(class),
only(code),
only(Group)
RESIDENT YourTable
GROUP BY itemnum;
edit: I think your issue is that primary keys itemnum don't match properly, check the values returned on length, existing white spaces etc. Your data should return numbers for the primary keys, right?
but how do I use this for multiple tables? the fields come from different tables.
check below script
T1:
LOAD * INLINE [
No, site, name, class
1, A, XYZ, D
2, B, ABC, A
3, A, PQR, A
];
Outer join (T1)
LOAD * INLINE [
No, Code, Grp
1, 1a, AA
2, 1b, ABC, DD
3, 2a, PQR, AB
];
I convert to text because some itemnum have letters
I can't, there are 10,000 over records...