Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How can I join 2 tables while discarding dupes?

Hello,

I have 2 basic tables which I would like to join. In case of dupes, I would like to keep only 1 occurrence, as illustrated below:

GroupByRegion:
Load * inline
[Region,        GroupName,
France,        North
Argentina,    South
Australia,     South
NewZeland, South
Germany,    North
Argentina,    North                                           /* THIS REGION IS A DUPE, I WOULD LIKE TO IGNORE IT */
];

Facts:
Load * inline
[Nb, Region,
1,  France
2,  Australia
3,  Argentina                                                  
4,  Germany
5,  NewZeland
];

LEFT JOIN
LOAD *
Resident GroupByRegion;


DROP Table GroupByRegion;

==>

I obtain this table (with 2 occurrences for Nb 3 , one with Argentina in Group South, and another one with Argentina in Group North):

   

whereas I would like to have this table (i.e. 1 occurrence for Nb 3 with Argentina in Group South):

It would very much appreciate your help on this. Many thanks. Annick

1 Solution

Accepted Solutions
swuehl
MVP
MVP

Use a MAPPING approach instead:

GroupByRegion:

MAPPING
Load * inline
[Region,        GroupName,
France,        North
Argentina,    South
Australia,     South
NewZeland, South
Germany,    North
Argentina,    North                                           /* THIS REGION IS A DUPE, I WOULD LIKE TO IGNORE IT */
];

Facts:
Load *, ApplyMap('GroupByRegion', Region) as GroupName

inline
[Nb, Region,
1,  France
2,  Australia
3,  Argentina                                                  
4,  Germany
5,  NewZeland
];

View solution in original post

3 Replies
swuehl
MVP
MVP

Use a MAPPING approach instead:

GroupByRegion:

MAPPING
Load * inline
[Region,        GroupName,
France,        North
Argentina,    South
Australia,     South
NewZeland, South
Germany,    North
Argentina,    North                                           /* THIS REGION IS A DUPE, I WOULD LIKE TO IGNORE IT */
];

Facts:
Load *, ApplyMap('GroupByRegion', Region) as GroupName

inline
[Nb, Region,
1,  France
2,  Australia
3,  Argentina                                                  
4,  Germany
5,  NewZeland
];

sunny_talwar

I agree with Stefan, in this case Mapping Load makes much more sense to use than ApplyMap. Look here for more details on the comparison between the two: Don't join - use Applymap instead

swuehl
MVP
MVP

I forgot to mention that only the first occurence (in load order) of your Region in the mapping table will be used for mapping region to group.