Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi
we are loading the data from multiple sources. how to avoid the duplicates while loading the raw data??
Below is the sample data:
ID Name
0111 IBM
0111 VIBM
0111 IBM CORPORATION
0422 V-SS&C TECHNOLOGIES INC
0422 V-SSC TECHNOLOGIES INC
0171 V-V-RECORDS MANAGEMENT INC
3674 V-EMC CORPORATION
3199 V- AMERICA INTERNATIONAL CORPORATION
3789 V-AIRTEL INC
3789 V-V-AIRTEL INC
How would you decide which Name to load. For example When ID = 0111 which Name would go? IBM, VIBM or IBM CROPORATION?
Have a look at
In your case, create a table with customer agreed Names for each ID, then e.g. MAP the values when loading your source data.
I have 13656 rows data. I have to create manually????
In this case IBM is the root data for loading.
We need to have consistency. If it is always going to be the first name then you can use FirstValue() function. Something like this:
LOAD ID,
FirstValue(Name) as Name
FROM Source
Group By ID;
Kishore Kumar Karakavalasa wrote:
I have 13656 rows data. I have to create manually????
Hopefully, there is a table with agreed Names for each ID somewhere in your systems (call it a master dimension table or leading data table for that dimension), so you can just load this table as dimension table in QV.
Or, to keep it simple: decide which one of your sources is "leading". That will probably be the data source that:
Solutions to merge master data tables from different systems rely either on
Peter