Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Talend Cloud AWS EU Scheduled Outage: Starting Tues 26 May 21:00 CEST with expected completion Wed 27 May 01:00 CEST
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Unique IDs with singular related records

Dear QV developers,

I would highly appreciate your advice on the following issue.

Let there is a table with a list of IDs and Names. Due to wrong (misspelled) Names IDs are non-unique. For instance, there are entries like these:

[ID],    [Name]
11,      "Tom"
11,      "TOM"
23,      "Janny"
23,      "Ja_ny"
33,      "John"
43,      "Michael"

The aim is to get a table with unique IDs and singular Names. And it does not matter if grammatically right or wrong name is assigned to a particular ID.

I have found a solution, which deals the issue.

A clip from my code:

[Table1]:
Load * INLINE [
11,      "Tom"
11,      "TOM"
23,      "Janny"
23,      "Ja_ny"
33,      "John"
43,      "Michael"
]
;

[Table2]:
Load
Distinct(ID) AS [Dist_ID]
resident [Table1]
;

For v_i = 0 to (NoOfRows('Table2')-1)
Let vLine = $(v_i);
Let vID = Peek('Dist_ID',vLine,'Table2');
First 1
LOAD [ID]             AS [NewID],
     
[Name]         AS [NewName]
resident [Table1] WHERE ID=$(vID);
Next v_i

However, if the IDs are non-integers:

[ID],    [Name]
aa11,    "Tom"
aa11,    "TOM"
bb23,    "Janny"
bb23,    "Ja_ny"
ba33,    "John"
ab43,    "Michael"

The code does not work. May you, please, advise me on this issue? Perhaps there exists a better solution, than the one I have found?

Thanks in advance!

1 Solution

Accepted Solutions
Bjorn_Wedbratt
Employee
Employee

Hi Benas,

If we assume that you want the first name found for the ID (and don't care if this is the "right" name), then simply use a where exists() statement, something like:

tmp:

Load * INLINE [

ID,    Name

aa11,    Tom

aa11,    TOM

bb23,    Janny

bb23,    Ja_ny

ba33,    John

ab43,    Michael

];

Load

     ID as newID, Name as newName

Resident tmp

where not exists(newID,ID);

Btw, your code works as well if you change

resident [Table1] WHERE ID=$(vID);


to


resident [Table1] WHERE ID='$(vID)';

View solution in original post

5 Replies
datanibbler
Champion
Champion

Hi Benas,

just vary the code very slightly and you'll have it:

Name_IDs_v1:

LOAD ID,
    
B as Name
FROM
Sample_20140903.xlsx
(
ooxml, embedded labels, table is Sheet1);

Name_IDs_v2:
LOAD
   
Distinct ID as Dist_ID,
   
Name as Dist_name
RESIDENT Name_IDs
;
DROP TABLE Name_IDs;

Voila, there is a unique record with every distinct ID and a name attached. Basically your code with just one line added.

Best regards,

DataNibbler

Bjorn_Wedbratt
Employee
Employee

Hi Benas,

If we assume that you want the first name found for the ID (and don't care if this is the "right" name), then simply use a where exists() statement, something like:

tmp:

Load * INLINE [

ID,    Name

aa11,    Tom

aa11,    TOM

bb23,    Janny

bb23,    Ja_ny

ba33,    John

ab43,    Michael

];

Load

     ID as newID, Name as newName

Resident tmp

where not exists(newID,ID);

Btw, your code works as well if you change

resident [Table1] WHERE ID=$(vID);


to


resident [Table1] WHERE ID='$(vID)';

maxgro
MVP
MVP

tmp:

Load * INLINE [

ID,    Name

aa11,    Tom

aa11,    TOM

bb23,    Janny

bb23,    Ja_ny

ba33,    John

ab43,    Michael

];

tmp2:

NoConcatenate load

ID,

MaxString(Name) as Name,

concat(Name, '---') as AllNames

Resident tmp group by ID;

DROP Table tmp;

Not applicable
Author

Björn,

Thanks a lot!

Not applicable
Author

To get the result that I need it still requires where not exists([Dist_ID],ID) condition. Thanks, though!