Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Arturo
Contributor
Contributor

Script concat distinct returns duplicate values when using sort by

Hello Community surprisingly cant find an answer any where for this puzzle. 

I am trying to concatenate some text that has duplicates which i need to remove, but i also need the concatenate text sorted in a specific order. 

1. concat( DISTINCT text) - eliminates the duplicates but defaults sort the text value 

2. Concat( DISTINCT text, delimiter, sort_weight) - Sorts correctly but does not keep DISTINCT text values.

Below are the two examples 

#1

COMMENTTEMP:
LOAD * INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
];

COMMENT:
Load DISTINCT
CUSTOMER_NUMBER,
Concat( DISTINCT COMMENT.Comment,'| ') AS COMMENT.Comment

Resident COMMENTTEMP
Group by CUSTOMER_NUMBER
;

#1 OUTPUT - Duplicates removed but incorrectly ordered

CUSTOMER_NUMBERCOMMENT.Comment
A1a| b| c
B2a| b
B3a| b| c| d

 

#2

COMMENTTEMP:
LOAD * INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
];

COMMENT:
Load DISTINCT
CUSTOMER_NUMBER,
Concat( DISTINCT COMMENT.Comment,'| ',[COMMENT.Line_No_]) AS COMMENT.Comment

Resident COMMENTTEMP
Group by CUSTOMER_NUMBER;

#2 Output - Correctly ordered but duplicates remain

CUSTOMER_NUMBERCOMMENT.Comment
A1a| c| b
B2b| b| a 
B3a| c| b| c| d

 

Required Output

CUSTOMER_NUMBERCOMMENT.Comment
A1a| c| b
B2b| a (the duplicate "b" removed)
B3a| c| b| d (the duplicate "c" removed)

 

Thanks for your input and feedback

1 Solution

Accepted Solutions
rubenmarin

Hi, you can try removing the duplicates before the concat:

COMMENTTEMP:
LOAD *, CUSTOMER_NUMBER &'_'& COMMENT.Comment as chkKey INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
]
Where not Exists('chkKey', CUSTOMER_NUMBER &'_'& COMMENT.Comment);

 

AutoNumberHash128(CUSTOMER_NUMBER,COMMENT.Comment) maybe gives better performance than CUSTOMER_NUMBER &'_'& COMMENT.Comment

View solution in original post

2 Replies
rubenmarin

Hi, you can try removing the duplicates before the concat:

COMMENTTEMP:
LOAD *, CUSTOMER_NUMBER &'_'& COMMENT.Comment as chkKey INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
]
Where not Exists('chkKey', CUSTOMER_NUMBER &'_'& COMMENT.Comment);

 

AutoNumberHash128(CUSTOMER_NUMBER,COMMENT.Comment) maybe gives better performance than CUSTOMER_NUMBER &'_'& COMMENT.Comment

Arturo
Contributor
Contributor
Author

Nice one. Thanks for that. Works as needed.