Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello Community surprisingly cant find an answer any where for this puzzle.
I am trying to concatenate some text that has duplicates which i need to remove, but i also need the concatenate text sorted in a specific order.
1. concat( DISTINCT text) - eliminates the duplicates but defaults sort the text value
2. Concat( DISTINCT text, delimiter, sort_weight) - Sorts correctly but does not keep DISTINCT text values.
Below are the two examples
#1
COMMENTTEMP:
LOAD * INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
];
COMMENT:
Load DISTINCT
CUSTOMER_NUMBER,
Concat( DISTINCT COMMENT.Comment,'| ') AS COMMENT.Comment
Resident COMMENTTEMP
Group by CUSTOMER_NUMBER
;
#1 OUTPUT - Duplicates removed but incorrectly ordered
CUSTOMER_NUMBER | COMMENT.Comment |
A1 | a| b| c |
B2 | a| b |
B3 | a| b| c| d |
#2
COMMENTTEMP:
LOAD * INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
];
COMMENT:
Load DISTINCT
CUSTOMER_NUMBER,
Concat( DISTINCT COMMENT.Comment,'| ',[COMMENT.Line_No_]) AS COMMENT.Comment
Resident COMMENTTEMP
Group by CUSTOMER_NUMBER;
#2 Output - Correctly ordered but duplicates remain
CUSTOMER_NUMBER | COMMENT.Comment |
A1 | a| c| b |
B2 | b| b| a |
B3 | a| c| b| c| d |
Required Output
CUSTOMER_NUMBER | COMMENT.Comment |
A1 | a| c| b |
B2 | b| a (the duplicate "b" removed) |
B3 | a| c| b| d (the duplicate "c" removed) |
Thanks for your input and feedback
Hi, you can try removing the duplicates before the concat:
COMMENTTEMP:
LOAD *, CUSTOMER_NUMBER &'_'& COMMENT.Comment as chkKey INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
]
Where not Exists('chkKey', CUSTOMER_NUMBER &'_'& COMMENT.Comment);
AutoNumberHash128(CUSTOMER_NUMBER,COMMENT.Comment) maybe gives better performance than CUSTOMER_NUMBER &'_'& COMMENT.Comment
Hi, you can try removing the duplicates before the concat:
COMMENTTEMP:
LOAD *, CUSTOMER_NUMBER &'_'& COMMENT.Comment as chkKey INLINE [
CUSTOMER_NUMBER, COMMENT.Comment, COMMENT.Line_No_
A1, a, 1
A1, c, 2
A1, b, 3
B2, b, 1
B2, b, 2
B2, a, 3
B3, a, 1
B3, c, 2
B3, b, 3
B3, c, 4
B3, d, 5
]
Where not Exists('chkKey', CUSTOMER_NUMBER &'_'& COMMENT.Comment);
AutoNumberHash128(CUSTOMER_NUMBER,COMMENT.Comment) maybe gives better performance than CUSTOMER_NUMBER &'_'& COMMENT.Comment
Nice one. Thanks for that. Works as needed.