Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
RC_121985
Contributor III
Contributor III

Creating Identifier or grouping the records that are Similar or Shuffled

Hello Experts,

 

Got held back by strange challenge. I have used strange subject above. But will use simple example to simplify.

FIELD1
A|B|C|D
B|C|D|A
D|C|B|A
A|B
B|A
C
D
D|C
A|C|B|D

 

Looking for the Output as below any one FIELD2 or FIELD3 or FIELD4.

FIELD1 FIELD2 FIELD2 FIELD3
A|B|C|D A|B|C|D 1 ABCD
B|C|D|A A|B|C|D 1 ABCD
D|C|B|A A|B|C|D 1 ABCD
A|B A|B 2 AB
B|A A|B 2 AB
C C 3 C
D D 4 D
D|C D|C 5 DC
A|C|B|D A|B|C|D 1 ABCD

 

Any one of the result is fine. I just want to group similar values that might have shuffled within.

Thanks

Labels (2)
2 Solutions

Accepted Solutions
Sayed_Mannan
Creator II
Creator II

here is my solution,

 

Test:
Load * INLINE [
FIELD1
A|B|C|D
B|C|D|A
D|C|B|A
A|B
B|A
C
D
D|C
A|C|B|D
];


t:
LOAD
FIELD1,
SubField(FIELD1, '|') as SubField
RESIDENT Test;

TempTable:
LOAD
FIELD1,
SubField as SortedSubField
RESIDENT t
ORDER BY FIELD1, SubField;

NoConcatenate
YourTable:
LOAD
FIELD1,
Concat(SortedSubField, '|') as SortedField
RESIDENT TempTable
GROUP BY FIELD1;

DROP TABLES TempTable,t,Test;

 

// Drop Table Test;

EXIT SCRIPT;

 

 

if is works for you, mark this as a solution.

View solution in original post

RC_121985
Contributor III
Contributor III
Author

Thank You @Sayed_Mannan for the solution. During the testing of your query, I found it don't work when there is space in between the text. It works good for single character separated by "|". 

So I tested with below sample.

Test:
Load * INLINE [
FIELD1
A B|C D,
E F|G,
C D,
A B|E F|G|H I|C D,
A B|H I|C D|E F|G,
A B|E F|C D|G|H I,
A B|E F|G|H I|C D,
A B
];

Result was as below;

rvc_121985_0-1717215893631.png

To handle this, i used Distinct in 

NoConcatenate
YourTable:
LOAD
FIELD1,
Concat(Distinct SortedSubField, '|') as SortedField
RESIDENT TempTable
GROUP BY FIELD1;

I see the result is matching to my expectation.

rvc_121985_1-1717216024116.png

Thank you very much for your Solution. I accept this as solution 🙂

View solution in original post

3 Replies
Sayed_Mannan
Creator II
Creator II

here is my solution,

 

Test:
Load * INLINE [
FIELD1
A|B|C|D
B|C|D|A
D|C|B|A
A|B
B|A
C
D
D|C
A|C|B|D
];


t:
LOAD
FIELD1,
SubField(FIELD1, '|') as SubField
RESIDENT Test;

TempTable:
LOAD
FIELD1,
SubField as SortedSubField
RESIDENT t
ORDER BY FIELD1, SubField;

NoConcatenate
YourTable:
LOAD
FIELD1,
Concat(SortedSubField, '|') as SortedField
RESIDENT TempTable
GROUP BY FIELD1;

DROP TABLES TempTable,t,Test;

 

// Drop Table Test;

EXIT SCRIPT;

 

 

if is works for you, mark this as a solution.

Clement15
Partner - Creator III
Partner - Creator III

Hello, this answers your problem


TempTable:
load
RowNo() as IDTEMP,
FIELD1,
SubField(FIELD1,'|',1) as F1,
SubField(FIELD1,'|',2) as F2,
SubField(FIELD1,'|',3) as F3,
SubField(FIELD1,'|',4) as F4
inline [
FIELD1
A|B|C|D
B|C|D|A
D|C|B|A
A|B
B|A
C
D
D|C
A|C|B|D
];
 
 
TransposeTable:
LOAD 
IDTEMP,
    FIELD1,
    F1 as FIELD2
RESIDENT TempTable;
CONCATENATE(TransposeTable)
LOAD 
IDTEMP,
    FIELD1,
    F2 as FIELD2
RESIDENT TempTable;
CONCATENATE(TransposeTable)
LOAD 
IDTEMP,
    FIELD1,
    F3 as FIELD2
RESIDENT TempTable;
CONCATENATE(TransposeTable)
LOAD 
IDTEMP,
    FIELD1,
    F4 as FIELD2
RESIDENT TempTable;
 
 
DROP TABLE TempTable;
 
TEST2:
load *,
AutoNumber(FIELD4) as FIELD3,
Replace(FIELD4,' ','|') as FIELD2;
load
 
    FIELD1,
    RTrim(chr(min(ord(FIELD2),1)) & ' '& chr(min(ord(FIELD2),2)) & ' '& chr(min(ord(FIELD2),3)) & ' '& chr(min(ord(FIELD2),4))) as FIELD4 
resident TransposeTable
group by IDTEMP,FIELD1
;
 
DROP table TransposeTable;
RC_121985
Contributor III
Contributor III
Author

Thank You @Sayed_Mannan for the solution. During the testing of your query, I found it don't work when there is space in between the text. It works good for single character separated by "|". 

So I tested with below sample.

Test:
Load * INLINE [
FIELD1
A B|C D,
E F|G,
C D,
A B|E F|G|H I|C D,
A B|H I|C D|E F|G,
A B|E F|C D|G|H I,
A B|E F|G|H I|C D,
A B
];

Result was as below;

rvc_121985_0-1717215893631.png

To handle this, i used Distinct in 

NoConcatenate
YourTable:
LOAD
FIELD1,
Concat(Distinct SortedSubField, '|') as SortedField
RESIDENT TempTable
GROUP BY FIELD1;

I see the result is matching to my expectation.

rvc_121985_1-1717216024116.png

Thank you very much for your Solution. I accept this as solution 🙂