Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Good Day,
I am trying to compare contents of one string value in one table against the contents of another string value in a different table. Based on the matched values it must pick the best matched value.
For example TABLE 1 has multiple values (String 1) separated by ','
TABLE 1 | |
Key | String 1 |
0001 | 428,459,5A1,5AC,5AL,5DM,609,650,688,6AC,6AE,6AK,6AM,6AN,6AP,6CP,6NS,710 |
0002 | 428,459,481,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS,715 |
0003 | 428,431,459,481,498,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS |
0004 | 428,432,453,482,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS |
A second table with some string values also...
TABLE 2 | |
String 2 | Value |
428 | 20 |
428,459 | 22 |
428,459,710 | 25 |
428,431,459,710 | 30 |
428,431,493 | 33 |
428,431,459 | 38 |
What I am trying to accomplish is to compare the 2 string fields in some way, I have tried breaking it down using subfield, so there are multiple records and then use the subfield values as key, but it does not return what I want as it links it to all associated string values in table 2. I have tried the use of Match, Wildmatch ect, but probably use it inefficiently.
What must happen is the string value in TABLE 1 must compare all the available options in the string field of TABLE 2 and choose the best matched value in TABLE 2.
Example of the results of above table contents.
RESULT | |||
Key | String 1 | String 2 | Value |
0001 | 428,459,5A1,5AC,5AL,5DM,609,650,688,6AC,6AE,6AK,6AM,6AN,6AP,6CP,6NS,710 | 428,459,710 | 25 |
0002 | 428,459,481,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS,715 | 428,459 | 22 |
0003 | 428,431,459,481,498,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS | 428,431,459 | 38 |
0004 | 428,432,453,482,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS | 428 | 20 |
Hoping that someone can assist or guide me in some direction, or that my requirement at least make some sense. hehe
Thanks and Regards,
Kanon
If I haven understood right, you can do this in script. I have given a quick try, please find below the script.
Table1:
Load * Inline [
Key, "String 1"
0001, "428,459,5A1,5AC,5AL,5DM,609,650,688,6AC,6AE,6AK,6AM,6AN,6AP,6CP,6NS,710"
0002, "428,459,481,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS,715"
0003, "428,431,459,481,498,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS"
0004, "428,432,453,482,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS"
];
Join
Table2:
Load *
,SubField([String 2], ',') as SepString2;
Load * Inline [
"String 2", Value
"428", 20
"428,459", 22
"428,459,710", 25
"428,431,459,710", 30
"428,431,493", 33
"428,431,459", 38
];
NoConcatenate
Final:
Load
[String 1],
FirstSortedValue(DISTINCT Key, -len([String 2])) as Key,
FirstSortedValue(DISTINCT [String 2], -len([String 2])) as [String 2],
FirstSortedValue(DISTINCT Value, -len([String 2])) as Value
Where bestMatchFlag = 'PossibleBestMatch' Group By [String 1];
Load
Key,
[String 1],
[String 2],
Value,
If(Count(if(matchFlag=1,1))=SubStringCount([String 2],',')+1, 'PossibleBestMatch', 'NO') as bestMatchFlag
Group By [String 2],[String 1],Key, Value;
Load
Key,
[String 1],
[String 2],
Value,
If(Index([String 1],SepString2)>0,1,0) as matchFlag
Resident Table1;
DROP Table Table1;
Better solution could be there though. Not invested much time in search of one.😋
Howdy...
Tried a few things again, but still not 100% success, will something like this even be possible within Qlik?
I will continue trying to find a solution though... hehe
Good Day,
Thank you so much for your reply...
Unfortunately this is the similar results that I am getting which is not yet correct.
With the results below from your example, the matching in theory is correct, however if you look for example at Key 0004 with the value of "428,493" and then look at TempTable2, there is no string value of "428,493". There is however a "428,431,493", which for this instance would be incorrect as well, as the "431" was not present in Table1.
Therefore due to 431 not being available for that record, looking at TempTable2, the only available match would be the 1st entry... "428" with the value of 20 as all the other entries miss one or more string code.
Best Regards,
Kanon
wouldn't it be possible to add an "if no match found, then ..." logic after the the step we have done so far?
I don't think so, as the table gets broken down with subfield, so there is no relation to match against. It's almost like I have to add additional key to associate each record on TempTable2 before subfield.
Then find the best logical match, if one of the values in TempTable2 row entry is not present in one of the records of Table1, it is not a match, and then look for another logical match in TempTable2, as in the one example row where the only logical match was "428" on its own, even though some of the other entries were found, not all was present for that record against the record in TempTable2.
Hehe, feels so close yet sooooooo...
Good Day,
If this is not possible in load script, would it be possible to do this with set analysis?
Applying some of the logic suggested, this is an example of one of the results.
Only one value must be returned, in the 1st example, the 1st record is the correct one, as the second record has a "823" value in string 2, which is not present in string 1.
The second example even though the value 3AC is present in the 1st record, the 2nd record is more accurate as the value "823" is present in string 1 as well, therefore the 1st record should be ignored.
Regards
If I haven understood right, you can do this in script. I have given a quick try, please find below the script.
Table1:
Load * Inline [
Key, "String 1"
0001, "428,459,5A1,5AC,5AL,5DM,609,650,688,6AC,6AE,6AK,6AM,6AN,6AP,6CP,6NS,710"
0002, "428,459,481,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS,715"
0003, "428,431,459,481,498,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS"
0004, "428,432,453,482,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS"
];
Join
Table2:
Load *
,SubField([String 2], ',') as SepString2;
Load * Inline [
"String 2", Value
"428", 20
"428,459", 22
"428,459,710", 25
"428,431,459,710", 30
"428,431,493", 33
"428,431,459", 38
];
NoConcatenate
Final:
Load
[String 1],
FirstSortedValue(DISTINCT Key, -len([String 2])) as Key,
FirstSortedValue(DISTINCT [String 2], -len([String 2])) as [String 2],
FirstSortedValue(DISTINCT Value, -len([String 2])) as Value
Where bestMatchFlag = 'PossibleBestMatch' Group By [String 1];
Load
Key,
[String 1],
[String 2],
Value,
If(Count(if(matchFlag=1,1))=SubStringCount([String 2],',')+1, 'PossibleBestMatch', 'NO') as bestMatchFlag
Group By [String 2],[String 1],Key, Value;
Load
Key,
[String 1],
[String 2],
Value,
If(Index([String 1],SepString2)>0,1,0) as matchFlag
Resident Table1;
DROP Table Table1;
Better solution could be there though. Not invested much time in search of one.😋
Thank you so much, I have so much to learn still. I have tested your code and the results is perfect, I am implementing the code in my application for further testing.
I managed to get it working with below code, please don't laugh, then again you might as well... hehe, was going to ask if doing so many resident loads is optimal, but guess my lack of knowledge is probably not the best way to ask. 🙂
TempTable2:
Load * Inline [
"String 2", Value
"428", 20
"428,459", 22
"428,459,710", 25
"428,431,459,710", 30
"428,431,493", 33
"428,431,459", 38
];
[TABLE]:
load *,
text([String 2]) as String2,
text(SubField([String 2],',')) as StrKey,
substringcount(trim([String 2]), ',') + 1 as Count_Str2
resident TempTable2;
drop field [String 2];
drop table TempTable2;
TempTable1:
Load * Inline [
ID, "String 1"
0001, "428,459,5A1,5AC,5AL,5DM,609,650,688,6AC,6AE,6AK,6AM,6AN,6AP,6CP,6NS,710"
0002, "428,459,481,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS,715"
0003, "428,431,459,481,498,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS"
0004, "428,432,453,482,493,4AW,4ML,4NE,4U0,4UR,534,548,552,5A1,5AC,6AP,6CP,6NS"
];
left join([TABLE])
Load *,
text([String 1]) as String1,
text(SubField([String 1],',')) as StrKey
Resident [TempTable1];
drop field [String 1];
drop field StrKey;
drop table TempTable1;
[TEMP]:
load *,
if(len(trim([String1]))<>0,text(SubField([String1],','))) AS [String_1],
if(len(trim([String2]))<>0,text(SubField([String2],','))) AS [String_2]
Resident TABLE;
drop table TABLE;
[MATCH]:
load *,
if([String_1] = [String_2],1,0) as MATCHED
resident TEMP order by ID, [String_1];
Drop table TEMP;
NoConcatenate
[Grr]:
load distinct *,
Text(ID & String2) as KEY Resident [MATCH] where MATCHED > 0;
drop table [MATCH];
Right Join
load KEY,
sum(MATCHED) as Please
resident [Grr]
group by KEY;
NoConcatenate
[Dude]:
load distinct * resident Grr where Please = Count_Str2
order by ID, Please desc;
drop table Grr;
[AiAiAi]:
load *,
IF(Rowno() = 1,1,
IF(ID = Previous(ID),0,1)) as FirstValue
resident Dude;
drop table Dude;
NoConcatenate
[ThisIsIt]:
load distinct ID,String1,String2,Value resident [AiAiAi] where FirstValue > 0;
drop table AiAiAi;
Again thanks for your time in helping here, much appreciated.
Kanon