Solved: Re: Whilst is equal… - Qlik Community

HonestToad · ‎2022-03-11

This is an example in MatLab which I am trying to replicate in Qlik. I have lots of random length, random sub-fielded strings that I would like to remove the preceding text that is common to all

% The data

DriftEL = 'ta_tsc_tsc_mcs_hk_atc_gyro_drift_el';

DriftXEL = 'ta_tsc_tsc_mcs_hk_atc_gyro_drift_xel';

DriftLOS = 'ta_tsc_tsc_mcs_hk_atc_gyro_drift_los';

% The algorithm [Strip off first character if shared by all others.]

while isequal(DriftEL(1),DriftXEL(1),DriftLOS(1))

DriftEL(1) = [];

DriftXEL(1) = [];

DriftLOS(1) = [];

end

% Result

DriftEL(1) = el;

DriftXEL(1) = xel;

DriftLOS(1) = los;

I am sure there is more than one methodology to do this in Qlik, I would welcome and pointers

HonestToad · ‎2022-03-17

Having thought about it all weekend and bouncing ideas off a colleague, I have come up with a solution that works. It may not be the most elegant. Here is my solution:

//Sample Data
[Sample]:
LOAD * INLINE
[
Grouping,Fields
Example, ta_tsc_tsc_mcs_hk_atc_gyro_drift_el
Example, ta_tsc_tsc_mcs_hk_atc_gyro_drift_xel
Example, ta_tsc_tsc_mcs_hk_atc_gyro_drift_lo
Album, Now that's what I call Music 7
Album, Now that's what I call Music 8
Album, Now that's what I call Music 9
](delimiter is ',');

//Sub routine to work through each grouping in a table and splitting the common fields from a string
Sub Prefix(Table,Group,Field)

//There are a small number of typos that can impact the splitting - change them all to a space usinging a mapping load
Cleanup:
Mapping Load *
Inline [
AttCode, Attribute
" ", " "
"-", " "
":", " "
];

//Looping through each grouping item in turn
FOR Each v in FieldValueList('$(Group)')

//Create a temp table that just has one single group item present
Temp:
Load Distinct
$(Group) as g,
MapSubString('Cleanup', $(Field)) as f
Resident $(Table)
where $(Group)='$(v)';

//Calculate the number of fields within the Grouping - used to understand when we have a 100% match
LET vRows=NoOfRows('Temp');

//When working through the string character by character, we need to stop when the shortest string length has been consummed
Len:
Load
g,
Min(Len(f)) as l
Resident Temp
Group By g;

//Assign the length of the shortest string to a variable
LET vLen = num(Peek('l',0,'Len'));
Drop Table Len;

//Slowly add an new character from the left until there is no longer a 100% match or the shortest string length as been reached

Set a=1;
Set vCheck=1;

Do while vCheck=1 and a<=($(vLen)+1)

//Return the text value of the shortest string, or the first shortest value if there is multiple values the same length
Short:
Load Distinct
FirstValue(f) as s
Resident Temp
where Len(f)=num($(vLen));

//Assign the shortest text value to a variable
LET vMatch = text(Peek('s',0,'Short'));
Drop table Short;

//With each interation the slowly collect another character from the left
Let vPart = left('$(vMatch)',$(a));

//Count the number of fields that match the current interations left string
Match:
Load
g,
count(f) as p
Resident Temp
where WildMatch(f,'$(vPart)*')>0
Group By g;

//If the left interation count matches the groups row count then we know that there is a 100% match
LET vCheck = if(num(Peek('p',0,'Match'))=$(vRows),1,0);
Drop Table Match;

//Move onto the next character
Let a=a+1;
Loop;

//Due to the way the loop finishes with the first false value, the final counters need to be wound back slightly
Let a = a-2;
Let vPart=left('$(vMatch)',$(a));

//Creating a new output table which has the original field split into two sections
Output:
Load Distinct
$(Group),
left($(Field),$(a)) as $(Field)1,
right($(Field),len($(Field))-$(a)) as $(Field)2
Resident $(Table)
where $(Group)='$(v)';

Drop Table Temp;

//Move onto the next table grouping
NEXT v

//Drop the original data table
Drop Table $(Table);

//Rename the output table to the same name as the original
Rename Table Output to $(Table);

End Sub

Call Prefix('Sample','Grouping','Fields');

View solution in original post

anthonyj · ‎2022-03-14

Hi @HonestToad ,

I'm sorry I haven't used MATLAB and I'm not sure if this is overly simplistic for your requirements, but based on the example you've provided and assuming these are values in a column then you can use this function to return the nth string value based on the underscores as separators.

subfield([Column Name], '_', 9 )

If there are further complexities to your requirements, let us know and I'm sure we can work them out.

Regards

Anthony

HonestToad · ‎2022-03-17