
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Matching substring in column with another column
Here I have two 3 columns states1, states2 and MYstate -
Whatever states under state1 and state2,
MY State should have the same state then result as Yes or else no
States1 | States2 | MY state | Result |
WA, NY, CA | WA, NY | CA,WA | Yes |
WA, NY, CA | WA, NY | CA | Yes |
WA, NY, CA | WA, NY | No | |
WA, NY, CA | WA, NY | NZ | No |
I am wondering how can I do this with set Analysis .... any thoughts or examples??
Any help is appreciated....

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If(substringcount(States1, [MY state]) or substringcount(States2, [MY state]), 'Yes','No')
talk is cheap, supply exceeds demand

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think SUBSTRINGCOUNT will only match the entire content of MyState - rather than individual elements e.g. 'CA,WA' in the first record, will not match to States1
You could possibly look into the SUBFIELD function - i haven't used it, but it appears to split a string out according to a delimiter. QlikView String Function cheat sheet
Other than that, the only way i can think is to change your Load Script to create a single line for each element of MyState

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
so I created two functions in the load script:
SET vSplit = IF(LEN(SUBFIELD($1,',',$2))=0,'',SUBFIELD($1,',',$2));
SET vCheck = IF($(vSplit($2,$3))='',0,SUBSTRINGCOUNT($1,$(vSplit($2,$3))));
Then created my calculated expression in the front end as:
IF(($(vCheck(States1,MyState,1)) + $(vCheck(States1,MyState,2)) + $(vCheck(States2,MyState,1)) + $(vCheck(States2,MyState,2)))>0,'Yes','No')
only problem is - it is currently limited to 2 elements in MyState (you can extend it my adding $(vCheck(States1,MyState,3)) to the IF
There might be a way of creating a more complex function which loops through all elements of the MyState - but also something i have never done before!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Andrew,
Thanks but let me tell know here... the state counts may increase to 'n' number... so I can't do that .. let me know if you have any other solution.
Let me give one more example :
MY State | States1 | States2 |
- | ,BC,CA,FL,GA,Guangdong,IL,NJ,NULL,ON,TX | CA |
- | BC,CA,FL,GA,IL,MD,NC,NJ,ON,TX | NC |
CA,TX,VA,FL | CA,FL,VA | VA |
- | AB,AL,AZ,BC,CA,CO,CT,FL,GA,HI,IA,IL ,IN,KS,KY,LA,MA,MD,ME,MI,MN,MO,MS,MT,NC,ND,NE,NJ,NM,NV,NY,OH,OK,ON,OR,PA,SC,TN,TX,UT,VA,VT,WA,WI | TX |
Here let me say my state is a super set .... if states in sate 1 + states2 are matching with MY state then Yes or No
So technically if my states have 10 states and states1 have 3 + states 2 have 3 i.e. union of states1 + states2 is 6 if all those 6 states are matching with my state i.e. 6/10 are matching then yes ...
right now I am deriving this logic from sql it self... but I am wondering if I can do this using set analysis

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't think its available in Set Analysis, in the current load format.
The only way I could find to make it work with n states in MyState, was to change the load script to make use of SUBFIELD() to split MyStates into single records e.g.
LOAD
MyState,
SUBFIELD(MyState,',') AS SubState
RESIDENT <MyStateSource>;
You can then use SUM(SUBSTRINGCOUNT(State1,SubState))>0 to determine whether there is a match on State1 and SUM(SUBSTRINGCOUNT(State2,SubState))>0 to determine a match on State2.
LEN(KEEPCHAR(MyState,','))+1 will tell you how many entries are in MyState - SUM(SUBSTRINGCOUNT()) will count how many of those match.
so:
(SUM(SUBSTRINGCOUNT(State1,SubState)) + SUM(SUBSTRINGCOUNT(State2,SubState))) / (LEN(KEEPCHAR(MyState,',')) + 1) would give you the (e.g.) 6/10

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
See attached example
talk is cheap, supply exceeds demand

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hahaha i'm an idiot.
used SUBSTRINGCOUNT several times, then reverted to LEN(KEEPCHAR()) to work out how many times ',' appears!
