7 Replies Latest reply: Mar 18, 2016 6:06 AM by Andrew Mein RSS

    Matching substring in column with another column

    Jithu J

      Here I have two 3 columns states1, states2 and MYstate -

       

      Whatever states under state1 and state2,

       

      MY State should have the same state then result as Yes or else no

       

         

      States1States2MY stateResult
      WA, NY, CAWA, NYCA,WAYes
      WA, NY, CAWA, NYCAYes
      WA, NY, CAWA, NY No
      WA, NY, CAWA, NYNZNo

       

      I am wondering how can I do this with set Analysis .... any thoughts or examples??

      Any help is appreciated....

        • Re: Matching substring in column with another column
          Gysbert Wassenaar

          If(substringcount(States1, [MY state]) or substringcount(States2, [MY state]), 'Yes','No')

            • Re: Matching substring in column with another column
              Andrew Mein

              I think SUBSTRINGCOUNT will only match the entire content of MyState - rather than individual elements e.g. 'CA,WA' in the first record, will not match to States1

               

              You could possibly look into the SUBFIELD function - i haven't used it, but it appears to split a string out according to a delimiter. QlikView String Function cheat sheet

               

              Other than that, the only way i can think is to change your Load Script to create a single line for each element of MyState

                • Re: Matching substring in column with another column
                  Andrew Mein

                  so I created two functions in the load script:

                   

                  SET vSplit = IF(LEN(SUBFIELD($1,',',$2))=0,'',SUBFIELD($1,',',$2));

                  SET vCheck = IF($(vSplit($2,$3))='',0,SUBSTRINGCOUNT($1,$(vSplit($2,$3))));

                   

                  Then created my calculated expression in the front end as:

                   

                  IF(($(vCheck(States1,MyState,1)) + $(vCheck(States1,MyState,2)) + $(vCheck(States2,MyState,1)) + $(vCheck(States2,MyState,2)))>0,'Yes','No')

                   

                  only problem is - it is currently limited to 2 elements in MyState (you can extend it my adding $(vCheck(States1,MyState,3)) to the IF

                   

                  There might be a way of creating a more complex function which loops through all elements of the MyState - but also something i have never done before!

                    • Re: Matching substring in column with another column
                      Jithu J

                      Hi Andrew,

                       

                                     Thanks but let me tell know here... the state counts may increase to 'n' number... so I can't do that .. let me know if you have any other solution.

                       

                       

                      Let me give one more example :

                       

                        

                      MY StateStates1States2
                      -,BC,CA,FL,GA,Guangdong,IL,NJ,NULL,ON,TXCA
                      -BC,CA,FL,GA,IL,MD,NC,NJ,ON,TXNC
                      CA,TX,VA,FLCA,FL,VAVA
                      -AB,AL,AZ,BC,CA,CO,CT,FL,GA,HI,IA,IL ,IN,KS,KY,LA,MA,MD,ME,MI,MN,MO,MS,MT,NC,ND,NE,NJ,NM,NV,NY,OH,OK,ON,OR,PA,SC,TN,TX,UT,VA,VT,WA,WITX

                       

                       

                      Here let me say my state is a super set .... if states in sate 1 + states2 are matching with MY state then Yes or No

                      So technically if my states have 10 states and states1 have 3 + states 2 have 3 i.e. union of states1 + states2 is 6 if all those 6 states are matching with my state i.e. 6/10 are matching then yes ...

                       

                      right now I am deriving this logic from sql it self... but I am wondering if I can do this using set analysis

                        • Re: Matching substring in column with another column
                          Andrew Mein

                          I don't think its available in Set Analysis, in the current load format.

                           

                          The only way I could find to make it work with n states in MyState, was to change the load script to make use of SUBFIELD() to split MyStates into single records e.g.

                          LOAD

                               MyState,

                               SUBFIELD(MyState,',') AS SubState

                          RESIDENT <MyStateSource>;

                           

                          You can then use SUM(SUBSTRINGCOUNT(State1,SubState))>0 to determine whether there is a match on State1 and SUM(SUBSTRINGCOUNT(State2,SubState))>0 to determine a match on State2.

                          LEN(KEEPCHAR(MyState,','))+1 will tell you how many entries are in MyState - SUM(SUBSTRINGCOUNT()) will count how many of those match.

                          so:

                          (SUM(SUBSTRINGCOUNT(State1,SubState)) + SUM(SUBSTRINGCOUNT(State2,SubState))) / (LEN(KEEPCHAR(MyState,',')) + 1) would give you the (e.g.) 6/10