Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
mellerbeck
Creator II
Creator II

Categorizing a Field

I was helping a guy here at work, and he had some data that was similar that he wanted to be categorized together.

i.e.  Big Dog, The Dog, would all have the category of DOG

I did it using an applymap and wildmatch. How would you do it?

I ran across this post http://www.qlikfix.com/2010/10/18/regular-expressions-in-the-load-script/ which is interesting. I do think it would be cool to have native regex in QV!

Category_Map:

MAPPING

LOAD * INLINE [

    x,y

    1,RED

    2,PINK

    3,BLUE

];

Initial_Data:

LOAD * INLINE [

    ID, Category

    123, Red Paint

    223, The Red Paint

    323, Pink

    423, Pink Paint

    523, Blue Paint

];

QUALIFY *;

Transformed_Data:

Load

          ID,

          // Use the applymap to classify the profession, 1000 is the default-value

          // if the applymap does not find a match

          applymap('Category_Map',Wildmatch(Category,'*red*','*pink*','*blue*'),'None') as Category

RESIDENT Initial_Data;

Drop table Initial_Data;

4 Replies
cheenu_janakira
Creator III
Creator III

Hi Michael,

Your approach seems to be one I would use. Below is a variant you might also want to consider, where the "wildmatch string" is actually generated on the fly rather than having to type it out in the applymap/wildmatch.

Check below. I have used this "technique" to allow users to make data enrichment and custom categorisations, as well string searching for information.

Hope this adds on a new layer you can use elsewhere - although my answer to your post is a few years "late" (i.e. I was looking to see if this 'technique' was worth blogging and came across your string).

cheerio,

C  ;o)

//Script start here

WildmatchStrings_Temp:

//2. Create a numerical equivalent for each line/string of data in table:

LOAD *, Rowno() as OrdinalPosition;

//1. Original strings against which you are trying to match:

LOAD * INLINE [

     Clean_Name_as_Category, WildMatchString

     Accenture, accenture

     Hitachi, hitachi

     US Department of Army, u*s*d*p*t*army

];

//Create a field with all strings which you will put in a variable and is sequentially 'ordered' as it is in above table:

Var_WildMatchString:

LOAD CHR(39) & '*' & Concat(DISTINCT WildMatchString, '*' & CHR(39) & ',' & CHR(39) & '*', OrdinalPosition) & '*' & CHR(39) as Var_WildMatchString RESIDENT WildmatchStrings_Temp;

//Insert value in var:

LET vVar_WildMatchString = PEEK('Var_WildMatchString', 0, 'Var_WildMatchString');

DROP TABLE Var_WildMatchString;

//Reload table as Mapping table:

WildMatch_Map:

MAPPING LOAD

     OrdinalPosition,

     Clean_Name_as_Category

RESIDENT WildmatchStrings_Temp;

DROP TABLE WildmatchStrings_Temp;

THE_FACT_TABLE:

LOAD

     *,

     ApplyMap('WildMatch_Map', WildMatchOrdinalPosition) as Clean_Name_as_Category

;

LOAD

     *,

     Wildmatch(AccountNames, $(vVar_WildMatchString)) as WildMatchOrdinalPosition

;

LOAD * INLINE [

    AccountNames

    Accenture Pvt Ltd

    Accenture Private Limited

    Accenture Consulting

    Hitachi Systems Ltd

    Hitachi System Limited

    Hitachi France

    Us Department of Army

    United States Department of Army

    us dept army

];

//end script here

vinieme12
Champion III
Champion III

This can be simplified using MapSubstring, but the only thing with this is that MapSubstring is case sensitive

as below

Category_Map:

MAPPING

LOAD * INLINE [

    y,x

    RED,<RED>

    PINK,<PINK>

    BLUE,<BLUE>

];

Initial_Data:

LOAD * ,textbetween(MapSubString('Category_Map', UPPER(Category)),'<','>') as Description

  INLINE [

    ID, Category

    123, RED Paint

    223, The Red Paint

    323, PINK

    423, Pink Paint

    523, Blue Paint

];

Vineeth Pujari
If a post helps to resolve your issue, please accept it as a Solution.
vinieme12
Champion III
Champion III

btw if you have less categories you don't need a Mapping table just Pick(Wildmatch( will do

Initial_Data:

LOAD * , Pick(Wildmatch(Category,'*red*','*pink*','*blue*')+1,'None','Red','Pink','Blue') as NEWCategory INLINE [

    ID, Category

    123, Red Paint

    223, The Red Paint

    323, Pink

    423, Pink Paint

    523, Blue Paint

];

Vineeth Pujari
If a post helps to resolve your issue, please accept it as a Solution.
cheenu_janakira
Creator III
Creator III

Cheers Vineeth,

Was unaware of the Mapsubstring() function.

Kr,

C  ;o)