
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Categorizing a Field
I was helping a guy here at work, and he had some data that was similar that he wanted to be categorized together.
i.e. Big Dog, The Dog, would all have the category of DOG
I did it using an applymap and wildmatch. How would you do it?
I ran across this post http://www.qlikfix.com/2010/10/18/regular-expressions-in-the-load-script/ which is interesting. I do think it would be cool to have native regex in QV!
Category_Map:
MAPPING
LOAD * INLINE [
x,y
1,RED
2,PINK
3,BLUE
];
Initial_Data:
LOAD * INLINE [
ID, Category
123, Red Paint
223, The Red Paint
323, Pink
423, Pink Paint
523, Blue Paint
];
QUALIFY *;
Transformed_Data:
Load
ID,
// Use the applymap to classify the profession, 1000 is the default-value
// if the applymap does not find a match
applymap('Category_Map',Wildmatch(Category,'*red*','*pink*','*blue*'),'None') as Category
RESIDENT Initial_Data;
Drop table Initial_Data;

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Michael,
Your approach seems to be one I would use. Below is a variant you might also want to consider, where the "wildmatch string" is actually generated on the fly rather than having to type it out in the applymap/wildmatch.
Check below. I have used this "technique" to allow users to make data enrichment and custom categorisations, as well string searching for information.
Hope this adds on a new layer you can use elsewhere - although my answer to your post is a few years "late" (i.e. I was looking to see if this 'technique' was worth blogging and came across your string).
cheerio,
C ;o)
//Script start here
WildmatchStrings_Temp:
//2. Create a numerical equivalent for each line/string of data in table:
LOAD *, Rowno() as OrdinalPosition;
//1. Original strings against which you are trying to match:
LOAD * INLINE [
Clean_Name_as_Category, WildMatchString
Accenture, accenture
Hitachi, hitachi
US Department of Army, u*s*d*p*t*army
];
//Create a field with all strings which you will put in a variable and is sequentially 'ordered' as it is in above table:
Var_WildMatchString:
LOAD CHR(39) & '*' & Concat(DISTINCT WildMatchString, '*' & CHR(39) & ',' & CHR(39) & '*', OrdinalPosition) & '*' & CHR(39) as Var_WildMatchString RESIDENT WildmatchStrings_Temp;
//Insert value in var:
LET vVar_WildMatchString = PEEK('Var_WildMatchString', 0, 'Var_WildMatchString');
DROP TABLE Var_WildMatchString;
//Reload table as Mapping table:
WildMatch_Map:
MAPPING LOAD
OrdinalPosition,
Clean_Name_as_Category
RESIDENT WildmatchStrings_Temp;
DROP TABLE WildmatchStrings_Temp;
THE_FACT_TABLE:
LOAD
*,
ApplyMap('WildMatch_Map', WildMatchOrdinalPosition) as Clean_Name_as_Category
;
LOAD
*,
Wildmatch(AccountNames, $(vVar_WildMatchString)) as WildMatchOrdinalPosition
;
LOAD * INLINE [
AccountNames
Accenture Pvt Ltd
Accenture Private Limited
Accenture Consulting
Hitachi Systems Ltd
Hitachi System Limited
Hitachi France
Us Department of Army
United States Department of Army
us dept army
];
//end script here


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This can be simplified using MapSubstring, but the only thing with this is that MapSubstring is case sensitive
as below
Category_Map:
MAPPING
LOAD * INLINE [
y,x
RED,<RED>
PINK,<PINK>
BLUE,<BLUE>
];
Initial_Data:
LOAD * ,textbetween(MapSubString('Category_Map', UPPER(Category)),'<','>') as Description
INLINE [
ID, Category
123, RED Paint
223, The Red Paint
323, PINK
423, Pink Paint
523, Blue Paint
];
If a post helps to resolve your issue, please accept it as a Solution.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
btw if you have less categories you don't need a Mapping table just Pick(Wildmatch( will do
Initial_Data:
LOAD * , Pick(Wildmatch(Category,'*red*','*pink*','*blue*')+1,'None','Red','Pink','Blue') as NEWCategory INLINE [
ID, Category
123, Red Paint
223, The Red Paint
323, Pink
423, Pink Paint
523, Blue Paint
];
If a post helps to resolve your issue, please accept it as a Solution.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cheers Vineeth,
Was unaware of the Mapsubstring() function.
Kr,
C ;o)
