Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
See why IDC MarketScape names Qlik a 2025 Leader! Read more
cancel
Showing results for 
Search instead for 
Did you mean: 
Crichter141
Creator
Creator

Why I hate Free Text - Text Analysis help please.

QLIKSENSE only please.

So our input platform has a lot of free text and not very many drop downs to standardize the data.  This has become a nightmare for certain dimensions that we need to report on.  

Example:

One Field:  Color  Seems simple enough.  We have over 1 million product color names.  Partly because the users put in various versions of the same name, but there's still a lot of actual individual names.

Example:  MOCHA-IVORY can be MOCHA-IVORY or IVORY-MOCHA or Fabric MOCHA-IVORY, etc.  You name it they can put it in the Color field.  And we have a lot of these types of fields (more than I want to write IF statements for).

Is there a way to analyze the text to look at the color field for all variations of a color (MOCHA-IVORY) and then replace it with a standard value for each one?

Can't be value by value...hoping for a code solution, not an extension.

More Valid Colors:

color1:  MOCHA-IVORY - could be reversed, could be with or without hyphen, could have other text

color2:  IVORY-IVORY

color3:  JAVA

color4:  WHITE

color 5:  ALMOND OVER GOLD

color 6:  ALMOND LACE

color 7:  AMETHYST

Thank you!

Labels (5)
1 Solution

Accepted Solutions
BrunPierre
Partner - Master II
Partner - Master II

Hello, you can replace the values in the source table by creating a table mapping with the standard values.

//----- Load mapping table of Valid Colors -----\\
MapValidColor:
mapping LOAD *
Inline [
ColorCode,Valid Color
color1,MOCHA-IVORY
color2,IVORY-IVORY
color3,JAVA
color4,WHITE
color5,ALMOND OVER GOLD
color6,ALMOND LACE
color7,AMETHYST
];

//----- Load Source table containing various versions of the same name.If the color code isn't listed in the mapping table then display no match-----\\
Data:
LOAD ColorCode,
Color,
ApplyMap('MapValidColor',ColorCode,'No match') as "Valid Color"

FROM SourceTable;

//----- Optional----- \\
DROP Field Color From Data;
RENAME Field "Valid Color" to Color;

View solution in original post

5 Replies
BrunPierre
Partner - Master II
Partner - Master II

Hello, you can replace the values in the source table by creating a table mapping with the standard values.

//----- Load mapping table of Valid Colors -----\\
MapValidColor:
mapping LOAD *
Inline [
ColorCode,Valid Color
color1,MOCHA-IVORY
color2,IVORY-IVORY
color3,JAVA
color4,WHITE
color5,ALMOND OVER GOLD
color6,ALMOND LACE
color7,AMETHYST
];

//----- Load Source table containing various versions of the same name.If the color code isn't listed in the mapping table then display no match-----\\
Data:
LOAD ColorCode,
Color,
ApplyMap('MapValidColor',ColorCode,'No match') as "Valid Color"

FROM SourceTable;

//----- Optional----- \\
DROP Field Color From Data;
RENAME Field "Valid Color" to Color;

hic
Former Employee
Former Employee

A more elaborate explanation of data cleansing using the above method can be found on https://community.qlik.com/t5/Design/Data-Cleansing/ba-p/1464362 

Crichter141
Creator
Creator
Author

This will help immensely, thank you both! 

If I may follow up with a bit more complexity.  What if (in this case) the color is in a field with a string - can mapping loads be used with a parsed out string or wildmatch?  example text.  'The color wanted is off-white, but our closest match is IVORY-IVORY'  Can I pull out just the color to put into a mapping load to standardize it?

Again, thank you!

hic
Former Employee
Former Employee

One way could be the following:
 
1) Define the colors in a mapping table:
 
Colors:
Mapping Load Color, '>>>' & Color & '<<<' as TaggedColor
Inline
[Color
MOCHA-IVORY
IVORY-MOCHA
IVORY-IVORY
JAVA
WHITE
ALMOND OVER GOLD
ALMOND LACE
AMETHYST];
 
2) Then you can use 
   TextBetween(MapSubstring('Colors',Field),'>>>','<<<') as Color
 
The MapSubstring() will insert the delimiters in the right place, and the TextBetween() will extract the color.
 
Alternatively, you can define the RGB colors already in the mapping table:
 
Mapping Load Color, '>>>' & RGB(R,G,B) & '<<<' as TaggedColor
Inline
[Color, R, G, B
MOCHA-IVORY, 100, 50, 50
...

 

Or
MVP
MVP

Careful with this sort of analysis - if you text match (case insensitive), it'll match white before it matches ivory-ivory, since white is also a legal color and is ahead of ivory-ivory in the text...