Qlik Community

QlikView Documents

QlikView documentation and resources.

Document boards are being consolidated, this board no longer allows NEW documents READ MORE

Find Key Words and Phrases in Free Text Fields

Showing results for 
Search instead for 
Did you mean: 
Luminary Alumni
Luminary Alumni

Find Key Words and Phrases in Free Text Fields


Sometimes data comes in a free text format, customer comments and feedback are a great example. Here we have a wealth of information a company could harvest and dashboard in QlikView although because it’s free text is becomes difficult to summarise.

This code reads each comment field and searches it for a key word or phrase and counts the occurrences.

Key Words / Phrases

Key Words.png

Free Text Comment Field

Free Text Comments.png

Output Table

Output Table.png

Due to the fact its reading each possible word or phrase and reloading the comments each time this process could take a while the more data you have although typically comment type fields are small volumes. I would also use an incremental QVD builder to ensure I’m only transforming new data once.

Happy Qliking

Find more blog posts at QlikCentral.com



Thanks the post. Great as usual.

Luminary Alumni
Luminary Alumni

Thankyou for your kind words Márcio


Thanks for sharing, this will prove very usefull.


Hello Richard,  got interested in the topic of your post and wanted to try an alternate version of this, that perhaps could expand to any set of fed data without requiring programmatic changes..  hope you don't mind my attempt, and apologies for jumping in, but I liked what you had done.


Luminary Alumni
Luminary Alumni

Morning Evan,

I'm all for making the code more automated and like what you have done with the idea!


Not applicable

Hi Richard,

I do something similar using wildmatch, which means I only need to read the data on the first pass

I use it to pattern match text in incident tickets raised to a support desk to identify common issues in various guises!

in you example I could do the following.


load ID,Comments

, if(wildmatch(Comments,'*frustrating*', '*frustrated*', '*frustration*' )>0,1,0) as CE_Negative,
if(wildmatch(Comments,'*efficient*', '*prompt*', '*satisfied*' )>0,1,0) as CE_Positive,
if(wildmatch(Comments,'*friendlier*', '*rude*', '*not pleasant*' )>0,1,0) as Staff_Poor


Load * Inline [
ID ,Comments
1 ,I felt really frustrated about my service
2 ,My experience was not pleasant. I going to switch
3 ,The install was efficient however the staff were rude and not pleasant.


Just a different perspective


Luminary Alumni
Luminary Alumni

Thanks Richard (hope you're well).

I'll confess I haven't used this yet, although I'm looking forward to an opportunityas I can really see these types of techniques adding value quickly (as other options which don't include QlikView are available). This design was developed to answer a community question and how there key words were stored drove the design.

It's good to see these alternative ways, looking at the code I see its ticking a box rather than performing a count - not that that's a bad thing! 🙂

Thanks for sharing


Not applicable

Hi Richard,

You are right in that is sets a flag (or ticks a box)  I should have explained that.  In our application we then use this to select all the "incident tickets" that matched the expression.  We can also produce charts etc. , based on counts.

Our need is slightly different as a match condition is binary, it either matches or it does not, in your example your comment could have both positive and negative conotations.

In one of my applications, I load incident descriptions and timestamp data into table "data" (Actually its a binary load of a larger app)  then I post process that data as.

join (data) load [Incident Number],wildmatch(Description,
'*oradata*filesystem is now*',
'Error(s) written to /var/adm/messages * please contact UNIX team',
'The Qlikview Server service terminated unexpectedly.*',
'Possible performance problem which may impact service please contact Support*',
'monitor Alarm for *',
'*Backup failed with exit code*',
'bad app DB2 deadlocks*',
'Pmon is unable to connect*',
'*filesystem is now*percent full*',
'*background dump is currently in use on Oracle database*',
'*M1*terminated unexpectedly*',
'*java.lang.OutOfMemoryError. Restart*',
'Problem with process *(Process may be down*',
'* has hung threads. Contact *' ,
'Event Description: *Report server could not write to socket*' )
as Pattern_match resident data;

join LOAD * INLINE [
Pattern_match, Pattern Description
0, N/A
1, Oracle filesystem full
2, Var/ADM messages
3, Qlikview service Terminated
4, PSD Performance Alert
5, moniotor Alerts
6, Backup Failures
7, Bad app Deadlocks
8, Patrol Unable to connect
9, Oracle filesystem full
10, Background Dump Issue
11, M1 failure
12, Java OOM
13, Websphere server down or hung
14, Websphere server down or hung
15, Bad app report Server

You can see that there are 2 pattern descriptions for "Oracle Filesystem full" as there are 2 types of alert generated.

I agree with you its a very interesting side to analysis, I would be interested in anyones thoughts on other ways to do this type of analysis.

Thanks for posting an interesting article.  It does prompt a lot  of thought.



Hi I was looking if there was something similar like this for QlikSense.

As I'm looking at how with the use of a few free text field able to detect or find most frequently used Keywords/Phrase (or at least using TF-IDF model) 

Version history
Last update:
‎2014-08-26 04:38 AM
Updated by: