Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
 
					
				
		
Hello,
Is there a way to identify recurrent terms (or frequent words sequences) in text fields?
For example, I have 3 phrases, each one in distinct fields:
1 - Qlikview is the best BI Tool in the market
2 - A new BI Tool is about to be launched next year.
3 - Buying one BI Tool is the best solution for your work problems. Maybe next year, ok?
From the above example, is clear that "BI Tool" has appeared in all text fields. And "Next year" in two of them. Considering this, how can I create a new field with these specific words sequence, with an maximum of 3 words or preposition? Here, we could have two values in this new field: "BI Tool" - with 3 registers - and "next year" - with two registers.
The ideia here is create one "termcloud", which can show us better results in context than one simple wordcloud.
Tks!
 
					
				
		
 MarcoWedel
		
			MarcoWedel
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi,
maybe one solution could be something like:
tabPhrases:
LOAD RecNo() as ID, *
INLINE [
Phrase
Qlikview is the best BI Tool in the market
A new BI Tool is about to be launched next year.
"Buying one BI Tool is the best solution for your work problems. Maybe next year, ok?"
];
tabWordTuples:
LOAD Distinct
*,
SubStringCount(WordTuple,' ')+1 as WordCount;
LOAD ID,
WordStart,
Trim(PurgeChar(WordTuple,'.,?')) as WordTuple
Where Len(Trim(WordTuple));
LOAD ID,
Div(IterNo()-1,3)+1 as WordStart,
Mid(Phrase,Index(' '&Phrase,' ',Div(IterNo()-1,3)+1),Index(' '&Phrase&' ',' ',Div(IterNo()-1,3)+Mod(IterNo()-1,3)+2)-Index(' '&Phrase,' ',Div(IterNo()-1,3)+1)-1) as WordTuple
Resident tabPhrases
While IterNo()<=(SubStringCount(Phrase,' ')+1)*3;
hope this helps
regards
Marco
 sunny_talwar
		
			sunny_talwar
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		May be word cloud?
 sunny_talwar
		
			sunny_talwar
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		For sense:
 
					
				
		
No. Word Cloud I've already created here.
As I said, I'm looking for an "TermCloud" instead of WordCloud (which is based on only words).
Tks
 
					
				
		
 rwunderlich
		
			rwunderlich
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Are you looking to find a set of predefined terms or auto discovery of the terms? If you auto-discover, you're also going to get hits in your example above like "is the", "Tool is", "the Best". You can deal with some of that by purging the common words.
-Rob
 
					
				
		
 MarcoWedel
		
			MarcoWedel
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi,
maybe one solution could be something like:
tabPhrases:
LOAD RecNo() as ID, *
INLINE [
Phrase
Qlikview is the best BI Tool in the market
A new BI Tool is about to be launched next year.
"Buying one BI Tool is the best solution for your work problems. Maybe next year, ok?"
];
tabWordTuples:
LOAD Distinct
*,
SubStringCount(WordTuple,' ')+1 as WordCount;
LOAD ID,
WordStart,
Trim(PurgeChar(WordTuple,'.,?')) as WordTuple
Where Len(Trim(WordTuple));
LOAD ID,
Div(IterNo()-1,3)+1 as WordStart,
Mid(Phrase,Index(' '&Phrase,' ',Div(IterNo()-1,3)+1),Index(' '&Phrase&' ',' ',Div(IterNo()-1,3)+Mod(IterNo()-1,3)+2)-Index(' '&Phrase,' ',Div(IterNo()-1,3)+1)-1) as WordTuple
Resident tabPhrases
While IterNo()<=(SubStringCount(Phrase,' ')+1)*3;
hope this helps
regards
Marco
 
					
				
		
 MarcoWedel
		
			MarcoWedel
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		please close your thread if your question is answered:
Qlik Community Tip: Marking Replies as Correct or Helpful
thanks
regards
Marco
 
					
				
		
Hi Macro,
Are you able to help on implement the same method on my qlikview?
Thanks!
 
					
				
		
 MarcoWedel
		
			MarcoWedel
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		I tried to.
See there.
regards
Marco
 Keitaru
		
			Keitaru
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi MarcoWedel ,
I've applied your idea to what I'm doing on QlikSense Enterprise and it worked. However my bag of words / word tuple captured Date/Time stamp how do I not have the script not include date/timestamp/months information as well as stop words.
