Qlik Community

QlikView App Development

Discussion Board for collaboration related to QlikView App Development.

Not applicable

How to identify recurrent terms in text fields?

Hello,

Is there a way to identify recurrent terms (or frequent words sequences) in text fields?

For example, I have 3 phrases, each one in distinct fields:

1 - Qlikview is the best BI Tool in the market

2 - A new BI Tool is about to be launched next year.

3 - Buying one BI Tool is the best solution for your work problems. Maybe next year, ok?

From the above example, is clear that "BI Tool" has appeared in all text fields. And "Next year" in two of them. Considering this, how can I create a new field with these specific words sequence, with an maximum of 3 words or preposition? Here, we could have two values in this new field: "BI Tool" - with 3 registers - and "next year" - with two registers.

The ideia here is create one "termcloud", which can show us better results in context than one simple wordcloud.

Tks!

1 Solution

Accepted Solutions
Highlighted

Re: How to identify recurrent terms in text fields?

Hi,

maybe one solution could be something like:

QlikCommunity_Thread_253797_Pic1.JPG

QlikCommunity_Thread_253797_Pic2.JPG

QlikCommunity_Thread_253797_Pic3.JPG

tabPhrases:

LOAD RecNo() as ID, *

INLINE [

    Phrase

    Qlikview is the best BI Tool in the market

    A new BI Tool is about to be launched next year.

    "Buying one BI Tool is the best solution for your work problems. Maybe next year, ok?"

];

tabWordTuples:

LOAD Distinct

     *,

     SubStringCount(WordTuple,' ')+1 as WordCount;

LOAD ID,

     WordStart,

     Trim(PurgeChar(WordTuple,'.,?')) as WordTuple

Where Len(Trim(WordTuple));

LOAD ID,

     Div(IterNo()-1,3)+1 as WordStart,

     Mid(Phrase,Index(' '&Phrase,' ',Div(IterNo()-1,3)+1),Index(' '&Phrase&'  ',' ',Div(IterNo()-1,3)+Mod(IterNo()-1,3)+2)-Index(' '&Phrase,' ',Div(IterNo()-1,3)+1)-1) as WordTuple

Resident tabPhrases

While IterNo()<=(SubStringCount(Phrase,' ')+1)*3;

hope this helps

regards

Marco

9 Replies

Re: How to identify recurrent terms in text fields?

May be word cloud?

Word Cloud Object Extension

Re: How to identify recurrent terms in text fields?

Not applicable

Re: How to identify recurrent terms in text fields?

No. Word Cloud I've already created here.

As I said, I'm looking for an "TermCloud"  instead of WordCloud (which is based on only words).

Tks

MVP & Luminary
MVP & Luminary

Re: How to identify recurrent terms in text fields?

Are you looking to find a set of predefined terms or auto discovery of the terms? If you auto-discover, you're also going to get hits in your example above like "is the", "Tool is", "the Best".  You can deal with some of that by purging the common words.

-Rob

Highlighted

Re: How to identify recurrent terms in text fields?

Hi,

maybe one solution could be something like:

QlikCommunity_Thread_253797_Pic1.JPG

QlikCommunity_Thread_253797_Pic2.JPG

QlikCommunity_Thread_253797_Pic3.JPG

tabPhrases:

LOAD RecNo() as ID, *

INLINE [

    Phrase

    Qlikview is the best BI Tool in the market

    A new BI Tool is about to be launched next year.

    "Buying one BI Tool is the best solution for your work problems. Maybe next year, ok?"

];

tabWordTuples:

LOAD Distinct

     *,

     SubStringCount(WordTuple,' ')+1 as WordCount;

LOAD ID,

     WordStart,

     Trim(PurgeChar(WordTuple,'.,?')) as WordTuple

Where Len(Trim(WordTuple));

LOAD ID,

     Div(IterNo()-1,3)+1 as WordStart,

     Mid(Phrase,Index(' '&Phrase,' ',Div(IterNo()-1,3)+1),Index(' '&Phrase&'  ',' ',Div(IterNo()-1,3)+Mod(IterNo()-1,3)+2)-Index(' '&Phrase,' ',Div(IterNo()-1,3)+1)-1) as WordTuple

Resident tabPhrases

While IterNo()<=(SubStringCount(Phrase,' ')+1)*3;

hope this helps

regards

Marco

Re: How to identify recurrent terms in text fields?

please close your thread if your question is answered:

Qlik Community Tip: Marking Replies as Correct or Helpful

thanks

regards

Marco

Not applicable

Re: How to identify recurrent terms in text fields?

Hi Macro,

Are you able to help on implement the same method on my qlikview?

Thanks!

https://community.qlik.com/message/1244265#1244265

Re: How to identify recurrent terms in text fields?

I tried to.

See there.

regards

Marco

Keitaru
New Contributor III

Re: How to identify recurrent terms in text fields?

Hi  ,

I've applied your idea to what I'm doing on QlikSense Enterprise and it worked. However my bag of words / word tuple captured Date/Time stamp how do I not have the script not include date/timestamp/months information as well as stop words.