Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello everyone,
I'm supposed to analyse a document-term matrix using Qlik Sense. My rows represent documents and my columns represent words. The values of the table are the occurrences of each word in each document.
What I need is knowing the words appearing most in my corpus. For that, I have to compute the sum of each word through all documents (rows) and choose the max, or (better) have a ranking from the most to the least appearing.
I tried to do it on my own, but my capabilities in Qlik Sense are very limited, especially for the script part. Can someone help me find a solution?
Thanks a lot in advance.
Let's imagine we have two documents : "you are learning" and "they are understanding". The resulting matrix would be :
you | are | learning | they | understanding
document_1 1 | 1 | 1 | 0 | 0
document_2 0 | 1 | 0 | 1 | 0
My objective is to order the words (that are in columns) from the most appearing in the corpus to the least appearing. In this exemple, the word "are" should be the first with a number of occurrences = 2.
What I did is that I called the crosstable function and made an aggregation. I don't know if there is a simpler way.