Re: Sum and max through all columns - Qlik Community

Anonymous · ‎2018-12-06

Hello everyone,

I'm supposed to analyse a document-term matrix using Qlik Sense. My rows represent documents and my columns represent words. The values of the table are the occurrences of each word in each document.

What I need is knowing the words appearing most in my corpus. For that, I have to compute the sum of each word through all documents (rows) and choose the max, or (better) have a ranking from the most to the least appearing.

I tried to do it on my own, but my capabilities in Qlik Sense are very limited, especially for the script part. Can someone help me find a solution?

Thanks a lot in advance.

Anil_Babu_Samineni · ‎2018-12-06

Perhaps this?

Firstsortedvalue(aggr(sum(measure), document), -document)

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful

Anonymous · ‎2018-12-08

Thanks for your answer @Anil_Babu_Samineni. I don't understand what does measure and document stand for, since I have many documents as rows and several words as columns. I don't have only one measure but more than 2000.

Anil_Babu_Samineni · ‎2018-12-08

Can you come back with some example and expected result to understand little more..

Best Anil, When applicable please mark the correct/appropriate replies as "solution" (you can mark up to 3 "solutions". Please LIKE threads if the provided solution is helpful

Anonymous · ‎2018-12-08

Let's imagine we have two documents : "you are learning" and "they are understanding". The resulting matrix would be :

you | are | learning | they | understanding

document_1 1 | 1 | 1 | 0 | 0

document_2 0 | 1 | 0 | 1 | 0

My objective is to order the words (that are in columns) from the most appearing in the corpus to the least appearing. In this exemple, the word "are" should be the first with a number of occurrences = 2.

What I did is that I called the crosstable function and made an aggregation. I don't know if there is a simpler way.