Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to easily get the frequency of each value for one field (from the flat file)

I have 100 csv files with the same schema.
One each file, I need to focus on one particular field and get the frequency for all values in that field.
Here is an example, if on File001 the values for that field (Column1) is like below:
Column1
A
A
A
B
B
C

We want to have the output :
Column1     Frequency
A                  50%
B                  33%
C                  17%

The same process will run through all 100 files and eventually I will gather/unite the value with top frequency from each file. The the final output will have 100 rows (one row for each file).

Thanks!


Labels (2)
1 Solution

Accepted Solutions
cmendels
Contributor III
Contributor III

I would try a job similar to this: talend.JPGWhere the tAggregateRow counts Column1 and groups by Column1, then the frequency code would look like this:

javarow.JPGNote: I haven't tested this because I don't have a similar collection of files to use at the moment and I don't quite have the time to generate them.

 

 

View solution in original post

2 Replies
cmendels
Contributor III
Contributor III

I would try a job similar to this: talend.JPGWhere the tAggregateRow counts Column1 and groups by Column1, then the frequency code would look like this:

javarow.JPGNote: I haven't tested this because I don't have a similar collection of files to use at the moment and I don't quite have the time to generate them.

 

 

Anonymous
Not applicable
Author

I created a similar job. I used tMap to add the calculated field (frequency) instead of using tJavaRow and then tSortRow (desc) and then tSampleRow to get the top row.

Thanks!