Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to create a calculated column populated with Most Common value from another field?

So I have a table with a ACCOUNT_ID column (being used as my primary key) and a ACCOUNT_NAME column.

The problem is that the account name is sometimes different for the same ID (IE: "John Smith - SF" versus "John Smith San Francisco")

I would like to create a third column "ACCOUNT_NAME_NORMALIZED" or something which would find the MOST COMMON name associated with an ID and populate it with that.

Anybody know how I can achieve this? Preferably at the script level, but I could also do it in chart/table expressions I suppose.

5 Replies
rustyfishbones
Master II
Master II

Try using the REPLACE FUNCTION

REPLACE (ACCOUNT_ID, 'John Smith -SF', 'John Smith San Francisco') AS ACCOUNT_NAME_NORMALISED

If you have more thsn one type just use the REPLACE a few times

Not applicable
Author

The problem here is I am dealing with hundreds of thousands of different ACCOUNT_ID's; most of which have multiple ACCOUNT_NAMES attached to them.

I need to fetch the name most commonly associated with the ID.

Your method would definitely work if it was an isolated incident, but will not work in this case.

Thanks though!!

Not applicable
Author

For reference here is what I have so far:

SQL SELECT max(`NameNormalized`)as `NameCount`, `Parent_Name`, `Parent_ID`

FROM(SELECT `Parent_ID`,

          `Parent_Name`,

    count(`Parent_Name`) as `NameNormalized`

FROM `C:\USERS\STEVE\DESKTOP\MYFOLDER`\`MYFILE.csv`

GROUP BY `Parent_ID`, `Parent_Name`)

GROUP BY `Parent_ID`, `Parent_Name`;

The problem is it's still bringing in EVERY parent name, they just all display the same max count...

rustyfishbones
Master II
Master II

Try

COUNT (AGGR (FIRSTSORTEDVALUE (Parent_Name,-Parent_ID), Parent_Name))

or something like that, let me know if that helps

Not applicable
Author

That did not get the desired result, I am not even sure what it ended up counting as the result was a very arbitrary number haha. Also I would really prefer to have it at the script level so I could then join on parent_id and use the new column across my entire deck.

And I am looking for the parent_name which has the highest count for each parent_id, NOT the value of the count itself....

Thanks for helping! We're bound to figure it out eventually...