Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Get the row with the highest value from a list of duplicates

Hi,

I need help to identify the row with the highest value from a excel list containing some duplicated "postnr".

The table below shows a sample from the excel list of swedish zip codes.

postnrpostortlänskodlänkommunkodkommunA-regionH-regionH-region info
15593NYKVARN1STOCKHOLM140NYKVARN1H1Stockholm/Södertälje A-region
15593NYKVARN4SÖDERMANLAND486STRÄNGNÄS7H4Mellanbygden
15594NYKVARN1STOCKHOLM140NYKVARN1H1Stockholm/Södertälje A-region
15594NYKVARN4SÖDERMANLAND461GNESTA5H3Större städer

The column postnr is joined with the customer table identifying how many customers lives in a certain zip code, and what sort of H-region that zip code has.

As you can see from the list above one unique postnr could have many H-regions.

What I want is to - in the loading script - only load unique postnr with the lowest H-region (going from H1 -> H9).

Thanks ahead!

6 Replies
wgonzalez
Partner - Creator
Partner - Creator

Hi,

You don't need to modify you data in the loading script.  You can do that by creating a pivot table having postnr and H-region as dimensions.  Create an expression having the of items per H-region count(fieldname).

Regards.

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

First load your excel file without any filtering. Then:

INNER JOIN (mytab)

LOAD postnr, minstring([H-region]) as [H-region]

RESIDENT mytab;

That will remove all but the lowest value h-region for each postnr.

-Rob

http://robwunderlich.com

nagaiank
Specialist III
Specialist III

If you want to use script, the following are the steps (taking into account the solutions suggested by Rob and wgonzalez):

Data:

load postnr,postort,lanskod,lan,kommunkod,kommun,A_region,H_Region,H_region_info from (your source);

Data2:

load postnr,Min(H_region) as Min_H_Region resident Data group by postnr;

left join (Data) load * resident Data2;

drop table Data2;

Final:

Noconcatenate load * resident Data where H_Region = Min_H_Region;

drop table Data;

drop field Min_H_Region from Data2;

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Krishna,

Are the extra steps needed? It seems to me that just the inner join is all that's necessary.

-Rob

nagaiank
Specialist III
Specialist III

Rob,

Thank you. I did not notice the INNER JOIN. Actually your script works and there is no need for the extra steps in my post and i verified it. The following script gives the desired result:

mytab:

load * Inline [

postnr,postort,länskod,län,kommunkod,kommun,A-region,H-region,"H-region info"

15593,NYKVARN,1,STOCKHOLM,140,NYKVARN,1,H1,Stockholm/Södertälje A-region

15593,NYKVARN,4,SÖDERMANLAND,486,STRÄNGNÄS,7,H4,Mellanbygden

15594,NYKVARN,1,STOCKHOLM,140,NYKVARN,1,H1,Stockholm/Södertälje A-region

15594,NYKVARN,4,SÖDERMANLAND,461,GNESTA,5,H3,Större städer

];

INNER JOIN (mytab)

LOAD postnr, minstring([H-region]) as [H-region] RESIDENT mytab Group By postnr;

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

Thanks for the confirmation. I thought I may have been missing something.

-Rob