Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
qlikmark1990
Contributor III
Contributor III

Remove duplicate rows based on the highest value in another field

I have a dataset that contains patient data and its postcodes. The geographical coordinates are extracted from another table with postal code and coordinates data. By joining the tables together, there are some postal codes that has 2 or more possible latitude and longitude coordinates. My goal is to only keep the records that has the highest value in the field 'longitude and latitude'.

        

search Idpostal codelatitudelongitudelongitude and latitude
2864NL|1717KK51,558113325,0967112365155850967
2572NL|8888TN50,876299535,9791156935087659791
53NL|7777RD51,831334745,8106709785183158106
53NL|7777RD51,833219715,8131772685183358131
1047NL|2323MD51,067675165,8531320615106758531

I made the field 'longitude and latitude' as a combination of the fields 'latitude' and 'longitude' to use as a value to determine a max.

How can I process the data to remove only the row

53NL|7777RD51,831334745,8106709785183158106

that has the lowest value and keep the one with the highest value? The method also needs to work if there are 3 of more rows with same search Id and if double search Id's have the same longitude and latitude.

Using DISTINCT while loading does't make sense.

This data isn't real to keep the original data private.

1 Solution

Accepted Solutions
sunny_talwar

Here is a sample script

Table:

LOAD * INLINE [

    search Id, postal code, latitude, longitude, longitude and latitude,

    2864, NL|1717KK, "51,55811332", "5,096711236", 5155850967,

    2572, NL|8888TN, "50,87629953", "5,979115693", 5087659791

    53, NL|7777RD, "51,83133474", "5,810670978", 5183158106,

    53, NL|7777RD, "51,83321971", "5,813177268", 5183358131,

    1047, NL|2323MD, "51,06767516", "5,853132061", 5106758531

];


Right Join (Table)

LOAD [search Id],

[postal code],

Max([longitude and latitude]) as [longitude and latitude]

Resident Table

Group By [search Id], [postal code];

View solution in original post

2 Replies
sunny_talwar

May be this

Table:

LOAD [search Id],

     [postal code],

     latitude,

     longitude,

     [longitude and latitude]

FROM ...;

Right Join (Table)

LOAD [search Id],

     [postal code],

     Max([longitude and latitude]) as [longitude and latitude]

Resident Table

Group By [search Id], [postal code];

sunny_talwar

Here is a sample script

Table:

LOAD * INLINE [

    search Id, postal code, latitude, longitude, longitude and latitude,

    2864, NL|1717KK, "51,55811332", "5,096711236", 5155850967,

    2572, NL|8888TN, "50,87629953", "5,979115693", 5087659791

    53, NL|7777RD, "51,83133474", "5,810670978", 5183158106,

    53, NL|7777RD, "51,83321971", "5,813177268", 5183358131,

    1047, NL|2323MD, "51,06767516", "5,853132061", 5106758531

];


Right Join (Table)

LOAD [search Id],

[postal code],

Max([longitude and latitude]) as [longitude and latitude]

Resident Table

Group By [search Id], [postal code];