Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

QlikView script for Joining files and Deduplicating

Hello, I have a query I'm hoping someone can help with. I have two tables I want to join in QlikView on a one-to-many basis, but after the join I want to pick a specific one of the many records to create a one-to-one file. Here are the details:

Table A  - one record

IDValue1

1

10
212

Table B - many records

IDSeqValue2
119
1210
1311
218
2211
23

13

Then join them together on ID only where value1 is greater than or equal to value 2

IDSeqValueValue2
11109
121010
21128
221211

Then de-duplicate by picking the highest value2 record

IDSeqValueValue2
121010
221211

Hope that makes sense, does anyone know how to do it?

Many thanks.

2 Replies
flipside
Partner - Specialist II
Partner - Specialist II

Something like this ...

TableA:
Load * inline [
ID, Value1
1, 10
2, 12
]
;

left join (TableA)

//TableB:
Load * inline [
ID, Seq, Value2
1, 1, 9
1, 2, 10
1, 3, 11
2, 1, 8
2, 2, 11
2, 3, 13
]
;

TableC:
Noconcatenate Load ID, Value1 as Value, Seq, Value2 resident TableA where Value1 >= Value2;
Drop Table TableA;

inner join (TableC) Load ID, max(Value2) as Value2 Resident TableC Group By ID;

flipside

njmaehler
Partner - Creator
Partner - Creator

Hi James,

This should work (not sure if you need the sequence number too but if you do then the solution Dave Riley provided is great)

TableA:
Load * inline [
ID, Value1
1, 10
2, 12
];

LEFT JOIN (TableA)
Load * inline [
ID, Seq, Value2
1, 1, 9
1, 2, 10
1, 3, 11
2, 1, 8
2, 2, 11
2, 3, 13
];

RESULT:
LOAD ID,
      Value1 as Value,
      max(Value2) as Value2
RESIDENT TableA
where Value1 >= Value2
group by ID, Value1;

DROP TABLE TableA;

If for the 3rd table you wanted it where the max value is equal to or higher than Value1 (opposite way around - I know because I mis-read it the first time) then mapping will work here. Just adding my example below for this too because I find mapping so useful. Just remember because you want the max number then the order by clause (in descending order) is vital. Mapping is great because it saves re-reading tables when joining them and it only ever maps the first record in (hence the reason of the order by clause).

//this answer will get the max(Value2) where Value2 >= Value1
TableA:
Load * inline [
ID, Value1
1, 10
2, 12
];

TMP:
Load * inline [
ID, Seq, Value2
1, 1, 9
1, 2, 10
1, 3, 11
2, 1, 8
2, 2, 11
2, 3, 13
];

MAP_HIGHEST:
MAPPING Load ID,
Value2
RESIDENT TMP
order by ID, Value2 desc;

DROP TABLE TMP;

RESULT:
LOAD ID,
Value1,
ApplyMap('MAP_HIGHEST', ID, Value1) as Value2
RESIDENT TableA;

DROP TABLE TableA;