Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
quickqlik2
Contributor II
Contributor II

Qlik Sense DISTINCT on One Column

The table I have looks like this:

Product ID Date Year
PMX99 2022-02-29 2022
PMX99 2021-12-08 2021
ORA31 2022-05-05 2022
ORA31 2021-09-23 2021

 

I want to sort descending by Date, select only distinct Product ID, and return the Product ID and Year. The output would look like this (expected)

Product ID Year
PMX99 2022
ORA31 2022

 

The code I have is

TempTable:
LOAD
    "Product ID",  
    "Date",
    "Year"
FROM [lib://excel.xlsx (ABC)](ooxml, embedded labels, table is [Sheet1]);

MappingTable:
LOAD 
    DISTINCT "Product ID",
    "Year"
RESIDENT TempTable
ORDER BY "Product ID", "Date" DESC;

DROP TABLE TempTable;

 

Which produces (actual)

Product ID Year
PMX99 2022
PMX99 2021
ORA31 2022
ORA31 2021

 

Why are there duplicates in Product ID? How do I get the output I expect?

Labels (2)
5 Replies
zhaofeng
Partner - Creator
Partner - Creator

Hi

Distinct is used to decorate Load but not a specific Field.

In your case,each ID matched two different Date in two different year.That means for combination of ID and Year,your record is unique.

If one ID match several different dates in same year,you will got several same records after you load ProductID and Year without distinct.Only in this situation,you need add distinct to your code.

 

 

quickqlik2
Contributor II
Contributor II
Author

@zhaofeng Your example is very helpful to show why this is happening. Thank you. I'm not sure where to add DISTINCT to my code to correct the issue. Can you show me where to add DISTINCT?

zhaofeng
Partner - Creator
Partner - Creator

Distinct always added after Load.Your code is correct.

 

LOAD DISTINCT 
    "Product ID",
    "Year"
RESIDENT TempTable
ORDER BY "Product ID", "Date" DESC;

 

The  output is also correct.However it is different from your expection.I think you may want to load the latest Year for each ID.If that,you need to write the following code.

Load
    "Product ID",
    max(Year)  as Year
Resident TempTable
group by
    "Product ID"
;

 

quickqlik2
Contributor II
Contributor II
Author

@zhaofeng Thanks I'll try that. It's slightly more complicated though. In the actual table I need to take the MAX(Date) and return the Year from that row (along with the Product ID). In the actual data set the Year is inconsistent, but the Date is always correct. So I have to use the Date column as a MAX and then grab whatever Year is there (sometimes Year is "2023" sometimes it's "TwentyTwentyThree", etc.).

zhaofeng
Partner - Creator
Partner - Creator

Yep.The first thing you need to do is unify your data format.Such as modify "Twenty Twenty Three" to 2023.