Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
joancasellasvega
Partner - Contributor III
Partner - Contributor III

Create a filter to identify duplicates and filter uniques ID base on the latest date

Hi all,

I'm loading two tables with Concatenate, where INSTANCIA is the unique Identifier.

ES:
LOAD
"Ticket Id",
INSTANCIA,
FECHA_CREACION_INSTANCIA,
ESTADO,
.

.

.

LastModified,

Grupo,
Agente
FROM [lib://Ficheros Qlik/output 2020_05_19.xlsx]
(ooxml, embedded labels, table is [Table 1])
Where LastModified='19/05/2020';

Concatenate

LOAD

"Ticket Id",
INSTANCIA,
FECHA_CREACION_INSTANCIA,
ESTADO,
.

.

.

LastModified,

Grupo,
Agente

FROM [lib://Ficheros Qlik/output 2020_05_20.xlsx]
(ooxml, embedded labels, table is [Table 1])
Where LastModified='20/05/2020';

Every data I append a new set of data. The unique identifier is the INSTANCIA, but an INSTANCIA can be manage in two different days. Therefore I found duplicates INSTANCIA’s with different LastModified date.

I would like to have a filter to keep unique INSTANCE base on the latest  LastModified date. Those that make sense. I don’t want to delete duplicates, but be able to filter on a table.

Any help will be really appreciate it.

Thanks in advance,

Joan

10 Replies
Gui_Approbato
Partner - Creator III
Partner - Creator III

Hello Joan,

I don't know if it is that simple but.. what if you create a column in both loads with the

Instancia & '-'& Fecha_creacion_instancia  ?

Or use rowno() to also concatenate with your instancia?

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Thanks for your replay @Gui_Approbato.

Below I have highlighted what the solution should be:

 

Ticket IdINSTANCIALastModifiedDuplicate
6195140157418326/05/20200
115740157418327/05/20201
6195240157439426/05/20201
5389740157442726/05/20201
7367840157453026/05/20201
6195340157454826/05/20201
6195440157460322/05/20200
6195540157460324/05/20201
6195640157487326/05/20201
6195740157489226/05/20201

 

I would like to have a filter called duplicates, so I can show the INSTANCIAS which were last modify, based on the column LastModified.

thanks,

Joan

Gui_Approbato
Partner - Creator III
Partner - Creator III

There are other ways to get the same result, and one I can think of is creating this key in your main table using INSTANCIA&LastModified , then you load only this column and count how many times it repeats..

So basically you would have:

ES:

Load  (all fields), 

          INSTANCIA&LastModified as Key from Source;

 

Load distinct INSTANCIA&LastModified as Key,

INSTANCIA,                          

Count(INSTANCIA) as Duplicate   Resident ES;

 

With this you will link both tables by the Key column and will have your Duplicate filter (but equals to or higher than 1).

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Hi ,
I'm trying to upload the data following your script. For each day (i have multiple tables) I'm using:

ES:
Load (all fields),
INSTANCIA&LastModified as Key from Source1;

Concatenate

Load (all fields),
INSTANCIA&LastModified as Key from Source2;

Concatenate

Load (all fields),
INSTANCIA&LastModified as Key from Source3;

Concatenate

.

.
.


When adding ….

Load distinct INSTANCIA&LastModified as Key,
INSTANCIA,
Count(INSTANCIA) as Duplicate Resident ES;

… at the end of the editor Qlik is throwing me an error. Any thoughts?
Thanks,
Joan

Gui_Approbato
Partner - Creator III
Partner - Creator III

Oh, I think we missed the 'group by' clause in the last table..

So the last table would be 


Load distinct INSTANCIA&LastModified as Key,
INSTANCIA,
Count(INSTANCIA) as Duplicate Resident ES Group by INSTANCIA, LastModified;

 

Try this

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Hi @Gui_Approbato 

I have attached an excel file with three tabs.

I would like to upload the three tabs as ES: 

and have a column that filter those unique by the latest LASTMODIFY date.

I hope this can help

Thanks in advance,

Joan

Gui_Approbato
Partner - Creator III
Partner - Creator III

Great, thanks for the file.

I don't know why this error shows up (if someone there knows please comment here), but I made a filter to solve this issue in the dashboard.

I believe it is what you need. Check this file and let me know.

Best,

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

hey @Gui_Approbato  thanks for your rapid response.

I think we are almost there 😀

I have attached another file, also with the expected result. Hope that help too.

I want to keep in a table only instances with the latest modified date.

thanks in advance

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Hi,

to identify duplicates based on date, I'm using the following scritp:

 

FinalTable:
LOAD *,
If(INSTANCIA = Previous(INSTANCIA), 'dup', 'unique') as Flag
Resident ES
Order By INSTANCIA, LastModified desc;
DROP Table ES;

 

Thanks,

Joan