Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
joancasellasvega
Partner - Contributor III
Partner - Contributor III

Create a filter to identify duplicates and filter uniques ID base on the latest date

Hi all,

I'm loading two tables with Concatenate, where INSTANCIA is the unique Identifier.

ES:
LOAD
"Ticket Id",
INSTANCIA,
FECHA_CREACION_INSTANCIA,
ESTADO,
.

.

.

LastModified,

Grupo,
Agente
FROM [lib://Ficheros Qlik/output 2020_05_19.xlsx]
(ooxml, embedded labels, table is [Table 1])
Where LastModified='19/05/2020';

Concatenate

LOAD

"Ticket Id",
INSTANCIA,
FECHA_CREACION_INSTANCIA,
ESTADO,
.

.

.

LastModified,

Grupo,
Agente

FROM [lib://Ficheros Qlik/output 2020_05_20.xlsx]
(ooxml, embedded labels, table is [Table 1])
Where LastModified='20/05/2020';

Every data I append a new set of data. The unique identifier is the INSTANCIA, but an INSTANCIA can be manage in two different days. Therefore I found duplicates INSTANCIA’s with different LastModified date.

I would like to have a filter to keep unique INSTANCE base on the latest  LastModified date. Those that make sense. I don’t want to delete duplicates, but be able to filter on a table.

Any help will be really appreciate it.

Thanks in advance,

Joan

10 Replies
Gui_Approbato
Creator III
Creator III

Hello Joan,

I don't know if it is that simple but.. what if you create a column in both loads with the

Instancia & '-'& Fecha_creacion_instancia  ?

Or use rowno() to also concatenate with your instancia?

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Thanks for your replay @Gui_Approbato.

Below I have highlighted what the solution should be:

 

Ticket IdINSTANCIALastModifiedDuplicate
6195140157418326/05/20200
115740157418327/05/20201
6195240157439426/05/20201
5389740157442726/05/20201
7367840157453026/05/20201
6195340157454826/05/20201
6195440157460322/05/20200
6195540157460324/05/20201
6195640157487326/05/20201
6195740157489226/05/20201

 

I would like to have a filter called duplicates, so I can show the INSTANCIAS which were last modify, based on the column LastModified.

thanks,

Joan

Gui_Approbato
Creator III
Creator III

There are other ways to get the same result, and one I can think of is creating this key in your main table using INSTANCIA&LastModified , then you load only this column and count how many times it repeats..

So basically you would have:

ES:

Load  (all fields), 

          INSTANCIA&LastModified as Key from Source;

 

Load distinct INSTANCIA&LastModified as Key,

INSTANCIA,                          

Count(INSTANCIA) as Duplicate   Resident ES;

 

With this you will link both tables by the Key column and will have your Duplicate filter (but equals to or higher than 1).

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Hi ,
I'm trying to upload the data following your script. For each day (i have multiple tables) I'm using:

ES:
Load (all fields),
INSTANCIA&LastModified as Key from Source1;

Concatenate

Load (all fields),
INSTANCIA&LastModified as Key from Source2;

Concatenate

Load (all fields),
INSTANCIA&LastModified as Key from Source3;

Concatenate

.

.
.


When adding ….

Load distinct INSTANCIA&LastModified as Key,
INSTANCIA,
Count(INSTANCIA) as Duplicate Resident ES;

… at the end of the editor Qlik is throwing me an error. Any thoughts?
Thanks,
Joan

Gui_Approbato
Creator III
Creator III

Oh, I think we missed the 'group by' clause in the last table..

So the last table would be 


Load distinct INSTANCIA&LastModified as Key,
INSTANCIA,
Count(INSTANCIA) as Duplicate Resident ES Group by INSTANCIA, LastModified;

 

Try this

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Hi @Gui_Approbato 

I have attached an excel file with three tabs.

I would like to upload the three tabs as ES: 

and have a column that filter those unique by the latest LASTMODIFY date.

I hope this can help

Thanks in advance,

Joan

Gui_Approbato
Creator III
Creator III

Great, thanks for the file.

I don't know why this error shows up (if someone there knows please comment here), but I made a filter to solve this issue in the dashboard.

I believe it is what you need. Check this file and let me know.

Best,

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

hey @Gui_Approbato  thanks for your rapid response.

I think we are almost there 😀

I have attached another file, also with the expected result. Hope that help too.

I want to keep in a table only instances with the latest modified date.

thanks in advance

joancasellasvega
Partner - Contributor III
Partner - Contributor III
Author

Hi,

to identify duplicates based on date, I'm using the following scritp:

 

FinalTable:
LOAD *,
If(INSTANCIA = Previous(INSTANCIA), 'dup', 'unique') as Flag
Resident ES
Order By INSTANCIA, LastModified desc;
DROP Table ES;

 

Thanks,

Joan