Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
BS
Contributor III
Contributor III

"where" clause to show the most recent record among duplicates

I have a series of Excels that I have to load in a QVD. These excels are extractions from a ticketing tool. This extraction is done every week and stored in different excels. The problem is that, if a ticket has been modified (for example, its Status changes from "In execution" to "Completed") in the extraction, this ticket returns with the same ID.

In the excels I also have a field called "Updated" that records the day of the modification. In my QVD I need to get ONLY the ticket that has been modified most recently. This is the code section where I load the excels:

PeticionesTareasTME:
LOAD *
FROM
[$(vPathCSVTLF)/*.xlsx]
(ooxml, embedded labels);

My idea was to make a "where" clause similar to the following:

where if (Exists (ID), max (Updated)) // If you find a repeated ID among the excels, return the one with the most recent "Updated" date field.

This sentence has not worked for me but that is what I want to do.

I hope you can help me.

15 Replies
BS
Contributor III
Contributor III
Author

sorry, i just checked it out. Yes, in my data there are identical duplicates in addition to those in which some field is modified and the Updated field is modified.

asinha1991
Creator III
Creator III

for identical, this will remove duplicates

PeticionesTareasTME:
LOAD *,rowNo() as RNO
FROM
[$(vPathCSVTLF)/*.xlsx]
(ooxml, embedded labels);

inner join(PeticionesTareasTME)

Load ID,max(updated)as updated Resident PeticionesTareasTME Group by ID;

inner join(PeticionesTareasTME)

Load ID,max(RNO)as RNO Resident PeticionesTareasTME Group by ID;

for non identical you will need to decide logic

 

 

BS
Contributor III
Contributor III
Author

now there are no duplicates but those duplicates with updated fields do not appear either ... 😞

asinha1991
Creator III
Creator III

 no other choice I can think of but group by if you want to keep non identicals

PeticionesTareasTME:
LOAD  field1,field2.....n(except updated), max(updated) as updated
FROM
[$(vPathCSVTLF)/*.xlsx]
(ooxml, embedded labels) group by field1,field2.....n(except updated);

inner join(PeticionesTareasTME)

Load ID,max(updated)as updated Resident PeticionesTareasTME Group by ID;

//field1,field2.....n(except updated) means all fieldnames  ',' separated except updated

asinha1991
Creator III
Creator III

  • try this too

PeticionesTareasTME:
LOAD  distinct

(all field names)

FROM
[$(vPathCSVTLF)/*.xlsx]
(ooxml, embedded labels) ;

inner join(PeticionesTareasTME)

Load ID,max(updated)as updated Resident PeticionesTareasTME Group by ID;

BS
Contributor III
Contributor III
Author

That worked!!! Thank you so so much!! 🙂