Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Christinedv
Contributor
Contributor

Duplicates with multiple fields

Hi,

I am trying to remove duplicates from my data. The issue is that for a row to be considered a duplicate, it has to be the same across multiple fields. For example, another student making the same transfer doesn't count as a duplicate. I want to keep only one row when the StudentID, CurrentClass, NextClass, Date, and DurationDays are the same across rows — like rows 1 and 2 in the table below. I have other fields that I want to keep in my fact table, but they are not important for identifying duplicates.

TransferID StudentID CurrentClass NextClass Date DurationDays Teacher Passed OtherField1 OtherField2
1 S001 ABC DEF 2025-01-01 10 T1 1 Value1 Value2
2 S001 ABC DEF 2025-01-01 10 T1 1 Value1 Value2
3 S001 ABC MTH 2025-01-02 15 T1 1 Value3 Value4
4 S002 MTH SCI 2025-01-01 20 T2 0 Value5 Value6

 

How do i only select one row and not duplicates. I want to do this in the load script.

Labels (1)
1 Solution

Accepted Solutions
BPiotrowski
Partner - Creator
Partner - Creator

Hi, @Christinedv 

Add field as you load this table
StudentID&'_'&CurrentClass&'_'&NextClass&'_'&Date&'_'&DurationDays as DuplicateKey

Then make resident load distinct DuplicateKey ... to tmp table drop old table and rename tmp to final name.

View solution in original post

1 Reply
BPiotrowski
Partner - Creator
Partner - Creator

Hi, @Christinedv 

Add field as you load this table
StudentID&'_'&CurrentClass&'_'&NextClass&'_'&Date&'_'&DurationDays as DuplicateKey

Then make resident load distinct DuplicateKey ... to tmp table drop old table and rename tmp to final name.