Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Remove duplicate row while loading the data into QV

Hi,

I have the sample data set. How could I write the script while I have the duplicate in the original data set and I would like to remove the duplicate while loading the data into QV?

Thanks so much!

1 Solution

Accepted Solutions
maxgro
MVP
MVP

if you want to choose the row with minimum Yearmonth2

source:

LOAD

  Yearmonth & '-' & [Percent of pass] as Field,

  Yearmonth,

     [Percent of pass],

     Grade,

     Teach,

     Yearmonth2,

     [Percent of pass2],

     Grade2,

     Teach2,

     Yearmonth3,

     [Percent of pass3],

     Grade3,

     Teach3,

     Yearmonth4,

     [Percent of pass4],

     Grade4,

     Teach4

FROM

SampleData.xlsx

(ooxml, embedded labels, table is Sheet1);

table:

NoConcatenate load

  *

Resident source

where peek(Field) <> Field

order by Yearmonth, [Percent of pass], Yearmonth2

;

DROP Table source;

another one: if you want to choose using different logic for different fields you can use group by

LOAD Yearmonth,

     [Percent of pass],

     MinString(Grade) as Grade,

     MinString(Teach) as Teach,

     min(Yearmonth2) as Yearmonth2,

     min([Percent of pass2]) as [Percent of pass2],

     MinString(Grade2) as Grade2,

     MinString(Teach2) as Teach2,

     max(Yearmonth3) as Yearmonth3,

     min([Percent of pass3]) as [Percent of pass3],

     MinString(Grade3) as Grade3,

     MinString(Teach3) as Teach3,

     min(Yearmonth4) as Yearmonth4,

     min([Percent of pass4]) as [Percent of pass4],

     MinString(Grade4) as Grade4,

     MinString(Teach4) as Teach4

FROM

SampleData.xlsx

(ooxml, embedded labels, table is Sheet1)

group by

Yearmonth,

     [Percent of pass]

;

View solution in original post

14 Replies
Not applicable
Author

Have you tried the DISTINCT statement?

Load distinct

         A, B, B

from .xls

regards,

Marcelo

Not applicable
Author

Yes I have tried this once.

But It seems that when the dataset is big and there is lots of filed, it could not work well...Is there any other method?

Thanks so much!

Not applicable
Author

It sounds very strange that  Distinct function is not working correctly.

You also could tried the group by, with the entire set or fields.

regards,

Marcelo


Not applicable
Author

For the distinct function, is there any restriction for the null value?

Not applicable
Author

If I would like to remove the duplicate by only one filed, how could I use the distinct function?

PrashantSangle

Hi,

In provided data which field is duplicate?

Regards,

Great dreamer's dreams never fulfilled, they are always transcended.
Please appreciate our Qlik community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
Not applicable
Author

The first two fileds: 'YearMonth' and 'percent of Pass'

PrashantSangle

Hi,

What issue you are getting while working with

Distinct

Or

Group By Clause.

See Enclosed File.

and check frequency of each value.

Regards,

Great dreamer's dreams never fulfilled, they are always transcended.
Please appreciate our Qlik community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂
maxgro
MVP
MVP

1.png

The first 2 rows have the same value for Yearmonth and Percent ofPass

What's your required output? 1 rows? If yes how do you choose which value to keep for other fields?