Solved: Re: Remove duplicate row while loading the data in... - Qlik Community

Report Inappropriate Content · ‎2014-05-04

Hi,

I have the sample data set. How could I write the script while I have the duplicate in the original data set and I would like to remove the duplicate while loading the data into QV?

Thanks so much!

maxgro · ‎2014-05-04

if you want to choose the row with minimum Yearmonth2

source:
LOAD 
  Yearmonth & '-' & [Percent of pass] as Field,
  Yearmonth, 
     [Percent of pass], 
     Grade, 
     Teach, 
     Yearmonth2, 
     [Percent of pass2], 
     Grade2, 
     Teach2, 
     Yearmonth3, 
     [Percent of pass3], 
     Grade3, 
     Teach3, 
     Yearmonth4, 
     [Percent of pass4], 
     Grade4, 
     Teach4
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Sheet1);


table:
NoConcatenate load
  *
Resident source
where peek(Field) <> Field
order by Yearmonth, [Percent of pass], Yearmonth2
;

DROP Table source;

another one: if you want to choose using different logic for different fields you can use group by

LOAD Yearmonth, 
     [Percent of pass], 
     MinString(Grade) as Grade, 
     MinString(Teach) as Teach, 
     min(Yearmonth2) as Yearmonth2, 
     min([Percent of pass2]) as [Percent of pass2], 
     MinString(Grade2) as Grade2, 
     MinString(Teach2) as Teach2, 
     max(Yearmonth3) as Yearmonth3, 
     min([Percent of pass3]) as [Percent of pass3], 
     MinString(Grade3) as Grade3, 
     MinString(Teach3) as Teach3, 
     min(Yearmonth4) as Yearmonth4, 
     min([Percent of pass4]) as [Percent of pass4], 
     MinString(Grade4) as Grade4, 
     MinString(Teach4) as Teach4
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Sheet1)
group by 
 Yearmonth, 
     [Percent of pass]
;

View solution in original post

Report Inappropriate Content · ‎2014-05-04

Have you tried the DISTINCT statement?

Load distinct

A, B, B

from .xls

regards,

Marcelo

Report Inappropriate Content · ‎2014-05-04

Yes I have tried this once.

But It seems that when the dataset is big and there is lots of filed, it could not work well...Is there any other method?

Thanks so much!

Report Inappropriate Content · ‎2014-05-04

It sounds very strange that Distinct function is not working correctly.

You also could tried the group by, with the entire set or fields.

regards,

Marcelo

Report Inappropriate Content · ‎2014-05-04

For the distinct function, is there any restriction for the null value?

Report Inappropriate Content · ‎2014-05-04

If I would like to remove the duplicate by only one filed, how could I use the distinct function?

PrashantSangle · ‎2014-05-04

Hi,

In provided data which field is duplicate?

Regards,

Great dreamer's dreams never fulfilled, they are always transcended.
Please appreciate our Qlik community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

Report Inappropriate Content · ‎2014-05-04

The first two fileds: 'YearMonth' and 'percent of Pass'

PrashantSangle · ‎2014-05-04

Hi,

What issue you are getting while working with

Distinct

Or

Group By Clause.

See Enclosed File.

and check frequency of each value.

Regards,

Great dreamer's dreams never fulfilled, they are always transcended.
Please appreciate our Qlik community members by giving Kudos for sharing their time for your query. If your query is answered, please mark the topic as resolved 🙂

maxgro · ‎2014-05-04

The first 2 rows have the same value for Yearmonth and Percent ofPass

What's your required output? 1 rows? If yes how do you choose which value to keep for other fields?

Remove duplicate row while loading the data into QV