Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I have the sample data set. How could I write the script while I have the duplicate in the original data set and I would like to remove the duplicate while loading the data into QV?
Thanks so much!
if you want to choose the row with minimum Yearmonth2
source:
LOAD
Yearmonth & '-' & [Percent of pass] as Field,
Yearmonth,
[Percent of pass],
Grade,
Teach,
Yearmonth2,
[Percent of pass2],
Grade2,
Teach2,
Yearmonth3,
[Percent of pass3],
Grade3,
Teach3,
Yearmonth4,
[Percent of pass4],
Grade4,
Teach4
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Sheet1);
table:
NoConcatenate load
*
Resident source
where peek(Field) <> Field
order by Yearmonth, [Percent of pass], Yearmonth2
;
DROP Table source;
another one: if you want to choose using different logic for different fields you can use group by
LOAD Yearmonth,
[Percent of pass],
MinString(Grade) as Grade,
MinString(Teach) as Teach,
min(Yearmonth2) as Yearmonth2,
min([Percent of pass2]) as [Percent of pass2],
MinString(Grade2) as Grade2,
MinString(Teach2) as Teach2,
max(Yearmonth3) as Yearmonth3,
min([Percent of pass3]) as [Percent of pass3],
MinString(Grade3) as Grade3,
MinString(Teach3) as Teach3,
min(Yearmonth4) as Yearmonth4,
min([Percent of pass4]) as [Percent of pass4],
MinString(Grade4) as Grade4,
MinString(Teach4) as Teach4
FROM
SampleData.xlsx
(ooxml, embedded labels, table is Sheet1)
group by
Yearmonth,
[Percent of pass]
;
Have you tried the DISTINCT statement?
Load distinct
A, B, B
from .xls
regards,
Marcelo
Yes I have tried this once.
But It seems that when the dataset is big and there is lots of filed, it could not work well...Is there any other method?
Thanks so much!
It sounds very strange that Distinct function is not working correctly.
You also could tried the group by, with the entire set or fields.
regards,
Marcelo
For the distinct function, is there any restriction for the null value?
If I would like to remove the duplicate by only one filed, how could I use the distinct function?
Hi,
In provided data which field is duplicate?
Regards,
The first two fileds: 'YearMonth' and 'percent of Pass'
Hi,
What issue you are getting while working with
Distinct
Or
Group By Clause.
See Enclosed File.
and check frequency of each value.
Regards,
The first 2 rows have the same value for Yearmonth and Percent ofPass
What's your required output? 1 rows? If yes how do you choose which value to keep for other fields?