Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
i want to load tables, but without duplicate records. The value in column Product + Customer must be unique. It should be load always the record with the highes SaleNumber.
How can i do it?
Example - the red marked records shouldn´t load:
Regards,
sam
You can do it in a single LOAD:
LOAD
Customer,
Product,
Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
Max(SaleNumber) AS SaleNumber
FROM
[lib://DATA/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1)
GROUP BY
Product, Customer
;
It does not depend on a certain sort order from the source data.
May be this:
Table:
LOAD Date,
Product,
SaleNumber,
Customer
FROM
[..\..\Downloads\mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1);
Right Join (Table)
LOAD Product,
Customer,
Max(SaleNumber) as SaleNumber
Resident Table
Group By Product, Customer;
Hi,
i will try it, thanks.
Is it possible to do that by only 1 load command, directly in the 1st step? Because i have a csv file and its very big (about 500 MB). I want to load less data from this csv. In your solution i would load all data.
Is it correct, that if i load this csv without duplicates like the description in my first thread here. that Qlik Sense need less time to load? Need less memory?
Regards,
sam
If the data is sorted in this manner
1st by Product,
2nd by Customer
3rd by SaleNumber descending order
then you can do it using a single load... else I don't think you will be able to do this another way
You can do it in a single LOAD:
LOAD
Customer,
Product,
Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
Max(SaleNumber) AS SaleNumber
FROM
[lib://DATA/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1)
GROUP BY
Product, Customer
;
It does not depend on a certain sort order from the source data.
Totally forgot about this
Wasn't sure myself so I had to try it out ...
Thanks a lot!
Why do you do this? Is it important?
Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
Because, i try it without this line...
LOAD
Product,
Customer,
//Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
Max(SaleNumber) AS SaleNumber
FROM [lib://concatenate/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1)
Group By Product, Customer
;
... and get the correct result. But perhaps it was a coincidence?!
If i try this code, too:
LOAD
Product,
Customer,
//Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
//Max(SaleNumber) AS SaleNumber
SaleNumber
FROM [lib://concatenate/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1)
Group By Product, Customer
;
and get an error:
But don´t really understand why.
Regards,
sam
You only need Group By clause when you are aggregating....
LOAD
Product,
Customer,
//Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
//Max(SaleNumber) AS SaleNumber
SaleNumber
FROM [lib://concatenate/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1)
Group By Product, Customer;
Since you commented out both your aggregations, you don't need Group By any more
LOAD
Product,
Customer,
//Date(FirstSortedValue( Date, -SaleNumber)) AS Date,
//Max(SaleNumber) AS SaleNumber
SaleNumber
FROM [lib://concatenate/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1);
Sorry, but i got the next problem and question.
Now i try to create some fields in the load script and use "Group by" for these fields and get an error.
LOAD
Date,
Product,
Customer,
Max(SaleNumber) AS SaleNumber,
month (Date) as sMonth,
SubstringCount(Customer, ' ')+1 as countProductWords
FROM [lib://concatenate/mehrere-spalten-vergleichen.xlsx]
(ooxml, embedded labels, table is Tabelle1)
Group By sMonth, countProductWords;
Is it possible to fix it?
regards,
sam