Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I would really appreciate your help on following task below.
I have straight table with ca 100k rows and 3 columns – ClientID, EMAIL and DATE. A single client may be associated with several different emails and different clients may have the same email.
Sample dataset (ordered by descending date):
ClientID | DATE | |
111 | aaaa@gmail.com | 01.01.2020 |
333 | aaaa@gmail.com | 31.12.2019 |
111 | bbb@gmail.com | 06.12.2019 |
222 | aaaa@gmail.com | 15.10.2019 |
222 | tt@gmail.com | 15.07.2019 |
333 | u@gmail.com | 26.05.2019 |
444 | bbb@gmail.com | 01.01.2018 |
The goal would be to acquire rows from the table in following conditions:
* Every ClientID and every EMAIL is taken only once
* The maximum number of clients and emails have been covered
* By DATE, the newest should be preferred
It should be done in straight table, not in load script.
Expected result:
ClientID | DATE | |
111 | aaaa@gmail.com | 01.01.2020 |
222 | tt@gmail.com | 15.07.2019 |
333 | u@gmail.com | 26.05.2019 |
444 | bbb@gmail.com | 01.01.2018 |
you can do something like below
In script,
Data:
Load Client,
Firstsortedvalue(Email,-Date) as Email
Firstsortedvalue(Date,-Date) as Date
FROM Table
Group By CLient;
or you can use the below expression in straight table
Email -Firstsortedvalue(Email,-Date)
Date - Firstsortedvalue(Date,-Date)
Unfortunately this solution does not work… or I don't quite understand. Could you please explain what exactly are the dimensions and what are the expressions in your solution?
load * inline [ |
|
Test:
Load * Inline
[
ClientID,EMAIL,DATE1
111,aaaa@gmail.com,01.01.2020
333,aaaa@gmail.com,31.12.2019
111,bbb@gmail.com,06.12.2019
222,aaaa@gmail.com,15.10.2019
222,tt@gmail.com,15.07.2019
333,u@gmail.com,26.05.2019
444,bbb@gmail.com,01.01.2018
];
NoConcatenate
Test1:
Load
ClientID,EMAIL,DATE1
resident Test
where not EMAIL=Peek(EMAIL) and not ClientID=Peek(ClientID) order by ClientID;
drop table Test;
Unfortunately that does not work out that way - email aaaa@gmail.com is present twice.
Also, I would really like to do it in straight table instead of load script.
Hi
The date format is not correct change the format using Date# and Date to get the correct format and try the logic what @Kushal_Chawda has suggested. It is working for me.
Hope this helps
Thanks
set your date format in script (In main tab) as per the format of Date in Data
SET DateFormat='DD.MM.YYYY';
then create a chart
Dimension:
Client
Expression:
For Email : FirstSortedValue(distinct Email, -Date)
For Date : FirstSortedValue(distinct Date, -Date)
Thank you for explanation - obviously I'm doing something wrong, as it does not work out for me. I would kindly appreciate your comments for attached .qvw
I don't have access to QlikView but I converted your qvw to qvf file and below expressions worked for me
I have created straight table with Dimension ClientID
=FirstSortedValue(distinct EMAIL, -DATE)
=FirstSortedValue(distinct DATE, -DATE)
Note : Make sure that you are passing correct field name in expressions
I tried and in that case, emails are not distinct (attached)
Any ideas? 🙂