Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us to spark ideas for how to put the latest capabilities into action. Register here!
cancel
Showing results for 
Search instead for 
Did you mean: 
UserID95139
Contributor
Contributor

Remove certain rows from straight table

Hi,

I would really appreciate your help on following task below. 

I have straight table with ca 100k rows and 3 columns – ClientID, EMAIL and DATE. A single client may be associated with several different emails and different clients may have the same email.

Sample dataset (ordered by descending date):

ClientIDEMAILDATE
111aaaa@gmail.com01.01.2020
333aaaa@gmail.com31.12.2019
111bbb@gmail.com06.12.2019
222aaaa@gmail.com15.10.2019
222tt@gmail.com15.07.2019
333u@gmail.com26.05.2019
444bbb@gmail.com01.01.2018


The goal would be to acquire rows from the table in following conditions:
* Every ClientID and every EMAIL is taken only once

* The maximum number of clients and emails have been covered
* By DATE, the newest should be preferred

It should be done in straight table, not in load script.

Expected result:

ClientIDEMAILDATE
111aaaa@gmail.com01.01.2020
222tt@gmail.com15.07.2019
333u@gmail.com26.05.2019
444bbb@gmail.com01.01.2018

 

9 Replies
Kushal_Chawda

you can do something like below

 

In script,

Data:
Load Client,
     Firstsortedvalue(Email,-Date) as Email
     Firstsortedvalue(Date,-Date) as Date
FROM Table
Group By CLient;

or you can use the below expression in straight table

Email -Firstsortedvalue(Email,-Date)
Date - Firstsortedvalue(Date,-Date)

 

 

UserID95139
Contributor
Contributor
Author

Unfortunately this solution does not work… or I don't quite understand. Could you please explain what exactly are the dimensions and what are the expressions in your solution?

load * inline [
ClientID, EMAIL, DATE
111, aaaa@gmail.com, 01.01.2020
333, aaaa@gmail.com, 31.12.2019
111, bbb@gmail.com, 06.12.2019
222, aaaa@gmail.com, 15.10.2019
222, tt@gmail.com, 15.07.2019
333, u@gmail.com, 26.05.2019
444, bbb@gmail.com, 01.01.2018
];




<-remove, EMAIL is taken above
<-remove, ClientID is taken above
<-remove, EMAIL is taken above




thannila
Creator
Creator

Test:
Load * Inline
[
ClientID,EMAIL,DATE1
111,aaaa@gmail.com,01.01.2020
333,aaaa@gmail.com,31.12.2019
111,bbb@gmail.com,06.12.2019
222,aaaa@gmail.com,15.10.2019
222,tt@gmail.com,15.07.2019
333,u@gmail.com,26.05.2019
444,bbb@gmail.com,01.01.2018
];

NoConcatenate
Test1:
Load
ClientID,EMAIL,DATE1
resident Test
where not EMAIL=Peek(EMAIL) and not ClientID=Peek(ClientID) order by ClientID;
drop table Test;

UserID95139
Contributor
Contributor
Author

Unfortunately that does not work out that way - email aaaa@gmail.com is present twice.

Also, I would really like to do it in straight table instead of load script. 

Kashyap_R
Partner - Specialist
Partner - Specialist

Hi 

The date format is not correct change the format using Date# and Date to get the correct format and try the logic what @Kushal_Chawda  has suggested. It is working for me.

Hope this helps 

Thanks

Thanks and Regards
Kashyap.R
Kushal_Chawda

set your date format in script (In main tab) as per the format of Date in Data

SET DateFormat='DD.MM.YYYY';

then create a chart

Dimension:

Client

Expression:

For Email : FirstSortedValue(distinct Email, -Date) 

For Date : FirstSortedValue(distinct Date, -Date) 

UserID95139
Contributor
Contributor
Author

Thank you for explanation - obviously I'm doing something wrong, as it does not work out for me. I would kindly appreciate your comments for attached .qvw

Kushal_Chawda

I  don't have access to QlikView but I converted your qvw to qvf file and below expressions worked for me

I have created straight table with Dimension ClientID

=FirstSortedValue(distinct EMAIL, -DATE)

=FirstSortedValue(distinct DATE, -DATE)

Note : Make sure that you are passing correct field name in expressions

 

UserID95139
Contributor
Contributor
Author

I tried and in that case, emails are not distinct (attached)

Any ideas? 🙂