Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
I am loading data where each row in the table indicates a login or logout for each user. The table looks like this:
username, action, date
timothy, login, 1/5/2014
timothy, logout, 4/5/2014
ben, login, 2/5/2014
ben, logout, 8/5/2014
I want to generate new rows in this table between a login and logout which will have the missing dates. So the final table in load script should look like this:
username, action, date
timothy, login, 1/5/2014
timothy, login, 2/5/2014
timothy, login, 3/5/2014
timothy, logout, 4/5/2014
ben, login, 2/5/2014
ben, login, 3/5/2014
ben, login, 4/5/2014
ben, login, 5/5/2014
ben, login, 6/5/2014
ben, login, 7/5/2014
ben, logout, 8/5/2014
I want to generate the underlined rows using a for or while loop.
What I've attempted so far is to load the original table, and then load a second table like below:
LOAD
upper(USER_NAME) as USER_NAME,
DATE-peek(DATE,-1,USLogin) as datecount
Resident USLogin
While IterNo() ? <--getting stuck here
But that's where I'm getting stuck. Essentially, for each username, I want to increment the date of a login until it reaches the date of a logout, and generate rows in between a login and logout with these incremented dates.
How can this be achieved? Thanks.
Hi,
I modified the script
Now only login actions are repeating.
PFA
Regards,
Sergey
Input:
Load * Inline [
username, action, datetimothy, login, 1/5/2014
timothy, logout, 4/5/2014
ben, login, 2/5/2014
ben, logout, 8/5/2014
];Join
Temp:
Load
username,
Max(date) as MaxDate
Resident Input Group By username;Final:
Load
username,
If(date+IterNo()-1=MaxDate, action, 'Login') as action,
date(date+IterNo()-1) as date
Resident Input while date+IterNo()-1<=MaxDate and action <>'logout'; // correctedDrop table Input;
Small correction to Tresesco's code which generates extra row on max date with login as action
Input:
Load * Inline [
username, action, date
timothy, login, 1/5/2014
timothy, logout, 4/5/2014
ben, login, 2/5/2014
ben, logout, 8/5/2014
];
Join
Temp:
Load
username,
Max(date) as MaxDate
Resident Input Group By username;
Final:
Load
username,
If(date+IterNo()-1=MaxDate, action, 'Login') as action,
date(date+IterNo()-1) as date
Resident Input while date+IterNo()-1<MaxDate Or (IterNo() = 1 And date=MaxDate);
Drop table Input;
Good catch anbu. The easier correction(as I did) would be : and action <>'logout' to avoid the max date load from resident table.
Hi
Did you try to add several intervals?
For example
timothy, login, 3/4/2014
timothy, logout, 4/4/2014
ben, login, 2/4/2014
ben, logout, 8/4/2014
timothy, login, 1/5/2014
timothy, logout, 4/5/2014
ben, login, 2/5/2014
ben, logout, 8/5/2014
As I see MaxDate will not work.
So. I tried to solve this situation with intervalmatch.
PFA
Sergey, You attached the wrong app
sorry
PFA
Hi Sergey, many thanks for your program. This is almost working - however, it's generating rows for the days that each user is not logged in, which is undesirable because the data has 1000+ users. How would you modify it such that it only generates rows for when the user is logged in? Thanks.
When it's loading, it works fine if I load just 5 or 6 users, however, loading hundreds of users causes 10 million+ lines to be generated and the load doesn't stop. Might there be something wrong with the joins?
Hi,
I modified the script
Now only login actions are repeating.
PFA
Regards,
Sergey