Skip to main content
Announcements
Live today at 11 AM ET. Get your questions about Qlik Connect answered, or just listen in. SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Sort, parse and identify

Hi all,

i have the following (simplified) data set:

LogInIDUserIDLogInDateLeftPostAnsweredPost
1AA2015-01-1010
2AA2015-01-1210
3BB2015-01-1201
4AA2015-01-1701
5BB2015-01-2211
6CC2015-01-2201

What i would like to do for every UserID:

  • Add a flag "HasAnswerAttachedToPost" = 1 to every record in my data source that tells me if a LogIn has either
    • a LeftPost = 1 AND an AnsweredPost=1
    • OR
    • a LogIn L1 that has a LeftPost=1 AND an AnsweredPost=0 that is followed by a LogIn L2 that has LeftPost=0 AND AnsweredPost=1 within a timeframe of max 10 days after L1

This would result in:

LogInIDUserIDLogInDateLeftPostAnsweredPostHasAnswerAttachedToPost
1AA2015-01-1010
2AA2015-01-1210
3BB2015-01-1201
4AA2015-01-17011
5BB2015-01-22111
6CC2015-01-2201

How can this be done in a smart way ?

Thanks!

1 Solution

Accepted Solutions
Gysbert_Wassenaar

Maybe like this:

Temp:

LOAD * INLINE [

    LogInID, UserID, LogInDate, LeftPost, AnsweredPost

    1, AA, 2015-01-10, 1, 0

    2, AA, 2015-01-12, 1, 0

    3, BB, 2015-01-12, 0, 1

    4, AA, 2015-01-17, 0, 1

    5, BB, 2015-01-22, 1, 1

    6, CC, 2015-01-22, 0, 1

];

NOCONCATENATE

Result:

LOAD *, if(AnsweredPost=1 and ( LeftPost=1 or (LeftPost=0 and LogInDate - LastLeftPost <=10)),1,0) as HasAnswerAttachedToPost;

LOAD *, if(LeftPost=1,LogInDate, if(UserID=previous(UserID), peek(LastLeftPost))) as LastLeftPost

RESIDENT Temp

ORDER BY UserID, LogInDate

;

DROP TABLE Temp;


talk is cheap, supply exceeds demand

View solution in original post

3 Replies
Gysbert_Wassenaar

Maybe like this:

Temp:

LOAD * INLINE [

    LogInID, UserID, LogInDate, LeftPost, AnsweredPost

    1, AA, 2015-01-10, 1, 0

    2, AA, 2015-01-12, 1, 0

    3, BB, 2015-01-12, 0, 1

    4, AA, 2015-01-17, 0, 1

    5, BB, 2015-01-22, 1, 1

    6, CC, 2015-01-22, 0, 1

];

NOCONCATENATE

Result:

LOAD *, if(AnsweredPost=1 and ( LeftPost=1 or (LeftPost=0 and LogInDate - LastLeftPost <=10)),1,0) as HasAnswerAttachedToPost;

LOAD *, if(LeftPost=1,LogInDate, if(UserID=previous(UserID), peek(LastLeftPost))) as LastLeftPost

RESIDENT Temp

ORDER BY UserID, LogInDate

;

DROP TABLE Temp;


talk is cheap, supply exceeds demand
Not applicable
Author

Here you go - Have a look at the following load script

USERDATA:
LOAD * INLINE [
LogInID, UserID, LogInDate, LeftPost, AnsweredPost
1, AA, 2015-01-10, 1, 0
2, AA, 2015-01-12, 1, 0
3, BB, 2015-01-12, 0, 1
4, AA, 2015-01-17, 0, 1
5, BB, 2015-01-22, 1, 1
6, CC, 2015-01-22, 0, 1
]
;

NoConcatenate
RESULT:
Load
LogInID,
UserID,
LogInDate,
LeftPost,
AnsweredPost,
if(AnsweredPost=1 and LeftPost=0,1) as HasAnsweredAttachedToPost
Resident USERDATA;

Drop Table
USERDATA;

Have Fun!!

Not applicable
Author

Thank you very much Mr. Wassenaar!

After reviewing your code and trying it on my "real" data, i stumbled over two more challanges that i am struggeling with. If you take a look at this (closer to the real) data sample:

LogInIDUserIDLogInDateLeftPostAnsweredPostHasUploadedContent
1AA2015-01-10100
2AA2015-01-12100
3BB2015-01-12100
4AA2015-01-17011
5AA2015-01-18011
6BB2015-01-12100
7BB2015-01-15011

What i would like to achieve is:

For every LogIn of a user for one day where the user has left a post, i would like to determine if this user has also answered a post and / or uploaded content either on the day of the post or within 10 days after the post. (if these login records have a LeftPost flag = 0 - in these cases i consider the original post(s) to have an answer attached and / or an upload attached.

Following the example above the my result data set would look like this:

LogInIDUserIDLogInDateLeftPostAnsweredPostHasAnswerAttachedToPostHasUploadedContentHasContentAttachedToPost
1AA2015-01-1010100
2AA2015-01-1010101
3BB2015-01-1210100
4AA2015-01-1701Null()0Null()
5AA2015-01-2101Null()1Null()
6BB2015-01-1210111
7BB2015-01-1501Null()1Null()

I guess, the difficulties i have in particular are:

- Set the two new fields HasAnswerAttachedToPost and HasContentAttachedToPost properly in case a user has two or more LogInRecords with LeftPost=1, but AnsweredPost=0 and HasUploadedContent=0 on the same day. In that case i need to consider all of these records when checking for following LogIns of this user for potential Answer and / or Upload attachments where LeftPost = 0.

Hope, my problem is clear & thanks a lot in advance for any help, hints or suggestions!