
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Create session ID from data with timestamp
Hi everyone,
I'm working with the server Audit data, which shows user activity in my QV dashboards. Each action has its own row and a timestamp. It looks something like this:
Timestamp | Document | User | Message |
7/15/2015 11.19:52 | Analysis.qvw | John | Selected XYZ |
7/15/2015 11.21:46 | Analysis.qvw | John | Activated Sheet |
7/15/2015 11.21:52 | Analysis.qvw | John | Selected ABC |
7/15/2015 11.22:13 | Budget.qvw | John | Export to Excel |
7/15/2015 11.22:30 | Budget.qvw | John | asdasd |
7/15/2015 11.22:14 | Analysis.qvw | Mary | yj |
7/15/2015 11.22:25 | Analysis.qvw | Mary | ty |
7/15/2015 1.15:05 | Analysis.qvw | John | adfh |
7/15/2015 1.17:10 | Analysis.qvw | John | adfaer |
In the script, I want to use Timestamp and User to create a Session ID.
So John from 11.19:52 - 11.22:30 would be Session ID 1, (ignore him switching dashboards during session)
Mary is 11.22:14-11.22:25, 2
and then when John comes back later from 1.15:05-1.17:10 would be Session 3.
I think this can be done with a condition that says after 10 minutes from last user timestamp, it counts as a new session, so create a new session ID. But I don't know how to write this. Can anyone help me?
- « Previous Replies
-
- 1
- 2
- Next Replies »
Accepted Solutions


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Timestamp format is OK, but it's
ORDER BY User, Timestamp desc;
edit:
Attached the sample, now also with a resorted SessionID.


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try
Set TimestampFormat = 'M/DD/YYYY hh.mm:ss';
INPUT:
LOAD Timestamp,
Document,
User,
Message
FROM
[https://community.qlik.com/thread/173771]
(html, codepage is 1252, embedded labels, table is @1);
RESULT:
LOAD *, if(peek(User) <> User or peek(Timestamp)-Timestamp > interval#('10','m'), rangesum(1, Peek(Session)), Peek(Session) ) as Session
RESIDENT INPUT
ORDER BY User, Timestamp desc;
DROP TABLE INPUT;


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe this way worked or showed at least the direction:
Audit:
Load Timestamp, Document, User, Message From xyz;
left join
Load
Timestamp, User,
autonumber(min(Timestamp) & ' - ' & max(Timestamp) & ' - ' User) as SessionID
Resident Audit Group by Timestamp, User;
- Marcus

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hmm Swuehl this looked great but unfortunately gave me something like the following:
SessionID | Timestamp | User | Message |
7/15/2015 11.19:52 | John | Selected XYZ | |
7/15/2015 11.21:46 | John | Activated Sheet | |
7/15/2015 11.21:52 | John | Selected ABC | |
7/15/2015 11.22:13 | John | Export to Excel | |
1 | 7/15/2015 11.22:30 | John | asdasd |
7/15/2015 11.22:14 | Mary | yj | |
1 | 7/15/2015 11.22:25 | Mary | ty |
7/15/2015 1.15:05 | John | adfh | |
7/15/2015 1.15:10 | John | adfaer |

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Marcus,
Problem with this method is that it does not take into account that I want a new SessionID if more than 10 minutes interval from last user timestamp


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How does you script look like? If you renamed Session to SessionID, you also need to rename the fields in the Peek() function...
Take care that the timestamp format code matches the format of timestamps of your input file.
If you run my exact script, using your sample data, it should show Session ids on all lines, don't you see?


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Maybe you could include something like the following to get the interval:
num(class(Timestamp, 1 / 24 / 6)
- Marcus

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, I changed TimestampFormat to input and made sure everything is Session. Now I get ID on every row.
However, the 10 minute interval rule does not seem to be working:
Session | Timestamp | User | Message |
1 | 7/15/2015 11.19:52 | John | Selected XYZ |
1 | 7/15/2015 11.21:46 | John | Activated Sheet |
1 | 7/15/2015 11.21:52 | John | Selected ABC |
1 | 7/15/2015 11.22:13 | John | Export to Excel |
1 | 7/15/2015 11.22:30 | John | asdasd |
2 | 7/15/2015 11.22:14 | Mary | yj |
2 | 7/15/2015 11.22:25 | Mary | ty |
1 | 7/15/2015 1.15:05 | John | adfh |
1 | 7/15/2015 1.15:10 | John | adfaer |
The last two rows should have ID 3.
I copied your script for it:
"
....
LOAD
if(peek(User)<> User or peek(Timestamp)-Timestamp > interval#('10','m'), rangesum(1, Peek(Session)), Peek(Session) ) as Session,
Timestamp,
..."


- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I believe your timestamp format is not parsed in correctly.
What is your input (default) format set in the script or OS?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think you are right, I am messing up format somewhere.
Actually, my input is a text file. To prevent confusion, I am posting a sample QVW, with real data that I need to use. I just cleaned some sensitive material.
I think I have fixed all the formats, but I am still not getting the correct use of interval#.
Thank you in advance for looking at this!

- « Previous Replies
-
- 1
- 2
- Next Replies »