Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
We have a big data table (40M records) which is the log from an app usage, and we'd like to detect some anomalies over a short period of time, such as users with more than 30 queries in 1 minute.
Suppose the table contains these columns:
and each row corresponds to one query over a service made by a user on a specific timestamp.
How can we achieve that?
I believe you are looking for some document analyzer app:
https://adminplaybook.qlik-poc.com/docs/tooling/app_metadata_analyzer.html#where-to-get-it
note: you may not have all the required stuff, but can help you get started.
What if you transform your timestamp to a minute level timestamp
Timestamp(Floor(query_timestamp, 1/(24*60))) as minute_level_timestamp
Then you just need to count the rows per user and minute_level_timestamp to get the query per minute and user calculation
Interesting... so a row still represents a query, but I would have repeated "minute timestamps" and usernames for some users. Thanks, @Vegar I'll give it a try.
Hello again,
I finished my anomaly detection page, but I'm not completely satisfied, so I want to share with you what I did. Please share your thoughts, and let me know if I'm doing something wrong, or could have been done differently.
LOAD statement (partial):
Date(Floor(query_timestamp), 'DD/MM/YYYY') as FECHA_CONSULTA,
Timestamp(Floor(query_timestamp, 1/(24*60))) as FECHA_CONSULTA_MINUTO,
Timestamp(Floor(query_timestamp, 1/24)) as FECHA_CONSULTA_HORA,
Date(query_timestamp, 'DD/MM/YYYY HH:MM:SS') as FECHA_HORA_CONSULTA,
But, when displaying these columns in a table, shouldn't the minute part of FECHA_CONSULTA_MINUTO (3rd column) reflect the same minute as FECHA_HORA_CONSULTA (the first column)?
For the calculations, I've created master measures like this:
Aggr(Count(query_id), FECHA_CONSULTA_MINUTO, username)
to get the number of queries per user per minute.
The result appears to be correct.