Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all !!!
I need to calculate the cache hit ratio.
Cache hit ratio is calculated as:
number of searches that hit cache / total searches.
To do so i need to create a flag in the table with true and false indication.
The system cache first response per unique scenario for 30 minutes.
If a request is made for the same unique scenario within 30 minutes the system serves the results from cache.
My original table has these data:
Unique scenario(Key) | Request (timestamp) |
---|---|
1 | 2016-04-19 06:23:39 |
1 | 2016-04-19 06:23:39 |
1 | 2016-04-19 06:23:50 |
1 | 2016-04-19 06:24:20 |
1 | 2016-04-19 06:24:50 |
1 | 2016-04-19 06:35:19 |
1 | 2016-04-19 07:13:05 |
1 | 2016-04-19 08:59:39 |
2 | 2016-04-19 06:28:39 |
2 | 2016-04-19 06:33:50 |
2 | 2016-04-19 06:39:50 |
2 | 2016-04-19 07:33:50 |
Modified table should look like this:
Unique scenario(Key) | Request (timestamp) | Flag |
---|---|---|
1 | 2016-04-19 06:23:39 | False |
1 | 2016-04-19 06:23:39 | True |
1 | 2016-04-19 06:23:50 | True |
1 | 2016-04-19 06:24:20 | True |
1 | 2016-04-19 06:24:50 | True |
1 | 2016-04-19 06:35:19 | True |
1 | 2016-04-19 07:13:05 | False |
1 | 2016-04-19 08:59:39 | False |
2 | 2016-04-19 06:28:39 | False |
2 | 2016-04-19 06:33:50 | True |
2 | 2016-04-19 06:39:50 | True |
2 | 2016-04-19 07:33:50 | False |
I want to do this in script as my table consists of millions of records.
Can anyone help?
Theofilas,
I've read your requirements a bit different than Sunny:
The table is order by Unique Key asc, created_at asc.
The first row represents the first request. This gets a false value as it is not cached by the system.
The next 5 records have true value as they are within 30 minutes from the 1st record and are served from the system cache.
Then a new request is made (record 7) and is not served from the system cache (the time frame of 30 minutes has passed).
The same applies and for the 8th record. Thats why they will get false value.
If record 8 was within 30 minutes from record 7, then the flag should be true.
The same logic is repeated in the rest of the records but for different unique key.
If I understood correctly, the 30 minutes starts ticking from the time the cache entry is created and is not reset with every new request that can be answered from the cache.
Table:
LOAD [Unique scenario(Key)],
[Request (timestamp)]
FROM
[https://community.qlik.com/thread/214323]
(html, codepage is 1252, embedded labels, table is @1);
FinalTable:
LOAD *,
If([Unique scenario(Key)] = Previous([Unique scenario(Key)])
and ([Request (timestamp)] - Alt(Peek('CacheTime'),[Request (timestamp)])) <= MakeTime(0,30),
Peek('CacheTime') ,
[Request (timestamp)] ) as CacheTime,
If([Unique scenario(Key)] = Peek('Unique scenario(Key)'),
If([Request (timestamp)] - Peek('CacheTime') <= MakeTime(0, 30), 'True', 'False'),'False') as Flag
Resident Table
Order By [Unique scenario(Key)], [Request (timestamp)];
DROP Table Table;
You won't see a difference between my version and Sunny's version given your sample records, but you should see a difference e.g. when the requests come in every 29 minutes.
Not entirely sure how you are getting the output you specified, would you be able to elaborate?
Got it, may be this:
Table:
LOAD [Unique scenario(Key)],
[Request (timestamp)]
FROM
[https://community.qlik.com/thread/214323]
(html, codepage is 1252, embedded labels, table is @1);
FinalTable:
LOAD *,
If([Unique scenario(Key)] = Peek('Unique scenario(Key)'), If([Request (timestamp)] - Peek('Request (timestamp)') <= MakeTime(0, 30), 'True', 'False'), 'False') as Flag
Resident Table
Order By [Unique scenario(Key)], [Request (timestamp)];
DROP Table Table;
@Sunny T
The table is order by Unique Key asc, created_at asc.
The first row represents the first request. This gets a false value as it is not cached by the system.
The next 5 records have true value as they are within 30 minutes from the 1st record and are served from the system cache.
Then a new request is made (record 7) and is not served from the system cache (the time frame of 30 minutes has passed).
The same applies and for the 8th record. Thats why they will get false value.
If record 8 was within 30 minutes from record 7, then the flag should be true.
The same logic is repeated in the rest of the records but for different unique key.
Did you got time to look at the solution I proposed? It matches the output you were looking for.
Theofilas,
I've read your requirements a bit different than Sunny:
The table is order by Unique Key asc, created_at asc.
The first row represents the first request. This gets a false value as it is not cached by the system.
The next 5 records have true value as they are within 30 minutes from the 1st record and are served from the system cache.
Then a new request is made (record 7) and is not served from the system cache (the time frame of 30 minutes has passed).
The same applies and for the 8th record. Thats why they will get false value.
If record 8 was within 30 minutes from record 7, then the flag should be true.
The same logic is repeated in the rest of the records but for different unique key.
If I understood correctly, the 30 minutes starts ticking from the time the cache entry is created and is not reset with every new request that can be answered from the cache.
Table:
LOAD [Unique scenario(Key)],
[Request (timestamp)]
FROM
[https://community.qlik.com/thread/214323]
(html, codepage is 1252, embedded labels, table is @1);
FinalTable:
LOAD *,
If([Unique scenario(Key)] = Previous([Unique scenario(Key)])
and ([Request (timestamp)] - Alt(Peek('CacheTime'),[Request (timestamp)])) <= MakeTime(0,30),
Peek('CacheTime') ,
[Request (timestamp)] ) as CacheTime,
If([Unique scenario(Key)] = Peek('Unique scenario(Key)'),
If([Request (timestamp)] - Peek('CacheTime') <= MakeTime(0, 30), 'True', 'False'),'False') as Flag
Resident Table
Order By [Unique scenario(Key)], [Request (timestamp)];
DROP Table Table;
You won't see a difference between my version and Sunny's version given your sample records, but you should see a difference e.g. when the requests come in every 29 minutes.
I see what you mean Stefan. I think you are right, but can we simplify the code a little by using Left Join with Min value for timestamp?
Table:
LOAD [Unique scenario(Key)],
[Request (timestamp)]
FROM
[https://community.qlik.com/thread/214323]
(html, codepage is 1252, embedded labels, table is @1);
Join (Table)
LOAD [Unique scenario(Key)],
Min([Request (timestamp)]) as Min
Resident Table
Group By [Unique scenario(Key)];
FinalTable:
LOAD *,
If([Request (timestamp)] - Min > 0 and [Request (timestamp)] - Min < MakeTime(0, 30), 'True', 'False') as Flag
Resident Table
Order By [Unique scenario(Key)], [Request (timestamp)];
DROP Table Table;
I would like to thank you all (Sunny T and swuehl).
@ swuehl
The solution you provided is the correct and it works.
If you have the time, is it possible to explain what you have done?
Thanks again.
Sunny, I don't think that these scripts are equivalent.
Look at the second record, it is now flagged as False instead of True.
And all requests for a scenario that come in after 30 mins after the very first are now flagged as False, while I think a caching should happen instead again.
"If record 8 was within 30 minutes from record 7, then the flag should be true."
You should see the difference clearly by adding some more sample records.
Regards,
Stefan
I see what you mean. Everytime a request is made, the things will be in Cache for next 30 mins and then as soon as it passes the 30 minutes limit the flag becomes False and then for the next 30 it again become true??
Stefan, I am learning, I will be there one day (with your help)
Best,
Sunny