Real-time analysis of new incoming transactions wi... - Qlik Community

ChristofSchwarz · ‎2014-02-15

An opening statement first: QlikView's Direct Discovery was initally made as one possible way to address Big Data by connecting to its storages. The aspect of real-time analysis was not the initial driver, so some basic assumptions were made. This article is about taking the current Direct Discovery feature behind it's concept and hence is a bit unorthodox. As newer versions will come out (>=QV11.2 SR6) i will revisit this article.

Direct Discovery 2.0 (this is the redesigned Direct Discovery since QV 11.2 SR5 only!) is separating fields from an external SQL table into DIMENSION and MEASURE fields. The help file explains, that QlikView will read the distinct values of dimension fields only(!) at the execution of the reload, NOT later. This means, if new dimension members will arrive after the reload, those will not show up.

This is good for new fact rows to arrive for already known dimensions, like a known MachineID and some incoming machine information. But I like to achieve more than this with Direct Discovery:

Let's assume new orders are coming in every other minute and I like to see those new transactions in QV in real-time (whenever I write about "new" orders I refer to such orders which have been created in the sql source table after the last QV reload; those are not in the memory data model). Yes, I need the unique OrderID as an dimension to work with it. Impossible or not? Well, I wouldn't have started this article, if there was no possibility ;-). Here are a number of steps required to get this work, but finally it will fly!

═►download "Endresult_Presentation_5min.swf" for a Flash-Video showing the final result of this thread in action

Prerequisite: One prerequisite at this point is that the new OrderIDs must be able to be "guessed". This is the case, when the OrderID is a unique incremental number, so that you know, if the latest OrderID for now is 1024, the next one will be 1025 and so on. If the ID of a new Order cannot be "guessed", I am afraid, then this article cannot help (however, the unique ID could be a new ID column with an increment. It does not need to be the Order ID from the system itself ...)

This is my simple order table "SalesOrderTable" on a SQL Server. OrderID is an auto-incremental key column. The CreationDate is system-filled with the timestamp when the record was inserted. CustomerID, ProductID are foreign keys.

Command to create this table on SQL Server is

CREATE TABLE SalesOrderTable

(

OrderID bigint IDENTITY(1000,1) NOT NULL,

CreationDate datetime NOT NULL,

CustomerID int NOT NULL,

ProductID int NOT NULL,

OrderValue money NULL,

CONSTRAINT PK_SalesOrderTable PRIMARY KEY CLUSTERED

( OrderID ASC ) ON [PRIMARY]

)

ALTER TABLE SalesOrderTable

ADD CONSTRAINT DF_SalesOrderTable_CreationDate

DEFAULT (getdate()) FOR CreationDate

Step 1: Get a text message about new orders and list all orders

The approach seemed clear to me, since I just want to count the new orders, I will load all fields (even the OrderID) as MEASURE fields, but Direct Discovery requires at least one DIMENSION field. I created a View "vw_SalesOrderTable" to show all columns plus one calulated column called "AllRecords", which always shows 1.

CREATE VIEW vw_SalesOrderTable AS

SELECT 1 AS AllRecords, * FROM SalesOrderTable

Now I load that view with DIRECT SELECT command and my only dimension is the AllRecords column. By setting a filter "1" on that field later in the GUI, I will automatically refer to every order in the view.

The data model has two separated tables, the memory orders and the DirectTable orders.

The script sets the Direct Discovery cache to expire every second with that line:

SET DirectCacheSeconds = 1;

═►download "DD Show New Orders-0.qvw" for this status of the game

There are only two objects, which can deal with new orders like this: A textbox and a table-object. A chart-object would not!

I created a table object, which contains the colum "AllRecords"
I created a button with the label "Refresh" which toggles the selection of value "1" in field "AllRecords" (=sets and removes 1 from that field)
I created a text object which displays a message as soon as (after clicking Refresh) new orders where found. Formula is

=If(Count(sql.OrderID) > Count(mem.OrderID), Count(sql.OrderID) - Count(mem.OrderID) & ' new order(s) found since last reload')

Clearly, this was not good enough yet. I wanted to select with one click in a field the new orders or the orders found in memory.

Step 2: Get a dimension "IsLoadedInQV" as selection box

The idea is like follows:

After loading all Orders into memory I will write back the creation timestamp of the "youngest" order (the one with the highest OrderID) into a separate SQL table "Last_QV_Load"
The SQL view, which is subject to the Direct Query, will get another calculated column, which will print "Mem" in case the record creation timestamp is equal or lower than the saved timestamp from the table "Last_QV_Load", otherwise it will print "New". In fact, not QlikView can tell the "new" from the "mem" Orders, the view itself does!
During reload, I will already add the value "New" to a memory table in QV, although at that very moment, there is no "New" in the view yet. But there will be, as soon as more orders will be created after the reload (again: "guessing" a dimension value in advance)

In this step I need two new objects in SQL: A new table "Last_QV_Load" and a stored procedure "sp_Set_QV_LoadTime":

CREATE TABLE Last_QV_Load

(

TableName nvarchar(50) NOT NULL,

Last_Loaded_Record datetime NOT NULL,

QV_Load_Time datetime NOT NULL,

CONSTRAINT PK_Last_QV_Load PRIMARY KEY CLUSTERED

( TableName ASC ) ON [PRIMARY]

)

CREATE PROCEDURE sp_Set_QV_LoadTime

@TableName NVARCHAR(50)

,@Cutofftime DATETIME = NULL

AS

BEGIN

DELETE FROM Last_QV_Load WHERE TableName = @TableName;

INSERT INTO Last_QV_Load

SELECT @TableName, @Cutofftime, GetDate();

END

I need to alter the view "vw_SalesOrderTable" as well and add the new column:

CREATE VIEW vw_SalesOrderTable AS

SELECT

1 AS AllRecords

,CASE WHEN o.CreationDate <= l.Last_Loaded_Record THEN 'Mem' ELSE 'New' END AS IsLoadedInQV

,o.*

FROM

SalesOrderTable AS o

LEFT OUTER JOIN

Last_QV_Load AS l

ON l.TableName = 'SalesOrderTable'

The load script is creating a data model as follows. If you like to see the Load Script,

═►download "DD Show New Orders-1.qvw"

My frontend has now a new field "%IsLoadedInQV" with two values "Mem" and "New" which I can click to see the right Orders in the table object. But please note: You could not use a chart object to display orders here, because there is no dimensionality on the OrderID (this is a measure right now) ...!

Step 3: Dynamically add new OrderIDs to a listbox and create a chart object

The final steps are to link the two data islands via the common OrderID and to "guess" some future OrderIDs already in the script. As written above already, it is necessary to do that in the script at load-time. The ideal situation is an auto-incremental OrderID field, so you will easily know the number range of the next IDs.

Yes, Direct Discovery will be able to link the next new orders to the in-memory model as they come. BUT ... there will only one 'disturbance' in the scene: The QlikView native caching of in-memory queries. We turned off the SQL cache, yes, but the new OrderIDs (the ones arriving after the reload) are in memory, not on SQL. And the memory cache can NOT be set to expire! It can only be invalidated by creating new and new artificial selection combinations, so that QlikView never can use that cache (this is the job of table "InvalidateCache" in the below model). For the current Service Release (11.2 SR5), this is my "clean" workaround (not mentioning dirty ways of disabling the cache with specific internal settings)

The field InvalidateCacheToken contains for each user some hundred of "tokens". Each time, the user clicks on "Refresh" button in the frontend, the next token will be selected and QlikView will (re)start both, the in-memory and the SQL queries to find out about the new situation.

By the way, I don't like to show the field "%OrderID" in a listbox in the frontend, because this will show not only existing OrderIDs but also the next 1000 (or so) "guessed" OrderIDs. I only want to see such %OrderIDs in my listbox which are at the very point in time already existing in the SQL table (before or since the reload). The formula for a selection box to do so is this:

=Aggr(Min(sql.OrderID), %OrderID)

download "DD Show New Orders-Final.qvw" to see the load script and frontend

═►download "Endresult_Presentation_5min.swf" for a Flash-Video showing the final result of this article in action

Oh, almost forgot to mention, that the view "vw_SalesOrderTable" got one more column, a copy of the OrderID called "OrderID_Dim". OrderID is loaded twice in the data model, as a MEASURE FIELD (as in previous steps) but also as a DIMENSION field in the DIRECT SELECT statement. Therefore, the view shows the OrderID twice, as "OrderID" and as "OrderID_Dim".

CREATE VIEW vw_SalesOrderTable AS
SELECT
    1 AS AllRecords
    ,CASE WHEN o.CreationDate <= l.Last_Loaded_Record THEN 'Mem' ELSE 'New' END AS IsLoadedInQV
    ,o.*
    ,o.OrderID AS OrderID_Dim
FROM
    SalesOrderTable AS o
LEFT OUTER JOIN
    Last_QV_Load AS l
    ON l.TableName = 'SalesOrderTable'

Report Inappropriate Content · ‎2014-02-17

Thanks for the documentation.

Zeljko

Report Inappropriate Content · ‎2014-04-02

Cool!

Nice work!

But I got some strange behaivor...

%IsLoadedInQV in the bottom right does not quite work.

First it show New after a total reload,

After the 4:th record I get " 666 ###" as value

After a few more records I get Allocated Memory Exceeded

but putting the %IsLoadedInQV inside a min() seems to do the trick...

It could of course be a SR6 Beta issue...

BR

Hans

Report Inappropriate Content · ‎2014-04-02

So, now I also tested i SR5 and it works fine there.

Some thing happend between SR5 and SR6 Beta...

BR

Hans

ChristofSchwarz · ‎2014-04-04

Thank you for letting me know. I will have a look into this and report a bug.

SR6, however, replaces some workarounds of SR5 already. I will have to rewrite the article to match the SR6 behavior anyways ...

Report Inappropriate Content · ‎2014-04-10

You are welcome!

Please feel free to elaborate little about the workaround/replacements.. 🙂

BR

Hans

Report Inappropriate Content · ‎2014-06-11

Wow.. nice sharing, thank you very much.

I have questions:

Can the DD qvw display automatically refresh without user clicking?

so that we can built an online dashboard to display on big lcd screen.

Currently we already use qvd load with 'Enable Push from Server', and display using IE Plugin.

Moveover, can we combine historical qvd data with online DD load data to display in one qvw ?

Regards,

Alex

Real-time analysis of new incoming transactions with DD