Skip to main content
Announcements
Qlik Connect 2024! Seize endless possibilities! LEARN MORE
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Performance issues with Intervalmatch

Hi,

I'm a QV novice so please be patient with me...

I'm building a solution for an energy company. I was asked to provide a report which will give the accurate number of active connections for each calendar day.

The interval table, named 'Link' contains the following fields:

%Connection_ID - Key to Connection table
%SupplierRegister_ID - "PK" for 'Link' table
StartSupplyDate_ID - Connection start date (Integer form of date. e.g. 20090728)
EndSupplyDate_ID - Connection end date (Integer form of date. e.g. 20090728)
%Customer_ID - Key to customer date

This table contains at the moment 1.6M records. It is connected, using Intervalmatch to Master date table.

Master date table contains a row for each date since January first 2008 (570 records).

Code:

SET NullInterpret = $(vDateKey);

NullAsValue EndSupplyDate_ID;

Link:

LOAD %Connection_ID,

%SupplierRegister_ID,

StartSupplyDate_ID,

StartSupplyMessage_ID,

EndSupplyDate_ID,

EndSupplyMessage_ID,

%Customer_ID,

_Link_created_date

FROM QVDs\LinkTable.qvd] (qvd);

Connection:

LOAD %Connection_ID,

EAN,

[Connection Product],

[Connection Street],

[Connection HouseNo],

[Connection HouseNoExtra],

[Connection Postcode],

[Connection City]

FROM QVDs\Connection.qvd] (qvd);

Customer:

LOAD %Customer_ID,

[Customer Name],

[Customer PhoneNo],

[Customer Email],

[Customer BirthDate],

_Customer_created_date

FROM QVDs\Customer.qvd (qvd);

Date:

LOAD

%Date_ID,

Date,

CY_Flag,

Weekday,

[Week Number],

Year,

Month,

Day,

Quarter,

(Year * 10000) + (Month * 100) + Day AS DateKey

FROM QVDs\DATE.qvd] (qvd);

LEFT JOIN(Link) INTERVALMATCH (%Connection_ID, DateKey)

LOAD %Connection_ID,

StartSupplyDate_ID,

EndSupplyDate_ID

RESIDENT Link;

Because of the sheer amount of volume data, I keep getting the notorios VIRTUAL MEMORY error. When making the link, Link table exceeds 50M records. This is a major performance overhead. As this is a part of POC project we would like to show that QV can handle large amount of date, w/out killing the server...

Maybe IntervalMatch is not the right approach? Is there another solution?

Help will be greatly appreciated!

Dror

6 Replies
Not applicable
Author

Hi,

I can see from the code that you are creating multiple records for each connection id for e.g if a connection has start as Jan 2008 and end date as today, it is going to create as many records for that id (570 records for same connection with distinct datekey). My question is do you really need this kind of data? If yes, what purpose it serves?

Thanks

Amit

Not applicable
Author

Amit,

I'm aware of the fact that a single connection might appear i.e for for each distinct datekey between date span. The Idea is to be able to provide analysis per day basis. You can limit the subset to week interval, but assuming there are roughly 700000 active connection at any given time, and we limit to current year, the result set will be 36.4M records. still too much...

The purpose of this analysis is be able to show breakdown of active connections, accross varous perios interval and broken down into several dimmesional properties (that's why i've used the link (star) approach)

Thanks

Dror

rwunderlich
Partner Ambassador/MVP
Partner Ambassador/MVP

First, I hope you are trying this in 64bit, not 32bit.

Is the only purpose of the IntervalMatch to link %Connection_ID to various periods? That is, select a [Week Number] and show all the %Connection_ID that were active that week? If so, can you use a simple IntervalMatch instead of the Extended form. Meaning:

LEFT JOIN(Link) INTERVALMATCH (DateKey)
LOAD StartSupplyDate_ID, EndSupplyDate_ID RESIDENT Link;

Don't know if that may work out better. IntervalMatch does generate a lot of rows. Set Analysis is another approach you may consider.

Any chance you can post a QVW with a small amount of records? If you can upload a qvw, maybe we can make a better analysis of the problem. See http://community.qlik.com/wikis/qlikview-wiki/preparing-examples-for-upload-reduction-and-data-scram... if you have concerns about privacy.

-Rob

Not applicable
Author

Hello Dror,

Sorry, I accidently hit the 'Suggest as Answer' button on your post.

Even I had a similar situation in one of my qv application..we were using intervalmatch which created millions of records and began causing memory issues. Below is what we did in that case:

1) Capture one record per id for each month from start till end date. In date key we used to keep the begin date of the month.

2) If there is a change in any attribute of the id, capture that as a new record. Date key for this record could be kept as the actual date when the change happened.

Take example of a stock 'XYZ', analyst started rating this stock from Jan 08 till Jan 09. In current scenario, we create 365 records for this stock. However, if the rating of this stock has changed only twice during this time period, we can work with just 14 records (12 records for 12 months and 2 records where ratings changed). Next, you create a date island table which is not linked to any other table in the model and has all the date attributes. Now the chart where you want to show data across intervals should compare it's datekey with the date from island table.

Hope this might be helpful in your case....

Thanks

Amit

Not applicable
Author

Amit, Rob

Thanks for your help. It has certainly gaven me a lead...

I will try post scaled-down app later.

Dror

rajni_batra
Specialist
Specialist

Hi Dror,

Did you get any solution to this? i am facing the same issue.

Thanks,

Rajni Batra