Skip to main content
Announcements
NEW: Seamless Public Data Sharing with Qlik's New Anonymous Access Capability: TELL ME MORE!
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Avoiding a gigantic intervalmatch

Hello,

I know there are a lot of interval topics already but haven't found a solution for my problem.

I have a few houndred thousand members. The members can move from different states, active, inactive and so on over time.

Everytime they get a starttime and an endtime.

So what I do is make an intervalmatch on starttime / endtime in order to get their state at a point in time.

The problem is this goes back about 8 years and doing an intervalmatch over eight years with about 300 000 members evertyime their status changes gives me too many rows. The server gives up at around 1,8 billion rows.

I need to see how many active members each month.

Ingoing and outgoing balance.

How many new members and how many lost members and so on.

If I limit the starttime to say 2008 it works but the problem then is that I don't get the members who have started before 2008 and who haven't changed their membership type/status so the count for active members and balance is incorrect.

Any help is appreciated. I'll see if I can set up an example with some data if it helps.



7 Replies
Not applicable
Author

Take a look at my previous post here http://community.qlik.com/forums/t/23510.aspx

I'm using this for apps where I used to load ~500 millions records for interval matching monthly.

Not applicable
Author

Hello,

Thanks for your answer.

Is it this solution in the discussion you're talking about?


<pre><pre>left join (Vacation)
load StartDate
,EndDate
,autonumber(StartDate & '-' & EndDate) as DateRangeId
resident Vacation;

VacationToDate:
load distinct
DateRangeId,
StartDate + IterNo() - 1 as Date
resident Vacation
while StartDate + IterNo() - 1 <= EndDate;

Not applicable
Author

Yes. It is. It does absolutely the same as intervalMatch, but it's not killing a memory during the process.

Not applicable
Author

Hello,

have been trying this and as you said, it's not killing memory and it seems like I could get the correct results.

However, the problem is the number of rows, the application is unmanageable, there is more data in the app than this.

Any other suggestions as how to both avoid intervalmatch and too many rows?

Not applicable
Author

This approach is doing absolutely the same what intervalMatch does. Check how the intervalMatch works. It's creating many rows for an interval. And you as an application architect have to decide to join it with existing table, or you can create a separate cross reference table for relationship.

Not applicable
Author

Thanks Nick, I understand that it's the same approach it's just that now I have a problem with too much data.

I'll see if a link table is viable, the problem there is that it's very slow when making choices in the application when there's that many rows.

Not applicable
Author

That's true. I had the same problem with my application (23Gb in memory, 4x6 core CPU, 500 millions records with intervals, and from 1 to 2 billions records in cross reference table). The performance for all data was very slow from 1 to 15-30 mins. And one smart guy pointed me on CPU usage. I had a bottleneck there. Some expressions are not multi threaded. I've fixed my expressions, replaced all my macros by actions. And now any of my queries are less than 30 sec.

My suggestion is try to remove all macros and check your CPU usage.