Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Merge with multiple columns or single column data model?

Hi all,

I am looking to get some feedback on what is generally best practice. We have a basic data model called customer usage that is processed at the end of the week, saved to QVD and not re-processed again. In this table we taken all the different types of usage and created a simple 5 column table:

UserId

WeekStartDate

UsageTypeId

UsageUnit

On this table we do Sum or Count on UsageUnit using Set Analysis on the UsageType:

Sum({$<UsageTypeId={5}>}UsageUnit)

We don't do sum on all the usagetypes.

The overall document in particular is 1.6 GB with 400 million rows. As the data has grown we have been using the Document Analyzer to ensure we always get the best performance.

After trying a few things we found UserID and WeekStartDate being rather costly in table and state size due to the user has multiple entries per weekstartdate per userid.

One idea which is favorable is to group and merge this model into:

UserId

WeekStartDate

Usage1

Usage2

Usage3

Usage4

Usage5

In the real life situation the usage types would be named correctly but in this scenario we have over 30 different usages types. Using a simple range of 30 days the data sizes are:

> Original: 56 million rows. / 450mb table bytes. / Virtual memory 544 mb

> Merged: 6.3 million rows. / 246mb table bytes / Virtual memory 318 mb

Is there generally a good practice for something like this? The space saving and size does seem very appealing in the merged model but the null values in the merged model does make me shiver a little (not sure if thats a good or bad thing).

Love to hear your thoughts.

-Chris

0 Replies