Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

How to integrate detailed and aggregated data sets

Hi guys

I'm having quite a challenge integrating a short term fully detailed data set with a long term aggregated data set.

They contain the same core data (Revenue/Cost), but the aggregated data set only contain the overall dimensions and can thus be compressed to 4% of the rows with fewer columns. The TransactionHead and TransactionDetails tables are thus merged into one aggregated table.

error loading image

The purpose is to have 1 year of full detail (2009-08 to 2010-08, 180 million lines) and three years of aggregated data (2008-01 to 2010-08, 20 million lines), thus saving memory and increasing performance.

The two solutions I see both have pros and cons:

Link Table
Will require a large key with extremely many unique values.

Concatenate
Concatenating aggregated data onto TransactionDetails, will require the CaseNumber key to be added to aggregated data, thus four doubling the number of rows.


This must be a fairly common problem, how do you solve it?


Best regards
Frederik

5 Replies
Not applicable
Author

Still struggling with this issue, anyone got an idea?

Not applicable
Author

Hi,

What is highest level of granularity in TransationAggregated table.

Regards

Ashish Srivastava

Not applicable
Author

Hi Ashish

The TransactionAggregated table has been aggregated using GROUP BY on:

- Department

- Product

- Date (in full months)

- CustomerID (xCaseNumber should have been xCustomerID, sorry)

Revenue and Cost is summed up.

Not applicable
Author

I am not asking about for which filed you have to do group by.

I am asking about what is highest level of granularity level of your TrancationAggregated table.

Regards

Ashish Srivastava

Not applicable
Author

Hi Ashish

Assuming you mean which field has the highest number of unique values, then it is CustomerID with +250.000 unique customers.

If you mean which field has the highest extent of detail, as in Country -> Region -> City, then id say there's no field with the most details, as they they're all of a different kind. (not related directly like Country/Region/City)