How to integrate detailed and aggregated data sets - Qlik Community

Report Inappropriate Content · ‎2010-08-06

Hi guys

I'm having quite a challenge integrating a short term fully detailed data set with a long term aggregated data set.

They contain the same core data (Revenue/Cost), but the aggregated data set only contain the overall dimensions and can thus be compressed to 4% of the rows with fewer columns. The TransactionHead and TransactionDetails tables are thus merged into one aggregated table.

The purpose is to have 1 year of full detail (2009-08 to 2010-08, 180 million lines) and three years of aggregated data (2008-01 to 2010-08, 20 million lines), thus saving memory and increasing performance.

The two solutions I see both have pros and cons:

Link Table
Will require a large key with extremely many unique values.

Concatenate
Concatenating aggregated data onto TransactionDetails, will require the CaseNumber key to be added to aggregated data, thus four doubling the number of rows.

This must be a fairly common problem, how do you solve it?

Best regards
Frederik

Report Inappropriate Content · ‎2010-08-09

Still struggling with this issue, anyone got an idea?

Report Inappropriate Content · ‎2010-08-09

Hi,

What is highest level of granularity in TransationAggregated table.

Regards

Ashish Srivastava

Report Inappropriate Content · ‎2010-08-09

Hi Ashish

The TransactionAggregated table has been aggregated using GROUP BY on:

- Department

- Product

- Date (in full months)

- CustomerID (xCaseNumber should have been xCustomerID, sorry)

Revenue and Cost is summed up.

Report Inappropriate Content · ‎2010-08-09

I am not asking about for which filed you have to do group by.

I am asking about what is highest level of granularity level of your TrancationAggregated table.

Regards

Ashish Srivastava

Report Inappropriate Content · ‎2010-08-09

Hi Ashish

Assuming you mean which field has the highest number of unique values, then it is CustomerID with +250.000 unique customers.

If you mean which field has the highest extent of detail, as in Country -> Region -> City, then id say there's no field with the most details, as they they're all of a different kind. (not related directly like Country/Region/City)