Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hoping this can be done...
trying to pre-aggregate a 175M+ fact table but we need a count distinct on the a dimension.
When we use COUNT(DISTINCT field) with a dimension from the raw table, its applying count distinct to the total as well.
If you put the dimension in the table with a the count distinct the table total will not match the sub-totals.
With the pre-aggregated table I'm using a sum instead of count. For the table chart, the total is correct adding up all the dimension sub-totals but a KPI would be incorrect because its including duplicates across dimensions.
Is there a way to pre-aggregate and have the distinct count correct ?
Hope I'm explaining this correctly. Help me Obi-wan kenobi.
I don't think this would be possible in concept, regardless of any software concerns - you have no way of knowing about these duplicates until filters have been made. Assuming that is the case, the only way to achieve what you're describing would be to pre-aggregate every possible combination of filters, which seems like it would cause worse performance than what you started with.
If you don't have a filtering issue, you could pre-aggregate separately both at the group by level, and at the totals level, which would prevent the duplication issue. You would need to adjust your measures to handle this, of course.
Not sure if you are talking about tables in the UI or tables in the datamodel.
If your issue is that in a table in the UI, if the expression is count(distinct Field) ventilated on a dimension, then the total is doing a global count(distinct Field), what you can do is to change the totals function to Sum in the UI.
What about using such approaches?
count(distinct Field1&Field2&Field3)
sum(aggr(count(distinct Field1), Field2, Field3))