I have 40 millions records and 5 millions unique customer ids. Aggr below is working true but slow. What can do to make it faster? I want to aggr customers into 3 parts according to their volumes change between two dates: increase, decrease, zero chance.
If this is further too slow you will need to consider changes within your datamodel to reduce the number of tables to which the involved fields belong to and also their associations (means developing the datamodel in the direction of a star-scheme or a big flat-table).
Usually the above mentioned approach of removing redundant calculations of an expression improved the performance (an if-loop calculates always all branches completly and the true/false check of the branches is applied afterwards).
Nevertheless in your datamodel might be options to optimize theses expression. If the expression used the fields from all 4 dimensions and the fact-table the created virtual table on which the calculation will be performed might be quite heavy - and although the aggregations like the sum() here is multi-threaded the creation of the virtual table is a single-threaded process. More to this could you find here: Logical-Inference-and-Aggregations