5 Replies Latest reply: Nov 16, 2016 7:45 AM by Marcus Sommer RSS

    AGGR Function(Performance Issue)

    Tony Francis

      Hi,

      I have been using the AGGR function in qlik and had a question.

      The expression i am using is AGGR(Nodistinct Count(Distinct USERS_META_ID),WEEK_ID,SEGMENT_1,SEGMENT_2).

      So this expression creates a table in memory which has Count(Distinct USERS_META_ID) grouped at week,Segment_1 and Segment_2 level for all the 52 weeks and all the segments in data. The performance is poor.

      Now my question is even if i filter only for 2 weeks and few segments, the data will be grouped for all the weeks and segments and then filtered for the required week and segments. So is there a way in which in the aggr function we can only group by the selected elements of week and segment rather than all the elements. This will mean that the table created will only have few weeks instead of many and may improve performance.

       

      Thank You

        • Re: AGGR Function(Performance Issue)
          Jonathan Dienst

          My understanding is that the Aggr() dimensions are filtered to the possible values, but I don't know if that happens before or after building the Aggr() table.

           

          One would hope that it filters before building the Aggr(). I have not tested this rigorously, but my experience suggests that this is indeed the case.

           

          Perhaps an expert from Qlik would like to comment here.

            • Re: AGGR Function(Performance Issue)
              Tony Francis

              Thank You Jonathan for the quick reply, i have a feeling filtering happens after the aggr() table is created, as even when i select 1 week and 1 element from the segment columns the report takes some time to execute. As you mentioned it would be great if an expert could confirm the behavior.

                • Re: AGGR Function(Performance Issue)
                  Marcus Sommer

                  Are you sure that you need the NODISTINCT? Also if this aggr isn't a calculated dimension an outer aggregation like: sum(aggr(Function, Dim1, Dim2)) might be useful.

                   

                  Another important influence for the performance (of each calculation) is from where the used fields come from - come they from a single table the performance will be significantly better as if the fields comes from different tables maybe connected over several table-hops and/or link-tables.

                   

                  - Marcus