QlikView Expressor: Processing Large Amounts of Data with Join, Aggregate and Unique Operators

    The Join, Aggregate and Unique operators compare multiple records before determining the output flow. All three operators have a Method property  for specifying where the records are held while being compared. For  smaller amounts of data, processing can be performed efficiently by  holding records In Memory or On Disk.

    With large amounts of data (100+ MB), processing In Memory or On Disk can reduce performance significantly and can even cause the application to fail. The recommended practice is to use the Sorted Method instead of  processing In Memory or On Disk. When the Method property is set to Sorted, the operators do not need to store records temporarily because they are presorted.

    To presort the input records for Join, Aggregate and Unique operators, a Sort operator must be placed upstream of those operators. The Sort operator must use the same key or keys for sorting the the downstream operator uses for it processing, and the sort order must be the same. For example, in the following dataflow, a Join operator is followed by an Aggregate operator preceded by Sort operators for each of its input ports.

     

    608d1335533360-bpsortdf.png

    The Property panels for the Sort, Join and Aggregate operators show that the key used in all of them is the same--SalesOrderID.

    609d1335533402-bpsort1.png610d1335533434-bpsort2.png
    611d1335533483-bpjoin1.png612d1335533522-bpaggregate1.png