QlikView Expressor Tutorial Advanced: Using the Pivot Column Operator

    Originally published on 09-22-2011 01:55 PM

     

    The Pivot Column operator converts multiple records into a single record that contains selected fields from each of the multiple records.  That is, this operator performs a many-to-one transformation.  Using this operator requires no coding; all configuration is performed through a graphical interface.

    Let's suppose you have a file with monthly corporate sales data.  The first line is a header and the following lines are the sales data by month, by year, for selected companies.

    Company,Year,RevenueMonth,Revenue
    Telco1,2011,JanRev,1101
    Telco1,2011,FebRev,1102
    Telco1,2011,MarRev,1103
    Telco1,2011,AprRev,1104
    Telco1,2011,MayRev,1105
    Telco1,2011,JunRev,1106
    Telco1,2011,JulRev,1107
    Telco1,2011,AugRev,1108
    Telco1,2011,SepRev,1109
    Telco1,2011,OctRev,1110
    Telco1,2011,NovRev,1111
    Telco1,2011,DecRev,1112
    Telco2,2010,JanRev,1001
    Telco2,2010,FebRev,1002
    Telco2,2010,MarRev,1003
    Telco2,2010,AprRev,1004
    Telco2,2010,MayRev,1005
    Telco2,2010,JunRev,1006
    Telco2,2010,JulRev,1007
    Telco2,2010,AugRev,1008
    Telco2,2010,SepRev,1009
    Telco2,2010,OctRev,1010
    Telco2,2010,NovRev,1011
    Telco2,2010,DecRev,1012

    You want to pivot this data so that each row of an output file contains all of the data for a single year for a specific company.

    Company,Year,JanRev,FebRev,MarRev,AprRev,MayRev,JunRev,JulRev,AugRev,SepRev,OctRev,NovRev,DecRev
    Telco1,2011,1101,1102,1103,1104,1105,1106,1107,1108,1109,1110,1111,1112
    Telco2,2010,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012

    This is the type of operation the Pivot Column operator performs.

    Let's first implement this basic example.  The dataflow includes only three operators: Read File, Pivot Column, Write File.

    pivot_column1.png

    Each incoming record, that is, row from the source file, contains four fields while each outgoing record will contain 14 fields.  Although the Read File operator is configured to drop the header row, the default schema for the input file defines an attribute whose name is the same as the field header.  As with all operators in the Transformers grouping, you will work with the attribute names within the operator.

    To configure the Pivot Column operator, there are four tabs that require attention.

    1. Place the Read File operator onto the dataflow and set its properties.
    2. Place the Pivot Column operator onto the dataflow and connect to the Read File operator.
    3. Select the Pivot Column operator and click the Edit Pivot link in its Properties panel or the Edit Pivot button in the Operators grouping of the ribbon bar.  This opens the Pivot Column wizard.
    4. Tab 1, Build Output, is where you describe the structure of the record emitted by the operator.  Notice that the Input Attributes panel lists all of the attributes in the incoming record.
      pivot_column2.png
    5. In this example, the emitted record will include the attributes Company, Year, and 12 attributes, each representing a single month's sales.  To add Company and Year to the listing of output attributes, select thee attributes in the input attributes listing and click the Add button that is between the Input Attributes and Output Attributes panels.  The attribute names are transferred to the Output Attributes panel.
      pivot_column3.png
    6. Now you need to manually add an attribute for each month's revenue. Click the Add button that is above the Output Attributes panel to open the Add Attributes window.  Create the required 12 attributes.  In this example, all attribute types are strings, but in an actual use case you will probably have changed the input attributes that represent revenue to decimal types.
      pivot_column4b.png
      • Note the right-facing arrow before Company and Year and the diamond before the months.  These icons indicate that Company and Year will be directly initialized with values from the incoming record and that the months will be initialized with values selected by the pivot operation.
    7. Click Next or click on Tab 2 to move to the next tab, Specify Transfers.  Notice the arrow between the input attributes Company and Year and the identically named output attributes.  The operator can make this decision as the attribute names are identical. However, you could make an alternative assignment by clicking on the small triangle and selecting a different input attribute from the drop down list.
    8. Since none of the remaining input attributes are assigned directly to output attributes, move to Tab 3.
    9. On Tab 3, Specify Pivots, you define the pivot operation. Start by clicking the Add Pivot button, which places a pivot descriptor into the tab.
      pivot_column5.png
    10. Next, click Select..., and in the Select Attributes window, select which data from the incoming record you want emitted.  In this example, you want to emit data for each month, so select all the attributes.  Then click Select.
      pivot_column6a.png
    11. The Output attributes listing now includes the name of the pivot attribute.  In the Pivot from drop down under the Input attributes label, select RevenueMonth, the attribute in the incoming record that will identify the month.
    12. Then in the Value from drop down, select Revenue, the attribute in the incoming record that contains the monthly revenue value.
      pivot_column7a.png
    13. Move to the final tab, Edit Output, whihc presents a graphical representation of the pivot operation.  Use this information to confirm that the output is what you desire.
      pivot_column8a.png
    14. If desired, you can also change the names of the values under the pivot column (RevenueMonth).  Simply click on the appropriate cell and change.
    15. Click OK to complete the process and close the Pivot Column wizard.

    As a second example, let's use a more involved incoming record - a record that includes both revenue and expense data.  In this use case, you want to create multiple pivots, one for each month's expense and one for each month's revenue and then combine into a single record that includes all revenue and expense data for a single year of a specific company.

    Company,Year,RevenueMonth,Revenue,ExpenseMonth,Expense
    Telco1,2011,JanRev,1101,JanEx,2201
    Telco1,2011,FebRev,1102,FebEx,2202
    Telco1,2011,MarRev,1103,MarEx,2203
    Telco1,2011,AprRev,1104,AprEx,2204
    Telco1,2011,MayRev,1105,MayEx,2205
    Telco1,2011,JunRev,1106,JunEx,2206
    Telco1,2011,JulRev,1107,JulEx,2207
    Telco1,2011,AugRev,1108,AugEx,2208
    Telco1,2011,SepRev,1109,SepEx,2209
    Telco1,2011,OctRev,1110,OctEx,2210
    Telco1,2011,NovRev,1111,NovEx,2211
    Telco1,2011,DecRev,1112,DecEx,2212
    Telco2,2010,JanRev,1001,JanEx,2001
    Telco2,2010,FebRev,1002,FebEx,2002
    Telco2,2010,MarRev,1003,MarEx,2003
    Telco2,2010,AprRev,1004,AprEx,2004
    Telco2,2010,MayRev,1005,MayEx,2005
    Telco2,2010,JunRev,1006,JunEx,2006
    Telco2,2010,JulRev,1007,JulEx,2007
    Telco2,2010,AugRev,1008,AugEx,2008
    Telco2,2010,SepRev,1009,SepEx,2009
    Telco2,2010,OctRev,1010,OctEx,2010
    Telco2,2010,NovRev,1011,NovEx,2011
    Telco2,2010,DecRev,1012,DecEx,2012

    In this example, you will set up a pivot for the revenues and a second pivot for the expenses.  Each pivot must include the same number of attributes.  The following screen shots show the tabs in the Pivot Column wizard.

    pivot_column9.png

    pivot_column10.png

    pivot_column11.png

    pivot_column12.png

    Note how in Tab 3 the two pivots are configured.  One deals only with the data related to revenue and the other with the data related to expenses.  As shown in Tab 4, each output record will include 26 fields: Company, Year, and 12 fields for monthly revenue and 12 fields for monthly expenses.

    The outut file includes the following content.

    Company,Year,JanRev,FebRev,MarRev,AprRev,MayRev,JunRev,JulRev,AugRev,SepRev,OctRev,NovRev,DecRev,JanEx,FebEx,MarEx,AprEx,MayEx,JunEx,JulEx,AugEx,SepEx,OctEx,NovEx,DecEx
    Telco1,2011,1101,1102,1103,1104,1105,1106,1107,1108,1109,1110,1111,1112,2201,2202,2203,2204,2205,2206,2207,2208,2209,2210,2211,2212
    Telco2,2010,1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1011,1012,2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012