13 Replies Latest reply: Jan 11, 2013 10:39 AM by John Lifter RSS

    Row merge with logic

      Hi,

       

      I would like to be able to transform some data which includes some aggregation and a group by.

       

      The data I have consists of 2 rows of data that have the same ID but some different attribute values. The main requirement I am trying to achieve is the ability to take say Column A's 2 rows and transpose them out into a single row, but with some logic applied. At the same time I would like to Group the two rows and Sum one of the other values.

       

      I have attached a simple example file that shows the starting data and the target with some notes.

       

      If anyone can help in pointing out the right artifacts to use and approaches that would be great.

       

      James

        • Re: Row merge with logic
          Mark McCoid

          Here is an example that will hopefully help out:

           

          Work:

          LOAD ID,

               Name,

               Type,

               SpotPrice,

               Currency,

               Term

          FROM

          C:\Hold\RowConsolidate.xlsx

          (ooxml, embedded labels, table is Sheet2);

           

          Tab1:

          LOAD ID,

               Name,

               IF(SpotPrice<0, 'Pay ' & Type, 'Received ' & Type) as NewType,

               SpotPrice,

               Currency,

               Term

          Resident Work

          WHERE Type = 'Fixed';

          JOIN

          LOAD ID,

               SpotPrice as RSpotPrice,

               Term as ReceivedTerm

          Resident Work

          WHERE Type = 'Floating';

           

          Final:

          LOAD

                    ID,

                    Name,

                    NewType,

                    SpotPrice + RSpotPrice as SpotPrice,

                    Term,

                    ReceivedTerm,

                    Currency

          Resident Tab1;

           

          Drop table Tab1;

          drop table Work;

           

          I always like to load the table into Qlikview first and then manipulate using QV functions.  After you load the whole table in, you can load your first set of rows, then the second set joining on ID.  Then one final pass through you can add the logic to get the price and then drop the work tables.

           

          I hope this helps.


          Mark

          • Re: Row merge with logic

            Just to confirm:

            1. Each swap will always be represented by two rows.
            2. One row in a swap will be Type Fixed and one row Type Floating
            3. One row of a swap will always have a negative SpotPrice while the other row in the swap will have a positive SpotPrice
            4. There are only two SwapType values: Pay Fixed, Receive Floating

             

            To implement this logic in QlikView Expressor you would most likely use a Function Rule in an Aggregate operator, supplying code in the prepare, aggregate, and result functions. 

            • In the prepare function, you would initialize an empty Datascript table
            • In the aggregate function, you would store each record from a swap into the Datascript table
            • In the result function, you would examine the two records saved in the Datascript table to determine the swap type based on whether Fixed or Floating contained the negative value.  Then, initialize the output values as appropriate.
              • Re: Row merge with logic

                Thanks John.

                 

                Your assumptions are correct, apart from 4, there can be alternative variations (Pay Floating,Rec Fixed) etc, but I dont think that poses to much trouble.

                 

                I think your approach sounds fine as the prepare/agg/result seems to solve the logic and doing it within the aggregate transform should do the math. However the row of data has about 100+ other attributes that I probably wont need to feed through the logic or aggregate (they may have to go through seperate logic) or they are the same for each row, if I feed these through a plain transform seperately I assume I will still end up with 2 rows for each swap but how will the output of the aggregate function be seen duplicated for each swap?

                 

                I'm going to try to build something now so I can see what the output is, I just need to have a read around this Datascript table piece, is this th normal way to hold a set of data to process?

                 

                 

                James

                  • Re: Row merge with logic

                    If there really are 100+ other fields, then you probably don't want to pass them into this Aggregate as it will really crowd the rule editor (although you could take this approach adding the values from the first or last record to the output record).  Your additional transformations could be performed in a Transform operator downstream of this Aggregate operator.

                     

                    Alternatively, you can block these attributes from entering this Aggregate in an upstream Transform.  

                     

                    If you pass these fields through another stream to carry out additional transformations, you can join them back to the aggregated record with a Join operator.  You need to decide if you only want one final record emitted for each swap.  If you do, then this second stream also goes into an Aggregate operator, but you can simply emit one record that contains the values derived from the first record (or last record).  Then, there will be only one record emitted from each Aggregate and only a single record emitted after you join their content together.

                     

                    If, on the other hand, you want an emitted record for both of the records documenting the swap, there would be no need for the second Aggregate operator.  The Join operator will emit two records for each swap.

                     

                    Using a table to store each record in an aggregation group is generally not something that is necessary within the Aggregate operator.  The concept behind this operator is that only the running calculation is maintained for a group, the values in each incoming record are unimportant after the math has been performed.  But in your use case, you need all the records in the group to decide on the Swap Type, so you need to save the records in the aggregate function and analyze them in the result function.  Since each swap only involves two records, this is not an issue (even if all 100+ fields were present).  If, on the other hand your groups contain a very large number of records, there is the possibility that sufficient memory will not exist to hold all the records.  In this case, you may need to drop some of the values from each record and only hold those you need for the follow-on analysis.

                      • Re: Row merge with logic

                        Hi John,

                         

                        I have attempted to build this and after reading around the datascript language I have put together a data flow, now below is the script from the aggregate artifact, it works (as in doesnt error) which is a good start, the output is quite what I was expecting, I managed to join it back to the rest of the data (without filtering so I still have two rows), but the valued that I expect to be summed seems to be the result of the same row * 2 as opposed to the sum of 2 rows.

                         

                        Does the script look ok, or have I missed something? I was trying to use a sum() function in the result but it kept throwing an error so I followed the example in the help file.

                         

                        Datascript_aggregate.jpg

                          • Re: Row merge with logic

                            You are close, but the code in the aggregate function is only saving the value from the second record in each group.

                             

                            In prepare, your code should be work={ }

                            Then in aggregate, you code should be work[#work+1] = input

                             

                            Then don't you need to compare the entries in the two records to determine the SwapType?  To extract a value use code similar to work[1].SwapType.

                              • Re: Row merge with logic

                                Again, many thanks John.

                                 

                                Now that I have put my coding hat back on I can see how this works, this great example of mixing Code with data.

                                 

                                So it's cracked, I managed to get the aggregate to output the values I was expecting and yes I see how you can just use the index pointer in the data table to get the values you after. I can see why the Sum would'nt help me, that's just me thinking of SQL again..

                                 

                                I have also added the downstream aggregator and added all the columns as the group by and I get the single row output I was after.

                                 

                                This is a great example for anyone interested the aggregate script is below which can be used to obtain specific row in a group and perform some aggregation, worth noting that you specify the Group By values in the artifacts properties in the dataflow window. 

                                 

                                 

                                aggregate_script_working.jpg

                                 

                                aggregate_script_working1.jpg

                                  • Re: Row merge with logic

                                    The code in your latest version of the result function is not as rich as the code in your prior version.  Are you sure you are doing all the analysis that you wanted?  Don't you need to compare the values in Type and SpotPrice from both records to determine the value to be assigned to SwapType?  Or can you make that decision based just on the values in the first record of each pairing?  I don't understand the logic behind setting the SwapType value to Pay Fixed, Receive Fixed, Pay Floating, and Receive Floating.  The logic of your decision function doesn't match the required result presented in the data file.

                                      • Re: Row merge with logic

                                        The result I currently get isn't exactly perfect, I was more interested in getting the whole aggregation and calculation working rather than the logic behind the Swap Type, in the previous script I had been playing with variations of the decision function to see results.

                                         

                                        Yes, I still need to do a check to decide the swap type, I will implement that shortly, the logic behind the Type selection hasn't changed , but the required result has, in all cases each record the Fixed part takes precendent.

                                         

                                        So every grouped Swap will be xxx Fixed , xxx being Pay or Receive depending on the sign of the price.

                                         

                                        I have implemented this using the code below, I've tested this and it works.

                                         

                                        output.SwapType = decision (

                                         

                                        work[1].Type == "Fixed" and work[1].SpotPrice <0,"Pay Fixed",

                                         

                                        work[1].Type == "Floating" and work[1].SpotPrice <0,"Receive Fixed",

                                         

                                        work[1].Type == "Fixed" and work[1].SpotPrice >0, "Receive Fixed",

                                         

                                        work[1].Type == "Floating" and work[1].SpotPrice >0, "Pay Fixed"

                                         

                                        )

                                          • Re: Row merge with logic

                                            OK, much clearer. 

                                             

                                            Nice use of the decision function.  As this code illustrates, the proper decision is made regardless of which of the two records per swap is stored in work[1].

                                             

                                            Thanks for this example.  It shows a nice usage of the Aggregate operator function rule.

                            • Re: Row merge with logic
                              khadar basha

                              Jamesjnr,

                               

                                  Please check the attachment,Please let me know.

                               

                               

                              Thanks

                              SP