Skip to main content
Announcements
Have questions about Qlik Connect? Join us live on April 10th, at 11 AM ET: SIGN UP NOW
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Row merge with logic

Hi,

I would like to be able to transform some data which includes some aggregation and a group by.

The data I have consists of 2 rows of data that have the same ID but some different attribute values. The main requirement I am trying to achieve is the ability to take say Column A's 2 rows and transpose them out into a single row, but with some logic applied. At the same time I would like to Group the two rows and Sum one of the other values.

I have attached a simple example file that shows the starting data and the target with some notes.

If anyone can help in pointing out the right artifacts to use and approaches that would be great.

James

1 Solution

Accepted Solutions
Not applicable
Author

You are close, but the code in the aggregate function is only saving the value from the second record in each group.

In prepare, your code should be work={ }

Then in aggregate, you code should be work[#work+1] = input

Then don't you need to compare the entries in the two records to determine the SwapType?  To extract a value use code similar to work[1].SwapType.

View solution in original post

13 Replies
markmccoid
Partner - Creator II
Partner - Creator II

Here is an example that will hopefully help out:

Work:

LOAD ID,

     Name,

     Type,

     SpotPrice,

     Currency,

     Term

FROM

C:\Hold\RowConsolidate.xlsx

(ooxml, embedded labels, table is Sheet2);

Tab1:

LOAD ID,

     Name,

     IF(SpotPrice<0, 'Pay ' & Type, 'Received ' & Type) as NewType,

     SpotPrice,

     Currency,

     Term

Resident Work

WHERE Type = 'Fixed';

JOIN

LOAD ID,

     SpotPrice as RSpotPrice,

     Term as ReceivedTerm

Resident Work

WHERE Type = 'Floating';

Final:

LOAD

          ID,

          Name,

          NewType,

          SpotPrice + RSpotPrice as SpotPrice,

          Term,

          ReceivedTerm,

          Currency

Resident Tab1;

Drop table Tab1;

drop table Work;

I always like to load the table into Qlikview first and then manipulate using QV functions.  After you load the whole table in, you can load your first set of rows, then the second set joining on ID.  Then one final pass through you can add the logic to get the price and then drop the work tables.

I hope this helps.


Mark

Not applicable
Author

Just to confirm:

  1. Each swap will always be represented by two rows.
  2. One row in a swap will be Type Fixed and one row Type Floating
  3. One row of a swap will always have a negative SpotPrice while the other row in the swap will have a positive SpotPrice
  4. There are only two SwapType values: Pay Fixed, Receive Floating

To implement this logic in QlikView Expressor you would most likely use a Function Rule in an Aggregate operator, supplying code in the prepare, aggregate, and result functions. 

  • In the prepare function, you would initialize an empty Datascript table
  • In the aggregate function, you would store each record from a swap into the Datascript table
  • In the result function, you would examine the two records saved in the Datascript table to determine the swap type based on whether Fixed or Floating contained the negative value.  Then, initialize the output values as appropriate.
Not applicable
Author

Thanks Mark, unfortunately this is for Qlikview Expressor

Not applicable
Author

Thanks John.

Your assumptions are correct, apart from 4, there can be alternative variations (Pay Floating,Rec Fixed) etc, but I dont think that poses to much trouble.

I think your approach sounds fine as the prepare/agg/result seems to solve the logic and doing it within the aggregate transform should do the math. However the row of data has about 100+ other attributes that I probably wont need to feed through the logic or aggregate (they may have to go through seperate logic) or they are the same for each row, if I feed these through a plain transform seperately I assume I will still end up with 2 rows for each swap but how will the output of the aggregate function be seen duplicated for each swap?

I'm going to try to build something now so I can see what the output is, I just need to have a read around this Datascript table piece, is this th normal way to hold a set of data to process?

James

Not applicable
Author

If there really are 100+ other fields, then you probably don't want to pass them into this Aggregate as it will really crowd the rule editor (although you could take this approach adding the values from the first or last record to the output record).  Your additional transformations could be performed in a Transform operator downstream of this Aggregate operator.

Alternatively, you can block these attributes from entering this Aggregate in an upstream Transform.  

If you pass these fields through another stream to carry out additional transformations, you can join them back to the aggregated record with a Join operator.  You need to decide if you only want one final record emitted for each swap.  If you do, then this second stream also goes into an Aggregate operator, but you can simply emit one record that contains the values derived from the first record (or last record).  Then, there will be only one record emitted from each Aggregate and only a single record emitted after you join their content together.

If, on the other hand, you want an emitted record for both of the records documenting the swap, there would be no need for the second Aggregate operator.  The Join operator will emit two records for each swap.

Using a table to store each record in an aggregation group is generally not something that is necessary within the Aggregate operator.  The concept behind this operator is that only the running calculation is maintained for a group, the values in each incoming record are unimportant after the math has been performed.  But in your use case, you need all the records in the group to decide on the Swap Type, so you need to save the records in the aggregate function and analyze them in the result function.  Since each swap only involves two records, this is not an issue (even if all 100+ fields were present).  If, on the other hand your groups contain a very large number of records, there is the possibility that sufficient memory will not exist to hold all the records.  In this case, you may need to drop some of the values from each record and only hold those you need for the follow-on analysis.

Not applicable
Author

Hi John,

I have attempted to build this and after reading around the datascript language I have put together a data flow, now below is the script from the aggregate artifact, it works (as in doesnt error) which is a good start, the output is quite what I was expecting, I managed to join it back to the rest of the data (without filtering so I still have two rows), but the valued that I expect to be summed seems to be the result of the same row * 2 as opposed to the sum of 2 rows.

Does the script look ok, or have I missed something? I was trying to use a sum() function in the result but it kept throwing an error so I followed the example in the help file.

Datascript_aggregate.jpg

Not applicable
Author

You are close, but the code in the aggregate function is only saving the value from the second record in each group.

In prepare, your code should be work={ }

Then in aggregate, you code should be work[#work+1] = input

Then don't you need to compare the entries in the two records to determine the SwapType?  To extract a value use code similar to work[1].SwapType.

Anonymous
Not applicable
Author

Jamesjnr,

    Please check the attachment,Please let me know.

Thanks

SP

Not applicable
Author

Thanks Shaik, this disscussion is for Qlikview Expressor unfortunately, but thanks for the effort.