Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How calculate Average in Load Script (SCD)?

Hi all,

I need to calculate Average Net Price (grouped by Key A,B,C and Year) and store the result in table. The result table will be used in another calculation process.

The issue is the Net Price located on other table (I'm using Slow Changing Dimension data model which explained in here)

Source Table

ABCDate
A1B1C11 Jan 2016
A1B1C15 May 2016
A1B1C113 May 2016
A1B2C25 Feb 2016
A1B2C210 Mar 2016

Other Table contain Net Price in USD

ABCStart DateEnd DateNet Price
A1B1C11 Jan 201615 Mar 201650
A1B1C116 Mar 201630 May 2016125
A1B2C21 Jan 201615 Feb 201675
A1B2C216 Feb 201630 Mar 201693

Result Table that I Expected

ABCYearAverage Net Price
A1B1C12016100 (=(50+125+125)/3)
A1B2C2201684 (=(75+93)/2)

It will be easy if this is not Slow Changing Dimension

Load A,

         B,

         C,

         (some logic to extract year from Date) as Year,

         Avg(Net Price in USD)   ---> located in other table, How?

Group By A | B | C | Year

Resident [Source]:

Best Regards,

Satria

1 Solution

Accepted Solutions
sunny_talwar

Hi Satria.... are you keeping that tables separate? May be, to perform this calculation, you can join the intervalmatch table back to your main table so that you have all the fields in a single table to perform this calculation?

SourceTable:

LOAD A,

B,

C,

Date#(Date, 'D MMM YYYY') as Date,

AutoNumber(A&'|'&B&'|'&C) as Key;

LOAD * INLINE [

    A, B, C, Date

    A1, B1, C1, 1 Jan 2016

    A1, B1, C1, 5 May 2016

    A1, B1, C1, 13 May 2016

    A1, B2, C2, 5 Feb 2016

    A1, B2, C2, 10 Mar 2016

];

AnotherTable:

LOAD A,

B,

C,

Date#([Start Date], 'D MMM YYYY') as StartDate,

Date#([End Date], 'D MMM YYYY') as EndDate,

[Net Price],

AutoNumber(A&'|'&B&'|'&C) as Key;

LOAD * INLINE [

    A, B, C, Start Date, End Date, Net Price

    A1, B1, C1, 1 Jan 2016, 15 Mar 2016, 50

    A1, B1, C1, 16 Mar 2016, 30 May 2016, 125

    A1, B2, C2, 1 Jan 2016, 15 Feb 2016, 75

    A1, B2, C2, 16 Feb 2016, 30 Mar 2016, 93

];

Left Join (SourceTable)

IntervalMatch (Date, Key)

LOAD StartDate,

EndDate,

Key

Resident AnotherTable;

Left Join (SourceTable)

LOAD *

Resident AnotherTable;

DROP Table AnotherTable;

FinalTable:

LOAD A,

B,

C,

Year(Date) as Year,

Avg([Net Price]) as [Average Net Price]

Resident SourceTable

Group By A, B, C, Year(Date);

DROP Table SourceTable;

View solution in original post

6 Replies
sunny_talwar

Hi Satria.... are you keeping that tables separate? May be, to perform this calculation, you can join the intervalmatch table back to your main table so that you have all the fields in a single table to perform this calculation?

SourceTable:

LOAD A,

B,

C,

Date#(Date, 'D MMM YYYY') as Date,

AutoNumber(A&'|'&B&'|'&C) as Key;

LOAD * INLINE [

    A, B, C, Date

    A1, B1, C1, 1 Jan 2016

    A1, B1, C1, 5 May 2016

    A1, B1, C1, 13 May 2016

    A1, B2, C2, 5 Feb 2016

    A1, B2, C2, 10 Mar 2016

];

AnotherTable:

LOAD A,

B,

C,

Date#([Start Date], 'D MMM YYYY') as StartDate,

Date#([End Date], 'D MMM YYYY') as EndDate,

[Net Price],

AutoNumber(A&'|'&B&'|'&C) as Key;

LOAD * INLINE [

    A, B, C, Start Date, End Date, Net Price

    A1, B1, C1, 1 Jan 2016, 15 Mar 2016, 50

    A1, B1, C1, 16 Mar 2016, 30 May 2016, 125

    A1, B2, C2, 1 Jan 2016, 15 Feb 2016, 75

    A1, B2, C2, 16 Feb 2016, 30 Mar 2016, 93

];

Left Join (SourceTable)

IntervalMatch (Date, Key)

LOAD StartDate,

EndDate,

Key

Resident AnotherTable;

Left Join (SourceTable)

LOAD *

Resident AnotherTable;

DROP Table AnotherTable;

FinalTable:

LOAD A,

B,

C,

Year(Date) as Year,

Avg([Net Price]) as [Average Net Price]

Resident SourceTable

Group By A, B, C, Year(Date);

DROP Table SourceTable;

Anonymous
Not applicable
Author

Hi Sunny,

Thanks for the idea. I never thought before that Intervals table (AnotherTable) could be joined with Events table (SourceTable). Some questions:

1. Below source code mean is to joining the Bridge table onto the transaction table (Events table/SourceTable), isn't it?

Left Join (SourceTable)

IntervalMatch (Date, Key)

LOAD StartDate,

EndDate,

Key

Resident AnotherTable;


2. I read that the prerequisites to do this (join Bridge with Events) is many to one relationship. Unfortunately I have many to many relationship as there is overlap intervals in Intervals table (AnotherTable). How to fix this issue?

Thanks

Satria

sunny_talwar

1) Yes

2) Would you be able to share a sample where you have the many to many relationship and the output you expect to see?

Anonymous
Not applicable
Author

Hi Sunny,

I think it will be more clear for you if i explain more about the data modelling before I give many to many relationship sample. I might create wrong data modelling.

Data Modelling:

[Events]:

LOAD *,

          A & B & C as [TempKey],

          A & B & C & '|' & [Trans_Date] as [Key+TransDate];

LOAD A

           B

           C

           [Trans_Date]

From Events.QVD

[Interval]:

LOAD *,

           A & B & C & '|' & [Start_Date] & '|' & [End_Date] as [Key+Interval];

LOAD A

           B

           C

           [Start_Date]

           [End_Date]

           [Net_Price]

From Interval.QVD

[TempBridgeTable]:

IntervalMatch(Trans_Date,TempKey)

Load Distinct [Start_Date]

                      [End_Date]

                      A & B & C as [TempKey]

Resident [Interval];

[Bridge]:

Load [TempKey] & '|' & [Trans_Date] as [Key+TransDate],

         [TempKey] &  '|' & [Start_Date] &  '|' & [End_Date] as [Key+Interval]

Resident [TempBridgeTable];


DROP Field [TempKey];

DROP Table [TempBridgeTable];

Expected Output

Straight Chart Table ---> Final output

A   | B   | C   | Trans_Date    | Net Price | Average Net Price Last Year Group By A,B,C

A1 | B1 | C1 |  10 Mar 2017 |  200

A1 | B1 | C1 |  15 Mar 2017 |  150

Interval Table ( there are many interval overlap cases, this is one of them)

A   | B   | C   | Start Date     | End Date       |  Net Price

A1 | B1 | C1 | 1 Feb 2017   | 20 Mar 2017  |  100

A1 | B1 | C1 | 15 Jan 2017  | 25 Mar 2017 |   200

A1 | B1 | C1 | 10 Mar 2017 | 14 Mar 2017  |   300

If found overlap interval, Net Price will be calculated using Average formula

10 Mar 2017 --> (100+200+300)/3 = 600/3 = 200

15 Mar 2017 --> (100+200)/2 = 300/2 = 150

To show Net Price in Straight Table, I'm using this formula and it's working:

=if(count([Key+Interval]) = 0,0,sum([Net Price])/count([Key+Interval]))

but for Average Net Price Last Year Group By A,B,C, I can't calculate this using Set Analysis as there is Trans_Date in straight table. That's why I'm think to calculate the average value in other table and then use ApplyMap

sunny_talwar

It seems you have already done a bit of scripting and also have a partial chart working for you... would you be able to share this sample app here? It might be easy to work with what you have and take it forward then to try to recreate the whole thing here.

Anonymous
Not applicable
Author

Hi Sunny,

The size of the app is quite huge as it contains multiple KPI so I don't attach it here.

I try your idea to join Interval and Event table (using Left Join) and it's work greatly.

There is small issue where the values of net price become duplicates in event table (after left join process) as there is additional key (lets call this dimension DN) in Event table.

So to calculate Net Price in chart, I need to use below formula (DN is not displayed in this chart)

= Sum(Net Price)/Count(DN)

Many thanks for your help.

Best Regards,

Satria