Skip to main content
Announcements
See what Drew Clarke has to say about the Qlik Talend Cloud launch! READ THE BLOG
cancel
Showing results for 
Search instead for 
Did you mean: 
MarZu
Contributor II
Contributor II

sum distinct values with a script in load editor

Hello, I want to sum the entries of the field "value" only for distinct values of another one "Sequence Id".
 
See below a pseudo example: the "calculate" table shall reuse previously loaded fields from the table "tag".
 
It sounds easy in visualisation but I have to do it in the load editor and apparently the same syntax is not available (aggr?). Please note that the it is not a matter to delete the rows which appears double.
 
 
Anyone has an Idea? How would be instead with an if ...then structure?
Thanks!!!
 
[calculate]:
Load
Sum(Aggr([Value]),distinct [Sequence Id])) as Counter
resident
tag;
Labels (1)
1 Solution

Accepted Solutions
MarZu
Contributor II
Contributor II
Author

Indeed I found the solution by using an intermediate step and it works as I wanted and I share it here:

 

Temp:

Load Distinct 

   [Sequence Id],

'dummy' as Dummy,

   Max([Value]) as MaxCounter
Resident
   tag
Group By
   [Sequence Id];

 

Load Distinct 

   [Sequence Id],

Dummy,
   Sum([MaxCounter]) as Counter
Resident
   Temp
Group By
   [Dummy];

View solution in original post

5 Replies
Chirantha
Support
Support

Please see if this helps

Load
   [Sequence Id],
   Sum([Value]) as Counter
Resident
   tag
Group By
   [Sequence Id];
 

udit_k
Partner - Creator II
Partner - Creator II

Load Distinct 

   [Sequence Id],
   Sum([Value]) as Counter
Resident
   tag
Group By
   [Sequence Id];

MarZu
Contributor II
Contributor II
Author

Hello, Thanks for the answer however it does not help me, perhaps I have missed one important info:

Imagine the values can be just 0 or 1. for the same sequnce Id I can get many rows and each can have value 0 or 1. I need to have a boolean OR over those values , so that maximum 1 can appear. the sum shall be done over all the sequences therefore the maximum expected number will be the number of the sequences.

 

With the proposed solution above in contrast I receive different summed values for the sequences.

Can you please help to "limit" and have max. 1 for each sequence? Thanks!

MarZu_1-1689239312119.png

 

 

MarZu
Contributor II
Contributor II
Author

Hi Thank you, but it seems to produces the same result as the first solution despite DISTINCT I get counted all the values for the same single sequence.

 

Perhaps it helps my observation: Imagine the values can be just 0 or 1. for the same sequence Id I can get many rows and each can have value 0 or 1. I need to have a boolean OR over those values , so that maximum 1 can appear. the sum shall be done over all the sequences therefore the maximum expected number will be the number of the sequences.  --> is there a way to do that? Thanks

MarZu
Contributor II
Contributor II
Author

Indeed I found the solution by using an intermediate step and it works as I wanted and I share it here:

 

Temp:

Load Distinct 

   [Sequence Id],

'dummy' as Dummy,

   Max([Value]) as MaxCounter
Resident
   tag
Group By
   [Sequence Id];

 

Load Distinct 

   [Sequence Id],

Dummy,
   Sum([MaxCounter]) as Counter
Resident
   Temp
Group By
   [Dummy];