Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello folks, Need some help to fix an issue in production. Any help would be greatly appreciated.
I am trying to figure to suppress double counting of a dimension value. Attaching screen shot explaining the issue.
Have the source data loaded as below:
Soource Data : | |
pid | tin |
p1 | t1 |
p1 | t2 |
p2 | t1 |
p3 | t2 |
p4 | t2 |
p5 | t3 |
p6 | t3 |
p7 | t3 |
The requirement is for each of pid, RelatedPIDS should be computed via tin and through the tin against pid get other pids and show as count including current pid. I am double counting for pid p1 since it shared to tins t1 and t2.
using the expression to compute RelatedPIDS field.
sum( Aggr(NODISTINCT count( distinct {1<tin={"*"} >}pid),tin))
For pid p1, it is double counting since it has overlap with tin t1 and t2. I am getting count of 5 instead of 4 since p1 is counted twice which is shared with tin t1 and t2.
Can some one suggest how to suppress double count? Example data attached below.
mapSource:
Mapping Load
tin,
concat(pid,',') as t
group by tin;
Load * Inline [
pid, tin
p1, t1
p1, t2
p2, t1
p3, t2
p4, t2
p5, t3
p6, t3
p7, t3
];
NoConcatenate
s1:
Load pid,
subfield(t1,',') as t2;
Load pid,
tin,
ApplyMap('mapSource',tin,null()) as t1;
Load * Inline [
pid, tin
p1, t1
p1, t2
p2, t1
p3, t2
p4, t2
p5, t3
p6, t3
p7, t3
];
In the front end
dim:Pid
measure: count(distinct t2)
Try this,
tab1:
LOAD * INLINE [
pid, tin
p1, t1
p1, t2
p2, t1
p3, t2
p4, t2
p5, t3
p6, t3
p7, t3
];
Left Join(tab1)
LOAD tin, pid As pid2
Resident tab1;
Hello Saran,
My approach using set analysis with nested aggregation did not work
Your solution works, Many thanks. Appreciate your time and timely help.
mapSource:
Mapping Load
tin,
concat(pid,',') as t
group by tin;
Load * Inline [
pid, tin
p1, t1
p1, t2
p2, t1
p3, t2
p4, t2
p5, t3
p6, t3
p7, t3
];
NoConcatenate
s1:
Load pid,
subfield(t1,',') as t2;
Load pid,
tin,
ApplyMap('mapSource',tin,null()) as t1;
Load * Inline [
pid, tin
p1, t1
p1, t2
p2, t1
p3, t2
p4, t2
p5, t3
p6, t3
p7, t3
];
In the front end
dim:Pid
measure: count(distinct t2)
Try this,
tab1:
LOAD * INLINE [
pid, tin
p1, t1
p1, t2
p2, t1
p3, t2
p4, t2
p5, t3
p6, t3
p7, t3
];
Left Join(tab1)
LOAD tin, pid As pid2
Resident tab1;
Output:
Hello Saran,
My approach using set analysis with nested aggregation did not work
Your solution works, Many thanks. Appreciate your time and timely help.
Hello Priyanka,
My approach using set analysis with nested aggregation did not work
Your solution works, Many thanks to you and Saran. Appreciate your time and timely help.