Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Count num of rows with duplicate field(s) using AGGR and COUNT

Hi,  I'm trying to retrieve a list of rows where Costs per Month per Agent is duplicate.

I have my data setup as follows:

SalesmanProductCostMonthRptTypeAgent
LeroyCar1604GeneralUSA
RichardCar20003GeneralFrance
SamuelPlane20003GeneralFrance
LeroyPlane150002GeneralUSA
SamuelCar3002GeneralUSA
VictorCar3002GeneralUSA
RichardBus5001GeneralUSA
JohanCar5001GeneralUSA
JohanBus7001GeneralUSA
LeroyBoat8001GeneralUSA
DavidPlane9005GeneralUSA

I've setup the  Dimensions and Expressions as follows

Dimensions:

1. Aggr(if(Month>=1 and Month<=4 and RptType = 'General' and Count(Month&'_'&Cost)>=2,Cost),Cost)

2. Salesman

3. Month

Expressions:

1. Agent

2. count(aggr(count(Month&'_'&'_'&Agent&'_'&Cost),Salesman))

This is currently what I have, which is incorrect

   

CostMonthAgentSalesman#Rows with same Cost+Month+Agents
5001USARichard1
5001USAJohan1
3002USASamuel1
3002USAVictor1
20003FranceRichard0
20003FranceSamuel0

This is what I expect

   

CostMonthAgentSalesman#Rows with same Cost+Month+Agents
5001USARichard2
5001USAJohan2
3002USASamuel2
3002USAVictor2
20003FranceRichard2
20003FranceSamuel2

I want to highlight rows by counting the duplicate Month+Cost+Agent via expression 2, but I'm not sure what it should be.

Can someone pls advise what should I be using? Thanks!

1 Solution

Accepted Solutions
swuehl
MVP
MVP

What about an expression like

Sum(TOTAL<Month,Agent,Cost>

     Aggr(Count(Month&Agent& Cost)

          , Month, Agent, Cost)

)

Or a script solution:

TAB1:

LOAD * INLINE [

Salesman Product Cost Month RptType Agent

Leroy Car 160 4 General USA

Richard Car 2000 3 General France

Samuel Plane 2000 3 General France

Leroy Plane 15000 2 General USA

Samuel Car 300 2 General USA

Victor Car 300 2 General USA

Richard Bus 500 1 General USA

Johan Car 500 1 General USA

Johan Bus 700 1 General USA

Leroy Boat 800 1 General USA

David Plane 900 5 General USA

] (delimiter is '\t');

Left JOIN

LOAD Month, Agent, Cost,

Count(Month&Agent&Cost) as Count

RESIDENT TAB1

GROUP BY Month, Agent, Cost;

View solution in original post

4 Replies
sunny_talwar

From your sample data, I see none of the rows to have any duplicates when I see Cost+Month+Agent? Is this data incomplete or am I missing something?

Anonymous
Not applicable
Author

Hi Sunny, Thanks for your reply.

Pls allow me to clarify by my meaning of "duplicate". I was trying to say that some rows showed up with the same values of Cost,Month and Agent... and I need to highlight these rows if the Cost, Month and Agent for the current row's Salesman is NOT unique i.e. count > =1

After applying the dimensions/expressions stated on my sample data, I was able to generate this straight-table chart:

CostMonthAgentSalesman#Rows with same Cost+Month+Agents
5001USARichard1
5001USAJohan1
3002USASamuel1
3002USAVictor1
20003FranceRichard0
20003FranceSamuel0

As you can see, columns Cost,Month and Agent are duplicated 2x for both Richard and Johnson: 500 , 1, USA

Similarly, for Samuel and Victor, they both have 300,2,USA and Richard and Samuel has 2000,3,France

Hope that helps to clarify what I meant by "duplicates". Thanks.

swuehl
MVP
MVP

What about an expression like

Sum(TOTAL<Month,Agent,Cost>

     Aggr(Count(Month&Agent& Cost)

          , Month, Agent, Cost)

)

Or a script solution:

TAB1:

LOAD * INLINE [

Salesman Product Cost Month RptType Agent

Leroy Car 160 4 General USA

Richard Car 2000 3 General France

Samuel Plane 2000 3 General France

Leroy Plane 15000 2 General USA

Samuel Car 300 2 General USA

Victor Car 300 2 General USA

Richard Bus 500 1 General USA

Johan Car 500 1 General USA

Johan Bus 700 1 General USA

Leroy Boat 800 1 General USA

David Plane 900 5 General USA

] (delimiter is '\t');

Left JOIN

LOAD Month, Agent, Cost,

Count(Month&Agent&Cost) as Count

RESIDENT TAB1

GROUP BY Month, Agent, Cost;

Anonymous
Not applicable
Author

Hi Stefan,

Thanks! I was examining my app and wondering how to implement it as my data model had changed with some conditions and I had to add in more calculated dimensions than before...

Nevertheless, got it to work using your solution!