Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I was trying to do a Group By similar to SQL when the load contains an if condition.
It was giving me an "Invalid Expression" error so I added all fields not aggregated in the Group By.
DataTemp:
LOAD * INLINE [
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
Data:
LOAD
if(fail=1,'Other',Vendor) as Vendor,
Sum(Units) as Units
Resident DataTemp
Group by Vendor, fail;
Drop Table DataTemp;
=========================================
Result:
Vendor | Units |
Other | 1 |
Other | 2 |
E | 5 |
D | 14 |
C | 3 |
The code works but it's not aggregating all rows as I intended.
What I want is this:
Vendor | Units |
Other | 3 |
E | 5 |
D | 14 |
C | 3 |
Is there a way of achieving this without doing another Resident Load with another Group By?
LOAD
Vendor,
Sum(Units) as Units
Group by Vendor, fail;
LOAD if(fail=1,'Other',Vendor) as Vendor,Units,fail INLINE [
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
How using a preceding load, maybe something like this ?
DataTemp:
LOAD * INLINE [
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
Data:
load
Vendor,
Sum(Units) as Units
Group by Vendor
;
LOAD
if(fail=1,'Other',Vendor) as Vendor,
Units
Resident DataTemp
;
Drop Table DataTemp;
DataTemp:
Load
if(fail = 1, 'Other',Vendor) as Vendor,
fail,
Units
Inline
[
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
Data:
Load
Vendor,
SUM(Units) as TotalUnit Resident DataTemp
Group By Vendor,fail;
Drop Table DataTemp;
u can also try this,
DataTemp:
LOAD * INLINE [
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
DataTemp2:
Noconcatenate LOAD
if(fail=1,'Other',Vendor) as Vendor,
Units,
fail
Resident DataTemp;
Drop table DataTemp;
Data:
Noconcatenate Load Vendor,
Sum(Units) as Units,
fail
Resident DataTemp2
Group by Vendor,fail;
Drop table DataTemp2;
LOAD
Vendor,
Sum(Units) as Units
Group by Vendor, fail;
LOAD if(fail=1,'Other',Vendor) as Vendor,Units,fail INLINE [
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
You can try like this:
DataTemp:
Load if(fail=1,'Other',Vendor) as Vendor, Units;
LOAD * INLINE [
Vendor, fail, Units
A, 1, 1
B, 1, 2
C, 0, 3
D, 0, 4
D, 0, 10
E, 0, 5
];
Data:
Load Vendor,Sum(Units) as Sum_Units
Resident DataTemp
Group by Vendor;
output:
Vendor | Sum_Units |
C | 3 |
D | 14 |
E | 5 |
Other | 3 |
It now looks really obvious and I should know better but I completely forgot one could do a Group By on a preceding load. Makes sense!
Opted for anbu1984 response as it is the cleanest and the one that is closest to my real case scenario.
Thank you all for your suggestions!