Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a dataset that I joined to a master calendar where I need to fill in the missing values. Here is an example of the data:
Date | LU | Plan ID | Item ID | Qty |
1/1/2020 | ||||
1/2/2020 | FCST | 1 | 333 | 4 |
1/3/2020 | ||||
1/1/2020 | SO | 5 | 345 | 5 |
1/2/2020 | ||||
1/3/2020 | ||||
1/1/2020 | WO | 1 | 445 | 1 |
1/2/2020 | WO | 1 | 445 | 7 |
1/3/2020 |
This is what I'd like to do with the dataset. Fill in the null dimension values and place a zero in for the measure.
Date | LU | Plan ID | Item ID | Qty |
1/1/2020 | FCST | 1 | 333 | 0 |
1/2/2020 | FCST | 1 | 333 | 4 |
1/3/2020 | FCST | 1 | 333 | 0 |
1/1/2020 | SO | 5 | 345 | 5 |
1/2/2020 | SO | 5 | 345 | 0 |
1/3/2020 | SO | 5 | 345 | 0 |
1/1/2020 | WO | 1 | 445 | 1 |
1/2/2020 | WO | 1 | 445 | 7 |
1/3/2020 | WO | 1 | 445 | 0 |
Is there an easy way to do this?
Thanks in advance!
May be this :
Data:
LOAD * INLINE [
Date, LU, Plan ID, Item ID, Qty
1/1/2020, , , ,
1/2/2020, FCST, 1, 333, 4
1/3/2020, , , ,
1/1/2020, SO, 5, 345, 5
1/2/2020, , , ,
1/3/2020, , , ,
1/1/2020, WO, 1, 445, 1
1/2/2020, WO, 1, 445, 7
1/3/2020, , , ,
];
Tmp1:
noconcatenate
load LU, [Plan ID], [Item ID] resident Data where len(LU);
join
load distinct Date resident Data;
Tmp2:
noconcatenate
load distinct * resident Data where len(LU);
join load * resident Tmp1;
drop table Data,Tmp1;
output:
noconcatenate
load Date, LU, [Plan ID], [Item ID], if(len(Qty)=0,0,Qty) as Qty resident Tmp2;
drop table Tmp2;
I am assuming you have a separate table for Data and Calendar, which is the most efficient way. If not modify Step_Two to load from Data.
Data:
Load * InLine [
LU, Date, Plan ID, Item ID, Qty
FCST, 02/01/2020, 1, 333, 4
SO, 01/01/2020, 5, 345, 5
WO, 01/01/2020, 1, 445, 1
WO, 02/01/2020, 1, 445, 7
];
Calendar:
Load * InLine [
Date
01/01/2020
02/01/2020
03/01/2020
];
Step_One:
Load Distinct
Date & '-' & LU as BlankKey
Resident Data
;
Step_Two:
Load Distinct
Date as TempDate
Resident Calendar
;
Outer Join (Step_Two)
Load
LU as TempLU,
[Plan ID] as TempPlanID,
[Item ID] as TempItemID
Resident Data
;
Concatenate (Data)
Load
TempLU as LU,
TempDate as Date,
TempPlanID as [Plan ID],
TempItemID as [Item ID],
0 as Qty
Resident Step_Two
Where Not Exists(BlankKey, TempDate & '-' & TempLU)
;
Drop Tables Step_One, Step_Two;
May be this :
Data:
LOAD * INLINE [
Date, LU, Plan ID, Item ID, Qty
1/1/2020, , , ,
1/2/2020, FCST, 1, 333, 4
1/3/2020, , , ,
1/1/2020, SO, 5, 345, 5
1/2/2020, , , ,
1/3/2020, , , ,
1/1/2020, WO, 1, 445, 1
1/2/2020, WO, 1, 445, 7
1/3/2020, , , ,
];
Tmp1:
noconcatenate
load LU, [Plan ID], [Item ID] resident Data where len(LU);
join
load distinct Date resident Data;
Tmp2:
noconcatenate
load distinct * resident Data where len(LU);
join load * resident Tmp1;
drop table Data,Tmp1;
output:
noconcatenate
load Date, LU, [Plan ID], [Item ID], if(len(Qty)=0,0,Qty) as Qty resident Tmp2;
drop table Tmp2;
I am assuming you have a separate table for Data and Calendar, which is the most efficient way. If not modify Step_Two to load from Data.
Data:
Load * InLine [
LU, Date, Plan ID, Item ID, Qty
FCST, 02/01/2020, 1, 333, 4
SO, 01/01/2020, 5, 345, 5
WO, 01/01/2020, 1, 445, 1
WO, 02/01/2020, 1, 445, 7
];
Calendar:
Load * InLine [
Date
01/01/2020
02/01/2020
03/01/2020
];
Step_One:
Load Distinct
Date & '-' & LU as BlankKey
Resident Data
;
Step_Two:
Load Distinct
Date as TempDate
Resident Calendar
;
Outer Join (Step_Two)
Load
LU as TempLU,
[Plan ID] as TempPlanID,
[Item ID] as TempItemID
Resident Data
;
Concatenate (Data)
Load
TempLU as LU,
TempDate as Date,
TempPlanID as [Plan ID],
TempItemID as [Item ID],
0 as Qty
Resident Step_Two
Where Not Exists(BlankKey, TempDate & '-' & TempLU)
;
Drop Tables Step_One, Step_Two;
Thank you both!