Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have a table with some missing days:
INPUT:
LOAD * INLINE [
Company, Day, Product, Value
Comp_1, 02.12.2014, Product_1, 100
Comp_1, 02.12.2014, Product_2, 100
Comp_1, 03.12.2014, Product_1, 100
Comp_1, 03.12.2014, Product_2, 100
Comp_1, 04.12.2014, Product_1, 100
Comp_2, 02.12.2014, Product_1, 200
Comp_2, 02.12.2014, Product_2, 200
Comp_2, 04.12.2014, Product_1, 200
Comp_2, 04.12.2014, Product_2, 200
];
Comp_1, 04.12.2014, Product_2
Comp_3, 03.12.2014, Product_1
Comp_3, 03.12.2014, Product_2
Is missing
I want to add this strings with value = 0 to my table.
How can I acheive this?
INPUT:
LOAD *,
AutoNumberHash128(Company, Day, Product) as ID
INLINE [
Company, Day, Product, Value
Comp_1, 02.12.2014, Product_1, 100
Comp_1, 02.12.2014, Product_2, 100
Comp_1, 03.12.2014, Product_1, 100
Comp_1, 03.12.2014, Product_2, 100
Comp_1, 04.12.2014, Product_1, 100
Comp_1, 05.12.2014, Product_1, 150
Comp_1, 06.12.2014, Product_2, 300
Comp_2, 02.12.2014, Product_1, 200
Comp_2, 02.12.2014, Product_2, 200
Comp_2, 04.12.2014, Product_1, 200
Comp_2, 04.12.2014, Product_2, 200
];
tabTemp:
LOAD Distinct Company Resident INPUT;
Join
LOAD Distinct Day Resident INPUT;
Join
LOAD Distinct Product, 0 as Value Resident INPUT;
Concatenate (INPUT)
LOAD * Resident tabTemp
Where not Exists (ID, AutoNumberHash128(Company, Day, Product));
DROP Table tabTemp;
Left Join (INPUT)
LOAD Company,
Max(Day) as MaxDay
Resident INPUT
Where Value > 0
Group By Company;
Right Join (INPUT)
LOAD Distinct
Company,
Day
Resident INPUT
Where Day <= MaxDay;
DROP Field MaxDay;
hope this helps
regards
Marco
HI Oleg,
Please elaborate more,so can help!!
Regards
Neetha
Hi...
first table as you wrote
...
concatenate //autoconcatenate would be performed anyway
Load * inline [
Company, Day, Product, Value
Comp_1,04.12.2014, Product_2, 0
...
...etc.
];
BR
M
Hi,
one solution could be:
INPUT:
LOAD *,
AutoNumberHash128(Company, Day, Product) as ID
INLINE [
Company, Day, Product, Value
Comp_1, 02.12.2014, Product_1, 100
Comp_1, 02.12.2014, Product_2, 100
Comp_1, 03.12.2014, Product_1, 100
Comp_1, 03.12.2014, Product_2, 100
Comp_1, 04.12.2014, Product_1, 100
Comp_2, 02.12.2014, Product_1, 200
Comp_2, 02.12.2014, Product_2, 200
Comp_2, 04.12.2014, Product_1, 200
Comp_2, 04.12.2014, Product_2, 200
];
tabTemp:
LOAD Distinct Company Resident INPUT;
Join
LOAD Distinct Day Resident INPUT;
Join
LOAD Distinct Product, 0 as Value Resident INPUT;
Concatenate (INPUT)
LOAD * Resident tabTemp
Where not Exists (ID, AutoNumberHash128(Company, Day, Product));
DROP Table tabTemp;
hope this helps
regards
Marco
Thank you. This is almost exactly what I'm looking for.
In case, when maximum date for "Comp_1" is 04.12.2014 and maximum date for "Comp_2" is 01.03.2013 we receive a lot of zero values for "Comp_2". Can we remove all this values that exceed the maximum date for each company?
Thank you for your answer. But in this case I have to make a table with missing values manually. This is impossible when we have a large amount of data.
INPUT:
LOAD *,
AutoNumberHash128(Company, Day, Product) as ID
INLINE [
Company, Day, Product, Value
Comp_1, 02.12.2014, Product_1, 100
Comp_1, 02.12.2014, Product_2, 100
Comp_1, 03.12.2014, Product_1, 100
Comp_1, 03.12.2014, Product_2, 100
Comp_1, 04.12.2014, Product_1, 100
Comp_1, 05.12.2014, Product_1, 150
Comp_1, 06.12.2014, Product_2, 300
Comp_2, 02.12.2014, Product_1, 200
Comp_2, 02.12.2014, Product_2, 200
Comp_2, 04.12.2014, Product_1, 200
Comp_2, 04.12.2014, Product_2, 200
];
tabTemp:
LOAD Distinct Company Resident INPUT;
Join
LOAD Distinct Day Resident INPUT;
Join
LOAD Distinct Product, 0 as Value Resident INPUT;
Concatenate (INPUT)
LOAD * Resident tabTemp
Where not Exists (ID, AutoNumberHash128(Company, Day, Product));
DROP Table tabTemp;
Left Join (INPUT)
LOAD Company,
Max(Day) as MaxDay
Resident INPUT
Where Value > 0
Group By Company;
Right Join (INPUT)
LOAD Distinct
Company,
Day
Resident INPUT
Where Day <= MaxDay;
DROP Field MaxDay;
hope this helps
regards
Marco
It's exactly what I need. Thanks a lot.
I little bit changed the order in JOINs for better performance.
INPUT:
LOAD *,
AutoNumberHash128(Company, Day, Product) as ID
INLINE [
Company, Day, Product, Value
Comp_1, 02.12.2014, Product_1, 100
Comp_1, 02.12.2014, Product_2, 100
Comp_1, 03.12.2014, Product_1, 100
Comp_1, 03.12.2014, Product_2, 100
Comp_1, 04.12.2014, Product_1, 100
Comp_1, 05.12.2014, Product_1, 150
Comp_1, 06.12.2014, Product_2, 300
Comp_2, 02.12.2014, Product_1, 200
Comp_2, 02.12.2014, Product_2, 200
Comp_2, 04.12.2014, Product_1, 200
Comp_2, 04.12.2014, Product_2, 200
];
tabTemp:
LOAD Distinct Company Resident INPUT;
Join
LOAD Distinct Day Resident INPUT;
Join
LOAD Distinct Product, 0 as Value Resident INPUT;
Left Join (tabTemp)
LOAD Company, Max(Day) as MaxDay
Resident INPUT
Group By Company;
Right Join (tabTemp)
LOAD DISTICNT Company, Day
Resident tabTemp
Where Day <= MaxDay;
Concatenate (INPUT)
LOAD * Resident tabTemp
Where not Exists (ID, AutoNumberHash128(Company, Day, Product));
DROP Table tabTemp;
What do you think?