Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
My script stops execution at the very end. The data load is quite large. When subsets of the data are used, the data loads fine. These are the last few lines of code:
Range:
LOAD *, if(RangeEnd-RangeStart<=31, 1, if(RangeEnd-RangeStart<=93, 3, date(RangeEnd,'M'))) as avgDiv,
if(substringcount(Range, 'Jan-Mar') >0, replace(Range,'Jan-Mar ','Q1 '),
if(substringcount(Range, 'Apr-Jun') >0, replace(Range,'Apr-Jun ','Q2 '),
if(substringcount(Range, 'Jul-Sep') >0, replace(Range,'Jul-Sep ','Q3 '),
if(substringcount(Range, 'Oct-Dec') >0, replace(Range,'Oct-Dec ','Q4 '),Range)))) as Range1
RESIDENT Ranges;
DROP TABLE Ranges;
Drop Field Range ;
Rename field Range1 to Range ;
LEFT JOIN (Range) IntervalMatch (feeddate) LOAD RangeStart, RangeEnd RESIDENT Range;
DROP FIELDS RangeStart, RangeEnd;
Is there a way to rewrite this without using JOIN and/or Interval Match?
Thanks in advance.
instead of
LEFT JOIN (Range) IntervalMatch (feeddate) LOAD RangeStart, RangeEnd RESIDENT Range;
use this:
LEFT JOIN (Range)
LOAD RangeStart + interno() - 1 as feeddate
RESIDENT Range
while RangeStart + interno() - 1<= RangeEnd;
I don't see anything specifically wrong with it, but it can depend on the fields that you aren't showing us.
For instance, let's say your ranges are specific to a FeedType. And the same table with feeddate also has FeedType. Then you would need to include FeedType in your intervalmatch, or you'd get more data than you intended.
Other than something like that, I can't imagine why the intervalmatch and left join would cause problems. Yes, it's going to duplicate every field in your Range table for every feeddate, but compression should make than a near non-issue. You could always TRY leaving off the left join and just doing the intervalmatch, which would prevent this duplication. I doubt it would help, though.
My only real suggestion has nothing to do with your problem, which is to handle your quarters like this. Same results, I just think it's a little cleaner, but that's just a personal preference thing.
Quarters:
MAPPING LOAD * INLINE [
Months, Quarter
Jan-Mar, Q1
Apr-Jun, Q2
Jul-Sep, Q3
Oct-Dec, Q4
];
mapsubstring('Quarters',Range) as Range1