Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Dear All
I have a problem, what I cannot solve.
I am loading data where in some cases two kind of data exists for the same day and same external_id, and I need only one ( the two types are CONTRACTED value and RECALCULATED value) . I would like to load only the recalculated value if the date and external ID are identical. How is it possible with load script?
Example:
EXT_ID | Date_valid_from | STATUS | Value |
1 | 2014.08.01 | CONTRACTED | 5 |
1 | 2014.08.01 | RECALCULATION | 6 |
2 | 2014.08.01 | CONTRACTED | 10 |
2 | 2014.09.01 | RECALCULATION | 11 |
3 | 2014.09.01 | CONTRACTED | 4 |
3 | 2014.09.01 | RECALCULATION | 5 |
From the first two rows I would like to load the second one and drop the first one:
EXT_ID | Date_valid_from | STATUS | Value |
1 | 2014.08.01 | CONTRACTED | 5 |
1 | 2014.08.01 | RECALCULATION | 6 |
In the 3rd and 4th row EXT_ID is the same , but Dates are different, so I need both rows:
EXT_ID | Date_valid_from | STATUS | Value |
2 | 2014.08.01 | CONTRACTED | 10 |
2 | 2014.09.01 | RECALCULATION | 11 |
Row 5 and 6 is similar to the first one: I need only the RECALCULATION where EXT_ID and Date_valid_from are identical:
EXT_ID | Date_valid_from | STATUS | Value |
3 | 2014.09.01 | CONTRACTED | 4 |
3 | 2014.09.01 | RECALCULATION | 5 |
So after LOAD I would require the following rows in the table:
EXT_ID | Date_valid_from | STATUS | Value |
1 | 2014.08.01 | RECALCULATION | 6 |
2 | 2014.08.01 | CONTRACTED | 10 |
2 | 2014.09.01 | RECALCULATION | 11 |
3 | 2014.09.01 | RECALCULATION | 5 |
Could you help me with loading script?
Best regards and thanks
Zoltan
SCRIPT
Directory;
tmp:
LOAD EXT_ID, Date_valid_from, STATUS, Value
FROM RECALC_LOAD.xlsx (ooxml, embedded labels, table is Munka1);
final:
NoConcatenate
load *
Resident tmp
where Peek(EXT_ID) <> EXT_ID or peek(Date_valid_from) <> Date_valid_from
order by EXT_ID, STATUS desc;
drop table tmp;
RESULT
SCRIPT
Directory;
tmp:
LOAD EXT_ID, Date_valid_from, STATUS, Value
FROM RECALC_LOAD.xlsx (ooxml, embedded labels, table is Munka1);
final:
NoConcatenate
load *
Resident tmp
where Peek(EXT_ID) <> EXT_ID or peek(Date_valid_from) <> Date_valid_from
order by EXT_ID, STATUS desc;
drop table tmp;
RESULT
Thank you Massimo, you solved my problem!
Zoltan