Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
I've got the following data source files:
File_01.02.2018.xlsx // plan for February
File_03.02.2018.xlsx // plan for February with changes
File_01.03.2018.xlsx // plan for March
... etc.
Question/ Issue:
Every time we get a new file with changes (within one month), we need to replace the data from a new file's date and till the end of the month.
P.S. The data doesn't contain any "date" fields etc. Date is stored only in the file's header.
Attempts:
I know that there's the REPLACE LOAD statement but it won't work for this case.
Data:
Replace LOAD
Store,
Week,
Day, // Thu, Fri,...
Flag, // 1 or null
Month(Date( Date#( right( FileBaseName(), 10) , 'DD.MM.YYYY'), 'DD-MM-YYYY')) as Month,
Year(Date( Date#( right( FileBaseName(), 10) , 'DD.MM.YYYY'), 'DD-MM-YYYY')) as Year
FROM
Expected result:
File_01.02.2018.xlsx
Store | Thu | Fri | Sat | Sun |
---|---|---|---|---|
PN | 1 | |||
CHE | 1 | 1 | 1 | 1 |
PN | 1 |
File_03.02.2018.xlsx
Store | Thu | Fri | Sat | Sun |
---|---|---|---|---|
PN | 1 | 1 | 1 | |
CHE | 1 | 1 | ||
PN | 1 | 1 |
01.02.2018 is Thursday and 03.02.2018 is Saturday so we need to start replacing from Saturday
Result plan for February:
Store | Thu | Fri | Sat | Sun |
---|---|---|---|---|
PN | 1 | 1 | 1 | |
CHE | 1 | 1 | 1 | |
PN | 1 | 1 | 1 |
With my best regards,
Evgeniy
I understand he gets a modified file but only the data from the data in the filename to the end of the month should be read. The example tables he gave were pretty clear.
I think the easiest thing would be to check if there is a new file for that month and then only load until that date(where date<file date. from the previous file.
For the dates you will need to add them to the data your self. but that should be easy enough.
Hope I was clear enough
hello,
question : is you new file containing all date regardless of it has modified or not ?
if you have all date, as you have created fields with month and year, you could :
determine the month and year of the newly received file
delete all existing records already stored
append the new data
If you have only modifications, you have to delete only records modified. try with an exist clause
I understand he gets a modified file but only the data from the data in the filename to the end of the month should be read. The example tables he gave were pretty clear.
Try this it will show new reloaded file data if your data is daily updated.
instead of replacing file only fetch current date file.
Set vCurrDate=Date(Date(today()),'DD.MM.YYYY');
t1:
Load *
From 'C:\data\File_$(vCurrDate).xlsx';
Thanks for your suggestions
I've got the expected result based on the following steps, but I am not sure in this way.
What do you think of it?
1. Table for files headers
FOR Each File in filelist ('$(vRoot)'&'\*.xlsx')
Files:
Load
Date(right( Subfield( '$(File)' , '.xlsx', 1),10),'DD.MM.YYYY') as FileDate,
num(Month(Date(right( Subfield( '$(File)' , '.xlsx', 1),10),'DD.MM.YYYY'))) as FileMonth
autogenerate 1;
next File
2. Are there any changes?
isChangeTable:
load *,
if( FileMonth = peek(FileMonth) , 1, 0) as isChange
Resident Files
Order by FileDate asc;
DROP Table Files;
3. Load the Data (based on these two files)
for i = 1 to 2
let FileDate = 'RouteList-' & FieldValue('FileDate', $(i));
let isChange = FieldValue('isChange', $(i));
Data:
Load
Store,
Week,
Day, // Thu, Fri,...
Flag, // 1 or null
Month(Date( Date#( right( FileBaseName(), 10) , 'DD.MM.YYYY'), 'DD-MM-YYYY')) as Month,
Year(Date( Date#( right( FileBaseName(), 10) , 'DD.MM.YYYY'), 'DD-MM-YYYY')) as Year
FROM
4. Joining dates
left Join(Data)
...
5. And the final stage. If there are any changes on the next step - replace the "Flag" value to null
Data_Final:
Load
Month,
Year,
Date,
Day,
Store,
if( $(isChange)+1 = 1 and Date >= '03.02.2018' , Null(), Flag_tmp ) as Flag
Resident Data;
DROP Table Data;
NEXT
I know it looks like very raw and "hard coded"
With my best regards,
Evgeniy
Nice to read you got this to work. It is indeed a bit hard coded.
When getting the filedates and such why don't you just get the file path/name as well so you can use this to load the data? Any reason for that?
I Think I would make a loop through all the files you listed in the table: "isChangeTable" And in the loop, peek in the next record to see if there is another file from that month. If so get the date of that file and load where Date of current file < date < that peeked date. If not just load the rest of the month.
Hope you understand what I am getting at. If it isn't clear Ill try to be more clear. Its early in the morning here.
Hello and thanks for your comments!
Do you suggest to replace my final 5th stage like this?
for i = 1 to <count_of_files>
...
let nextDate = FieldValue('FileDate', $(i)+1);
let currentDate = FieldValue('FileDate', $(i));
let isChange = FieldValue('isChange', $(i)+1);
... // data loading, joining dates
5th stage:
IF isChange = 1 THEN
NoConcatenate
Data_Final:
Load *
Resident Data where Date < '$(nextDate)' ;
DROP Table Data;
ELSE
Concatenate (Data_Final)
Load *
Resident Data where Date >= '$(currentDate)' ;
DROP Table Data;
ENDIF
P.S. I've got the same (expected) result
With my best regards,
Evgeniy
I would not even load the data from the file.
So in the 3th part where u load the data use: WHERE Date<'$(nextDate)' AND Date >= '$(currentDate)' ;
Not completely sure what u are doing with the 4th stage. u dont say what you are joining.
Data files don't contain of any "date" field by default. So in the 4th stage I join "Dates" to the existing data