Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
I read in a loop several files and combine them into one table, let's say 3 files. Each of the files contains a field which is named [Extraction Date]. This field is fix for each file. I now want to create in the load script a new field which is called [Latest Extraction Date]. This shall be 1 if the [Extraction Date] is max, otherwise 0.
Here an example:
File 1:
[Extraction Date] = "2024-01-08T18:31:15.143156100"
File 2:
[Extraction Date] = "2024-01-09T18:31:15.143156100"
File 3:
[Extraction Date] = "2024-01-10T18:31:15.143156100"
I want to create a new field ([Latest Extraction Date]) for the combined table but this field shall only for data in File 3 be "1".
Here is what I tried (maybe much better solutions apply):
I created a new file which contains only the latest extraction date:
[LatestExtractionDate] = "2024-01-10T18:31:15.143156100"
Then, I have following load script:
LOAD
LatestExtractionDate
FROM [lib://mylib/latest.xlsx]
(ooxml, embedded labels, table is [default]);
LOAD
ID,
Title,
"Extraction Date",
IF(LatestExtractionDate=[Extraction Date],1,0) as "Latest Extraction Date"
FROM [lib://mylib/*.xlsx]
(ooxml, embedded labels, table is [RAW-Data]);
Unfortunately, all data is marked with 0 which means, it is not working. How do I get this beast running? Is there a better solution for that?
Maybe this can help you:
Data:
NoConcatenate Load
10 + RecNo() as Data,
Timestamp(Timestamp#('2024-01-08T18:31:15.143156100', 'YYYY-MM-DDThh:mm:ss.fffffffff')) as [Extraction Date] // Use Dates instead of Timestamps if the level of detail is not needed
AutoGenerate 3;
Concatenate Load
20 + RecNo() as Data,
Timestamp(Timestamp#('2024-01-09T18:31:15.143156100', 'YYYY-MM-DDThh:mm:ss.fffffffff')) as [Extraction Date]
AutoGenerate 3;
Concatenate Load
30 + RecNo() as Data,
Timestamp(Timestamp#('2024-01-10T18:31:15.143156100', 'YYYY-MM-DDThh:mm:ss.fffffffff')) as [Extraction Date]
AutoGenerate 3;
ExtractionDates:
NoConcatenate Load Distinct
"Extraction Date"
Resident Data;
Join Load
Max("Extraction Date") as "Extraction Date",
1 as Flag
Resident ExtractionDates;
Join(Data) Load Distinct
"Extraction Date",
RangeMax(Flag, 0) as "Latest Extraction Date"
Resident ExtractionDates;
Drop Table ExtractionDates;
Hi, the problem is, that the number of files is not limited to 3 as well as the Extraction Dates differ over time (although they are fix per file). I don't think that this solution is a generic one.
offtopic:
Your comment:
// Use Dates instead of Timestamps if the level of detail is not needed
How can I only use the dates? Because you are right: the detail is not needed.
Here's an example how you can extract just the date:
Date(Date#(Left('2024-01-08T18:31:15.143156100', 10), 'YYYY-MM-DD')) as [Extraction Date]
The 3 loads at the beginning were just an example. You can load the data using loops or a wildcard operator in the filename. As long as the final table has the "Extraction Date" field it, the remaining parts of the script will work.
The script above is really complex and I am not able to transfer it. Isn't there a simpler way?
This already is one of the simplest ways...
Here is a quick mockup:
Date:
NoConcatenate Load
Date(Date#(Left([Extraction Date], 10), 'YYYY-MM-DD')) as [Extraction Date]
Inline [
Extraction Date
2024-01-08T18:31:15.143156100
2024-01-09T18:31:15.143156100
2024-01-10T18:31:15.143156100
];
If your Extraction Date field is already formatted as a timestamp, you can do this instead:
Date([Extraction Date]) as [Extraction Date]
What I dont understand is your fixed dates, like
2024-01-08T18:31:15.143156100
. In my case, these are changing.