Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi,
My dashboard is currently loading a folder full of monthly Excel reports from a network share. Each row has a numeric identifier that I want to have as unique in the final loaded data. The problem is that there can be multiple occurences of the same ID on later reports where other fields on the same row have been updated. A normal distinct load doesn't work since it only keeps the first corresponding row leaving my data partly out of date.
The reports' naming convention is Report-YYMM which could maybe be used to identified the load order. The only other thing I can think of is renaming them so that they are loaded from latest to oldest.
Any ideas?
I think the most optimal way is to load excel files looping from newest to oldest,
using "not exists(ID)" to avoid older duplicates.
Here is some script you can try:
// get list of excel files in a directory
set vDirectory = 'E:\SomeDirectory';
for each vFoundFile in filelist(vDirectory & '\*.xls')
Files:
Load
'$(vFoundFile)' as FileName,
FileTime( '$(vFoundFile)' ) as Timestamp
AutoGenerate 1;
next
// Sort files by timestamp
SortedFiles:
Load
FileName as FileNameSorted
Resident Files
Order By Timestamp desc;
// Load data from excel files listed in table SortedFiles
for i = 0 to NoOfRows('SortedFiles')-1;
let iExcelFile = Peek('FileNameSorted', i, 'SortedFiles');
Data:
Load
IDField, Field1, Field2
FROM
[$(iExcelFile)] (biff, no labels, table is [SheetName$])
Where not Exists(IDField);
next
hei
you can use filetime() which give you when the file last modified to sort the files
Thanks for the suggestion! I'd still have to loop through all of the files in the folder to find the modified date and then load them in the sorted order. I'm not sure how to do that in the load script.
I think the most optimal way is to load excel files looping from newest to oldest,
using "not exists(ID)" to avoid older duplicates.
Here is some script you can try:
// get list of excel files in a directory
set vDirectory = 'E:\SomeDirectory';
for each vFoundFile in filelist(vDirectory & '\*.xls')
Files:
Load
'$(vFoundFile)' as FileName,
FileTime( '$(vFoundFile)' ) as Timestamp
AutoGenerate 1;
next
// Sort files by timestamp
SortedFiles:
Load
FileName as FileNameSorted
Resident Files
Order By Timestamp desc;
// Load data from excel files listed in table SortedFiles
for i = 0 to NoOfRows('SortedFiles')-1;
let iExcelFile = Peek('FileNameSorted', i, 'SortedFiles');
Data:
Load
IDField, Field1, Field2
FROM
[$(iExcelFile)] (biff, no labels, table is [SheetName$])
Where not Exists(IDField);
next
Thanks! That did the the trick after some tweaks. Decided to use the suffix in the filename instead of filetime() since someone might accidentally edit old reports.