Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Wodge
Partner - Contributor III
Partner - Contributor III

Pulling in latest files with different file time stamps

Hello!

For context, this is QlikCloud, connecting to a OneDrive.

I have been looking into an odd issue in my application and I hope I can find some advice. The issue I am having is that I have the below filenames. However, the seconds increase by 1... So when I use the below variable to try and pull in File2, Qlik looks at the max date and time which is in File4 rather than finding the latest version of File2.

These files are imported into this location everyday, and I am using this variable to pull in the latest version of each file, but I am running into the above issue. 

I cannot use DD-MM-YYYY because then the FROM syntax does not pick up the file (as we are omitting seconds from the variable), and I have to use a variable because a wildcard does not work with a OneDrive Connector... 

File1 DD-MM-YYYY HH-MM-SS

File2 DD-MM-YYYY HH-MM-SS

File3 DD-MM-YYYY HH-MM-SS

File4 DD-MM-YYYY HH-MM-SS

Variable Syntax: Let vMaxDateTime = Date(Peek('MaxFileDate',0,'Date'),'DD_MM_YYYY')

Any ideas?

Labels (2)
1 Solution

Accepted Solutions
Sayed_Mannan
Creator II
Creator II

The issue seems to be that the variable vMaxDateTime is getting the maximum date across all files, not just the latest version of each individual file. One way to solve this could be to create a separate variable for each file and get the maximum date for each file separately.

However, since you're dealing with a large number of files and the seconds are increasing by one, another approach could be to round the seconds to the nearest minute or hour, so that the files from the same minute or hour are treated as the same file.

Here's how you could implement this:


SET TimestampFormat='DD-MM-YYYY hh:mm:ss';
SET DateFormat='DD-MM-YYYY';

Data:
LOAD
FileBaseName() as FileName,
timestamp#(trim(purgechar(lower(FileBaseName()),'data')),'DD-MM-YYYY hh-mm-ss') as Time
FROM
[lib://YourOneDriveConnection/*] (ooxml, embedded labels, table is Sheet1);

Max:
LOAD
FileName,
Max(Time) as MaxTime
Resident
Data
Group By
FileName;

DROP Table Data;

FOR i=0 to NoOfRows('Max')-1
LET vFileName = Peek('FileName', i, 'Max');
LET vMaxTime = Peek('MaxTime', i, 'Max');
Data:
LOAD *
FROM
[lib://YourOneDriveConnection/$(vFileName) $(vMaxTime).xlsx] (ooxml, embedded labels, table is Sheet1);
NEXT i


In this script, it first loads all the files and their timestamps. Then it finds the maximum timestamp for each file. Finally, it loads the data from each file with the maximum timestamp. 

Please try this solution and see if it resolves your issue. 

 

View solution in original post

1 Reply
Sayed_Mannan
Creator II
Creator II

The issue seems to be that the variable vMaxDateTime is getting the maximum date across all files, not just the latest version of each individual file. One way to solve this could be to create a separate variable for each file and get the maximum date for each file separately.

However, since you're dealing with a large number of files and the seconds are increasing by one, another approach could be to round the seconds to the nearest minute or hour, so that the files from the same minute or hour are treated as the same file.

Here's how you could implement this:


SET TimestampFormat='DD-MM-YYYY hh:mm:ss';
SET DateFormat='DD-MM-YYYY';

Data:
LOAD
FileBaseName() as FileName,
timestamp#(trim(purgechar(lower(FileBaseName()),'data')),'DD-MM-YYYY hh-mm-ss') as Time
FROM
[lib://YourOneDriveConnection/*] (ooxml, embedded labels, table is Sheet1);

Max:
LOAD
FileName,
Max(Time) as MaxTime
Resident
Data
Group By
FileName;

DROP Table Data;

FOR i=0 to NoOfRows('Max')-1
LET vFileName = Peek('FileName', i, 'Max');
LET vMaxTime = Peek('MaxTime', i, 'Max');
Data:
LOAD *
FROM
[lib://YourOneDriveConnection/$(vFileName) $(vMaxTime).xlsx] (ooxml, embedded labels, table is Sheet1);
NEXT i


In this script, it first loads all the files and their timestamps. Then it finds the maximum timestamp for each file. Finally, it loads the data from each file with the maximum timestamp. 

Please try this solution and see if it resolves your issue.