Skip to main content
Woohoo! Qlik Community has won “Best in Class Community” in the 2024 Khoros Kudos awards!
Announcements
Nov. 20th, Qlik Insider - Lakehouses: Driving the Future of Data & AI - PICK A SESSION
cancel
Showing results for 
Search instead for 
Did you mean: 
martin_hamilton
Creator
Creator

Load latest file based on file prefix - script

Hi

I have data refreshing on a daily basis into a directory but I only wish to load the latest file based on the prefix of the file name e.g.

20180101FILENAME1.csv

20180102FILENAME1.csv

20180103FILENAME1.csv

20180104FILENAME1.csv

20180101FILENAME2.csv

20180102FILENAME2.csv

20180103FILENAME2.csv

20180104FILENAME2.csv

so in the above example I would only load the following:

20180104FILENAME1.csv

20180104FILENAME2.csv

I dont want to use the file properties as this could be the same for all files, I need to use the file name.

Many Thanks

Martin 

7 Replies
Gysbert_Wassenaar

Something like this:

For each vFile in FileList('D:\ABC\XYZ\Data\*.csv')

    LET vFileName = SubField('$(vFile)','\',-1);

    LET vDate = Date(Date#(Left('$(vFileName)',8),'YYYYMMDD'),'YYYYMMDD');

    tmpFiles:

    LOAD '$(vFile)' as File,  '$(vFileName)' as FileName, '$(vDate)' as Date AutoGenerate 1;

next

Files:

LOAD Max(Date) as Date RESIDENT tmpFiles;

LEFT JOIN (Files)

LOAD File, Date RESIDENT tmpFiles;

DROP TABLE tmpFiles;

For i = 0 to NoOfRows('Files') -1

LET vSourceFile = peek('File',$(i),'Files')

    data:

    LOAD * FROM [$(vSourceFile)] (txt, ....);

Next


talk is cheap, supply exceeds demand
martin_hamilton
Creator
Creator
Author

Thanks Gysbert, I plan to look into applying the script changes tomorrow.

thanks Martin

martin_hamilton
Creator
Creator
Author

Hi Gysbert

I dont think I am fully understanding the script, I have implemented the following but its throwing an error saying "

The following error occurred:

Table 'tmpFiles' not found

The error occurred here:

Files: LOAD Max(Date) as Date RESIDENT tmpFiles

For each vFile in FileList('C:\Users\martin.hamilton\Documents\Qlik\Sense\Data Directory\FileListByDate\*.csv');


LET vFileName = SubField('$(vFile)','\',-1);

LET vDate = Date(Date#(Left('$(vFileName)',8),'YYYYMMDD'),'YYYYMMDD');


tmpFiles:


Load '$(vFile)' as File,  '$(vFileName)' as FileName, '$(vDate)' as Date AutoGenerate 1;

   

next


Files:


LOAD Max(Date) as Date RESIDENT tmpFiles;


LEFT JOIN (Files)


LOAD File, Date RESIDENT tmpFiles;


DROP TABLE tmpFiles;


For i = 0 to NoOfRows('Files') -1


LET vSourceFile = peek('File',$(i),'Files');


data:


LOAD

*


FROM [$(vSourceFile)]

(txt, codepage is 28599, embedded labels, delimiter is ',', msq);


next


Any advice?

Gysbert_Wassenaar

That looks like an error from Qlik Sense. Qlik Sense uses LIB references to files instead of paths like Qlikview does. You'll have to define a folder connection in Qlik Sense first to the folder that contains your csv files. Then use the LIB reference instead: For each vFile in FileList('LIB://MyDataDirectory\FileListByDate\*.csv');

MyDataDirectory is in this example the name of the folder connection to C:\Users\martin.hamilton\Documents\Qlik\Sense\Data Directory\FileListByDate\


talk is cheap, supply exceeds demand
martin_hamilton
Creator
Creator
Author

Thats all worked fine for a single file name with differing prefixes but when I have multiple files names to load it only picks the first one in the tmpFiles as opposed to uniquely pulling each one out.

Do I have to have multiple Load statements for each file? I notice that the variable vSourceFile only contains a single file so subsequently it doesnt know about the others although it has listed them in table Files (i think this is based on the use of Peek)

So as an example I would load the following tables based on the combination of date and file name:

example.JPG

Thanks

Martin

Gysbert_Wassenaar

Ah, more details. Wish you could have added those earlier. So we're not dealing with dates, but timestamps. And you don't want the latest date that's the same for the different kind of files, but latest timestamp per type of file, which will be different for the different kind of files.

So not LET vDate = Date(Date#(Left('$(vFileName)',8),'YYYYMMDD'),'YYYYMMDD');

But LET vDate = TimeStamp(TimeStamp#(Left('$(vFileName)',8),'YYYYMMDDhhmmss'),'YYYYMMDDhhmmss');


And you need the kind of file as well: LET vFileKind = Mid(SubField('$(vFile)','\',-1),15) as FileKind;


Then you want the maximum timestamp per FileKind:


Files:

LOAD FileKind, Max(Date) as Date RESIDENT tmpFiles GROUP BY FileKind;




talk is cheap, supply exceeds demand
priyasawant
Creator II
Creator II

can anyone help me i am having same kind of issue..

I am having a QVD extractor which extracts QVD as per the Business Date & Business Name

for e.g.

20200930_ABC_(Reload date and timestamp) like 20201120 18:33:10

20200930_DCS_20201120 18:40:20

so on as and when i reload the app i have a new created generated for 4 different Business, but now into my main App i have to load all the Business data along with the latest reload date.