Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

How to process 2 latest file from a folder based on modified date

I have a scenario where I will have some set of files which gets created by some external system for every 10 minutes or so.How to process only the 2 latest file from all the files present in the folder sorted on file creation date/modified date in desc?

 

Ex : C:/Sample/Incoming_Files -> Emp_YYYYMMDDHHMMSS is the file format

 

Emp_20190801001534.csv

Emp_20190801041522.csv

Emp_20190801031555.csv

Emp_20190801001504.csv

Emp_20190801101517.csv

 

How to process the 2 latest file from the folder based om modified date?

Could someone please make it quick?

 

Thanks,

Kiran Kumar

Labels (3)
2 Replies
Ganshyam
Creator II
Creator II

Hello,

 

Make use of tfileproperties which will give you details of the modified date and time of file.

 

Thanks

Ganshyam 

Anonymous
Not applicable
Author

There are a few steps to this. I have created an example job which can be seen below....

0683p000009M7RE.png

First of all, use a tFileList to monitor your folder and retrieve CSV files. Using this component is described in the documentation.

 

The next is a tJavaFlex. Create two columns for this; a filedate column (Date) and a filepath column (String). You need to convert the filename to a date and pass the filepath as normal. This is done in the MainCode section. The code I have used is below....

String filename = ((String)globalMap.get("tFileList_1_CURRENT_FILE"));
filename = filename.substring(4, filename.length()-4);

row1.filedate = routines.TalendDate.parseDate("yyyyMMddHHmmss", filename);
row1.filepath = ((String)globalMap.get("tFileList_1_CURRENT_FILEPATH"));

After that, the records are sent to a tHashOutput set with the "Append" option on.

 

In the next subjob, you start with a tHashInput component. This is linked to the tHashOutput and has the same schema.

Join that to a tSortRow component. My config is shown below...
0683p000009M7RJ.png

This sorts the files by date.

 

The next component is the tMap. This is used to filter only the two newest files. I do this by creating a "count" record in a tMap variable. This use the Numeric.sequence method. As the records pass through the tMap, they are assigned a number. The tMap output then filters the records which are allowed to be output to have a count less than 3. This can be seen below...

0683p000009M7RO.png

After this, I have a tLogRow simply to show the result. 

Note: I have just realised that the example is set up for the two oldest files. Just switch the tSortOrder component's sort order to asc to get the two newest.