Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I have googled to death on this and am not sure I can do this with talend, or if I am not asking about the correct components. My talend job goes to a folder with numerous dated excel files and pulls the most recent two. I was hoping I could pull them both into the same postgres table via tpostgresqloutput. However, the tfileinputexcel component I am using, I am using ((String)globalMap.get("MostRecent")) and therefore pulls only one. Is there a way to pull both files? currently here are components:
tpostgresqlconnection trigger ok to tfilelist1 to folder with multiple excel iterate to titeratetoflow_2 main to tbufferoutput_1 main to tlogrow_2. tbufferinput_1 also triggered by tfilelist_1 main textractregexfeilds_1 main tsortrow_1 main tlogrow3 main tsamplerow_1 with "1..2" main tlogrow-4 main tflowtoiterate_1 with basic settings as key "MostRecent" value = CURRENT_FILEPATH. Second row in customize box is key = "LastFileName" and value = "CURRENT_FILE". I am sure this code can be changed, but I'm not a java person so really have no clue. tried to take snapshot of talend job but could not paste in.
Thank you for your help.
What @TRF has said is a good solution.
Using the tFileList you can order the files returned by "modified date". This enables you to (with a little experimentation of the feature) retrieve a list of filenames in date order. Connect the tFileList to a tJava component with the following code....
int count = 0;
//Checks to see if "count" exists.
if(globalMap.get("count")!=null){
//Adds 1 to the current "count" value
count = ((Integer)globalMap.get("count")).intValue() +1;
}else{
//"creates the first "count" value
count = 1;
}
//Update or set the globalMap value
globalMap.put("count", count);
Now connect your tJava to the tFileInputExcel using a RunIf link. In the RunIf expression use the following logic....
((Integer)globalMap.get("count")).intValue()<=2
Then (if the rest of your job is configured OK) you should be able to limit the file read to the most recent two files.
OK, it sounds like you may have a different method that doesn't require my method....but I cannot see your attachment I'm afraid. You should be able to insert an image of your job by clicking on the "photos" button here and selecting yoru screenshot
Thanks to rhall for telling me how to upload pix. You are probably thinking by now I should be looking for a new career. here is the job in total, and then the two components I was thinking needed different code.
Ah, I see. If you have already narrowed the jobs down to the two you want by the time you get to the tFlowToIterate, you simply need to connect the tFlowToIterate to the Excel component using an iterate link. This will allow you to run the last subjob one time for every version of the file (in this case, 2 versions)
I made what I thought was correct change, but though it runs, it only captures last file. What did I do wrong? do I need to change the code in the tfileinputexcel? Thanks for your patience. This is really important as I have other talend jobs like this I need to do the same.