Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Toronto Sept 9th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Point me towards the correct components?

Talend comes with soooo many components, which is great, but as a noobie I don't know where to start looking for the tools that might help me accomplish the following:

 

I've successfully built a job in TOS that pulls data from a folders worth of csv files and copies them to a MySQL database using tFileList, tFileInputDelimited, etc.

 

What I'd like to do now is log the name of each file that gets copied and the datetime it was copied. Then, when the job is run in the future, I want to consult that log and limit my export to the files which either haven't been copied or have been edited since the last time they were copied. So, a couple questions:

  1. At the end of each file iteration, how can I generate a single row containing the filename and current time to insert into a db?
  2. Within each iteration, how can I compare the current file name and current file modified on date to the data from transaction log and then short-circuit the iteration if appropriate?
Labels (4)
7 Replies
fdenis
Master
Master

after your iteration with an onsubjobok trowgwnwrator
1 row, String, getting (String)globalMap.get("tFilelist1_FILE……")
then you know how to insert into bd

for the secon point:
after file iteration add tmap with a lookup on tMysqlInput.

good luck
Anonymous
Not applicable
Author

Not sure I understand what you're saying for my second point. The only output from tFile is an iterate row and that is not a valid input for a tMap, so I don't understand what you mean when you say "after file iteration add tmap".

Anonymous
Not applicable
Author

In regards to your first point. I have added a tRowGenerator element but I don't see any way of calling globalMap.get("tFilelist1_FILE……").

0683p000009LyrF.png

Jesperrekuh
Specialist
Specialist

Suggest reading the manual speficly :
- OnComponentOk vs OnSubjobOk
- tLogCatcher , tStatcatcher, tFlowmeter
In case of files:
- tFileProperties, able to generate an md5-hash.

For tracing what happened and job restarts always ... yes always... create 2 or 3 additional columns
- SRC_LOAD_DT fill it with TalendDate.getCurrentDate()
- JOB_PID fill it with pid (which is the process identifier)
- MD5_FILENAME which contains the HASH from the tfileProperties
fdenis
Master
Master

select "…" and in value (String)globalMap.get("tFileList_1_CURRENT_FILE"")
fdenis
Master
Master

you have a folder with your csv files so I guess youuse tfileList - iterate - tfileinput.
on thîs tfileinput add the onSubjobOk link to add file name to your db.

on a second time when you want to add only new file you have to insert tmap(used to filter) on the row link between tFileInput and tMysqlOutput.

fdenis
Master
Master

The best way to do that is to directly link tFileList to an tIterateToFlow. use tmap and tMySqlInput to filter files to Upload link this file list to a tFlowToIterate To load Data and add filename to bd.