Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in NYC Sept 4th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Dynamic Job, Single job for multiple files, Multiple Schema, etc.

Hi Everyone,

 

One of the questions I get whenever I visit any client for a Talend project / assignment is

 

Can I have one single job which can process multiple files with different schema, format, etc. ? I should be able to carry out following transformations for example -

1) Read Data from File

2) Sort based on column(s)

3) Filter records on some condition(s)

4) Aggregate data

5) Store it in individual table ( each file will have separate table )

 

Which file to process, which column to use for sorting, condition to filter data, columns to aggregate data - all this information should be able to pass to Talend job through a configuration file / table.

 

Those who have worked on other ETL tools like Pentaho, Ab intio will know this is very much possible through these tools. In case of Petaho - metadata injection feature allows you achieve this.

 

I understand - there is a feature available in the form of dynamic schema ( Enterprise Edition ). But it does not really allow to implement the use case mentioned above.

 

I've had hard time to explain this to one of my client. But on a second thought - it appears feature like this would be useful in situations where multiple source files need to go through set number of transformations.

 

Therefore just want to understand what community members and Talend team think about this?

Labels (3)
2 Replies
Anonymous
Not applicable
Author

Hi
In Talend, there are different component to read each type of file format, and the columns to be sorted, to be aggregated should be defined at design time, can't pass these information at runtime, so it is impossible achieve this use case with a single generic job.

Regards
Shong
Anonymous
Not applicable
Author

Hi Shong,

 

Thank you very much for quick response. Few reasons I mentioned to the client - more generic job you try to design -

1) Overall design becomes complex and difficult to maintain

2) Testing such jobs also becomes difficult

3) Massive configuration table / file means you need to train people to provide accurate information to the job

4) If such job breaks down - debugging also becomes challenging

 

Thanks,

Nishad Joshi.