Skip to main content
Announcements
Introducing Qlik Answers: A plug-and-play, Generative AI powered RAG solution. READ ALL ABOUT IT!
cancel
Showing results for 
Search instead for 
Did you mean: 
shamu
Contributor
Contributor

Apply a schema dynamically to data

Hi Everyone

Relatively new to Talend, and I am trying to work out if its possible to build a "generic" job to extract data from MSSQL to a delimited text file? And vice versa, a "generic" job to load from a text file to SQL DB (or any other target that has a data-typed structure eg Excel)

I guess what I am asking is it possible to "apply" a schema to the data once a job has started, so that I do not have specify a schema during design. eg a component which can read the schema from a meta data file and apply it to a delimited data flow?

For the generic extract, I have been able to somewhat achieve this by using SQL trickery to output a single delimited row to a target text file. This is OK.  However it would be better if I could also output the data to eg an excel file with its datatype intact.  

I thought it might be possible with the dynamic schemas - however from what I have been able to ascertain so far, it seems that at some point you need to specify (ie hard-code) in the job the schema to either write to or extract from a dynamic column. You don't seem to be able to pass the values as a parameter, or there is no component to apply a schema from an external file?

Is this at all possible with standard components, or am I going to have to improve my java skills and write my own?

Labels (2)
4 Replies
JohnRMK
Creator II
Creator II

Hello,

what do you use as a version of Talend?

Open studio or the paid version?

 

In the paid version, you already have the dynamic schema and you have almost nothing to program

Keep me informed of your version, I can guide you in detail

shamu
Contributor
Contributor
Author

Hi JohnRMK

 

I am using the subscription version 6.4.

 

I do have the dynamic schema, but as I mentioned I don't know that it is flexible enough as you still need to define a schema somewhere within the job, you cannot supply it from an external source. Even with something like the positional file example provided on the Talend help portal (which is about as close as I've seen to what I would like to acheive), it allows you to define a schema with tsetDynamicSchema, but then this existing Dynamic schema can only be used in a limited number of other components. And even then it looks like if you wanted to actually output your data to a target data object, you would still need to define a schema in that output?

 

 

 

 

JohnRMK
Creator II
Creator II

Hello,

Sorry for the delay. In fact, the dynamic schema makes it possible to define just one column when the job is designed and at runtime, it performs the mapping automatically.

 

A use case:

The ingestion of several hundred csv files in several tables in the database.

-Each csv file has a different schema (Person, Customer, Lead or Document for example)

-The classic approach is to develop a job for each table with a different tMap each time. So there are as many Talend jobs as there are different schemes.

-With the dynamic schema, you can do the work in one job. Because, at the time of the execution, it retrieves the schema and automatically does the mapping.

There is also the migration of a database in a single job.

It's very powerful and it saves dev time.

 

Can you give me your use case and what you want to accomplish to identify the problem?

 

two small limitation for the dynamic schema:

-You cannot do any transformation except if you use java code

-Some limits at the database level with the "max" size that it cannot detect but you can use a java routines to automatically generate the table then the schema to retrieve the data

I remain available if you have any questions

shamu
Contributor
Contributor
Author

Thanks for your reply John.

 

I'd be grateful if you could explain how to setup the dynamic schema for the following scenario

  • Processing over 100 different sql queries out to a delimited file. The queries have a varied schema (there are probably some shared) ranging from one column output to around 20 columns
  • As above but output the data to an existing excel file where I can keep the format of the incoming columns (eg output decimals as decimals not text).

 

Otherwise it would be nice just to get an understanding of how to set up the dynamic schema to create more generic jobs for reuse.