Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All
I am new to Talend and trying to build a data warehouse using daily exports from Salesforce. As it stands the plag is to export data like for like to a MS SQL staging database before using SQL (much more comfortable for me in the first instance!) to transform the data into my reporting structure
After a quick review of the underlying Salesforce structure, I think I need data from around 50+ objects that will therefore map to 50+ SQL tables.
Playing around with Talend (Open Source version) I have created a single job that uses the SF connection to export to a single table and expanded that job to export to multiple tables. I have also had a look at tRunJob but have not found an example on how the SF connection should be passed from parent to Child (assuming that that would be the appropriate way to do it ie one connection in the parent for all the child jobs).
After googling for a while I am yet to find a design pattern or best practice for the straight forward dump of multiple objects. Any suggestions/examples would be much appreciated
Thanks in advance
Darren
Hi,
my "Talend only" best practice method for this:
1) create a job for every salesforce object to transfer it to the DB. I duplicated a template job for every SF object.
Those jobs handle the connection to SF by themselves (you do not need to transfer the SF connection to the jobs). The common SF credentials should be kept in an own context and are used by the jobs.
2) I have a "main" job, which calls all the SF jobs to import the different SF objects.
The "main" job is done by a the following flow: tFixedFlowInput->tFlowToIterate->tRunJob
tFixedFlowInput:
tRunJob:
This works for me.
You will maybe later, if this is up and running for you, run into problems with the fixed talend schemas: If Salesforce attributes change (e.g. deleted), which is normal in a living system, your Talend import job will fail, if it can't find the corresponding SF attribute.
My actual solution is to dynamically load the SF object schema for every SF object through the SF REST API and build a talend xml schema with this information. So I only have to load the new xml schema for an SF object into talend, if SF attributes change.
Thank you. Much appreciated.
I have currently grouped my objects and created several jobs with related objects in a single group and with the info above I will build them into one parent job .
I have already run into the changed schema issue and will look to tighten up the process on SF changes in the first instance (although will be checking out the dynamic load option soon!)
Thanks again for responding
Darren