Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi all,
If I have to do an ETL job wih a few transformations(filters, joins, aggregates ...), is it a good practice to do everything using SQL script in the query of the DB input?
I mean, have a job with only DBInput -> DBOutput and in the query of the input include a big query with all the transformations needed. for example:
"select a.a1,a.a2, b.1,sum(b.b2) as b2 from a, b where a.a1 = b.a1 and a.a3 = 'xx' group by 1,2,3;"
Other similar approach would be using directly a tDBrow inc ase both input and output are part of same database:
"insert into c (select a.a1,a.a2, b.1,sum(b.b2) as b2 from a, b where a.a1 = b.a1 and a.a3 = 'xx' group by 1,2,3"
I think none of this options are good practice but I have not seen it in any Talend paper or documentation.This is not flexible and all the good things of using an ETL tool are not used. Otherwise, why would you use Talend Studio? You can directly run the SQL query in your database using an external scheduler.
Can you share any Talend documention where these type of best practices are included?
Thanks,
Aldi
Thanks for taking a look to my comment.
At least, is correct my assumption that this is not a good practice?