Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi All,
I would like to join multiple tables(say 10 tables) from Oracle database each having huge amount of Data. I was looking at using tMap component which does the look-up for joining of tables. But i realized that this will cause Out-of-memory issue since tMap stores the look-up data in-memory.
I have seen the "enable stream" option in tMySQLInput component but couldn't see the same in tOracleInput.
1.) Is there a way for joining multiple tables in Oracle using streams?
2.) What is the best and fastest way of joining data between multiple tables considering each table having huge data?
Sorry for the VERY late reply to this question. I stumbled upon this today. The problem you have with joining in this way is that you need to load all of the data into your lookups before a join can be made. You have already pointed that out. But streams won't help here. If you need to check every row within your lookup data to find a match, streaming won't save you any memory or time. You *could* try loading at each row, but that would mean a new query for every row from your main data set. That would take ages.
The best way to deal with situation like this IF the data is coming from the same database, is to carry out the join in the SQL. Let the database carry out its job ad what it is best at....joining its data. If the ten tables are distributed over less than 10 databases but more than 1, then see if you can carry out the majority of the joins in each database, the join the rest inside Talend. Be aware that you can also use the Temp Data Directory Path option in the tMap to store large amounts of data.....
https://help.talend.com/r/eQZTskQyWGsPFEh60rnfhA/4fo9MPcKnihV3h4HvZ3yrw