Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik GA: Multivariate Time Series in Qlik Predict: Get Details
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Noraml Join query between two tables in tpostgresqlInput component and tmap join

Dear All,

     I am new in Talend , I need to know about Joins in talend .

I am doing a Job in Talend where I  join two to three tables with tmap component ,

But I have a question about If I can write join query between in  TpostgresqlInput component which does  same for me then why would I use tmap to Join two tables.

Could any one tell me the real difference between Noraml Join query between two tables in tpostgresqlInput component  and tmap join. does it have any relation with memory management or  job execution time (processing speed) .

 

      

Labels (2)
1 Solution

Accepted Solutions
ankit7359
Creator II
Creator II

Hi @mac_vardam07,

Greetings of the day,

Welcome to Talend. Well Joins in Talend can be done in multiple methods, as you have pointed it out you can perform joins in your Database Components as well as other Talend components like Tmap or Tjoin...

Let me give you a clear idea -> 

You can perform joins in 2 components other than DB components... one of them is tjoin... in this component we can perform inner join (only inner join) and inner join by capturing the reject records.

while the other components is TMap...where you can perform inner,left outer,right outer and full outer joins.. well Tmap consumes bit of performance(consumes lot of memory).. The reason for this is Tmap itself generates quite complex codes. and if you perform joins on that.. you will only increase its memory usage.

for example Table A and Table B have to be joined via tmap where in tmap you have defined the key column and join as per the requirement.

DEMO SCENARIO :

TABLE_A--------->------->

                                 TMAP --------> OUTPUT_COMPONENT

TABLE_B--------->------->

Number of components used 4(including input and output).

Usage of joins and other required logic(as per requirement) for Tmap.

While from the DB components the join is performed in the DB level and data moves as row by row....

so for example Table A and Table B are joined(will be joined in the DB level) and the result-set of this would be sent to the next component/transformation as a row.

DEMO_SCENARIO:

TDBINPUT_COMPONENT(QUERY FOR JOINS FROM 2 TABLES and CONSIDERING THE DB CONNECTION IS TAKEN FROM REPOSITORY)------------>OUTPUT_DBCOMPONENT

Number of Components -> 2

Join is performed in the DB level and then output is produced..

however there is another theory here... DB -> DB willl be slower..  so maybe if you want to enhance the performance then you try this... DB---> FILE---> DB...(This usually improves the Performance.)

I hope you have got the difference... '

Pls reach out to the Talend Community,if necessary.

Thanks,

Ankit.

 

View solution in original post

3 Replies
Anonymous
Not applicable
Author

Hello,

Could you please give us some background about your use case? The memory consumption will depend on the size of the dimension tables, the size of your fact table and data transformations and so on.

Best regards

Sabrina

 
 
 
ankit7359
Creator II
Creator II

Hi @mac_vardam07,

Greetings of the day,

Welcome to Talend. Well Joins in Talend can be done in multiple methods, as you have pointed it out you can perform joins in your Database Components as well as other Talend components like Tmap or Tjoin...

Let me give you a clear idea -> 

You can perform joins in 2 components other than DB components... one of them is tjoin... in this component we can perform inner join (only inner join) and inner join by capturing the reject records.

while the other components is TMap...where you can perform inner,left outer,right outer and full outer joins.. well Tmap consumes bit of performance(consumes lot of memory).. The reason for this is Tmap itself generates quite complex codes. and if you perform joins on that.. you will only increase its memory usage.

for example Table A and Table B have to be joined via tmap where in tmap you have defined the key column and join as per the requirement.

DEMO SCENARIO :

TABLE_A--------->------->

                                 TMAP --------> OUTPUT_COMPONENT

TABLE_B--------->------->

Number of components used 4(including input and output).

Usage of joins and other required logic(as per requirement) for Tmap.

While from the DB components the join is performed in the DB level and data moves as row by row....

so for example Table A and Table B are joined(will be joined in the DB level) and the result-set of this would be sent to the next component/transformation as a row.

DEMO_SCENARIO:

TDBINPUT_COMPONENT(QUERY FOR JOINS FROM 2 TABLES and CONSIDERING THE DB CONNECTION IS TAKEN FROM REPOSITORY)------------>OUTPUT_DBCOMPONENT

Number of Components -> 2

Join is performed in the DB level and then output is produced..

however there is another theory here... DB -> DB willl be slower..  so maybe if you want to enhance the performance then you try this... DB---> FILE---> DB...(This usually improves the Performance.)

I hope you have got the difference... '

Pls reach out to the Talend Community,if necessary.

Thanks,

Ankit.

 

Anonymous
Not applicable
Author

Hi Ankit,

 

      Thank you for the solution , I understand the difference but where you have mentioned about tmap component that it consumes little bit memory , but is it  processing fast than normal query which we'd write in DB component ?

    Because both affecting the performance of JOB .  So which way would you prefer to code in talend ?

 " if you want to enhance the performance then you try this... DB---> FILE---> DB...(This usually improves the Performance.)" --> please  give me example to do this.


@ankit7359 wrote:

Hi @mac_vardam07,

Greetings of the day,

Welcome to Talend. Well Joins in Talend can be done in multiple methods, as you have pointed it out you can perform joins in your Database Components as well as other Talend components like Tmap or Tjoin...

Let me give you a clear idea -> 

You can perform joins in 2 components other than DB components... one of them is tjoin... in this component we can perform inner join (only inner join) and inner join by capturing the reject records.

while the other components is TMap...where you can perform inner,left outer,right outer and full outer joins.. well Tmap consumes bit of performance(consumes lot of memory).. The reason for this is Tmap itself generates quite complex codes. and if you perform joins on that.. you will only increase its memory usage.

for example Table A and Table B have to be joined via tmap where in tmap you have defined the key column and join as per the requirement.

DEMO SCENARIO :

TABLE_A--------->------->

                                 TMAP --------> OUTPUT_COMPONENT

TABLE_B--------->------->

Number of components used 4(including input and output).

Usage of joins and other required logic(as per requirement) for Tmap.

While from the DB components the join is performed in the DB level and data moves as row by row....

so for example Table A and Table B are joined(will be joined in the DB level) and the result-set of this would be sent to the next component/transformation as a row.

DEMO_SCENARIO:

TDBINPUT_COMPONENT(QUERY FOR JOINS FROM 2 TABLES and CONSIDERING THE DB CONNECTION IS TAKEN FROM REPOSITORY)------------>OUTPUT_DBCOMPONENT

Number of Components -> 2

Join is performed in the DB level and then output is produced..

however there is another theory here... DB -> DB willl be slower..  so maybe if you want to enhance the performance then you try this... DB---> FILE---> DB...(This usually improves the Performance.)

I hope you have got the difference... '

Pls reach out to the Talend Community,if necessary.

Thanks,

Ankit.