Solved: Re: [resolved] Join to 1 or the other table - Qlik Community

BremmeyrISS · ‎2017-03-30

How do I do this in Talend Studio? What components will allow me to get the desired results?[color=#000000][font=Arial, Helvetica, sans-serif] I expect there is a recommend method to accomplish this task. [/font][/color]

I want to create a CSV file with data always from table A. Also based on a value in each row from table A get a row from table B or C. There are 2 values of concern in table A. ID and alternate number (alt_no). In all cases there will be a row in B where A.ID = A.ID Yet when A.alt_no is > 0 [color=#000000][font=Arial, Helvetica, sans-serif] we want values from the alternate table C where C.ID = A.ID and C.alt_no = A.alt_no. Furthermore the CSV file will have 1 column from table B or C. For example get the value for CSV column name city from B or C depending on which one was used. [/font][/color]
My 1st idea is to use tJDBCInput for the 3 tables and have them go into tMap. Then use the expression editor to select the desired rows with a left outer join. I have not discovered a way to build the expression to work this way.
Talend integration cloud SaaS edition of Talend studio version 6.3

TXAggie00 · ‎2017-03-31

Are all three tables in the same database? If so, I would forget about bringing the tables into a tMap and joining them up that way. Talend does not currently pushdown any database operations based on the design flow. It is a code generator not a code/sql generator. For the sake of performance, push down as much DB operations as you can and let the database do what it is designed to do. Based on what you said and the assumption that all tables are from the same database, think of it in terms of SQL:

SELECT A.*, B.CITY
FROM TABLEA A
  INNER JOIN TABLEB B
    ON A.ID = B.ID
WHERE A.ALT_NO <= 0

UNION ALL

SELECT A.*, C.CITY
FROM TABLEA A
  INNER JOIN TABLEC C
    ON    A.ID = C.ID
      AND A.ALT_NO = C.ALT_NO
WHERE A.ALT_NO > 0

If that is the type of results you are looking for, stick that query in a SQLInput component, add a tMap if any other data manipulation is needed and add you CSVOutputFile component.

Thanks,
Scott

View solution in original post

Anonymous · ‎2017-03-30

Hi,

I want to create a CSV file with data always from table A. Also based on a value in each row from table A get a row from table B or C. There are 2 values of concern in table A. ID and alternate number (alt_no). In all cases there will be a row in B where A.ID = A.ID  Yet when A.alt_no is > 0  we want values from the alternate table C where C.ID = A.ID and C.alt_no = A.alt_no.  Furthermore the CSV file will have 1 column from table B or C. For example get the value for CSV column name city from B or C depending on which one was used.

Could you please elaborate your case with an example with input and expected output values?
Best regards
Sabrina

TXAggie00 · ‎2017-03-31

Are all three tables in the same database? If so, I would forget about bringing the tables into a tMap and joining them up that way. Talend does not currently pushdown any database operations based on the design flow. It is a code generator not a code/sql generator. For the sake of performance, push down as much DB operations as you can and let the database do what it is designed to do. Based on what you said and the assumption that all tables are from the same database, think of it in terms of SQL:

SELECT A.*, B.CITY
FROM TABLEA A
  INNER JOIN TABLEB B
    ON A.ID = B.ID
WHERE A.ALT_NO <= 0

UNION ALL

SELECT A.*, C.CITY
FROM TABLEA A
  INNER JOIN TABLEC C
    ON    A.ID = C.ID
      AND A.ALT_NO = C.ALT_NO
WHERE A.ALT_NO > 0

If that is the type of results you are looking for, stick that query in a SQLInput component, add a tMap if any other data manipulation is needed and add you CSVOutputFile component.

Thanks,
Scott

BremmeyrISS · ‎2017-03-31

I think that will work . I will give it a try. Thank you Scott.

BremmeyrISS · ‎2017-03-31

Yes, Yes, YES. Thank you Scott. That worked just the way I want. My take away is right on target with your explanation. Selection values, joins, etc. can be added to the SQL of the tJDBCInput component.
I recall from somewhere that it is better to split up the table access in the job with multiple tJDBCInput components. That is the main reason I was trying to join or merge the data with tMap. At this point with this job I will take whatever works. I think that idea was in the context of planning to deploy the job with TIC - Taland Integration Cloud.

TXAggie00 · ‎2017-03-31

Glad it worked for you.

I think as a general ETL architecture design, I agree with what you read: it should be split up for maintainability reasons and to put the business logic as part of the design. One thing I have learned about Talend as compared to other Gartner Q1 ETL tools, is that Talend is not as efficient in this regard. Most tools recognize the use of tables from the same datastore/database and generate the necessary SQL code to make it as efficient as possible and to push down most of the heavy lifting to the database. This frees up precious memory to do actual transformations of your data. Talend, on the otherhand, just generates java code on a component by component basis and will, too many times, use a lot of server resources that could otherwise be done at the db level. Unless they are on different sources, I use pure SQL and then manipulate/transform as needed in the job. Even if they are different source databases, I usually (depending on the size) load into temp tables on the target DB and use the ELT components or SQL based query to derive my true source recordset. Performance and optimization is just as important to ETL architecture as is the defined business logic for the data integration.

Thanks,
Scott

BremmeyrISS · ‎2017-03-31

In the tMap component there is a filter expression option. Can that be used in effect the same way as a where clause in SQL ?

TXAggie00 · ‎2017-03-31

Yep, sure can. You can use it on the input or any output.

Thanks,
Scott

BremmeyrISS · ‎2017-04-03

Thank you.

[resolved] Join to 1 or the other table

Talend Data Integration

v6.x