Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Shalafi
Contributor II

Union with tMap

Hello, community.

 

I have a requirement that I am not sure how to accomplish in Talend.

 

I have two similar tables (almost same scheme) in two different databases. I have to "merge" the data from both tables into one single output. Since the tables are in different databases, two different tDBInput components are used to get the data.

 

Attending to one column (id), some rows could be found in both tables; in that case the row from table A is taken.

In other cases, I have data in table A that is not present in table B (attending to column id). In that case the row from table A is taken.

Until this point this could be accomplished with a "TableA LEFT JOIN TableB" structure, but...

In other cases, I have data in table B that is not present in table A (attending to column id). In that case the row from table B is taken.

 

An example:

 

Table A:

ID OTHER_FIELD

1 A_1

2 A_2

 

Table B:

ID OTHER_FIELD

1 B_1

3 B_3

 

Expected output:

ID OTHER_FIELD

1 A_1 -> present in both tables; taken from A

2 A_2 -> only present in table A

3 B_3 -> only present in table B

 

I'm not sure how to develop this requirement in Talend. I think I have to use tMap, but I need some example about its configuration for this particular scenario. Maybe I'm wrong and tMap is not the better option; in that case, please let me know your suggestion.

 

Thank you somuch

 

EDIT:

if I could perform a single query over both tables (I can't since they are in different databases), this query should do the job:

select a.id, a.other_column

from table_a a left join table_b b on a.id = b.id

union

select b.id, b.other_column

from table_b b left join table_a a on a.id = b.id

where a.id is null

Labels (3)
1 Solution

Accepted Solutions
Shalafi
Contributor II
Author

Hi again.

That was perfect. Your approach did the job.

I have simplified it a little bit because I thought that the join between A (main) and B (lookup) is, in fact, not needed. I mean, all data from A will be taken, even when the data is present in B.

So, the tricky part is the join between B (main) and A (lookup), where only the unmatched have to be taken.

Finally, the data is stored and restore from hash tables before doing the merge. I'm not sure about why is this needed, but Talend does not allow me to merge the outputs directly without this components, so I think that is beyond my knowlegde in Talend.

Here the snapshots showing the job, the tMap configuration and the logs:

0693p000009ocO4AAI.jpg

0693p000009ocO9AAI.jpg

0693p000009ocOEAAY.jpg

 

Thank you so much!

View solution in original post

6 Replies
vikramk
Creator II

Hi  Shalafi,

 

If the Schema of the tables and data type is same then you can get this by using tUnite component. Please go through below link:

https://help.talend.com/reader/mjoDghHoMPI0yuyZ83a13Q/2YsQPg30hhoQZ9rHExZ9Qw

https://help.talend.com/reader/WfrS1ZaBst3az8zJBGoMSQ/reordz4zg7ldIHmFzxqd9w

Let me know if helps

Anonymous
Not applicable

Hi

You need to do inner join two times, please refer to this KB articles.

The first time, table A is main table and table B is lookup table, do an inner join based on id column and get the matched rows, store the result to memory using tHashOutput for used later.

The second time, table B is main table and table A is lookup table, do an inner join based on id column and get the unmatched rows ( these rows only present in table B), store the result to memory for used later .

 

Finally, read back the two output from memory using tHashInput and merge them.

 

Regards

Shong

 

Shalafi
Contributor II
Author

Hello. Thank you for your answer.

Unfortunately, this method does not do the job. The tUnite simply merge data from both tables and does not take into account the requirement about the same id (the join part). Anyway, I made a test to confirm that and as I said, the result is not the expected one (the rows marked in red should not be present). You can see this test in the image.

0693p000009ocM3AAI.jpg

Shalafi
Contributor II
Author

Hi. Thank you for your answer.

Uoo! This is pretty simple once is read (simple solutions are always good solutions). In fact, this is a direct porting of the SQL that I shared, so this is perfect. Let me make a test to confirm that the approach is working and I will share the result.

Thank you so much again

Shalafi
Contributor II
Author

Hi again.

That was perfect. Your approach did the job.

I have simplified it a little bit because I thought that the join between A (main) and B (lookup) is, in fact, not needed. I mean, all data from A will be taken, even when the data is present in B.

So, the tricky part is the join between B (main) and A (lookup), where only the unmatched have to be taken.

Finally, the data is stored and restore from hash tables before doing the merge. I'm not sure about why is this needed, but Talend does not allow me to merge the outputs directly without this components, so I think that is beyond my knowlegde in Talend.

Here the snapshots showing the job, the tMap configuration and the logs:

0693p000009ocO4AAI.jpg

0693p000009ocO9AAI.jpg

0693p000009ocOEAAY.jpg

 

Thank you so much!

Anonymous
Not applicable

Great, glad to see you get it working! Thanks for your feedback!