Complex SQL query in tDBInput

stephbzr · ‎2022-12-06

Hello,

I have input data from a tMap. I need to use some columns (in bold in the query below) to run a SQL query from a tDBInput "Lookup". I have several thousand rows of input so I don't want to use the "Reload at each row" option as it takes too long. I just need to find a method to correctly inject my tMap data into the query and get the result.

The query :

select distinct a.ID_FAM,

into :idFam%, :idAy%, :idOrg1%, :IdProd$

from AY a, FPROD fp, PROD P, ORG O, DCSE D

where a.AY_SS = 'TALEND_COL1'

and a.ID_FAM = fp.ID_FAM

and fp.ID_ORG = a.ID_ORG

and fp.ID_ORG = P.ID_ORG

and fp.ID_PROD = P.ID_PROD

and fp.ID_ORG = O.ID_ORG

and D.DCSE_NOM = 'NOEMIE'

and ((('NOEMIE' <> 'NOEMIE' and D.DCSE_PP = 0) and fp.ID_PROD = 'SANTE') or (('NOEMIE' = 'NOEMIE' or D.DCSE_PP = 1) and (fp.ID_PROD = O.ORG_PROD or P.PROD_PRESTATION_SANTE = 1)))

and fp.ID_FAM = a.ID_FAM

and ((a.AY_DATNAIS = decode('TALEND_COL2', 1, to_date('TALEND_COL2', 'dd/mm/yyyy'), a.AY_DATNAIS)) or (a.AY_DATNAIS2 = 'TALEND_COL2'))

and a.AY_RNG = TALEND_COL3

and exists (select 'a'

from ADC ad

where ad.ID_AY = a.ID_AY

and ad.ID_PROD = fp.ID_PROD

and nvl(ad.id_fprod, fp.id_fprod) = fp.id_fprod

and to_date(TALEND_COL4', 'DD/MM/RR') between ad.ADC_DATDEB and nvl( ad.ADC_DATRAD, sysdate+1))

and to_date('TALEND_COL4', 'DD/MM/RR') between fp.FP_DATADH and nvl( fp.FP_DATRAD, sysdate+1);

A small illustration of what I have started to do :

I moved the conditions requiring input values after the select except for TALEND _COL4 where I wrote the data myself because obviously that's why I need your help.

stephbzr · ‎2022-12-07

Thank you very much @Richard Hall , the SQL query works very well and is more optimised. As expected with these values (4 columns), it returns nothing. But with other values, it can return a row.

So I come back to the main problem, what method can I use to inject the values from the tMap into the SQL query of the tDBInput.

Anonymous · ‎2022-12-07

If you want to inject values, you HAVE to use the "Reload Each Row" option. Otherwise, how will the values be selected?

However, there is another way. How many rows are returned if you leave out the WHERE clause options which use these injected values? I suspect that you can carry out your lookup by simply loading ALL of the data returned and use the tMap to join to this data in order to filter it down.

stephbzr · ‎2022-12-07

Ok @Richard Hall. Is the setting in tMap as below correct?

I used a filter because I need to run the query for only a few rows (about 4000) out of 20000.

Query with WHERE clauses return at most one row. It is about finding a member thanks to the data I inject from the tMap.

Anonymous · ‎2022-12-07

OK, I think there is some confusion here. How many rows of data are in your main source (the file)? I assume you are saying that there are 20,000 in your lookup and that you are reducing this to 4000 with your filter in the tMap. Out of interest, why aren't you filtering the number of rows from 20,000 to 4000 in your SQL?

If you are only using 4000 rows in your lookup, you may as well just load all of the data once and use the tMap joins to get to the data you need. When you join your Main to your Lookup, it will exclude any data that does not meet that join. So, it will be quicker than firing the query on every row.

stephbzr · ‎2022-12-07

@Richard Hall My tMap input data is about 20,000 lines. I just need to run the query on 4000 rows, the one with type 223 or 255.

My result after injecting the values (of more than 4000 rows) as input to the SQL query should be like this:

These are not of course the real data, but the query must return for each line, the member's identifier if it exists otherwise null. I don't usually have problems with tDBInput except for this query.

I usually do it like this, a simple example. The original query :

"SELECT id_org

from org

where org_noc = TALEND_COL1"

In tDBInput :

In tMap

And I have my exit for the 4000 lines

In the current case, this is really difficult because the CLAUSE concerned by the input value is not of the type :

WHERE col1 = TALEND_COL1. But within the SQL query itself...

stephbzr · ‎2022-12-07

I tested by injecting the data with globalMap (as in my screenshot in the last comment). As expected it works, but the execution time of the job went from 1-2mn to 27mn. This is not surprising as if it loads the SQL query on each line, with 4000 lines it will take a long time.

Anonymous · ‎2022-12-07

You don't need to inject the values and Reload Each Row. Just use the query without the injected values and join inside the tMap. Let the tMap joins deal with finding the rows that you need. 20,000 rows is not too large for a lookup if it is just a single query. When you do not select Reload Each Row, it loads all of the rows and stores them in memory in the job. Then, when every Main row comes through, they are each matched against the lookup data in memory.

Cloud

ORACLE

Studio

v7.x