Skip to main content
Announcements
Join us at Qlik Connect for 3 magical days of learning, networking,and inspiration! REGISTER TODAY and save!
cancel
Showing results for 
Search instead for 
Did you mean: 
TXAggie00
Contributor III
Contributor III

ELT Components with mutiple connections

I have a question regarding ELT and Talend.  Having worked with many ETL tools (Informatica, SAP Data Services, SSRS, etc), I have always tried to design jobs using as much ELT as possible to lighten the load on the job servers and push down as much logic to the database as possible.  I am currently working on a project where the client has Talend (subscription not open studio) with a MS SQL Server ODS.  I am having a bit of an issue with the ELT components and the different connections.  I have three different connections to SQL Server for the different databases and different schemas (all on the same server and instance, but the architect is very schema-happy).  All the db metadata is setup using context variables for the different environments.  Is it possible to use the ELT components with the different connections?  The current setup I have is a very common scenario.  I have a stage table in the staging database/schema which I filter based on incremental logic.  I also have my target table which I left outer to the staging to get my inserts/updates.  Unfortunately I only have the option to select one connection in the ELTMap component.  As a work-around to that limitation, I setup the flow to load a data transfer table (Staging table with filtered, incremental results creates a table in the same database/schema as the target, which I drop in the postjob flow), but I am getting errors because the ELT components don't fully qualify the tables with the schema specified in the connection.  I realize I can do this using my own query using standard DBInput/DBOutput components and use the globalMap to get the schema/database of the two connection to make it dynamic (or heck, even use an SP if SQL code is the only route), but this seems counter-intuitive for an ETL tool.  Am I missing something obvious?
Thanks,
Scott
FYI, we are using 6.3.1, which we have already found a bug in the tMSSqlOutput component, so it may be a bug in the tELTMSSqlMap component that is doesn't qualify the tables.  We ended up having to create a custom component for tMSSqlOutput using the 6.2.1 version of the same component.
Labels (2)
5 Replies
Anonymous
Not applicable

Hi,
 t<DB>ELT components are some what related to Extract-Load-Transform (ELT) method while t<DB>xxx is related to Extract-Transform-Load method (ETL).
The difference is that the operation of t<DB>ELT component is available inside a database.
Best regards
Sabrina
TXAggie00
Contributor III
Contributor III
Author

Thanks Sabrina, these are the components I am referring to.
Okay, I managed to make it work with multiple connections, but the update SQL that is generated is hogwash.  It is generating the SQL like the following:
UPDATE TargetTableName
SET ColumnName = (
  SELECT StagingColumnName
   FROM Stage.StagingTable StagingAlias
      LEFT OUTER JOIN Target.TargetTableName TargetAlias
         ON TargetAlias.SourceId = StagingAlias.SourceId
   WHERE TargetAlias.PrimaryKey IS NOT NULL
      AND StagingAlias.SourceTimeStamp > ),
   NextColumnName = (
  SELECT NextStagingColumnName
   FROM Stage.StagingTable StagingAlias
      LEFT OUTER JOIN Target.TargetTableName TargetAlias
         ON TargetAlias.SourceId = StagingAlias.SourceId
   WHERE TargetAlias.PrimaryKey IS NOT NULL
      AND StagingAlias.SourceTimeStamp > ),

...etc.
What I would expect to see would be:
UPDATE Target.TargetTableName 
  SET ColumnName = StagingAlias.StagingColumnName,
      NextColumnName = StagingAlias.NextStagingColumnName,
  ...etc
FROM Staging.StagingTableName StagingAlias
 LEFT OUTER JOIN Target.TargetTableName TargetAlias
   ON TargetAlias.SourceId = StagingAlias.SourceId
WHERE TargetAlias.PrimaryKey IS NOT NULL
 AND StagingAlias.SourceTimestamp >

The current generated SQL (this is specifically an MS SQL component constantly generates failures because is try's to update a NULL value to the target table when there is a null constraint on that column.  The current WHERE clause brings back 0 records, so nothing should be inserted or updated.  I am at the point where I feel like I need to modify yet another 6.3.1 component.
Any suggestions?
Thanks,
Scott
TXAggie00
Contributor III
Contributor III
Author

I ended up modifying the existing component and creating a custom component.  The only component I modified was the ELTMSSqlOutput component, but in the future, in order to make this correction optional, I may need to touch the other two components.
If anyone is interested here are the two corrections I made to tELTMSSqlOutput_main.javajet:
// Line 285 - 304:  Two lines weere modified: line 288, the updatesSetStmt and line 303, the updateQuery variable creation
}else{
insertColName.append(colStmt.getName()+suffix);
insertValueStmt.append(colStmt.getValue()+suffix);
updateSetStmt.append(colStmt.getName()+"=\"+routines.system.StringUtils.splitSQLColumns(selectQueryColumnsName)+\"" +suffix);
}
}
counterOuter ++;
}

if(("INSERT").equals(dataAction)){
%>
String insertQuery = "INSERT INTO "+tableName_<%=cid%>+"(<%=insertColName.toString()%>)  ("+select_query+")";
pstmt_<%=cid %> = conn_<%=cid%>.prepareStatement(insertQuery);
<%
}else if (("UPDATE").equals(dataAction)){
%>
String updateQuery = "UPDATE "+tableName_<%=cid%>+" SET <%=updateSetStmt.toString()%> " + select_query.substring(select_query.indexOf("FROM"), select_query.length()).trim();
pstmt_<%=cid %> = conn_<%=cid%>.prepareStatement(updateQuery);

The other caveat was that in the properties for this component, "Default Schema Name" should be an empty String and "Use different table name" should be check and the name of the alias table used in the map should be used here.
Thanks,
Scott
prajwalagurram
Partner - Contributor
Partner - Contributor

Hi,

Can you please explain how you were able to manage to make it work for multiple connections ? 

I have a similar scenario where the source and target connections are different.

 

Thanks,

Prajwala

Divija_Agrawal
Contributor
Contributor

Hi,

You're right when you say that in ELTMap, Talend only supports one connection, and depending on the size of multiple databases or schemas being used, it could be a problem. One solution might be to generate a fully qualified table name by embedding context variables within an ELT component to qualify a schema. Otherwise, making use of DBInput/DBOutput components or even stored procedures might allow for better control over SQL logic, even if it is not totally ELT-driven.

It could also be a bug in version 6.3.1, so an upgrade or a message to Talend support will help. More insight into ELT processes can be found here, an article on optimizing ELT pipelines; you may find this helpful: ELT Pipeline.