Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
We intend to use a generic Talend job to bring data from an Oracle database into Hadoop. Some of the tables have a varying number of fields with CLOB or LONG data type.
Questions:
1. how to make this job generic to be used to import tables with or without CLOB data type?
2. how to pass varying number of CLOB Java mappings dynamically?
The Talend job works when I add under "Advanced Settings", "Java Mapping" values to map a CLOB field to a Java data type. However, to make the job generic, it fails processing other tables. I parameterized "Java Mapping" to use context parameters. But for the tables that do not have CLOB, even when I pass NULL, the Talend job fails.
Thank you
To make the complete process generic what I could think of is to have your complete source table set and their corresponding configuration (like "Java mapping", "delims" etc..) have set it up on metadata table. The meta data table would be read by Talend - row wise and it would get all the relevant configuration needed for Sqoop for that particular table and form the complete sqoop command. Once that is done you could run that through - tSSH only caveat you won't be able to use tsqoopimport.
To make the complete process generic what I could think of is to have your complete source table set and their corresponding configuration (like "Java mapping", "delims" etc..) have set it up on metadata table. The meta data table would be read by Talend - row wise and it would get all the relevant configuration needed for Sqoop for that particular table and form the complete sqoop command. Once that is done you could run that through - tSSH only caveat you won't be able to use tsqoopimport.