Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hi there,
I'm trying to update database with multiple tables (100+) from CSV files (in a directory) using the following job design and components.
input
output
The issue here is that all my tables (in a DB) have different schemas and different primary keys (most of my PK is a composite key made from multiple PK columns, usually from 1-5).
For example:
mysql_pk
mysql_pk1
So I'm not sure how I could dynamically set my PK in schema.
Does anyone have an idea how to make this work?
A screenshot of job design/components would be very much appreciated
Currently, I only have a dynamic column in my schema since all my keys are different.
tMap
When I tried to to run the job, it stated that 'schema must have a key'.
Exception in component tDBOutput_2 (test02) java.lang.RuntimeException: For update, Schema must have a key
Hello,
I'm not sure you'll be able to achieve this without using some tricks.
I think you should first load csv data into "loading tables" (table you'll truncate and insert only) and then update your final tables from those data.
You can either write manually all the queries or you can build them on the fly from the MySQL catalog.
Regards
@crotmn wrote:
Hi there,
I'm trying to update database with multiple tables (100+) from CSV files (in a directory) using the following job design and components.
input
output
The issue here is that all my tables (in a DB) have different schemas and different primary keys (most of my PK is a composite key made from multiple PK columns, usually from 1-5).
For example:
mysql_pk
mysql_pk1
So I'm not sure how I could dynamically set my PK in schema.
Does anyone have an idea how to make this work?
A screenshot of job design/components would be very much appreciated
Currently, I only have a dynamic column in my schema since all my keys are different.
tMap
When I tried to to run the job, it stated that 'schema must have a key'.
Exception in component tDBOutput_2 (test02) java.lang.RuntimeException: For update, Schema must have a key
Right now, I'm trying a lengthy method below.
Job1: Load CSV and insert into a temporary database(dynamic schema)
Job2: Inject constraint into the tables (temporary DB)
Job3: Update tables (temporary DB)'s data into the real database
If there's a precise and better way to do this, please let me know.
Thank you
Hi,
first load all the data into empty tables used only as intermediate storage either with Talend (easy with Dynamic schema) or directly from the database:
Talend BulkExec components or MSSQL BULK INSERT, ORACLE SQL LOADER, etc.
then, update final tables using intermediate tables ; two solutions :
1) create as many update_[tableName].sql files as needed ; in each file, manually enter the required update query such as :
/*update_tableTarget1.sql*/
UPDATE tableTarget1 t, tableSource1 s SET t.data1 = s.data1, t.data2 = s.data2 WHERE t.id = s.id
Then, within your Talend job, loop over the SQL files and use a tDatabaseRow to execute the update query (you can even parallelize the iterate link after the tFileList) :
tFileList (*.sql) -iterate-> tFileInputFullRow -row-> tFlowToITerate -iterate-> t<DataBase>Row
the tFileInput* reads the SQL file and the tFlowToIterate put the query in globalMap but you can use others methods as well.
Or
2) build dynamically the queries from the database catalog.
You can get a list of tables to update (this heavily depends on your database) :
select table_name, ... from <system_tables> where ...
You then loop over the tables, retrieve the list of columns, identify which are keys and so one and build all the queries dynamically and use a tDatabaseRow to do all the updates.
The first method is easier but harder to maintain. The second method will ask more work at first but then, you'll be able to add tables, change formats much faster.