Solved: Re: Whole Database Migration (MSSQL->MySQL) using ... - Page 2 - Qlik Community

Anonymous · ‎2020-03-02

Hi,

I just recently tried using Talend Cloud Data Integration (Talend Studio 7.2)

and I'm figuring how to dynamically migrate a whole MSSQL server database to MySQL(blank DB).

So far, I heard that it's possible to do that with Dynamic Schema.

However, I only found a tutorial for a table (data) migration, or some post that didn't specifically explained much about settings and components to use.

Here's my job design (UPDATED!!!)

I'm using global variable to dynamically define the table, however, I'm not sure how i can define the schema dynamically.

**I supposed my query is right.

Can you let me know how I can define the schema dynamically here?

Output Component

Spoiler

The other option I found was to use job template, however, it doesn't help deal with syntax error that occurred during migration like the following.

sample1)

Exception in component tDBOutput_1 (Job_tbl_mem)
java.sql.SQLSyntaxErrorException: BLOB, TEXT, GEOMETRY or JSON column 'msrepl_tran_version' can't have a default value
	at com.mysql.cj.jd

sample2)

[FATAL]: local_test.job1_tbl_mem_0_1.Job1_tbl_mem - tDBOutput_4 Invalid default value for 'mobile_inq'
java.sql.SQLSyntaxErrorException: Invalid default value for 'mobile_inq'
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120)
	at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
	at

The other option I found was to use job template, however, it doesn't help deal with syntax error that occurred during migration like the following. sample1)Exception in component tDBOutput_1 (Job_tbl_mem) java.sql.SQLSyntaxErrorException: BLOB, TEXT, GEOMETRY or JSON column 'msrepl_tran_version' can't have a default value at com.mysql.cj.jd sample2)[FATAL]: local_test.job1_tbl_mem_0_1.Job1_tbl_mem - tDBOutput_4 Invalid default value for 'mobile_inq' java.sql.SQLSyntaxErrorException: Invalid default value for 'mobile_inq' at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:120) at com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97) at

In addition, I was hoping to find a solution where one job could perform a whole database migration. Eg) Iterating tables in a source DB then generate an output a destination DB.

Please let me know either if I should proceed with Dynamic Schema or Job Template + job design/components/setting needed.

Thank you

Anonymous · ‎2020-03-04

Thank you for your advice.

I have created a parent job as you mentioned and connected it to the child job(from the blog).

However, there are few things I'm not sure about.

1) How should I set schema for tRunJob component? Is it a dynamic column like the source?

Do I need to set dynamic job or context parameter?

2) How should I create and set context variable? Is it something like the second screenshot?

Can you show me example?

3) The number of columns and data types in each table differs, so I was wondering how and what kind of data/code I need to add? As my database contains about a hundred tables and each of them have different number of column and data types. Would you mind to help walk me through the procedure to add extra data and Java Code?

I've used a single database, but in reality you will likely be using different databases. It doesn't make much 
difference, but you will need to make sure that the database column types are the same if you are following this. 
It would be possible to add some code to dynamically change the column types, but this would require extra data in 
the column mapping table and some extra Java code. This is not covered here.

Parent Job

context variable

I use a global variable ((String)globalMap.get("tDBTableList_1_CURRENT_TABLE")), to define table name in the child job(blog), so I thought about defining it in a similar way here, but I'm don't know how.

Child Job (from the blog)

In addition, you mentioned in the blog about multiple tables that there are 3 additional touch-up needed.

I think number 1 is what you mentioned earlier,

but I'm hoping if you can elaborate more and show examples for number 2 and 3?

If it's needed in my case, please let me know.

1. Use Context Variables in your DB components
2. Add columns to your Column_Mapping table to hold the Table Name
You will need to add a bit more supporting data to your Column_Mapping table. If you add an "Old_Table_Name" column 
and a "New_Table_Name" column, you can query the Column_Mapping table using the "Old_Table_Name" field and the 
context.source Context variable. That will return the mapping configurations for your source table and return the 
new table name. This will need to be set as your context.target Context variable value.

3. Create a wrapper Job to call this Job and supply the Table data as Context Variables
The final step for this will be to create a wrapper Job. This is a Job that will query a data set (maybe your 
Column_Mapping) table to return a list of source tables to be migrated. This data will then be sent to this Job, 
run using a tRunJob. For every source table identified in the wrapper Job, this Job will be run. Therefore you can 
start the wrapper Job, it will return each of the source tables and this Job will dynamically run for each of them.

for you reference, this is the mapping_column input

column_mapping

Please let me know if you need any clarification.

Thanks a lot

JohnRMK · ‎2020-03-04

Hello,

"Drop" is the name I gave to a tDBRow component where I execute a query to delete all the tables (<< "EXEC sp_MSforeachtable 'DROP TABLE ?'" >>

it's not necessary and you can remove it because you delete the table in the action on the table in tDBOuput)

If you have a mapping error, it means that you do not delete the table before the extraction.
Use "delete table if it exists"

For the tJava component, it is empty. It is there because tDBInput is a stream input component and does not allow a main link before it.

For the schema, there is only one column (name=data => type=dynamic) and you map all the possible tables. I used it on projects with tables of 800 columns and 3 columns without worries.

The only possible errors are due to the config of the database (see the parameters for rounding or truncation of char)

The job I sent you is working properly and I am using it to clone a db.
You have to add controls for example you test if there is a record in the source table and with an IF trigger, you execute the data extraction.

Anonymous · ‎2020-03-04

Thanks for getting back to me.

I made changes as you mentioned, however another error comes up again.

It said SQLServerException: 'connection is close'

sql_connection error

Full Error Message

Spoiler

Starting job test04 at 19:44 04/03/2020.

[statistics] connecting to socket on port 3944
[statistics] connected
Exception in component tDBTableList_1 (test04)
com.microsoft.sqlserver.jdbc.SQLServerException: 接続は閉じられています。
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:191)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:710)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(SQLServerStatement.java:1071)
at com.microsoft.sqlserver.jdbc.SQLServerResultSet.checkClosed(SQLServerResultSet.java:387)
at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:1009)
at local_test.test04_0_1.test04.tDBTableList_1Process(test04.java:545)
at local_test.test04_0_1.test04.runJobInTOS(test04.java:3244)
at local_test.test04_0_1.test04.main(test04.java:3031)
[FATAL]: local_test.test04_0_1.test04 - tDBTableList_1 接続は閉じられています。
com.microsoft.sqlserver.jdbc.SQLServerException: 接続は閉じられています。
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:191)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:710)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(SQLServerStatement.java:1071)
at com.microsoft.sqlserver.jdbc.SQLServerResultSet.checkClosed(SQLServerResultSet.java:387)
at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:1009)
at local_test.test04_0_1.test04.tDBTableList_1Process(test04.java:545)
at local_test.test04_0_1.test04.runJobInTOS(test04.java:3244)
at local_test.test04_0_1.test04.main(test04.java:3031)
[statistics] disconnected

Job test04 ended at 19:44 04/03/2020. [exit code=1]

Starting job test04 at 19:44 04/03/2020.[statistics] connecting to socket on port 3944[statistics] connectedException in component tDBTableList_1 (test04)com.microsoft.sqlserver.jdbc.SQLServerException: 接続は閉じられています。at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:191)at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:710)at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(SQLServerStatement.java:1071)at com.microsoft.sqlserver.jdbc.SQLServerResultSet.checkClosed(SQLServerResultSet.java:387)at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:1009)at local_test.test04_0_1.test04.tDBTableList_1Process(test04.java:545)at local_test.test04_0_1.test04.runJobInTOS(test04.java:3244)at local_test.test04_0_1.test04.main(test04.java:3031)[FATAL]: local_test.test04_0_1.test04 - tDBTableList_1 接続は閉じられています。com.microsoft.sqlserver.jdbc.SQLServerException: 接続は閉じられています。at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:191)at com.microsoft.sqlserver.jdbc.SQLServerConnection.checkClosed(SQLServerConnection.java:710)at com.microsoft.sqlserver.jdbc.SQLServerStatement.checkClosed(SQLServerStatement.java:1071)at com.microsoft.sqlserver.jdbc.SQLServerResultSet.checkClosed(SQLServerResultSet.java:387)at com.microsoft.sqlserver.jdbc.SQLServerResultSet.next(SQLServerResultSet.java:1009)at local_test.test04_0_1.test04.tDBTableList_1Process(test04.java:545)at local_test.test04_0_1.test04.runJobInTOS(test04.java:3244)at local_test.test04_0_1.test04.main(test04.java:3031)[statistics] disconnectedJob test04 ended at 19:44 04/03/2020. [exit code=1]

I have check my SQLServer as well as tried using it on other job, and everything is working fine though...

I'm not sure why the connection is not doing well here....

For you reference

dbconnection

tablelist

JohnRMK · ‎2020-03-04

Check the tDBCommit/tDBRollBack and desable Close Connection

Anonymous · ‎2020-03-04

It works perfectly!!
Thank you so much!!!

Anonymous · ‎2020-03-08

@JohnRMK

Thanks for helping me out previously.

After comparing the input source with output, I noticed that the primary key fields (column) and primary composite key is not generated in the output database.

Is there a way to migrated primary key fields (column) and primary composite key together with the tables and its data(+schemas)? My tables have around 1~4 primary key fields (column).

Source Database: MSSQL

mssql-input

Output: MySQL

mysql-output

JohnRMK · ‎2020-03-09

Hello,
Like I said in my first post, the job I gave you is just for migrating data.

If you want to add the constraints of integrity, it will be necessary to develop a little but it remains easy.

Here is the approach to automatically migrate the keys.

First, you will request the database schema and list all the integrity constraints.

Here is the query to do that. https://dataedo.com/kb/query/sql-server/list-all-table-constraints

You can use tDBInput to execute the query like the image below.

You recover the data in a file or SQL Server table then for each table you generate a query in order to update structures and add the keys.

You will use tDBRow and in it you will have a request for style
"ALTER TABLE GLOBAL_VARIABLE ADD CONSTRAINT ........" you can use tJavaRow to perform the concatenation or a tMap and you can iterate on the different constraints with tFlowToIterate and you can use like the first job

-> tJava (blank) ==> On Subjub Ok ==> tDBRow (query to add)

In the example below, I use the same approach to create the tables and add a primary key. Me, all my tables have the Id columns as key so it's easy but you just have to modify the query a bit and it will be fine

If you want a last tip, try to create the tables with the constraints then you migrate the data with the dynamic schema.
You can use a query in order to recover the columns and their types (or combination of tDBTableList and tDBColumnList), you carry out a conversion to MySql type and you generate your requests with tMap or tJava. You execute the creation of the tables with the constraints then you migrate the data. you can do it in 3 jobs and you already have 2. It's the same process used in the job attached (migration from MySql to Snowflake)

Good luck

Snowflake-DynamicMigration-Studio7 (1).zip

Anonymous · ‎2020-03-09

@JohnRMK

Thanks for getting back to me.

Here I have tried creating job design as you mentioned above, however I'm not how it works here.

You will use tDBRow and in it you will have a request for style
"ALTER TABLE GLOBAL_VARIABLE ADD CONSTRAINT ........" you can use tJavaRow to perform the 
concatenation or a tMap and you can iterate on the different constraints with tFlowToIterate and 
you can use like the first job

-> tJava (blank) ==> On Subjub Ok ==> tDBRow (query to add)

I feel that this might not be the right one, but could please help point out how could I improve?

job-design & inputtMap (schema)

Next I want ask the statement for altering the table, i replace 'global_variable' with the global variable string but i'm not sure about what I should put for constraint??

'PK_'global variable +PRIMARY KEYS(?)

In my case, my primary keys columns is usually about 1-4, and column also differs by table.

What should I put for primary keys? Is there something like a global variable for this?

**I'm still a novice in SQL

dbRow

Thank you!

JohnRMK · ‎2020-03-09

Did you check the job a give you ?

What you are trying to do requires a little programming.
You collect your keys from a base table.
Then you group each table with these keys (you can use tAggregatRow ==> as output you must have <Table_1; Pk_1, Pk_2 ..>
Then in a tMap you add characters to form a request

"ALTER TABLE" + GLOBAL_VAR (Table name) + "ADD CONSTRAINT" + GLOBAL_VAR (List of table keys) + ";"

ALTER TABLE table_name (to replace with global variable)
  ADD CONSTRAINT table_PK_name (to replace with global variable)
    PRIMARY KEY (last_name, first_name); (to replace with global variable after aggregation)

In the image below I use the same principle to add columns to a database.
I generate dynamic query that I inject into a tDBRow

I just developed the job for you. I only create the request for PK to you to develop the second part of the sub-job for foreign keys or indexes

I can't do more than that you normally have.

Sorry, i made an error in the script (inside of tMap ==> ALTER Table not ALTER TABME)
(Add the connection to the SQL Server database and it will work without problem)

PK_DYNAMIC_GEN_MySQL.zip

Anonymous · ‎2020-03-10

@JohnRMK

Thank you for your clarification.

I was able to group each tables with keys as the following

dynamic PK's job design

tMap

dynamic_pk mapping

Here's the result ,,,, somehow table with 'one key' has its one key repeated here ...

traveltour.tbl_mem|ALTER TABLE traveltour.tbl_mem ADD CONSTRAINT PK_tbl_mem,PK_tbl_mem PRIMARY (mem_id,mem_id);
Table Name = traveltour.tbl_mem 
 AND the Query is = ALTER TABLE traveltour.tbl_mem ADD CONSTRAINT PK_tbl_mem,PK_tbl_mem PRIMARY (mem_id,mem_id);

However, I still do have question about how I should set tDBRow?

Here is my DBRow's component and schema setting

DBRow's setting

For schema I make is the same as tMap's output(query_build).

I used a global variable ((String)globalMap.get("tDBTableList_1_CURRENT_TABLE")) for table name.

As for query, I'm not sure if I could use a global variable here or used row name??? I kinda used both here ....

For PRIMARY KEY in query statement, I put it as 'Id' as I'm not sure how I could call it here, since I assume that I could not use 'rowname.details ' like I used it in tMap.

Would appreciate if you elaborate more on DBRow's Component and Schema (point mentioned above)

After making this work, I will move on to try indexes and foreign keys

Thank you so muchhhh!!

Whole Database Migration (MSSQL->MySQL) using Dynamic Schema

Cloud

Studio