Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik and ServiceNow Partner to Bring Trusted Enterprise Context into AI-Powered Workflows. Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
bglaplante
Contributor III
Contributor III

Talend Open Studio tMysqlSCD - special characters cause new version always

I'm having an issue with the tMysqlSCD component.  I ran a load, and then ran another load without changing the input data.  So my expectation would be that there would be no new Type2 SCD records generated as nothing changed.  Surprise! I have some new SCD versions!  What?  HOW?

 

It appears that if the column being checked for changes contains special characters (in this case linefeed LF) there is some flaw in the SCD type 2 detection that will always create a new version.

 

This is a huge pain as a lot of our text fields can have special characters.  Any way to get around this?  I'm on 6.5.1 should that matter.

 

thanks, Bryan

 

Labels (2)
1 Solution

Accepted Solutions
bglaplante
Contributor III
Contributor III
Author

Solution:

in MySqlConnection, additional jdbc parameters, add utf8, for example:

noDatetimeStringSync=true&characterEncoding=utf8

 

Then the UTF characters are loaded to MySQL properly, and thus the SCD doesn't see differences.

View solution in original post

2 Replies
bglaplante
Contributor III
Contributor III
Author

Actually - a correction.  It isn't a LF character causing the issue (as trim() doesn't remove it).  A dump of an offending string finds extended ASCII characters, in this case characters 0xC2 and 0xAC.

 

Looking at the source system the data is "ul Podbipięty 27/1"  Note the special ę  character.  I think this is causing the issue.  The target table is MySql utf8mb4 --- but in MySQL workbench the character no longer looks like ę but instead is showing as a ?

 

My hunch here is that somehow the extended character is being modified by the time it lands in MySQL, and then the SCD is finding it to be 'different' (well it is), and then tries to do a type 2 update, which just repeats the problem.

bglaplante
Contributor III
Contributor III
Author

Solution:

in MySqlConnection, additional jdbc parameters, add utf8, for example:

noDatetimeStringSync=true&characterEncoding=utf8

 

Then the UTF characters are loaded to MySQL properly, and thus the SCD doesn't see differences.