Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
I'm having an issue with the tMysqlSCD component. I ran a load, and then ran another load without changing the input data. So my expectation would be that there would be no new Type2 SCD records generated as nothing changed. Surprise! I have some new SCD versions! What? HOW?
It appears that if the column being checked for changes contains special characters (in this case linefeed LF) there is some flaw in the SCD type 2 detection that will always create a new version.
This is a huge pain as a lot of our text fields can have special characters. Any way to get around this? I'm on 6.5.1 should that matter.
thanks, Bryan
Solution:
in MySqlConnection, additional jdbc parameters, add utf8, for example:
noDatetimeStringSync=true&characterEncoding=utf8
Then the UTF characters are loaded to MySQL properly, and thus the SCD doesn't see differences.
Actually - a correction. It isn't a LF character causing the issue (as trim() doesn't remove it). A dump of an offending string finds extended ASCII characters, in this case characters 0xC2 and 0xAC.
Looking at the source system the data is "ul Podbipięty 27/1" Note the special ę character. I think this is causing the issue. The target table is MySql utf8mb4 --- but in MySQL workbench the character no longer looks like ę but instead is showing as a ?
My hunch here is that somehow the extended character is being modified by the time it lands in MySQL, and then the SCD is finding it to be 'different' (well it is), and then tries to do a type 2 update, which just repeats the problem.
Solution:
in MySqlConnection, additional jdbc parameters, add utf8, for example:
noDatetimeStringSync=true&characterEncoding=utf8
Then the UTF characters are loaded to MySQL properly, and thus the SCD doesn't see differences.