Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Talend Cloud AWS EU Scheduled Outage: Starting Tues 26 May 21:00 CEST with expected completion Wed 27 May 01:00 CEST
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

!!! Fighting hard with control characters !!!

Hi!
I'm running Talend 5.0.1 and I'm fighting hard to get rid of, or replace, control characters... This is what I have:
1. tMysqlInput reading from a MySQL database with an utf8_general_ci encoding where some of the characters appear as an "em symbol" in the query output
2. tReplace where I'm trying to replace "\u0025" with an emty string ""
3. tMap component
4. tAdvancedFileOutput where the output encoding is set to UTF8
I thought I'd remove the problem by enclose the text with "<!]>" but it didn't help 0683p000009MPcz.png - Furthermore the tReplace component seems to be unable to replace the single "em" character by looking for "\u0025". If I don't enclose the text with the CDATA directive I get written to the file which causes problems when I try to index the XML in another system...
Hope you're able to help me here because I'm 100% stuck with this...
Many thanks!
Labels (4)
13 Replies
Anonymous
Not applicable
Author

Hmmmm, the characters I'm trying to remove are \u0019, \u0025, \u0028 and \u0029 and they're shown as "strange single character" characters 0683p000009MACn.png
I checked in the original file and it is UTF8-encoded...
Cheers
Anonymous
Not applicable
Author

Hi!
I managed to fix a workaround. This is what I did:
Using a tMap component, I invoked the 'replaceAll' method on the column causing the problem: <column>.replaceAll("","")
I hope that it will become possible to use a similar approach using the tReplace component in the future.
Cheers!
Anonymous
Not applicable
Author

Hi,
Couldn't you use the tReplace component with some regular expression that allows standard characters only? I'm not that good with regex, but somthing like allows only alphanumeric characters including spaces.
Hope this helps.
Regards,
Arno
Anonymous
Not applicable
Author

@avdbrink:
hmmm, not sure - I tried to replace something like "\u000c" for example but never got it to work with tReplace... it could be that I provided the parameters wrongly but...
The "workaround" I applied works fine soo I'll stick with that for the time being. Thanks for your suggestion though 0683p000009MACn.png
Cheers