Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
 Clayton1
		
			Clayton1
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hello
I have a txt file with usage of what appears to be "en dash"
However, in talend, I can't seem to get it to recognize as it just shows up as junk character.
I found this older post
https://community.talend.com/t5/Archive/resolved-how-to-handle-Long-Dash/td-p/179874 ;
but even after trying to change the encoding type for my tFileInputDelimited component from "US-ASCII" to "UTF-8"; the tLogRow output still displays as un-readable; (was trying to focus on Talend etl reading before moving on to MSSQL database setup).
–  (en dash)
-  (hyphen)
Thanks
 Clayton1
		
			Clayton1
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		For anyone else in future needing to address: Solution with help from Talend Support
I was not using the correct encoding (duh) so:
Change encoding to "Custom" with value "Windows-1252'"
Thanks.
 
					
				
		
Hello,
Could you please try to create file delimited metadata and get another encoding for your input file to see if it works? Would you mind posting some sample content of your txt file with "en dash"?
Best regards
Sabrina
 Clayton1
		
			Clayton1
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi Sabrina
Yes,I have already tried to change the encoding of the adv setting of the metadata (tFileInputdelimited)
Studio also has configuration set Talend> Specific Settings [chkbox] allow specific characters for .....
I've tried encoding from US-ASCII to UTF-8 and ISO-8859-15 ( to no avail.)
US-ASCII/UTF-8 shows question mark
ISO-8859-15; blank space
Below is example of text not reading into Talend:
"Polar Ice – Gum"   en dash  (Talend reads as "Polar Ice � Gum")
"Polar Ice - Gum"     hypen   (Talend reads fine)
- hyphen
– N (en dash)
— M (em dash)
Thanks
 Clayton1
		
			Clayton1
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		For anyone else in future needing to address: Solution with help from Talend Support
I was not using the correct encoding (duh) so:
Change encoding to "Custom" with value "Windows-1252'"
Thanks.
 
					
				
		
Hello,
Thanks for sharing your solution with us.
Best regards
Sabrina
 CBailey0504
		
			CBailey0504
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Neither UTF-8 nor Windows-1252 encoding helped when I was doing a a bulk insert between 2 sql server databases. The bulk insert writes to a text file, and that is where the data got distorted.
Changing the encoding to UTF-16 did work, but it makes the text file twice as large and impacts performance a bit.
