Skip to main content
Announcements
Introducing a new Enhanced File Management feature in Qlik Cloud! GET THE DETAILS!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Convert a CSV UTF-8 to UTF-8-BOM

Hello,

 

Please i need some help.
I have to output a CSV file in UTF-8-BOM for my client, because in UTF-8 several special caracters are not encoding as well...

 

In this topic, I found a routine which can encode CSV file, but not in UTF-8-BOM...

https://community.talend.com/t5/Design-and-Development/tChangeFileEncoding-and-UTF8-encoding/m-p/149...

 

Please, somebody has a solution to this problem ?

PS: not by passing with the custom component  tWriteHeaderLineToFileWithBOM

Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

To do this you will need to add a bit code. I have just tested it, and it appears to work if you test the file here: https://validator.w3.org/i18n-checker/check#validate-by-upload+

 

The way to do this is to add a tJava to the beginning of your flow (before the file output component). Then add this code (or a variant of this code) to the tJava.....

 

String filename = "/Users/richardhall/Downloads/test.CSV";
String content="";
byte[] bytes = content.getBytes();
 
try (OutputStream out = new FileOutputStream(filename)) {
 
	// write a byte sequence
	out.write(0xEF);
	out.write(0xBB);
	out.write(0xBF);
        
	out.close();
} catch (IOException e) {
	e.printStackTrace();
}

 You will also need to add these imports to the tJava's advanced settings...

import java.io.FileOutputStream;
import java.io.OutputStream;

What this does is create the file and adds 3 BOM bytes to the beginning of the file. The only change you need to make to the file output component is to set the file to Append and to not fail if the file already exists.

 

 

View solution in original post

11 Replies
Anonymous
Not applicable
Author

To do this you will need to add a bit code. I have just tested it, and it appears to work if you test the file here: https://validator.w3.org/i18n-checker/check#validate-by-upload+

 

The way to do this is to add a tJava to the beginning of your flow (before the file output component). Then add this code (or a variant of this code) to the tJava.....

 

String filename = "/Users/richardhall/Downloads/test.CSV";
String content="";
byte[] bytes = content.getBytes();
 
try (OutputStream out = new FileOutputStream(filename)) {
 
	// write a byte sequence
	out.write(0xEF);
	out.write(0xBB);
	out.write(0xBF);
        
	out.close();
} catch (IOException e) {
	e.printStackTrace();
}

 You will also need to add these imports to the tJava's advanced settings...

import java.io.FileOutputStream;
import java.io.OutputStream;

What this does is create the file and adds 3 BOM bytes to the beginning of the file. The only change you need to make to the file output component is to set the file to Append and to not fail if the file already exists.

 

 

Anonymous
Not applicable
Author

@rhall  thanks you for your answer !

 

I don't understand how i can write the file path in the tJava before my file is created ?

String filename = "/Users/richardhall/Downloads/test.CSV";

 

Do you know  how modify this routine and add your code to encode in UTF-8-BOM (if convert selection is UTF-8 for exemple) please ?

 

Thanks you again !


routine_encoding.txt
Anonymous
Not applicable
Author

Sorry, that was my filepath I left in there 🙂

 

If you set the filepath in your component with a context variable, just use the same context variable. Otherwise hard code it as I did in my test. Probably best to use a context variable though 🙂

Anonymous
Not applicable
Author

@rhall ok thanks you.
So, I can put my file path (context variable) before the file exists ?

 

Thanks !!

 

 

PS:

I tried but doesn't work... CSV file generate is UTF8 not UTF8-BOM.

Opened with Excel, special caracters are not good (é = é)...

 

 


UTF8.JPG
Anonymous
Not applicable
Author

Check the file with the link I provided. That worked for me. I have no other way of testing, so maybe you can tell me how you tested. Run the job without the Java code and test the file with the link I gave. Then run the job with the code switched on and test the file. You should see that the BOM characters are recognised.

Anonymous
Not applicable
Author

@rhall 

 

In my component tFileOutPutDelimited, how type of encoding i have to put ?

Have I to enable "write after" ? please 

 

Thanks !

 

 

PS: i have tested, my component tFileOutputDelemited, convert the CSV file UTF-8-BOM to UTF-8...

 

 

IT'S WORKED !! I have to enable "write after"  in component tFileOutPutDelimited !

 

@rhall thanks you so much !!

Anonymous
Not applicable
Author

No problem 🙂

Anonymous
Not applicable
Author

@rhall 

I have just a last problem, with this solution i can't have the title in my csv...

 

Can you tell me how add it ? 
Maybe in the tJava, like out.write("TitleColumnA; TitleColumnB"); ??

Anonymous
Not applicable
Author

Why can't you add your title? I don't understand?