Skip to main content
Announcements
A fresh, new look for the Data Integration & Quality forums and navigation! Read more about what's changed.
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

Conversion binary->text to a PDF file with Talend Open Studio Data Integration

Hi,

 

I’m new in Talend and I have an issue. I want to extract a PDF file of a postgresql server (Blob column) and transform the binary file in a text file with Talend Open Studio to read the contain of the PDF file.

I will explain my way now :

This is my postgresql table.

 

Table name : attached_file

 

id;name;mime_type;data

1;TEST_report_1.html;text/html;<binary data>

2;pdffile1.pdf;application;<binary data>

 

I extracted the file with the SQL request :

 

COPY (SELECT data FROM attached_file WHERE id = 2) TO ‘D:/_users/BMI/testpdf.txt’ (FORMAT binary);

 

I heard that to open the binary file, I must use a tFileInputRaw composant with the “Read the file as a bytes array” mode. However, I’m blocked for the output to create the final file.

 

Regards,

Bastien

Labels (2)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Hi,

Thank you for your answer. I read that.

Bastien

View solution in original post

2 Replies
TRF
Champion II

Hi,

You need to go with a tJavaXxxx component for this purpose (I think so).

Search for "java convert binary to pdf" with Google, it will give you some examples:

http://stackoverflow.com/questions/1131116/pdf-to-byte-array-and-vice-versa

http://myjavaprogramming.blogspot.fr/2011/09/convert-byte-array-to-pdf-in-java.html

 

Anonymous
Not applicable
Author

Hi,

Thank you for your answer. I read that.

Bastien