Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

[resolved] decode PDF

I have a list of .txt files that contain encoded PDFs (base64). I am trying to decode and save them back to .pdf files. I am starting with one .txt file to test. 
tFileList ---> tFileInputDelimited -----> tJavaRow ------> tFileOutputDelimited
In tFileInputDelimited, I set row separator to something like "\nnnnnnnnnnnnnnn\nnnnnnnn" so the whole file is treated as one row
In tJavaRow, 
  byte[] buf = new sun.misc.BASE64Decoder().decodeBuffer(input_row.pdf_in);
  output_row.pdf_out = new String(buf);
but the output file test.pdf is not readable (Adobe Reader: damaged and could not be repaired). 
What am I doing wrong?
Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

I suggest you test the extraction and decoding outside Talend in a simple Java project. If you know how to do it right, you can adapt your new knowledge in a Talend job. By the way, I would create a routine instead coding it in a tJavaRow completely. The static method from a routine could easily be developed and tested outside Talend.

View solution in original post

1 Reply
Anonymous
Not applicable
Author

I suggest you test the extraction and decoding outside Talend in a simple Java project. If you know how to do it right, you can adapt your new knowledge in a Talend job. By the way, I would create a routine instead coding it in a tJavaRow completely. The static method from a routine could easily be developed and tested outside Talend.