Solved: [resolved] decode PDF - Qlik Community

Anonymous · ‎2014-11-17

I have a list of .txt files that contain encoded PDFs (base64). I am trying to decode and save them back to .pdf files. I am starting with one .txt file to test.
tFileList ---> tFileInputDelimited -----> tJavaRow ------> tFileOutputDelimited
In tFileInputDelimited, I set row separator to something like "\nnnnnnnnnnnnnnn\nnnnnnnn" so the whole file is treated as one row
In tJavaRow,
byte[] buf = new sun.misc.BASE64Decoder().decodeBuffer(input_row.pdf_in);
output_row.pdf_out = new String(buf);
but the output file test.pdf is not readable (Adobe Reader: damaged and could not be repaired).
What am I doing wrong?

Anonymous · ‎2014-11-18

I suggest you test the extraction and decoding outside Talend in a simple Java project. If you know how to do it right, you can adapt your new knowledge in a Talend job. By the way, I would create a routine instead coding it in a tJavaRow completely. The static method from a routine could easily be developed and tested outside Talend.

View solution in original post

Anonymous · ‎2014-11-18

I suggest you test the extraction and decoding outside Talend in a simple Java project. If you know how to do it right, you can adapt your new knowledge in a Talend job. By the way, I would create a routine instead coding it in a tJavaRow completely. The static method from a routine could easily be developed and tested outside Talend.

[resolved] decode PDF

Java

Talend Data Integration

v5.x