Unlock a world of possibilities! Login now and discover the exclusive benefits awaiting you.
Hello,
Is there a way to merge several PDF into one in Talend Studio?
Thank you for your help.
Hervé.
Hi ,I think you can do it via custom components or via java hardcode.
You can find custom component on https://exchange.talend.com/
you can use org.apache.pdfbox
and this routine :
public void mergePDFFiles(List<File> files,
String mergedFileName) {
try {
PDFMergerUtility pdfmerger = new PDFMergerUtility();
for (File file : files) {
PDDocument document = PDDocument.load(file);
pdfmerger.setDestinationFileName(mergedFileName);
pdfmerger.addSource(file);
pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());
document.close();
}
} catch (IOException e) {
logger.error("Error to merge files. Error: " + e.getMessage());
}
}
Send me Love and Kudos
Hello Jeremy,
Thanks for your answer.
I searched for "merge PDF" in custom component but without sucess.
I'm not a expert in studio... So, I'm not sure to understand how to do the second way you mentioned.
But I'll try as soon as I can.
Hervé.
Hi,
Instead of using Java Routines you can alternately use python coupled with Talend Open Studio.
Steps to achieve the same:
1) Create a python file in your local machine "pdf_merge.py" on the same location as the PDF Files
2) Reuse the code snippet below:
from PyPDF2 import PdfFileMerger, PdfFileReader
mergedObject = PdfFileMerger()
for fileNumber in range(1, 117): #Change 117 to number of files you have (E.g. - 116 Files then 117)
mergedObject.append(PdfFileReader('Test_pdf_' + str(fileNumber)+ '.pdf', 'rb')) #correct the indentation
mergedObject.write("mergedfilesoutput.pdf")
3) Add it to your Existing Talend Graph Flow -
For Example,