Skip to main content
Announcements
July 15, NEW Customer Portal: Initial launch will improve how you submit Support Cases. IMPORTANT DETAILS
cancel
Showing results for 
Search instead for 
Did you mean: 
hcroce
Contributor
Contributor

Merge several PDF into one

Hello,

Is there a way to merge several PDF into one in Talend Studio?

Thank you for your help.

Hervé.

Labels (2)
3 Replies
gjeremy1617088143
Creator III
Creator III

Hi ,I think you can do it via custom components or via java hardcode.

You can find custom component on https://exchange.talend.com/

 

you can use org.apache.pdfbox

and this routine :

public void mergePDFFiles(List<File> files,

String mergedFileName) {

try {

PDFMergerUtility pdfmerger = new PDFMergerUtility();

for (File file : files) {

PDDocument document = PDDocument.load(file);

pdfmerger.setDestinationFileName(mergedFileName);

pdfmerger.addSource(file);

pdfmerger.mergeDocuments(MemoryUsageSetting.setupTempFileOnly());

document.close();

}

} catch (IOException e) {

logger.error("Error to merge files. Error: " + e.getMessage());

}

}

 

Send me Love and Kudos

hcroce
Contributor
Contributor
Author

Hello Jeremy,

Thanks for your answer.

 

I searched for "merge PDF" in custom component but without sucess.

 

I'm not a expert in studio... So, I'm not sure to understand how to do the second way you mentioned.

But I'll try as soon as I can.

 

Hervé.

Sourav_Roy
Contributor II
Contributor II

Hi,

 

Instead of using Java Routines you can alternately use python coupled with Talend Open Studio.

 

Steps to achieve the same:

1) Create a python file in your local machine "pdf_merge.py" on the same location as the PDF Files

2) Reuse the code snippet below:

from PyPDF2 import PdfFileMerger, PdfFileReader

 mergedObject = PdfFileMerger()

 for fileNumber in range(1, 117): #Change 117 to number of files you have (E.g. - 116 Files then 117)

mergedObject.append(PdfFileReader('Test_pdf_' + str(fileNumber)+ '.pdf', 'rb')) #correct the indentation

mergedObject.write("mergedfilesoutput.pdf")

3) Add it to your Existing Talend Graph Flow -

For Example,

 

0695b00000Get0AAAR.png