Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Connect 2026! Turn data into bold moves, April 13 -15: Learn More!
cancel
Showing results for 
Search instead for 
Did you mean: 
Anonymous
Not applicable

split a pdf into single pages

Hi,

 

I need a way to split PDFs into their single Pages within a Talend job to further process them.

 

Does anybody has a good solution for this?

 

Thanks

Labels (3)
1 Solution

Accepted Solutions
Anonymous
Not applicable
Author

Meanwhile Ive found the solution, so i thought i post it here, if someone needs it.

Ive written a small routine:

 

package routines;
import java.io.File;
import java.io.IOException;
import java.util.List; 
import java.util.Iterator;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.multipdf.Splitter; 


public static void splitPdf(String arg, String directory) throws IOException
    {
    	PDDocument document = PDDocument.load(new File(arg));
    	Splitter splitter = new Splitter();
    	List<PDDocument> Pages = splitter.split(document);
    	Iterator<PDDocument> iterator = Pages.listIterator();
    		
    	int i = 1;
    	while (iterator.hasNext()) {
    		PDDocument pd = iterator.next();
    		pd.save(directory+ i + ".pdf");
    		i++;
    	}
    	document.close();
    }

It takes the PDF given and extracts every single page to a directory.

View solution in original post

1 Reply
Anonymous
Not applicable
Author

Meanwhile Ive found the solution, so i thought i post it here, if someone needs it.

Ive written a small routine:

 

package routines;
import java.io.File;
import java.io.IOException;
import java.util.List; 
import java.util.Iterator;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.multipdf.Splitter; 


public static void splitPdf(String arg, String directory) throws IOException
    {
    	PDDocument document = PDDocument.load(new File(arg));
    	Splitter splitter = new Splitter();
    	List<PDDocument> Pages = splitter.split(document);
    	Iterator<PDDocument> iterator = Pages.listIterator();
    		
    	int i = 1;
    	while (iterator.hasNext()) {
    		PDDocument pd = iterator.next();
    		pd.save(directory+ i + ".pdf");
    		i++;
    	}
    	document.close();
    }

It takes the PDF given and extracts every single page to a directory.