TheDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/apache/pdfbox/llms.txt
Use this file to discover all available pages before exploring further.
export:images command walks every page of a PDF and saves each embedded image to a separate file. Output filenames follow the pattern <prefix>-<n>.<ext>, where the extension is determined by the image’s compression type in the PDF (JPEG, PNG, TIFF, JP2, etc.). Duplicate image objects are skipped automatically, so shared resources are not written multiple times.
Usage
Options
| Option | Default | Description |
|---|---|---|
-i, --input | (required) | Path to the input PDF file |
-password | (none) | Password to open an encrypted PDF or certificate keystore |
-prefix | (auto) | Filename prefix for extracted images; defaults to the input PDF path without extension |
-useDirectJPEG | false | Force raw extraction of JPEG/JPX streams regardless of colorspace or masking |
-noColorConvert | false | Extract images in their original colorspace where possible (CMYK images may be written as TIFF) |
Output format
The file extension is chosen automatically based on the image encoding stored in the PDF:- JPEG (
DCTDecode) →.jpg - JPEG 2000 (
JPXDecode) →.jp2 - CCITT/grayscale →
.tiff - JBIG2 and all other formats →
.png - Images with alpha masks →
.png(regardless of original encoding)
-noColorConvert is set, images with more than 3 channels (such as CMYK) are written as .tiff; a TIFF codec must be available on the classpath.