PDFBox 3.0 is a major release with a significant number of API-level breaking changes, many of which affect code that simply opens or saves a PDF. The changes were made to improve memory efficiency, reduce I/O overhead, and clean up the public API after years of accumulated deprecated code. This guide covers the changes you are most likely to encounter and shows you what to update.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/apache/pdfbox/llms.txt
Use this file to discover all available pages before exploring further.
Dependency changes
The MavengroupId changed in PDFBox 3.0. You must update every pdfbox dependency in your build file.
The
groupId value org.apache.pdfbox is the same in both 2.x and 3.x. What changed is that the io functionality was extracted into a separate pdfbox-io artifact (see below). No groupId change is required — but you may need to add the new pdfbox-io dependency if your code directly uses RandomAccessRead.New pdfbox-io module
In 3.0, the I/O primitives (RandomAccessRead, RandomAccessReadBuffer, RandomAccessReadBufferedFile, and related classes) were moved from org.apache.pdfbox.io inside the pdfbox JAR into a dedicated Maven module: pdfbox-io. The pdfbox artifact declares a compile dependency on pdfbox-io, so most projects do not need to add it explicitly. If your code imports classes from org.apache.pdfbox.io directly, add:
pom.xml
Loader API replaces PDDocument.load()
The most common breaking change for 2.x users is thatPDDocument.load(...) has been removed. In 3.0, all documents must be opened through the org.apache.pdfbox.Loader class.
Loading from a File
Before (2.x)
After (3.x)
Loading from a byte array
Before (2.x)
After (3.x)
Loading from an InputStream
DirectInputStream loading is no longer supported in Loader.loadPDF() in PDFBox 3.x (it was removed to avoid ambiguity with the RandomAccessRead overload). Wrap your stream in a RandomAccessReadBuffer first:
Before (2.x)
After (3.x)
Loading with a password
Before (2.x)
After (3.x)
RandomAccessRead changes
PDFBox 3.0 introducedorg.apache.pdfbox.io.RandomAccessRead as the primary abstraction for reading PDF data. Two concrete implementations cover the most common use cases:
| Class | When to use |
|---|---|
RandomAccessReadBufferedFile | Reading from a file on disk — memory-efficient, uses NIO |
RandomAccessReadBuffer | Reading from a byte array or InputStream loaded into memory |
RandomAccessBufferedFileInputStream or SequentialSource, you now implement RandomAccessRead instead.
After (3.x) — reading from a file via RandomAccessRead
Removed and renamed classes
PDFBox 3.0 deleted a large amount of deprecated API that had been marked for removal since 2.x. The table below lists the most commonly used removals.PDDocument changes
PDDocument changes
| Removed in 3.x | Replacement |
|---|---|
PDDocument.load(File) | Loader.loadPDF(File) |
PDDocument.load(InputStream) | Loader.loadPDF(new RandomAccessReadBuffer(stream)) |
PDDocument.load(byte[]) | Loader.loadPDF(byte[]) |
PDDocument.setAllSecurityToBeRemoved(boolean) | Pass an empty password to Loader.loadPDF |
I/O class renames and moves
I/O class renames and moves
| Removed / renamed in 3.x | Replacement |
|---|---|
RandomAccessFile (pdfbox package) | RandomAccessReadBufferedFile (pdfbox-io module) |
RandomAccessBuffer (old) | RandomAccessReadBuffer (pdfbox-io module) |
RandomAccessBufferedFileInputStream | RandomAccessReadBufferedFile |
SequentialSource | RandomAccessRead interface |
ScratchFile / ScratchFileBuffer | RandomAccessReadWriteBuffer (default in-memory scratch) |
org.apache.pdfbox.io.MemoryUsageSetting changes
MemoryUsageSetting changes
In 2.x,
PDDocument.load() accepted a MemoryUsageSetting parameter to control whether temporary data was stored in memory or on disk. In 3.x, this is controlled by passing a StreamCacheCreateFunction to the Loader.loadPDF() overloads.After (3.x) — custom stream cache
Font API changes
Font API changes
PDType1Font built-in font constants (e.g. PDType1Font.HELVETICA) were removed. Use the Standard14Fonts.FontName enum instead:Before (2.x)
After (3.x)
Summary checklist
Use this checklist when updating a 2.x project to 3.x:- Replace every
PDDocument.load(...)call with the correspondingLoader.loadPDF(...)overload. - Add
import org.apache.pdfbox.Loader;wherever you open a PDF. - If you load from an
InputStream, wrap it:new RandomAccessReadBuffer(stream). - Replace
PDType1Font.HELVETICA(and similar constants) withnew PDType1Font(FontName.HELVETICA). - If you depend directly on
RandomAccessBuffer,RandomAccessFile, orSequentialSource, update imports to the neworg.apache.pdfbox.ioclasses. - Update your
pdfboxdependency version to3.0.0. - Verify your JDK is Java 11 or higher.