Apache PDFBox is a powerful, open source Java library that gives you full programmatic control over PDF documents. Whether you need to generate reports, extract text, render pages to images, fill forms, or sign documents digitally, PDFBox provides a rich API to accomplish it all — licensed under the Apache License 2.0.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/apache/pdfbox/llms.txt
Use this file to discover all available pages before exploring further.
Quickstart
Add PDFBox to your Maven or Gradle project and write your first PDF in minutes.
Creating PDFs
Learn to create PDF documents with text, images, fonts, and graphics.
Text Extraction
Extract plain text or positional text data from any PDF document.
Rendering
Render PDF pages to BufferedImage, PNG, JPEG, or TIFF at any DPI.
PDF Manipulation
Merge, split, overlay, and reorder PDF documents programmatically.
Encryption & Security
Password-protect and decrypt PDFs with standard or public-key encryption.
Digital Signatures
Apply and verify digital signatures, timestamps, and visible signatures.
Command-Line Tools
Use PDFBox directly from the command line without writing any Java code.
Why PDFBox?
PDFBox is a mature, Apache-licensed Java library trusted in production environments worldwide. It covers the complete PDF lifecycle — from document creation to parsing, rendering, form processing, and cryptographic signing — with no runtime cost or licensing restrictions.Add the dependency
Add PDFBox to your Maven
pom.xml or Gradle build file. Requires Java 11 or higher.Load or create a document
Use
Loader.loadPDF() to open an existing file, or new PDDocument() to start fresh.Process the document
Extract text with
PDFTextStripper, render pages with PDFRenderer, manipulate content with PDPageContentStream, or sign with PDSignature.PDFBox 3.x introduced significant API changes from 2.x. If you are upgrading, see the Migration Guide for a full list of breaking changes and how to address them.
Key capabilities
AcroForm support
Create, fill, and flatten interactive PDF forms including text fields, checkboxes, radio buttons, and digital signature fields.
Font embedding
Embed TrueType, Type 1, and CFF fonts — including subsetting — for portable, self-contained PDF files.
XMP Metadata
Read and write XMP metadata streams using the XMPBox module for standards-compliant document metadata.
Modules overview
Learn about the pdfbox, fontbox, xmpbox, and io sub-modules that make up the PDFBox toolkit.