Method ID
ID:03 04 01 (hex)
The PPMd method is identified by the 3-byte sequence 03 04 01 in 7z archive format.
Overview
7-Zip includes two PPMd implementations:PPMd Variant H (PPMd7)
Original PPMd algorithm from 2001
- Based on Dmitry Shkarin’s PPMd var.H
- Used in 7z archives
- Order: 2-64
PPMd Variant I (PPMd8)
Improved PPMd algorithm from 2002
- Based on Dmitry Shkarin’s PPMd var.I
- Better compression ratio
- Order: 2-16
PPMd is particularly effective for:
- Text files (source code, logs, documents)
- Structured data (CSV, JSON, XML)
- Database dumps
- Configuration files
How PPMd Works
PPMd uses Prediction by Partial Matching:- Context Modeling: Analyzes previous bytes to build statistical models
- Probability Prediction: Predicts the probability of each possible next byte
- Arithmetic Coding: Encodes bytes using their predicted probabilities
- Adaptive Learning: Updates models as more data is processed
Context Orders
Context Orders
PPMd maintains multiple context models of different lengths (orders):Higher orders provide better compression for structured data but require more memory.
Escape Mechanism
Escape Mechanism
When a byte is not found in the current context, PPMd uses an “escape” mechanism:This allows the algorithm to fall back to lower-order contexts when needed.
PPMd7 (Variant H)
PPMd var.H is the original PPMd implementation used in 7z archives.Parameters
Order (maxOrder)
Order (maxOrder)
Type:
Range:
Default: 6Maximum context length for prediction.
unsignedRange:
PPMD7_MIN_ORDER (2) to PPMD7_MAX_ORDER (64)Default: 6
- Low order (2-4): Faster, less memory, lower compression
- Medium order (6-8): Balanced (recommended)
- High order (16-64): Best compression for structured text
Memory Size (mem)
Memory Size (mem)
Type:
Range:
Default: 16 MBAmount of memory allocated for context models.Typical settings:
UInt32Range:
PPMD7_MIN_MEM_SIZE (2 KB) to PPMD7_MAX_MEM_SIZE (~4 GB)Default: 16 MB
- 1-4 MB: Fast, lower compression
- 16-64 MB: Good balance (recommended)
- 128-256 MB: Maximum compression for large text files
Memory Requirements
PPMd7 memory usage:- Encoding: Same as memory size parameter
- Decoding: Same as memory size parameter
- Stack: Minimal (< 1 KB)
Example configurations:
- Order 6, 16 MB: ~16 MB RAM (both compress/decompress)
- Order 8, 64 MB: ~64 MB RAM (both compress/decompress)
- Order 16, 256 MB: ~256 MB RAM (both compress/decompress)
API Usage
Decoding
PPMd8 (Variant I)
PPMd var.I is an improved version with better compression ratio.Key Differences from PPMd7
Maximum Order Limit
Maximum Order Limit
PPMd8 has a lower maximum order:This is due to improved context modeling that provides better compression with lower orders.
Restore Method
Restore Method
PPMd8 supports different memory restoration methods:
- RESTART: More aggressive memory recycling
- CUT_OFF: Conservative memory management
FREEZE mode is disabled due to compatibility issues between PPMdI rev.1 and rev.2.
Improved Context Statistics
Improved Context Statistics
PPMd8 uses different statistical models:These provide better probability estimation for certain data patterns.
API Usage
Performance Characteristics
Typical performance on modern hardware (Intel Core i7, 3.5 GHz):Compression:
- Order 6, 16 MB: ~1-2 MB/s
- Order 8, 64 MB: ~0.5-1 MB/s
- Order 16, 256 MB: ~0.3-0.5 MB/s
- Order 6: ~5-10 MB/s
- Order 8: ~3-7 MB/s
- Order 16: ~2-5 MB/s
- Text files: 5-20% better
- Source code: 10-25% better
- Binary executables: Similar or worse
- Multimedia: Much worse (use LZMA)
Compression Ratio Comparison
Text Files (Source Code, Logs)
Text Files (Source Code, Logs)
| Method | Compression Ratio | Speed |
|---|---|---|
| PPMd (order 8) | 100% (best) | Slow |
| PPMd (order 6) | 102-105% | Medium |
| LZMA2 | 110-120% | Medium |
| BZip2 | 115-125% | Fast |
| Deflate | 130-150% | Very Fast |
Structured Data (XML, JSON, CSV)
Structured Data (XML, JSON, CSV)
| Method | Compression Ratio | Speed |
|---|---|---|
| PPMd (order 10) | 100% (best) | Very Slow |
| PPMd (order 6) | 103-107% | Slow |
| LZMA2 | 115-130% | Medium |
| BZip2 | 120-135% | Fast |
| Deflate | 140-160% | Very Fast |
Binary Executables
Binary Executables
| Method | Compression Ratio | Speed |
|---|---|---|
| LZMA2 | 100% (best) | Medium |
| PPMd | 105-115% | Slow |
| BZip2 | 110-120% | Fast |
| Deflate | 120-140% | Very Fast |
Command Line Usage
Best Practices
Choosing Order
Choosing Order
Higher order = better compression but slower and more memory:
- Start with order 6 (default)
- For highly structured data, try order 8-10
- For maximum compression, test up to order 16
- Monitor compression time and memory usage
Memory Size Selection
Memory Size Selection
Memory should be:
- At least 1-2x the file size for small files
- 10-20% of file size for large files
- Minimum 4 MB for order 6+
- Maximum what both encoder and decoder can allocate
When NOT to Use PPMd
When NOT to Use PPMd
Avoid PPMd for:
- Already compressed data: JPEG, PNG, MP3, video files
- Binary executables: Use LZMA2 with BCJ filter
- Random data: Any method will fail
- Memory-constrained systems: Use LZMA2 or Deflate
- Need for speed: Use LZMA2 with multithreading
Error Codes
| Symbol | Value | Description |
|---|---|---|
PPMD7_SYM_END | -1 | End of payload marker |
PPMD7_SYM_ERROR | -2 | Data corruption error |
PPMD8_SYM_END | -1 | End of payload marker |
PPMD8_SYM_ERROR | -2 | Data corruption error |
PPMd7 vs PPMd8
| Feature | PPMd7 (var.H) | PPMd8 (var.I) |
|---|---|---|
| Year | 2001 | 2002 |
| Max Order | 64 | 16 |
| Compression | Excellent | Slightly better |
| Speed | Medium | Slightly slower |
| Memory | Same as setting | Same as setting |
| Restore method | Simple | Multiple options |
| 7z default | Yes | No |
| Compatibility | Wider | Limited |
For most use cases, PPMd7 (var.H) is recommended as it’s the standard PPMd implementation in 7z archives.
See Also
- LZMA2 Compression - Better for general-purpose compression
- LZMA Compression - Original LZMA algorithm
- Compression Methods Overview - Compare all methods
- Source files:
C/Ppmd7.c,C/Ppmd7.h,C/Ppmd8.c,C/Ppmd8.h