LZMA (Lempel-Ziv-Markov chain Algorithm) is an improved version of the famous LZ77 compression algorithm. It was improved in the way of maximum increasing of compression ratio, keeping high decompression speed and low memory requirements for decompressing.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ip7z/7zip/llms.txt
Use this file to discover all available pages before exploring further.
Method ID
ID:03 01 01 (hex)
The LZMA method is identified by the 3-byte sequence 03 01 01 in 7z archive format.
Overview
LZMA provides excellent compression ratios through:- Dictionary-based LZ77 algorithm
- Range encoding for entropy coding
- Markov chain-based probability model
- Optimized match finding
LZMA is the original algorithm used in .lzma files and was the default method in 7z archives before LZMA2 was introduced. It remains widely used for single-threaded compression.
Properties File Format
LZMA compressed files have a 13-byte header:Compression Parameters
LZMA supports the following encoding properties defined inLzmaEnc.h:13-39:
Dictionary Size (dictSize)
Dictionary Size (dictSize)
Type:
Range:
Default:
UInt32Range:
(1 << 12) to (1 << 27) for 32-bit, (1 << 12) to (3 << 29) for 64-bitDefault:
1 << 24 (16 MB)The dictionary size determines how far back the encoder can reference previous data. Larger dictionaries provide better compression but require more memory.Literal Context Bits (lc)
Literal Context Bits (lc)
Type:
Range: 0 to 8
Default: 3Number of high bits of the previous byte used as context for literal encoding.Higher values can improve compression of text files but increase memory usage.
intRange: 0 to 8
Default: 3Number of high bits of the previous byte used as context for literal encoding.
Literal Position Bits (lp)
Literal Position Bits (lp)
Type:
Range: 0 to 4
Default: 0Number of low bits of the current position used as context for literal encoding.
intRange: 0 to 4
Default: 0Number of low bits of the current position used as context for literal encoding.
The constraint
lc + lp <= 4 is recommended but not enforced. Higher values significantly increase memory usage.Position Bits (pb)
Position Bits (pb)
Type:
Range: 0 to 4
Default: 2Number of low bits of the current position used as context for match length encoding.
intRange: 0 to 4
Default: 2Number of low bits of the current position used as context for match length encoding.
Compression Level (level)
Compression Level (level)
Type:
Range: 0 to 9
Default: 5Compression level affects multiple parameters automatically:
intRange: 0 to 9
Default: 5Compression level affects multiple parameters automatically:
| Level | Dict Size | Algorithm | Fast Bytes |
|---|---|---|---|
| 0 | 64 KB | Fast | 32 |
| 1 | 256 KB | Fast | 32 |
| 2 | 1 MB | Fast | 32 |
| 3 | 4 MB | Fast | 32 |
| 4 | 16 MB | Fast | 32 |
| 5 | 16 MB | Normal | 32 |
| 6 | 32 MB | Normal | 32 |
| 7 | 64 MB | Normal | 64 |
| 8 | 128 MB | Normal | 64 |
| 9 | 256 MB | Normal | 64 |
Algorithm (algo)
Algorithm (algo)
Type:
Range: 0 (fast) or 1 (normal)
Default: 1
intRange: 0 (fast) or 1 (normal)
Default: 1
- Fast (0): Hash chain mode, faster compression, lower ratio
- Normal (1): Binary tree mode, slower compression, better ratio
Fast Bytes (fb)
Fast Bytes (fb)
Type:
Range: 5 to 273
Default: 32 (level < 7) or 64 (level >= 7)Number of fast bytes. Higher values can improve compression ratio but slow down encoding.
intRange: 5 to 273
Default: 32 (level < 7) or 64 (level >= 7)
Binary Tree Mode (btMode)
Binary Tree Mode (btMode)
Type:
Range: 0 (hash chain) or 1 (binary tree)
Default: 1
intRange: 0 (hash chain) or 1 (binary tree)
Default: 1
Number of Hash Bytes (numHashBytes)
Number of Hash Bytes (numHashBytes)
Type:
Range: 2, 3, or 4
Default: 4
intRange: 2, 3, or 4
Default: 4
Match Counter (mc)
Match Counter (mc)
Type:
Range: 1 to
Default: 32Maximum number of match candidates to check.
UInt32Range: 1 to
1 << 30Default: 32
Number of Threads (numThreads)
Number of Threads (numThreads)
Type:
Range: 1 or 2
Default: 2 (if multithreading available)
intRange: 1 or 2
Default: 2 (if multithreading available)
Write End Mark (writeEndMark)
Write End Mark (writeEndMark)
Type:
Range: 0 or 1
Default: 0Whether to write an end-of-payload marker (EOPM).
unsignedRange: 0 or 1
Default: 0
Memory Requirements
Encoding
Memory required for compression (LzmaEnc.c:41):
dictSize: Dictionary size in bytesstate_size:(4 + (1.5 << (lc + lp))) KB- Default state_size (lc=3, lp=0): 16 KB
For level 5 (16 MB dictionary): ~190 MB for encodingFor level 9 (256 MB dictionary): ~2.9 GB for encoding
Decoding
Memory required for decompression:- Stack usage: 200-400 bytes for local variables
- Default state_size: 16 KB
- Dictionary buffer: Equal to dictionary size used during encoding
API Usage
Encoding
Decoding
Error Codes
LZMA encoder and decoder can return the following status codes:| Code | Description |
|---|---|
SZ_OK | Success |
SZ_ERROR_DATA | Data error during decoding |
SZ_ERROR_MEM | Memory allocation error |
SZ_ERROR_PARAM | Incorrect parameter in properties |
SZ_ERROR_UNSUPPORTED | Unsupported properties |
SZ_ERROR_INPUT_EOF | Needs more bytes in input buffer |
SZ_ERROR_WRITE | Write callback error |
SZ_ERROR_OUTPUT_EOF | Output buffer overflow |
SZ_ERROR_PROGRESS | Break from progress callback |
SZ_ERROR_THREAD | Multithreading error |
Performance Characteristics
Typical performance on modern hardware (Intel Core i7, 3.5 GHz):Compression:
- Level 5: ~2-3 MB/s
- Level 9: ~1-2 MB/s
- ~20-40 MB/s (single-threaded)
- Text files: 15-25% of original size
- Executable files: 30-50% of original size
- Multimedia files: 70-95% of original size (poorly compressible)
Command Line Usage
See Also
- LZMA2 Compression - Improved version with multithreading
- PPMd Compression - Better for text files
- Compression Methods Overview - Compare all methods
- Source files:
C/LzmaEnc.c,C/LzmaEnc.h,C/LzmaDec.c,C/LzmaDec.h