Skip to main content
GZIP is a widely-used compression format based on the DEFLATE algorithm, commonly used for single-file compression and TAR archives.

Format Overview

GZIP provides:
  • Universal support - Available on all Unix/Linux systems
  • Fast compression - Good speed-to-ratio balance
  • Deflate algorithm - LZ77 + Huffman coding
  • Single file compression - Compresses one file or stream
  • CRC32 integrity - Built-in error detection
  • Metadata support - Filename, timestamp, OS information
GZIP is designed for single-file compression. For multiple files, use TAR+GZIP (.tar.gz) or switch to 7z/ZIP formats.

Format Structure

From source/CPP/7zip/Archive/GzHandler.cpp:38-86, GZIP file structure:
+-------------------+
| Header (10+ bytes)|
+-------------------+
| Compressed Data   |
+-------------------+
| Footer (8 bytes)  |
+-------------------+

Header Format

// Magic signature
static const Byte kSignature_0 = 0x1F;
static const Byte kSignature_1 = 0x8B;
static const Byte kSignature_2 = 8;  // Deflate method

// Header structure:
// Bytes 0-1:   Magic (0x1F 0x8B)
// Byte 2:      Compression method (8 = Deflate)
// Byte 3:      Flags
// Bytes 4-7:   Modification timestamp (Unix time)
// Byte 8:      Extra flags (compression level)
// Byte 9:      Operating system

Header Flags

From GzHandler.cpp:45-53:
namespace NFlags {
  const Byte kCrc     = 1 << 1;  // Header CRC present
  const Byte kExtra   = 1 << 2;  // Extra field present
  const Byte kName    = 1 << 3;  // Original filename present
  const Byte kComment = 1 << 4;  // Comment present
}

Extra Flags

Indicates compression level used:
namespace NExtraFlags {
  const Byte kMaximum = 2;  // Maximum compression (-mx=9)
  const Byte kFastest = 4;  // Fastest compression (-mx=1)
}
Bytes 0-3:   CRC32 of uncompressed data
Bytes 4-7:   Size of uncompressed data (mod 2^32)
GZIP stores file size as 32-bit value, limiting uncompressed files to 4 GB. For larger files, use XZ or 7z formats.

Supported Operating Systems

GZIP tracks the OS where file was compressed (GzHandler.cpp:62-110):
namespace NHostOS {
  kFAT = 0,      // MS-DOS, Windows
  kAMIGA,        // Amiga
  kVMS,          // VMS
  kUnix,         // Unix, Linux
  kVM_CMS,       // VM/CMS
  kAtari,        // Atari
  kHPFS,         // OS/2
  kMac,          // Macintosh
  kZ_System,     // Z-System
  kCPM,          // CP/M
  kTOPS20,       // TOPS-20
  kNTFS,         // Windows NT
  kQDOS,         // QDOS
  kAcorn,        // Acorn RISC OS
  kVFAT,         // Windows 95/NT
  kMVS,          // MVS
  kBeOS,         // BeOS
  kTandem,       // Tandem
  kUnknown = 255
}

Usage Examples

Compress Single File

7z a -tgzip file.txt.gz file.txt
Creates compressed file, removes original by default with some tools.

Compress with Maximum Compression

7z a -tgzip -mx=9 file.txt.gz file.txt

Compress with Fast Speed

7z a -tgzip -mx=1 file.txt.gz file.txt

Decompress File

7z x file.txt.gz
Extracts to file.txt.

Create TAR.GZ Archive

7z a -ttar archive.tar files/
7z a -tgzip archive.tar.gz archive.tar
Or using system tar:
tar -czf archive.tar.gz files/

Compress with Filename Stored

7z a -tgzip -mx=9 compressed.gz original.txt
Stores “original.txt” as the original filename in the GZIP header.

Compression Levels

Performance Comparison (100 MB text file)

LevelTimeSizeRatioSpeed
-mx=12s35 MB35%Fastest
-mx=33s32 MB32%Fast
-mx=55s30 MB30%Normal (default)
-mx=78s29 MB29%High
-mx=912s28 MB28%Maximum
Deflate algorithm provides diminishing returns at higher compression levels. Level 6-7 often provides the best balance.

Advanced Usage

Compress Standard Input

echo "Hello, World!" | 7z a -tgzip -si hello.gz

Decompress to Standard Output

7z x -so file.gz

Multiple Files (TAR+GZIP)

# Create TAR archive first
7z a -ttar data.tar folder/
# Compress TAR with GZIP
7z a -tgzip -mx=9 data.tar.gz data.tar
# Remove intermediate TAR
rm data.tar

Test Integrity

7z t file.gz
Verifies CRC32 checksum and structure.

View File Info

7z l -slt file.gz
Shows:
  • Original filename
  • Modification time
  • Uncompressed size
  • CRC32 value
  • Host OS

Implementation Details

GZIP handler in 7-Zip:
source/CPP/7zip/Archive/GzHandler.cpp
Key components:
class CItem {
  Byte Flags;
  Byte ExtraFlags;
  Byte HostOS;
  UInt32 Time;
  UInt32 Crc;
  UInt32 Size32;
  AString Name;
  AString Comment;
};
Uses compression codecs from:
source/CPP/7zip/Compress/DeflateDecoder.h
source/CPP/7zip/Compress/DeflateEncoder.h

Comparison with Other Formats

GZIP vs BZIP2

FeatureGZIPBZIP2
AlgorithmDeflate (LZ77+Huffman)Burrows-Wheeler
SpeedFastSlower
RatioGoodBetter
File size limit4 GBUnlimited
Compression time1x~3x
Decompression1x~2x

GZIP vs XZ

FeatureGZIPXZ
AlgorithmDeflateLZMA2
SpeedFastSlow
RatioGoodExcellent
File size limit4 GB16 EB
Memory usageLowHigh
DecompressionFastMedium

GZIP vs 7z

FeatureGZIP7z
Multiple filesNo (needs TAR)Yes
Solid compressionNoYes
EncryptionNoYes (AES-256)
Compression ratioGoodExcellent
CompatibilityUniversalRequires 7-Zip

Best Practices

For Web Servers

Use -mx=6 for optimal speed/size balance on HTTP compression

For Log Files

Use -mx=9 for maximum compression of text logs

For Backups

Combine with TAR for multiple files: .tar.gz

For Fast Compression

Use -mx=1 or -mx=3 for quick compression

Common Use Cases

Web Content Compression

# Compress JavaScript
7z a -tgzip -mx=9 script.min.js.gz script.min.js

# Compress CSS
7z a -tgzip -mx=9 styles.css.gz styles.css

Log File Rotation

# Compress yesterday's log
7z a -tgzip -mx=9 app-$(date -d yesterday +%Y%m%d).log.gz app.log.1

Database Backup

# Dump and compress database
mysqldump mydb | 7z a -tgzip -si backup-$(date +%Y%m%d).sql.gz

Source Code Archive

tar -cf - project/ | 7z a -tgzip -mx=9 -si project-$(date +%Y%m%d).tar.gz

Limitations

GZIP format limitations:
  • Single file only (use TAR for multiple files)
  • 4 GB uncompressed file size limit
  • No encryption support
  • No error recovery or redundancy
  • Lower compression ratio than BZIP2/XZ/LZMA

Compatibility

Universal Support

GZIP is supported by:
  • All Linux distributions (gunzip, gzip command)
  • macOS (built-in gzip)
  • Windows (7-Zip, WinZip, various tools)
  • Web browsers (HTTP compression)
  • Programming languages (zlib library)

HTTP Compression

GZIP is the most common HTTP compression:
Content-Encoding: gzip
Supported by all modern browsers.

Performance Tips

  1. Use appropriate compression level - Level 6 is usually optimal
  2. Pre-filter data - Remove redundant data before compression
  3. Use TAR for multiple files - Don’t compress files individually
  4. Consider alternatives - Use XZ for better ratio, BZIP2 for medium ratio
  5. Parallel compression - Use pigz for multi-core compression

See Also

Build docs developers (and LLMs) love