Overview
HHblits combines the sensitivity of HMM-HMM comparison with the speed of a prefilter to perform lightning-fast iterative sequence searches. It’s designed to detect remote homologs that might be missed by traditional sequence search methods like BLAST.Key Features
- Iterative searching: Builds a profile by merging significant matches in multiple rounds
- HMM-HMM comparison: Compares profile HMMs for maximum sensitivity
- Fast prefilter: Uses ungapped alignment for rapid candidate identification
- MSA generation: Creates high-quality multiple sequence alignments
When to Use HHblits
Use HHblits when you need to:- Find remote homologs: Detect distantly related proteins that share similar structures but low sequence identity
- Build comprehensive MSAs: Generate multiple sequence alignments for downstream analysis
- Perform iterative searches: Progressively expand your search profile across multiple rounds
- Search large databases: Leverage the prefilter for fast searches in databases like Uniclust30 or BFD
For searching pre-built HMM databases without iteration, use hhsearch instead.
Basic Usage
Common Use Cases
Building a Profile for Structure Prediction
Generate a deep MSA for AlphaFold or other structure prediction tools:Searching Multiple Databases
Search against multiple databases in a single run:One-liner for Quick Homology Search
From README.md example:Generate MSA Without Filtering
Keep all sequences in the result alignment:Key Parameters
Input/Output Options
Input/Output Options
-i <file>- Input query (FASTA, A3M, or HMM format)-o <file>- Results in HHR format (default:<infile>.hhr)-oa3m <file>- Output MSA in A3M format-opsi <file>- Output MSA in PSI-BLAST format-ohhm <file>- Output HMM file
Search Parameters
Search Parameters
-d <name>- Database basename (can be specified multiple times)-n [1,8]- Number of search iterations (default: 2)-e [0,1]- E-value threshold for inclusion (default: 0.001)-E [0,inf]- E-value threshold for reporting (default: depends on mode)
Alignment Filtering
Alignment Filtering
-id [0,100]- Maximum pairwise sequence identity % (default: 90)-diff [0,inf]- Keep diverse sequences, Ndiff per 50-column block (default: 1000)-cov [0,100]- Minimum coverage with master sequence % (default: 0)-qid [0,100]- Minimum sequence identity with master % (default: 0)-qsc [-inf,100]- Minimum score per column (default: -20.0)
Performance Options
Performance Options
-cpu <int>- Number of CPU threads (default: system-dependent)-noprefilt- Disable prefilter for maximum sensitivity-maxfilt <int>- Max hits passing second prefilter
Output Format
In A3M format:
- Uppercase = match states (aligned columns)
- Lowercase = insert states (gaps in the consensus)
- - = delete states (gaps in the sequence)
Tips and Best Practices
Advanced Options
Controlling MAC Realignment
Adjust the Maximum Accuracy (MAC) algorithm parameters:Filtering During Search
Control intermediate filtering to manage memory:Custom Pseudocount Settings
Fine-tune context-specific pseudocounts:Parallel Versions
HH-suite provides optimized parallel versions when compiled with OpenMP or MPI support:hhblits_omp - OpenMP multi-threaded version for shared-memory systemshhblits_ca3m - OpenMP version optimized for compressed CA3M databases in FFindex formathhblits_mpi - MPI version for distributed computing clusters
Related Tools
- hhsearch - Search HMM databases without iteration
- hhmake - Convert MSA to HMM format
- hhfilter - Filter alignments by sequence identity
- hhalign - Pairwise HMM-HMM alignment