Skip to main content
HHblits is HH-suite’s most powerful tool for sensitive, iterative protein sequence searching. It represents both query and database sequences as Hidden Markov Models (HMMs) and performs iterative searches to build high-quality multiple sequence alignments.

Overview

HHblits combines the sensitivity of HMM-HMM comparison with the speed of a prefilter to perform lightning-fast iterative sequence searches. It’s designed to detect remote homologs that might be missed by traditional sequence search methods like BLAST.

Key Features

  • Iterative searching: Builds a profile by merging significant matches in multiple rounds
  • HMM-HMM comparison: Compares profile HMMs for maximum sensitivity
  • Fast prefilter: Uses ungapped alignment for rapid candidate identification
  • MSA generation: Creates high-quality multiple sequence alignments

When to Use HHblits

Use HHblits when you need to:
  • Find remote homologs: Detect distantly related proteins that share similar structures but low sequence identity
  • Build comprehensive MSAs: Generate multiple sequence alignments for downstream analysis
  • Perform iterative searches: Progressively expand your search profile across multiple rounds
  • Search large databases: Leverage the prefilter for fast searches in databases like Uniclust30 or BFD
For searching pre-built HMM databases without iteration, use hhsearch instead.

Basic Usage

1

Single iteration search

Perform a single search iteration to find homologs:
hhblits -i query.fas -o results.hhr -n 1 -d uniclust30
2

Iterative search with MSA output

Perform multiple iterations and generate an alignment:
hhblits -i query.fas -o results.hhr -oa3m query.a3m -d uniclust30
3

Control search sensitivity

Adjust E-value threshold and iteration count:
hhblits -i query.fas -o results.hhr -n 3 -e 0.001 -d uniclust30

Common Use Cases

Building a Profile for Structure Prediction

Generate a deep MSA for AlphaFold or other structure prediction tools:
hhblits -i protein.fas -o protein.hhr -oa3m protein.a3m -n 3 -d uniclust30

Searching Multiple Databases

Search against multiple databases in a single run:
hhblits -i query.fas -o results.hhr \
  -d uniclust30 \
  -d pdb70 \
  -n 2
From README.md example:
hhblits -i query.fas -o query.hhr -d ./uniclust30

Generate MSA Without Filtering

Keep all sequences in the result alignment:
hhblits -i query.fas -oa3m query.a3m -d uniclust30 -all

Key Parameters

  • -i <file> - Input query (FASTA, A3M, or HMM format)
  • -o <file> - Results in HHR format (default: <infile>.hhr)
  • -oa3m <file> - Output MSA in A3M format
  • -opsi <file> - Output MSA in PSI-BLAST format
  • -ohhm <file> - Output HMM file
  • -d <name> - Database basename (can be specified multiple times)
  • -n [1,8] - Number of search iterations (default: 2)
  • -e [0,1] - E-value threshold for inclusion (default: 0.001)
  • -E [0,inf] - E-value threshold for reporting (default: depends on mode)
  • -id [0,100] - Maximum pairwise sequence identity % (default: 90)
  • -diff [0,inf] - Keep diverse sequences, Ndiff per 50-column block (default: 1000)
  • -cov [0,100] - Minimum coverage with master sequence % (default: 0)
  • -qid [0,100] - Minimum sequence identity with master % (default: 0)
  • -qsc [-inf,100] - Minimum score per column (default: -20.0)
  • -cpu <int> - Number of CPU threads (default: system-dependent)
  • -noprefilt - Disable prefilter for maximum sensitivity
  • -maxfilt <int> - Max hits passing second prefilter

Output Format

Query         query_protein
Match_columns 250
No_of_seqs    1500 out of 2000

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM Template HMM
  1 1ABC_A Protein description      99.9     0.0   0.0    250.5  10.2  230   1-240    5-238 (300)
  2 2XYZ_B Another protein          98.5   0.001   1e-6   180.3   8.5  200  10-220   15-218 (250)
In A3M format:
  • Uppercase = match states (aligned columns)
  • Lowercase = insert states (gaps in the consensus)
  • - = delete states (gaps in the sequence)

Tips and Best Practices

Iteration count: 2-3 iterations usually provide a good balance between sensitivity and specificity. More iterations can improve the MSA but may introduce false positives.
Using -glob (global alignment) with iterative searches (n ≥ 2) is deprecated, as non-homologous segments can corrupt the MSA. Stick with local alignment (default).
Database selection:
  • Use Uniclust30 for general protein searches
  • Use BFD for maximum coverage (2.5 billion sequences)
  • Use PDB70 when you need structural information

Advanced Options

Controlling MAC Realignment

Adjust the Maximum Accuracy (MAC) algorithm parameters:
hhblits -i query.fas -o results.hhr -d uniclust30 \
  -mact 0.35 \    # Threshold for greediness (0=global, >0.1=local)
  -realign_max 500 # Max number of hits to realign
Control intermediate filtering to manage memory:
hhblits -i query.fas -o results.hhr -d uniclust30 \
  -interim_filter FULL \  # NONE or FULL
  -maxseq 10000            # Max sequences in MSA

Custom Pseudocount Settings

Fine-tune context-specific pseudocounts:
hhblits -i query.fas -o results.hhr -d uniclust30 \
  -pc_hhm_contxt_mode 2 \  # Diversity-dependent mode
  -pc_hhm_contxt_a 0.9 \   # Overall admixture
  -pc_hhm_contxt_b 4.0     # Neff threshold

Parallel Versions

HH-suite provides optimized parallel versions when compiled with OpenMP or MPI support:
hhblits_omp - OpenMP multi-threaded version for shared-memory systemshhblits_ca3m - OpenMP version optimized for compressed CA3M databases in FFindex formathhblits_mpi - MPI version for distributed computing clusters
See the Parallel Computing Guide for usage details.
  • hhsearch - Search HMM databases without iteration
  • hhmake - Convert MSA to HMM format
  • hhfilter - Filter alignments by sequence identity
  • hhalign - Pairwise HMM-HMM alignment

References

For detailed algorithm information, see: Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J (2019). HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics, 473. doi: 10.1186/s12859-019-3019-7

Build docs developers (and LLMs) love