HHblits

HHblits is HH-suite’s most powerful tool for sensitive, iterative protein sequence searching. It represents both query and database sequences as Hidden Markov Models (HMMs) and performs iterative searches to build high-quality multiple sequence alignments.

Overview

HHblits combines the sensitivity of HMM-HMM comparison with the speed of a prefilter to perform lightning-fast iterative sequence searches. It’s designed to detect remote homologs that might be missed by traditional sequence search methods like BLAST.

Key Features

Iterative searching: Builds a profile by merging significant matches in multiple rounds
HMM-HMM comparison: Compares profile HMMs for maximum sensitivity
Fast prefilter: Uses ungapped alignment for rapid candidate identification
MSA generation: Creates high-quality multiple sequence alignments

When to Use HHblits

Use HHblits when you need to:

Find remote homologs: Detect distantly related proteins that share similar structures but low sequence identity
Build comprehensive MSAs: Generate multiple sequence alignments for downstream analysis
Perform iterative searches: Progressively expand your search profile across multiple rounds
Search large databases: Leverage the prefilter for fast searches in databases like Uniclust30 or BFD

For searching pre-built HMM databases without iteration, use hhsearch instead.

Basic Usage

Single iteration search

Perform a single search iteration to find homologs:

hhblits -i query.fas -o results.hhr -n 1 -d uniclust30

Iterative search with MSA output

Perform multiple iterations and generate an alignment:

hhblits -i query.fas -o results.hhr -oa3m query.a3m -d uniclust30

Control search sensitivity

Adjust E-value threshold and iteration count:

hhblits -i query.fas -o results.hhr -n 3 -e 0.001 -d uniclust30

Common Use Cases

Building a Profile for Structure Prediction

Generate a deep MSA for AlphaFold or other structure prediction tools:

hhblits -i protein.fas -o protein.hhr -oa3m protein.a3m -n 3 -d uniclust30

Searching Multiple Databases

Search against multiple databases in a single run:

hhblits -i query.fas -o results.hhr \
  -d uniclust30 \
  -d pdb70 \
  -n 2

One-liner for Quick Homology Search

From README.md example:

hhblits -i query.fas -o query.hhr -d ./uniclust30

Generate MSA Without Filtering

Keep all sequences in the result alignment:

hhblits -i query.fas -oa3m query.a3m -d uniclust30 -all

Key Parameters

Input/Output Options

-i <file> - Input query (FASTA, A3M, or HMM format)
-o <file> - Results in HHR format (default: <infile>.hhr)
-oa3m <file> - Output MSA in A3M format
-opsi <file> - Output MSA in PSI-BLAST format
-ohhm <file> - Output HMM file

Search Parameters

-d <name> - Database basename (can be specified multiple times)
-n [1,8] - Number of search iterations (default: 2)
-e [0,1] - E-value threshold for inclusion (default: 0.001)
-E [0,inf] - E-value threshold for reporting (default: depends on mode)

Alignment Filtering

-id [0,100] - Maximum pairwise sequence identity % (default: 90)
-diff [0,inf] - Keep diverse sequences, Ndiff per 50-column block (default: 1000)
-cov [0,100] - Minimum coverage with master sequence % (default: 0)
-qid [0,100] - Minimum sequence identity with master % (default: 0)
-qsc [-inf,100] - Minimum score per column (default: -20.0)

Performance Options

-cpu <int> - Number of CPU threads (default: system-dependent)
-noprefilt - Disable prefilter for maximum sensitivity
-maxfilt <int> - Max hits passing second prefilter

Output Format

Query         query_protein
Match_columns 250
No_of_seqs    1500 out of 2000

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM Template HMM
  1 1ABC_A Protein description      99.9     0.0   0.0    250.5  10.2  230   1-240    5-238 (300)
  2 2XYZ_B Another protein          98.5   0.001   1e-6   180.3   8.5  200  10-220   15-218 (250)

In A3M format:

Uppercase = match states (aligned columns)
Lowercase = insert states (gaps in the consensus)
- = delete states (gaps in the sequence)

Tips and Best Practices

Iteration count: 2-3 iterations usually provide a good balance between sensitivity and specificity. More iterations can improve the MSA but may introduce false positives.

Using -glob (global alignment) with iterative searches (n ≥ 2) is deprecated, as non-homologous segments can corrupt the MSA. Stick with local alignment (default).

Database selection:

Use Uniclust30 for general protein searches
Use BFD for maximum coverage (2.5 billion sequences)
Use PDB70 when you need structural information

Advanced Options

Controlling MAC Realignment

Adjust the Maximum Accuracy (MAC) algorithm parameters:

hhblits -i query.fas -o results.hhr -d uniclust30 \
  -mact 0.35 \    # Threshold for greediness (0=global, >0.1=local)
  -realign_max 500 # Max number of hits to realign

Filtering During Search

Control intermediate filtering to manage memory:

hhblits -i query.fas -o results.hhr -d uniclust30 \
  -interim_filter FULL \  # NONE or FULL
  -maxseq 10000            # Max sequences in MSA

Custom Pseudocount Settings

Fine-tune context-specific pseudocounts:

hhblits -i query.fas -o results.hhr -d uniclust30 \
  -pc_hhm_contxt_mode 2 \  # Diversity-dependent mode
  -pc_hhm_contxt_a 0.9 \   # Overall admixture
  -pc_hhm_contxt_b 4.0     # Neff threshold

Parallel Versions

HH-suite provides optimized parallel versions when compiled with OpenMP or MPI support:

hhblits_omp - OpenMP multi-threaded version for shared-memory systemshhblits_ca3m - OpenMP version optimized for compressed CA3M databases in FFindex formathhblits_mpi - MPI version for distributed computing clusters

See the Parallel Computing Guide for usage details.

hhsearch - Search HMM databases without iteration
hhmake - Convert MSA to HMM format
hhfilter - Filter alignments by sequence identity
hhalign - Pairwise HMM-HMM alignment

References

For detailed algorithm information, see: Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J (2019). HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics, 473. doi: 10.1186/s12859-019-3019-7

Getting Started

Core Tools

Utility Tools

Guides

Advanced

Overview

Key Features

When to Use HHblits

Basic Usage

Common Use Cases

Building a Profile for Structure Prediction

Searching Multiple Databases

One-liner for Quick Homology Search

Generate MSA Without Filtering

Key Parameters

Output Format

Tips and Best Practices

Advanced Options

Controlling MAC Realignment

Filtering During Search

Custom Pseudocount Settings

Parallel Versions

References

Build docs developers (and LLMs) love

Getting Started

Core Tools

Utility Tools

Guides

Advanced

​Overview

​Key Features

​When to Use HHblits

​Basic Usage

​Common Use Cases

​Building a Profile for Structure Prediction

​Searching Multiple Databases

​One-liner for Quick Homology Search

​Generate MSA Without Filtering

​Key Parameters

​Output Format

​Tips and Best Practices

​Advanced Options

​Controlling MAC Realignment

​Filtering During Search

​Custom Pseudocount Settings

​Parallel Versions

​Related Tools

​References

Build docs developers (and LLMs) love

Overview

Key Features

When to Use HHblits

Basic Usage

Common Use Cases

Building a Profile for Structure Prediction

Searching Multiple Databases

One-liner for Quick Homology Search

Generate MSA Without Filtering

Key Parameters

Output Format

Tips and Best Practices

Advanced Options

Controlling MAC Realignment

Filtering During Search

Custom Pseudocount Settings

Parallel Versions

Related Tools

References