Skip to main content
HHsearch searches a database of HMMs with a query multiple sequence alignment or HMM. Unlike HHblits, it performs a single-round search without iteration, making it ideal for searching pre-built databases of known protein families.

Overview

HHsearch is optimized for searching databases of profile HMMs, such as PDB, Pfam, or SCOP. It uses HMM-HMM comparison to detect remote homologies with high sensitivity.

Key Features

  • Single-round search: No iterative profile building
  • Database search: Designed for pre-built HMM databases
  • High sensitivity: Detects remote homologs through HMM-HMM comparison
  • Fast: No prefilter overhead or iteration steps

When to Use HHsearch

Use HHsearch when you need to:
  • Search structure databases: Find proteins with similar structures in PDB70
  • Identify protein families: Search against Pfam or other family databases
  • Compare to known folds: Search SCOP or ECOD databases
  • Perform single-round searches: When you have a pre-built query HMM or MSA
For building deep MSAs through iterative searching, use hhblits instead.

Basic Usage

1

Search with a FASTA sequence

Search a database with a single sequence or MSA:
hhsearch -i query.a3m -d pdb70 -o results.hhr
2

Search with an HMM

Use a pre-built HMM as query:
hhsearch -i query.hhm -d scop70 -o results.hhr
3

Control result output

Adjust E-value threshold and number of hits:
hhsearch -i query.a3m -d pdb70 -o results.hhr -e 0.001 -Z 100

Common Use Cases

Finding Structural Homologs

Search the PDB database to find proteins with similar structures:
hhsearch -i query.a3m -d pdb70 -o results.hhr
This is the primary use case from the README.md example.

Identifying Protein Family Membership

Search Pfam to classify your protein:
hhsearch -i unknown_protein.a3m -d pfam -o pfam_results.hhr -e 1e-3

Generating Tabular Output

Create BLAST-compatible output format:
hhsearch -i query.a3m -d pdb70 -blasttab results.m8

Output Multiple Sequence Alignment

Merge significant matches into an MSA:
hhsearch -i query.a3m -d pdb70 -o results.hhr -oa3m merged.a3m -e 1e-10

Key Parameters

  • -i <file> - Input query (FASTA, A2M, A3M, or HMM)
  • -o <file> - Output file in HHR format (default: <infile>.hhr)
  • -d <name> - Database name (can specify multiple with multiple -d flags)
  • -oa3m <file> - Write result MSA in A3M format
  • -blasttab <file> - Write results in BLAST tabular format
  • -e [0,1] - E-value cutoff for inclusion in result alignment (default: 0.001)
  • -E [0,inf] - Maximum E-value in result list (default: 1E+06)
  • -p [0,100] - Minimum probability in result list (default: 20)
  • -Z <int> - Maximum number of hits in result list (default: 500)
  • -z <int> - Minimum number of hits in result list (default: 10)
  • -glob - Use global alignment mode (align full query and template)
  • -loc - Use local alignment mode (default, find best local match)
  • -mact [0,1] - Threshold for MAC realignment greediness (default: 0.35)
  • -id [0,100] - Maximum pairwise sequence identity % (default: 90)
  • -cov [0,100] - Minimum coverage with master sequence % (default: 0)
  • -qid [0,100] - Minimum sequence identity with query % (default: 0)
  • -qsc [-inf,100] - Minimum score per column (default: -20.0)

Output Format

Query         protein_query
Match_columns 245
No_of_seqs    150 out of 200

 No Hit                             Prob E-value P-value  Score    SS Cols Query HMM Template HMM
  1 3ABC_A Crystal structure         99.8   1e-35  1e-40  220.5   9.8  240   5-245    1-240 (248)
  2 4XYZ_B NMR structure             95.2    0.02  2e-07  120.8   7.2  180  45-225   10-190 (210)

Tips and Best Practices

Database selection:
  • PDB70: For finding structural templates (clustered at 70% identity)
  • Pfam: For protein family classification
  • SCOP: For fold recognition and evolutionary relationships
  • dbCAN: For carbohydrate-active enzymes
HHsearch does NOT iterate. If you need to build a deep MSA, run hhblits first, then use the resulting MSA with hhsearch.
Local vs Global alignment:
  • Use local (default) when domains might be different sizes
  • Use global only when you expect full-length alignment

Advanced Options

Realignment Control

Control the Maximum Accuracy (MAC) realignment:
hhsearch -i query.a3m -d pdb70 -o results.hhr \
  -norealign          # Disable MAC realignment
  # OR
  -mact 0.5 \          # Adjust threshold
  -realign_max 100     # Limit number of realignments

Secondary Structure Scoring

Include secondary structure in the scoring:
hhsearch -i query.a3m -d pdb70 -o results.hhr \
  -ssm 2 \    # SS scoring mode: 0=off, 1,2=after/during alignment
  -ssw 0.5    # Weight of SS score (0-1)

Excluding Query Regions

Exclude specific query positions from alignment:
hhsearch -i query.a3m -d pdb70 -o results.hhr \
  -excl 1-33,97-168
Search multiple databases in one run:
hhsearch -i query.a3m \
  -d pdb70 \
  -d scop70 \
  -d pfam \
  -o combined_results.hhr

Comparison with HHblits

FeatureHHsearchHHblits
IterationsSingle roundMultiple rounds (default 2)
PrefilterNoYes (ungapped alignment)
Best forSearching HMM databasesBuilding MSAs, sequence searches
SpeedFastFast with prefilter
SensitivityHighVery high (iterative)
Database typeHMM databasesSequence databases
  • hhblits - Iterative sequence searching and MSA building
  • hhmake - Build HMM from alignment
  • hhalign - Pairwise HMM-HMM alignment
  • hhfilter - Filter MSAs by sequence identity

References

Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J (2019). HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics, 473. doi: 10.1186/s12859-019-3019-7

Build docs developers (and LLMs) love