Overview
HHconsensus reads an alignment in A2M, A3M, or FASTA format and generates a consensus sequence based on the most frequent amino acid at each position, weighted by sequence similarity. It can output the consensus alone or as part of the full alignment.Key Features
- Consensus generation: Creates representative sequence from MSA
- Multiple output formats: FASTA, A2M, A3M formats
- Filtering support: Apply filters before consensus calculation
- Flexible output: Consensus only or full alignment with consensus
When to Use HHconsensus
Use HHconsensus when you need to:- Create representative sequences: Generate consensus for database entries
- Analyze conservation: Identify conserved positions in alignments
- Reduce MSA to single sequence: Simplify analysis or visualization
- Build consensus databases: Create searchable consensus sequence sets
- Quality control: Check alignment quality via consensus inspection
The consensus sequence is automatically calculated during HMM building in hhmake, but HHconsensus gives you direct access to it.
Basic Usage
Common Use Cases
Extract Consensus Sequence
Generate just the consensus sequence:Generate Consensus with Filtering
Filter alignment before calculating consensus:Create Consensus Database
Build a database of consensus sequences:Output Alignment in Different Formats
Generate alignment with consensus in various formats:Batch Processing
Process multiple alignments:Key Parameters
Input/Output Options
Input/Output Options
-i <file>- Input alignment (A2M, A3M, FASTA) or HMM-s <file>- Output consensus sequence in FASTA (default:<infile>.seq)-o <file>- Output alignment with consensus in A3M format-oa3m <file>- Same as-o(explicit A3M format)-oa2m <file>- Output alignment with consensus in A2M format-ofas <file>- Output alignment with consensus in FASTA format-v <int>- Verbose mode (0=silent, 1=warnings, 2=verbose)
Filtering Options
Filtering Options
-id [0,100]- Maximum pairwise sequence identity % (default: 100)-diff [0,inf]- Filter for diversity, keeping at least this many sequences per 50-column block (default: 0)-cov [0,100]- Minimum coverage with query % (default: 0)-qid [0,100]- Minimum sequence identity with query % (default: 0)-qsc [-inf,100]- Minimum score per column with query (default: -20.0)
Input Format
Input Format
-M a2m- A2M/A3M format (default): upper=match, lower=insert-M first- FASTA: first sequence defines match states-M [0,100]- FASTA: columns with <X% gaps are match states
Advanced Options
Advanced Options
-maxseq <int>- Maximum number of input sequences (default: 65535)-maxres <int>- Maximum number of HMM columns (default: 20000)
How Consensus is Calculated
The consensus sequence is determined by:- Sequence weighting: Similar sequences are down-weighted
- Position-specific frequencies: Count amino acids at each position
- Most frequent residue: Select the most common amino acid
- Tie breaking: Use a fixed priority order if frequencies are equal
Output Formats
Tips and Best Practices
Advanced Use Cases
Named Consensus Sequences
Specify a custom name for the consensus:Consensus with Strict Quality Control
Generate consensus from only high-quality sequences:Build NR-style Consensus Database
Create a non-redundant database from alignments:Integration with Structure Prediction
Generate consensus for AlphaFold input:Comparison with Master Sequence
The consensus differs from the master sequence:| Aspect | Master Sequence | Consensus Sequence |
|---|---|---|
| Definition | First sequence in alignment | Most frequent residue per position |
| Coverage | May have gaps | Always complete (no gaps) |
| Represents | One real sequence | Average of all sequences |
| Use case | Reference sequence | Representative sequence |
Pipeline Integration
Workflow Example
Batch Consensus Generation
Troubleshooting
Consensus has gaps
Consensus has gaps
If the consensus sequence has gaps (dashes):
- This shouldn’t happen - consensus always has a residue at each match state
- Check if input alignment is properly formatted
- Ensure you’re using A3M/A2M format correctly
Consensus doesn't match expectations
Consensus doesn't match expectations
If the consensus seems wrong:
- View alignment to check sequence quality
- Apply filters to remove outliers
- Check if similar sequences are dominating (use
-idfilter) - Remember: consensus is frequency-based, not conservation-based
Empty output file
Empty output file
If no consensus is generated:
- Check input file format
- Verify alignment has at least one sequence
- Look for error messages with
-v 2 - Ensure output path is writable
Related Tools
- hhmake - Build HMMs (includes consensus in HMM file)
- hhfilter - Filter alignments before consensus
- hhblits - Build MSAs for consensus generation