Skip to main content

Synopsis

hhfilter -i <infile> -o <outfile> [options]

Description

HHfilter filters an alignment by maximum pairwise sequence identity, minimum coverage, minimum sequence identity, or score per column to the first (seed) sequence.

Required Parameters

-i
file
required
Read input file in A3M/A2M or FASTA format
-o
file
required
Write to output file in A3M format
-a
file
Append to output file in A3M format

Filter Options

-v
integer
default:"2"
Verbose mode:
  • 0: no screen output
  • 1: only warnings
  • 2: verbose
-id
integer
default:"90"
Maximum pairwise sequence identity (%) (range: 0-100)
-diff
integer
default:"0"
Filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50
-cov
integer
default:"0"
Minimum coverage with query (%) (range: 0-100)
-qid
integer
default:"0"
Minimum sequence identity with query (%) (range: 0-100)
-qsc
float
default:"-20.0"
Minimum score per column with query
-neff
float
Target diversity of alignment (default: off, range: 1-inf)

Input Alignment Format

-M
string
default:"a2m"
Input alignment format:
  • a2m or a3m: upper case = Match; lower case = Insert; ’-’ = Delete; ’.’ = gaps aligned to inserts
  • first: FASTA format where columns with residue in 1st sequence are match states
  • [0-100]: FASTA format where columns with fewer than X% gaps are match states

Other Options

-maxseq
integer
default:"65535"
Max number of input rows
-maxres
integer
default:"20000"
Max number of HMM columns

Examples

Filter by sequence identity

hhfilter -id 50 -i d1mvfd_.a2m -o d1mvfd_.fil.a2m

Filter by coverage and identity

hhfilter -id 90 -cov 75 -i alignment.a3m -o filtered.a3m

Exit Codes

  • 0: Success
  • 1: File format error
  • 2: File access error
  • 3: Memory error
  • 4: Command line error

See Also

Build docs developers (and LLMs) love