Skip to main content

Overview

HH-suite implements sophisticated algorithms for comparing Hidden Markov Models (HMMs) representing protein sequences. The core algorithms include:
  • Viterbi algorithm: Finds optimal alignment path
  • Forward-Backward algorithms: Calculate alignment probabilities
  • MAC (Maximum Accuracy): Posterior decoding for optimal expected accuracy

HMM-HMM Comparison

Profile HMM Structure

HH-suite uses profile HMMs with a linear topology:
Each column (1 to L) contains:
  • Match state (M): Emits aligned residue
  • Insert state (I): Emits extra query residues
  • Delete state (D): Skips template residues

Pair States

Aligning two HMMs creates pair states:
Pair StateQuery StateTemplate StateDescription
MMMatchMatchBoth emit residues
GDGapDeleteQuery gap, template emits
IMInsertMatchQuery insert, template emits
DGDeleteGapQuery emits, template gap
MIMatchInsertQuery emits, template insert
Transitions only occur between MM state and the four other states (no direct I↔D transitions).

Viterbi Algorithm

The Viterbi algorithm finds the single best alignment between two HMMs using dynamic programming.

Algorithm Description

Source: hhviterbialgorithm.cpp
1

Initialization

Initialize the top row (i=0) and leftmost column (j=0) of the dynamic programming matrix.
// Top row initialization
for (j=0; j <= t->L; ++j) {
    sMM[j] = -j * penalty_gap_template;
    sGD[j] = -FLT_MAX;
    sIM[j] = -FLT_MAX;
}
2

Recursion

For each cell (i,j), calculate scores for all pair states:Match-Match (MM):
sMM[i][j] = S(qi, tj) + max(
  sMM[i-1][j-1] + tr_MM_MM,
  sGD[i-1][j-1] + tr_GD_MM,
  sIM[i-1][j-1] + tr_IM_MM,
  sDG[i-1][j-1] + tr_DG_MM,
  sMI[i-1][j-1] + tr_MI_MM
)
Where S(qi, tj) is the emission score (scalar product of emission probabilities).
3

Traceback

Follow backtrace pointers from the maximum score to reconstruct the alignment path.

SIMD Vectorization

HH-suite processes 4-8 alignments in parallel using SIMD instructions:
hhviterbi.h
// Process 4 alignments simultaneously
const int VECSIZE_FLOAT = 4;

simd_float sMM_i_j;
simd_float score_vec = simdf32_set(-FLT_MAX);

Secondary Structure Scoring

SS scoring is integrated into the Viterbi alignment:
hhviterbi.h
static inline float ScoreSS(
    const HMM* q, const HMM* t, 
    const int i, const int j, 
    const float ssw, const int ssm,
    const float S73[NDSSP][NSSPRED][MAXCF],
    const float S37[NSSPRED][MAXCF][NDSSP],
    const float S33[NSSPRED][MAXCF][NSSPRED][MAXCF]
) {
    switch (ssm) {
        case HMM::NO_SS_INFORMATION:
            return 0.0;
        case HMM::PRED_DSSP:  // Query predicted, template DSSP
            return ssw * S37[q->ss_pred[i]][q->ss_conf[i]][t->ss_dssp[j]];
        case HMM::DSSP_PRED:  // Query DSSP, template predicted
            return ssw * S73[q->ss_dssp[i]][t->ss_pred[j]][t->ss_conf[j]];
        case HMM::PRED_PRED:  // Both predicted
            return ssw * S33[q->ss_pred[i]][q->ss_conf[i]]
                            [t->ss_pred[j]][t->ss_conf[j]];
    }
    return 0.0;
}
SS scoring modes:
  • PRED_DSSP: Query has predicted SS, template has DSSP
  • DSSP_PRED: Query has DSSP, template has predicted SS
  • PRED_PRED: Both have predicted SS

Cell-Off Optimization

The “cell-off” mechanism excludes regions from alignment:
hhviterbialgorithm.cpp
#ifdef VITERBI_CELLOFF
// Check if cell should be excluded
if (celloff_matrix.getCellOff(i, j, elem)) {
    sMM[i][j] = -FLT_MAX;  // Exclude this cell
}
#endif
Use cases:
  • Prevent self-alignments with |i-j| < threshold
  • Exclude previously found alignments
  • Enforce minimum overlap constraints

Forward-Backward Algorithms

Forward Algorithm

Calculates the probability of all alignment paths up to position (i,j). Source: hhforwardalgorithm.cpp
Forward recursion (simplified)
// F_MM[i][j] = probability of all paths ending at (i,j) in state MM
for (i = 2; i <= q.L; ++i) {
    for (j = 1; j <= t.L; ++j) {
        double pmatch = ProbFwd(q.p[i], t.p[j]) * Cshift
                       * fpow2(ScoreSS(&q, &t, i, j, ssw, ssm));
        
        F_MM[i][j] = pmatch * (
            F_MM[i-1][j-1] * q.tr[i-1][M2M] * t.tr[j-1][M2M] +
            F_GD[i-1][j-1] * t.tr[j-1][D2M] +
            F_IM[i-1][j-1] * q.tr[i-1][I2M] * t.tr[j-1][M2M] +
            F_DG[i-1][j-1] * q.tr[i-1][D2M] +
            F_MI[i-1][j-1] * t.tr[j-1][I2M]
        );
    }
}
Numerical Stability: Forward probabilities can underflow. HH-suite uses scaling to maintain precision:
scale[i] = 1.0 / max(F_MM[i][:]);
for (j in columns) {
    F_MM[i][j] *= scale[i];
}

Backward Algorithm

Calculates the probability of all alignment paths from position (i,j) to the end. Source: hhbackwardalgorithm.cpp
Backward recursion (simplified)
// B_MM[i][j] = probability of paths from (i,j) to end
for (i = q.L - 1; i >= 1; i--) {
    for (j = t.L - 1; j >= 1; j--) {
        double pmatch = B_MM[i+1][j+1]
                       * ProbFwd(q.p[i+1], t.p[j+1]) * Cshift
                       * fpow2(ScoreSS(&q, &t, i+1, j+1, ssw, ssm));
        
        B_MM[i][j] = (
            pmin +  // Local alignment: can end here
            pmatch * q.tr[i][M2M] * t.tr[j][M2M] +
            B_GD[i][j+1] * t.tr[j][M2D] +
            B_IM[i][j+1] * q.tr[i][M2I] * t.tr[j][M2M] +
            B_DG[i+1][j] * q.tr[i][M2D] +
            B_MI[i+1][j] * q.tr[i][M2M] * t.tr[j][M2I]
        );
    }
}

Posterior Probabilities

Combining Forward and Backward:
P_MM[i][j] = F_MM[i][j] * B_MM[i][j] / P_forward
Where P_forward is the total alignment probability.
P_MM[i][j] = probability that positions i and j are aligned in state MM across all possible alignments.

MAC Algorithm (Maximum Accuracy)

MAC finds the alignment with maximum expected accuracy using posterior probabilities. Source: hhmacalgorithm.cpp, hhposteriordecoder.h

Algorithm Overview

1

Compute Posteriors

Run Forward-Backward to get P_MM[i][j]
2

MAC Recursion

// S[i][j] = expected accuracy up to (i,j)
S[i][j] = max(
    S[i-1][j],              // Delete in query
    S[i][j-1],              // Insert in query
    S[i-1][j-1] + P_MM[i][j] - mact  // Match
);
The mact parameter controls greediness:
  • Higher mact: shorter, more confident alignments
  • Lower mact: longer, more greedy alignments
3

Traceback

Follow the maximum expected accuracy path

MAC vs Viterbi

Viterbi

  • Finds single best path
  • Maximizes joint probability
  • Faster computation
  • Can be overconfident

MAC

  • Maximizes expected accuracy
  • Considers all possible paths
  • More reliable boundaries
  • Requires Forward-Backward
Use -realign flag to apply MAC algorithm to Viterbi hits:
hhsearch -i query.a3m -d pdb70 -realign -mact 0.35 -o results.hhr

Alignment Path Representation

State Encoding

From hhdecl.h:
enum pair_states {
    STOP = 0,
    MM = 2,   // Match-Match
    GD = 3,   // Gap-Delete
    IM = 4,   // Insert-Match
    DG = 5,   // Delete-Gap
    MI = 6    // Match-Insert
};

Backtrace Structure

hhviterbi.h
struct BacktraceResult {
    int* i_steps;      // Query positions
    int* j_steps;      // Template positions
    char* states;      // Pair states (MM/GD/IM/DG/MI)
    int count;         // Number of steps
    int matched_cols;  // Number of MM states
};

Performance Optimizations

SIMD Vectorization

Key optimizations in hhviterbi.cpp:
hhviterbi.h
// Vectorized scoring for 20 amino acid types
simd_float ScalarProd20Vec(simd_float* qi, simd_float* tj) {
    simd_float res0 = simdf32_mul(tj[0], qi[0]);
    simd_float res1 = simdf32_mul(tj[1], qi[1]);
    // ... process all 20 types
    res0 = simdf32_add(res0, res1);
    return res0;
}
#if defined(SSE)
_mm_prefetch((char*)&qi[4], _MM_HINT_T0);
_mm_prefetch((char*)&tj[4], _MM_HINT_T0);
#endif
HH-suite compiles separate libraries for different feature combinations:
  • hhviterbialgorithm (base)
  • hhviterbialgorithm_with_celloff (with exclusion)
  • hhviterbialgorithm_with_celloff_and_ss (exclusion + SS scoring)
This avoids runtime branching in the inner loop.

Memory Layout

Optimized cache access:
hhviterbi.h
// Store all DP states in single array for cache locality
simd_float* sMM_DG_MI_GD_IM_vec;
// Layout: [sMM|sDG|sMI|sGD|sIM] for each column

Complexity Analysis

Time Complexity

AlgorithmComplexityNotes
ViterbiO(L_q × L_t)L_q = query length, L_t = template length
ForwardO(L_q × L_t)Same as Viterbi
BackwardO(L_q × L_t)Same as Viterbi
MACO(L_q × L_t)Requires F+B first
SIMD speedup: ~4-8× faster with vectorization

Space Complexity

  • Viterbi: O(L_t) per alignment (only stores one row)
  • Forward-Backward: O(L_q × L_t) (stores full matrix for posteriors)
  • Backtrace: O(L_q × L_t) (stores backtrace pointers)

Practical Implications

Speed vs Accuracy

  • Use Viterbi-only for fast searches
  • Add -realign for critical alignments
  • MAC improves boundary detection

Memory Constraints

  • Viterbi: minimal memory
  • MAC: needs ~8 bytes × L_q × L_t
  • Use -maxmem to limit realignment

References

HH-suite3 Paper

Steinegger et al., BMC Bioinformatics (2019)

Source Code

View implementation on GitHub

Next Steps

Parameters

Learn how to tune algorithm parameters

Compilation

Build with optimal SIMD flags

Build docs developers (and LLMs) love