Skip to main content
Precision configuration is critical for balancing resource utilization and model accuracy in FPGA implementations. hls4ml provides both automatic and manual precision tuning capabilities.

Precision Types

hls4ml supports multiple precision types for fixed-point arithmetic:

FixedPrecisionType

Standard fixed-point representation:
from hls4ml.model.types import FixedPrecisionType

# ap_fixed<16,6> - 16 total bits, 6 integer bits (including sign)
precision = FixedPrecisionType(width=16, integer=6, signed=True)

# Equivalent to: 10 fractional bits, 1 sign bit, 5 integer bits
# Range: [-32, 31.999...]
# Resolution: 2^-10 ≈ 0.00098

IntegerPrecisionType

Integer-only (no fractional bits):
from hls4ml.model.types import IntegerPrecisionType

# ap_int<8> or ap_uint<8>
signed_int = IntegerPrecisionType(width=8, signed=True)    # -128 to 127
unsigned_int = IntegerPrecisionType(width=8, signed=False)  # 0 to 255

ExponentPrecisionType

Power-of-2 representation (for po2 quantization):
from hls4ml.model.types import ExponentPrecisionType

# Values are 2^n
po2_precision = ExponentPrecisionType(width=8, signed=True)

XnorPrecisionType

Binary representation for XNOR operations:
from hls4ml.model.types import XnorPrecisionType

# Single-bit binary {0, 1} for XNOR networks
xnor_precision = XnorPrecisionType()

Automatic Precision Inference

The InferPrecisionTypes optimizer pass automatically calculates appropriate precision:
import hls4ml

config = hls4ml.utils.config_from_keras_model(
    model,
    granularity='name',
    default_precision='ap_fixed<16,6>'
)

hls_model = hls4ml.converters.convert_from_keras_model(
    model,
    hls_config=config,
    output_dir='auto_precision'
)

# Precision is automatically inferred for intermediate layers

How Inference Works

1

Analyze input precision

Start with user-defined or quantizer-specified input precision.
2

Propagate through operations

Calculate required precision for each operation based on:
  • Input bit-widths
  • Weight bit-widths
  • Operation type (multiplication, addition, etc.)
  • Number of accumulations
3

Apply maximum precision limits

If maximum precision is specified in config, constrain inferred precision.
4

Avoid overflow and underflow

Ensure sufficient integer bits to prevent overflow and fractional bits to maintain resolution.

Precision Inference Example

For a Dense layer with:
  • Input: ap_fixed<8,3> (8 bits, 3 integer)
  • Weights: ap_fixed<8,3>
  • Bias: ap_fixed<8,3>
  • 128 neurons (n_in = 128)
Inferred accumulator precision:
# Bitwidth = input_width + weight_width + ceil(log2(n_ops))
bitwidth = 8 + 8 + ceil(log2(128)) = 8 + 8 + 7 = 23

# Integer = input_int + weight_int + ceil(log2(n_ops))
integer = 3 + 3 + 7 = 13

# Result: ap_fixed<23,13> for accumulator

Maximum Precision Configuration

Limit inferred precision to control resource usage:
config = hls4ml.utils.config_from_keras_model(
    model,
    default_precision='ap_fixed<16,6>'
)

# Set maximum precision
config['Model']['Precision'] = {
    'default': 'ap_fixed<16,6>',
    'maximum': 'ap_fixed<32,16>'  # Cap all inferred precision
}

hls_model = hls4ml.converters.convert_from_keras_model(
    model,
    hls_config=config,
    output_dir='max_precision'
)
Maximum precision limiting can cause overflow if set too aggressively. Always verify with C simulation.

Manual Precision Configuration

Override automatic inference with explicit precision:

Layer-Level Precision

config = hls4ml.utils.config_from_keras_model(
    model,
    granularity='name'
)

# Set precision for specific layer
config['LayerName']['fc1'] = {
    'Precision': {
        'weight': 'ap_fixed<8,3>',
        'bias': 'ap_fixed<8,3>',
        'result': 'ap_fixed<16,6>',
        'accum': 'ap_fixed<24,12>'
    }
}

hls_model = hls4ml.converters.convert_from_keras_model(
    model,
    hls_config=config,
    output_dir='manual_precision'
)

Type-Level Precision

# Set precision by layer type
config['LayerType']['Dense'] = {
    'Precision': {
        'weight': 'ap_fixed<6,2>',
        'bias': 'ap_fixed<6,2>',
        'result': 'ap_fixed<12,4>'
    }
}

Rounding and Saturation Modes

Control how values are rounded and saturated:
from hls4ml.model.types import FixedPrecisionType, RoundingMode, SaturationMode

precision = FixedPrecisionType(
    width=16,
    integer=6,
    signed=True,
    rounding_mode=RoundingMode.TRN,        # Truncate (default)
    saturation_mode=SaturationMode.WRAP,   # Wrap around (default)
    saturation_bits=0
)

Rounding Modes

rounding_mode=RoundingMode.TRN
Fastest. Simply drops fractional bits. Can introduce negative bias.

Saturation Modes

saturation_mode=SaturationMode.WRAP
Wrap around on overflow (fastest, default). Can cause severe errors.

Bit-Exact Precision Inference

For properly quantized models (QKeras, HGQ), use bit-exact inference:
config = hls4ml.utils.config_from_keras_model(
    qkeras_model,
    granularity='name'
)

# Enable bit-exact inference (automatic for QKeras)
config['HLSConfig']['Model']['Precision'] = {
    'bit_exact': True
}

hls_model = hls4ml.converters.convert_from_keras_model(
    qkeras_model,
    hls_config=config,
    output_dir='bit_exact'
)
Bit-exact inference is automatically enabled for QKeras and HGQ models. It ignores user-defined precision and trusts the quantizers.

Requirements for Bit-Exact

  • Quantizers between all layers with non-trivial operations
  • Input quantization explicitly defined (QActivation as first layer)
  • All operations supported by bit-exact pass
Bit-exact inference will crash if it encounters unsupported operations or missing quantizers. Use automatic inference instead for unquantized models.

Precision Profiling

Use profiling to guide precision choices:
from hls4ml.model.profiling import numerical
import matplotlib.pyplot as plt

# Profile with test data
wp, wph, ap, aph = numerical(model=model, hls_model=hls_model, X=X_test)

# Grey boxes show current precision ranges
plt.show()
Interpret profiling results:
  • Box-and-whisker shows weight value distribution
  • Grey box shows representable range with current precision
  • If whiskers extend beyond grey box: increase precision
  • If grey box much larger than whiskers: can reduce precision

Advanced Precision Techniques

Heterogeneous Precision

Use different precision for different parts of the network:
config = hls4ml.utils.config_from_keras_model(model, granularity='name')

# Early layers: higher precision (processing raw inputs)
config['LayerName']['conv1']['Precision'] = {
    'result': 'ap_fixed<16,6>'
}

# Middle layers: medium precision
config['LayerName']['conv2']['Precision'] = {
    'result': 'ap_fixed<12,4>'
}

# Late layers: lower precision (features already extracted)
config['LayerName']['fc1']['Precision'] = {
    'result': 'ap_fixed<8,3>'
}

hls_model = hls4ml.converters.convert_from_keras_model(
    model, hls_config=config, output_dir='heterogeneous'
)

Accumulator Precision Tuning

Carefully control accumulator precision to balance accuracy and resources:
# Dense layer with 512 inputs
config['LayerName']['fc_large'] = {
    'Precision': {
        'weight': 'ap_fixed<8,3>',
        'bias': 'ap_fixed<8,3>',
        # Large accumulator for many additions
        'accum': 'ap_fixed<32,16>',
        # Smaller result after activation
        'result': 'ap_fixed<16,6>'
    }
}

Dynamic Precision Selection

Automatically tune precision based on profiling:
from hls4ml.model.profiling import weights_hlsmodel, types_hlsmodel
import numpy as np

def calculate_required_precision(data):
    """Calculate minimum precision from data distribution"""
    max_val = np.max(np.abs(data))
    integer_bits = int(np.ceil(np.log2(max_val + 1))) + 1  # +1 for sign
    
    # Target resolution: 1/256 of max value
    fractional_bits = int(np.ceil(np.log2(256)))
    
    total_bits = integer_bits + fractional_bits
    return f'ap_fixed<{total_bits},{integer_bits}>'

# Get weight statistics
weight_data = weights_hlsmodel(hls_model)

# Analyze and update precision
for layer_name, weights in weight_data.items():
    required = calculate_required_precision(weights)
    config['LayerName'][layer_name]['Precision']['weight'] = required

Precision for Different Backends

Vivado/Vitis

# Standard ap_fixed notation
config['Model']['Precision'] = {
    'default': 'ap_fixed<16,6>',
    'maximum': 'ap_fixed<32,16>'
}

Quartus

# Uses ac_fixed (same semantics)
config['Model']['Precision'] = {
    'default': 'ac_fixed<16,6,true>',  # width, integer, signed
}

Catapult

# ac_fixed format
config['Model']['Precision'] = {
    'default': 'ac_fixed<16,6,true>'
}

Best Practices

Always profile your model with representative test data before manually tuning precision. This provides data-driven guidance.
Let hls4ml infer precision automatically, then selectively override problem layers identified through C simulation.
Activation precision is more critical than weight precision since errors propagate. Ensure activations don’t overflow.
After changing precision, always run C simulation with diverse test data to catch overflow/underflow issues.
Layers with many accumulations (large Dense, Conv) need higher accumulator precision to avoid overflow.
For layers where overflow could be catastrophic, use SAT mode despite the resource cost.
Keep notes on why specific precision was chosen for each layer. This helps future debugging and tuning.

Troubleshooting

  • Check for overflow: increase integer bits
  • Check for underflow: increase fractional bits
  • Profile activations to see actual value ranges
  • Try increasing precision incrementally
  • Review profiling: are you using more precision than needed?
  • Set maximum precision to cap inferred types
  • Use heterogeneous precision: reduce precision in less critical layers
  • Consider lower precision for middle layers
  • Increase integer bits in affected layers
  • Use SAT saturation mode instead of WRAP
  • Review accumulator precision for layers with many operations
  • Ensure quantizers between all layers
  • Add QActivation as first layer for input quantization
  • Check that all operations are supported
  • Fall back to automatic inference if needed

API Reference

FixedPrecisionType

hls4ml.model.types.FixedPrecisionType(
    width,          # Total bits
    integer,        # Integer bits (including sign)
    signed=True,
    rounding_mode=RoundingMode.TRN,
    saturation_mode=SaturationMode.WRAP,
    saturation_bits=0
)

IntegerPrecisionType

hls4ml.model.types.IntegerPrecisionType(
    width,          # Total bits
    signed=True
)

InferPrecisionTypes Pass

from hls4ml.model.optimizer.passes.infer_precision import InferPrecisionTypes

# Automatically applied during conversion
# Can be configured:
pass_config = {
    'infer_no_bias': False  # Assume zero bias for tighter bounds
}

Build docs developers (and LLMs) love