CodeFusion Studio provides two complementary profiling approaches for embedded AI models. Static resource profiling runs before any deployment and estimates memory requirements, inference latency, compute cycles, and per-layer bottlenecks directly from the model file and target hardware profile — no physical board is needed. Runtime hardware profiling captures live trace data from a running application on the actual device, delivering operator-level inference timing alongside system-level metrics such as CPU load and memory usage. Together, these tools give you a complete picture of how your model will behave on the target hardware before and after deployment.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/analogdevicesinc/codefusion-studio/llms.txt
Use this file to discover all available pages before exploring further.
The static resource profiler currently supports TFLM models only. CNN accelerator models on the MAX78002 that use the PyTorch
izer backend are not supported by the profiler or compatibility analyzer. The Open Profiling Report option in System Planner is disabled for izer models.Static resource profiling
Static profiling analyzes a model file against a target hardware profile without requiring a physical board. It produces a five-section report covering model metadata, memory requirements, hardware performance estimates, per-layer breakdown, and optimization recommendations.Run from System Planner
Add or select a TFLM model
If no model is configured, click Add Model and complete the model configuration. See Embedded AI Tools for field descriptions.
Open the Profiling Report
Click Open Profiling Report to generate and view the resource profiling report. The report opens as an interactive graphical view with filtering, recommendations panels, and an expandable layer table.
Run from the CLI
Usecfsutil ai profile to generate a resource profiling report from the terminal. Provide --model, --soc, and --core at minimum.
Flags
| Flag | Short | Description |
|---|---|---|
--model | -m | Path or URL to the model file. |
--soc | -s | Target SoC (e.g. MAX32690, MAX32657, ADSP-SC835). |
--core | -c | Target core (e.g. CM4, CM33, FX). |
--acc | -a | Target accelerator (optional). |
--package | -p | SoC package variant (optional). |
--format | Console output format: text (default) or json. | |
--report-file | Path to write the output report file. | |
--report-format | Report file format: json (default) or text. | |
--search-path | -x | Additional search path for data models. Can be repeated. |
--ignore-cache | Bypass cache and fetch the latest remote files. |
Example commands
Understanding the profiling report
The resource profiling report is organized into five sections that progressively drill down from high-level summaries to detailed layer-by-layer analysis.1. Model summary
1. Model summary
Provides a high-level overview of the analyzed model file before detailed profiling begins.
Example output:
| Field | Description |
|---|---|
| Model Name | Filename of the model under test. |
| Model Path | Path to the model file on disk. |
| Framework | ML framework type. Currently only TensorFlow Lite is supported. |
| Model Size | Memory required to store the model (in KB). |
| Data Type | Numerical precision defined in the model file (e.g. float32, int8). |
| Layer Count | Total number of layers parsed from the model. |
2. Memory analysis
2. Memory analysis
Shows peak runtime RAM requirements and compares them to the available RAM on the target hardware.
If the profiler detects over-usage, a Memory Issues subsection lists the specific problems (e.g.
| Field | Description |
|---|---|
| Peak RAM Required | Maximum RAM usage during model execution. |
| RAM Status | Whether peak RAM usage fits the target hardware constraints (OK or overflow). |
| Available RAM | Total RAM available on the target hardware, from the hardware profile. |
| RAM Utilization | Percentage of available RAM consumed. Formula: (Peak RAM ÷ Available RAM) × 100. |
Peak RAM usage (5880.0 KB) exceeds available RAM (1024.0 KB)). A Memory Recommendations subsection follows with context-specific mitigations.Example output (healthy model):3. Hardware performance
3. Hardware performance
Aggregates per-layer metrics into an overall performance estimate for the target hardware.
Example output:
| Field | Description |
|---|---|
| Total Cycles | Total compute cycles for full model execution. |
| Estimated Latency | End-to-end inference time in milliseconds, derived from cycles and max CPU clock frequency. |
| Peak Memory | Maximum RAM required during inference. |
| Accelerated Layers | Number of layers executed using hardware accelerators (DSP, NPU). |
| CPU-Only Layers | Number of layers that must run on the CPU without hardware acceleration. |
4. Per-layer performance table
4. Per-layer performance table
A detailed breakdown of compute and memory requirements for each layer. Use this table to identify specific performance bottlenecks.
Example output:In the System Planner graphical report, each row can be expanded by clicking the chevron icon and the table supports SQL-like queries — for example:
| Field | Description |
|---|---|
| Layer | Index of the layer in the model (e.g. 0, 1, 2). |
| Operator | Operator type used in this layer (e.g. CONV_2D, ADD, FULLY_CONNECTED). |
| Cycles | Total compute cycles to execute the layer. |
| Latency | Estimated runtime of the layer in milliseconds. |
| Energy | Estimated energy consumption in microjoules (µJ). |
| MACs | Number of multiply–accumulate operations. |
| Memory | Memory footprint of the layer in KB. |
| Accel | Whether the layer runs on hardware acceleration (Yes) or CPU only (No). |
5. Optimization opportunities
5. Optimization opportunities
Reports baseline totals and highlights the specific layers most likely to benefit from optimization — grouped into memory (parameter size) and compute (MACs) categories.Summary baseline metrics:
Layerwise memory opportunities flag layers with high parameter memory and suggest strategies such as depthwise separable convolution or low-rank factorization.Layerwise MAC opportunities flag layers with high MACs and suggest strategies such as replacing with depthwise convolution or using sparse matrices.Example output:
| Field | Description |
|---|---|
| Total Parameter Memory | Size of all model weights in KB. |
| Total MACs | Total multiply–accumulate operations for the model. |
Runtime hardware profiling
Runtime profiling captures live trace data from a model running on actual hardware. It is powered by the Zephelin middleware layer developed by Antmicro for Analog Devices. Zephelin extends Zephyr RTOS with advanced tracing capabilities including operator-level AI inference analysis, task switch events, interrupts, and timing information.The Zephelin profiling feature is currently in beta. Some options or trace formats may change in future releases.
Configure profiling options in System Planner
Open the Profiling tab
In System Planner, click the Profiling tab to see a list of cores in your workspace.
Configure profiling options
Expand the core entry and configure the available options:
- Application Callgraph — Enables instrumentation for capturing function call graphs and application-level tracing.
- AI Model Profiling — Enables TFLM inference tracing. Only available when AI models are configured in Embedded AI Tools and the target processor supports it.
- CPU Load — Samples CPU usage at a configurable interval (in milliseconds).
- Memory Usage — Samples memory consumption at a configurable interval (in milliseconds).
Configure interface options
Set the trace transport interface:
- Trace Interface Type — Select UART (USB is planned for a future release).
- Trace Interface — Select the UART port number for trace output (e.g.
0for UART0,2for UART2). Available options depend on UART peripherals allocated in Peripheral Allocation. - Baud Rate — Displays the configured baud rate for the trace interface (typically 115200).
Capture a profiling trace
Open the Trace Capture panel
On the CFS Home Page, expand the TRACE CAPTURE section. If you see a Setup required or Source not available message, click Configure capture to open the Trace Configuration view.
Configure the trace source
In the Trace Configuration view, set:
- Trace Interface Type —
UART. - Serial Port — The serial port connected to your board’s UART trace output (e.g.
/dev/ttyUSB0on Linux,/dev/tty.usbserial-*on macOS,COM3on Windows). - Baud Rate — Must match the firmware configuration (default: 115200).
- Output Directory — Where trace files are saved (default:
<core>/tracefiles). - ELF File (optional) — Path to the application ELF binary, used to symbolize trace data (default:
<core>/build/zephyr/zephyr.elf). Without this, function names may not appear in the trace viewer. - Build Directory (optional) — Path to the build output directory for debug symbol lookup.
Start capture
Return to the TRACE CAPTURE panel and click Start Capture.
Close any serial monitor or terminal before starting capture. The trace capture requires exclusive access to the UART port. If another process is using the same port, the capture will fail.
Reset the board
Press the Reset button on your development board to restart the application and begin trace data transmission. The
.ctf file will appear in the output directory once the device begins transmitting. For applications that run once and exit (such as the AI profiling examples), you may need to press Reset multiple times to capture additional trace data.Stop capture
Click Stop Capture when you have collected sufficient data. The capture generates timestamped trace files using the naming pattern
tracefile_YYYYMMDD_HHMMSS. Each board reset during a single capture session produces a separate pair of files:.ctffile — Binary trace in Common Trace Format, optimized for efficient UART transmission..teffile — JSON-based Trace Event Format file for visualization in the Zephelin Trace Viewer.
AI Hardware Profiling view
For workspaces created from an AI model, the AI Hardware Profiling view (.cfs/ai.cfsaiprof file) provides a guided interface for deploying the model to hardware and capturing profiling data. It exposes the same trace capture workflow but adds deployment-specific status indicators and hardware configuration options.
The view includes:
- Building / Built / Error status indicator for the compilation state.
- Undeployed → Deploying → Running → Stopped → Error status indicator for the deployment and trace capture state.
- Host USB port designation dropdown — select the serial port for UART trace output.
- Run with dropdown — select the debug probe type (
J-LinkorCMSIS-DAP) used to flash the application. - Run and Stop buttons to control deployment and trace capture.
.cfs folder in your workspace and open the ai.cfsaiprof file.
Visualize trace data with the Zephelin Trace Viewer
The Zephelin Trace Viewer VS Code extension is automatically installed as a dependency of the CodeFusion Studio extension. You can open.tef trace files in two ways:
- Immediately after capture — When capture completes, a notification displays Traces captured successfully with a list of generated files. If multiple files were produced, click Choose a file to open to select which to view.
- From Explorer — In the VS Code Explorer, locate any
.teffile and click it to open in the trace viewer.
CPU Load
Shows CPU usage sampled at the configured interval. Visualizes how much CPU time is consumed by the AI inference relative to other application tasks.
AI Model Profiling
Displays per-operator inference timing for TensorFlow Lite Micro models. Enables detailed layer-level performance analysis of the live inference run.
Memory Usage
Shows per-thread memory consumption with multiple visualization types, sampled at the configured interval.
Next steps
Check Compatibility First
Run the Compatibility Analyzer to validate operator support and memory constraints before profiling.
Create an AI Workspace
Set up a workspace pre-configured for your model and target device using the GUI wizard or CLI.