The MSVC compiler and linker expose a rich set of optimization features ranging from basic per-file optimizations to cross-module whole-program optimization and profile-guided feedback. Understanding these options — and how they interact — lets you extract the best possible performance or the smallest possible binary from the MSVC toolchain. This guide covers each optimization tier in depth, with real command-line examples and a PGO workflow walkthrough.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/MicrosoftDocs/cpp-docs/llms.txt
Use this file to discover all available pages before exploring further.
Compiler Optimization Levels
The/O family of options selects a predefined combination of lower-level optimization flags.
Option Summary
| Option | Purpose | Expands To |
|---|---|---|
/Od | Disable all optimization (Debug default) | — |
/O1 | Minimize code size | /Og /Os /Oy /Ob2 /GF /Gy |
/O2 | Maximize execution speed (Release default) | /Og /Oi /Ot /Oy /Ob2 /GF /Gy |
/Ox | Full optimization (superset of /O2) | /Og /Oi /Ot /Oy /Ob2 |
/Og | Enable global (cross-statement) optimizations | — |
/Oi | Replace eligible function calls with intrinsics | — |
/Os | Favor small code over fast code | — |
/Ot | Favor fast code over small code | — |
/Ob0 | Disable all inlining | — |
/Ob1 | Inline only functions marked __inline or inline | — |
/Ob2 | Inline any function the compiler deems suitable | — |
/Oy | Omit frame pointer (x86 only, frees one register) | — |
/GF | Eliminate duplicate string literals (string pooling) | — |
/Gy | Enable function-level linking (COMDATs) | — |
/O1 — Minimize Size
/O1 is equivalent to /Og /Os /Oy /Ob2 /GF /Gy. It prioritizes code size over speed. Use this when binary footprint matters more than raw throughput — for example, embedded targets or components where cache effects from a smaller binary outweigh raw instruction throughput.
/O2 — Maximize Speed
/O2 is the standard release build optimization level and the default for Visual Studio Release configurations. It adds /Oi (intrinsic functions) and /Ot (favor fast code) to the /O1 set.
/Ox — Maximum Optimization
/Ox enables /Og /Oi /Ot /Oy /Ob2 but does not include /GF (string pooling) or /Gy (function-level linking). In practice, /O2 /Gy is usually preferred over /Ox because function-level linking enables the linker to eliminate unreferenced functions with /OPT:REF.
/O1 and /O2 are mutually exclusive. The last one specified wins.Whole-Program Optimization: /GL
/GL (Whole Program Optimization) tells the compiler to emit intermediate MSVC IR rather than native code into .obj files. The actual native code generation is deferred to the linker, which can then see and optimize across all translation units simultaneously.
- Inline functions across translation-unit boundaries
- Better register allocation that spans function calls
- Reduce redundant loads and stores to global data
- Track possible modifications through pointer dereferences
Link-Time Code Generation: /LTCG
/LTCG is the linker-side counterpart to /GL. When the linker detects one or more /GL-compiled modules, it invokes the compiler back-end to perform native code generation across all of them at once. If you don’t specify /LTCG explicitly, the linker detects /GL modules and restarts itself with LTCG automatically — but specifying it explicitly gives the fastest build performance.
Incremental LTCG
/LTCG:INCREMENTAL reoptimizes only the modules that changed since the previous link, which can substantially reduce link time during development of release builds:
If you remove
/LTCG:INCREMENTAL, also remove any corresponding /LTCGOUT option to avoid stale incremental data increasing build times.Profile-Guided Optimization (PGO)
Profile-Guided Optimization uses runtime data collected from real workloads to guide final code generation. PGO can produce binaries 10–25% faster than/O2 /GL /LTCG alone for workloads that have measurable hot/cold patterns.
The PGO workflow has three distinct phases:
Instrument — Build a profiling binary
Compile with
/GL and link with /LTCG /GENPROFILE. This produces an instrumented executable that records branch frequencies, function call counts, and indirect call targets at runtime. A .pgd database file is created alongside the binary.Train — Run representative workloads
Execute the instrumented binary with scenarios representative of real usage. Each run appends profiling data to After training, the directory contains
.pgc files in the same directory as the binary. You can run multiple workloads:myapp_instr!N.pgc files (one per run). Use pgomgr.exe to merge them:Optimize — Build the optimized binary
Link the final binary using
/LTCG /USEPROFILE. The compiler uses the collected profile data to:- Reorder code so hot paths have better cache locality
- Inline hot call sites more aggressively
- Optimize indirect calls by specializing for the most common target
- Move cold code (error paths, rarely-called functions) to separate sections
PGO in MSBuild / Visual Studio
In a Visual Studio project, PGO is exposed in Property Pages → Configuration Properties → C/C++ → Optimization → Profile Guided Optimization. The IDE provides three commands under Build → Profile Guided Optimization:- Instrument — builds the instrumented binary
- Run Instrumented Application — shortcut to run it
- Optimize — links the final optimized binary
PGO Tips
Use /FASTGENPROFILE instead of /GENPROFILE for sampling-based PGO
Use /FASTGENPROFILE instead of /GENPROFILE for sampling-based PGO
/FASTGENPROFILE instruments using statistical sampling rather than precise counters, which reduces the overhead of the instrumented binary (typically 20–100% instead of 5–10× slowdown). Use it when running the full workload with precise instrumentation is impractical.Cover all code paths during training
Cover all code paths during training
PGO is only as good as the training data. Code paths not exercised during training are treated as cold. Provide a representative mix of inputs: normal usage, edge cases, and performance-critical scenarios. If your product has automated benchmarks, run them during the training phase.
Re-run PGO when behavior changes significantly
Re-run PGO when behavior changes significantly
PGO profiles can become stale after major refactoring. A profile collected from an older version of the code may produce suboptimal results if function layouts have changed significantly. Re-instrument and re-train when the hot paths change.
Architecture-Specific Tuning: /favor
The/favor option selects micro-architecture-specific code generation tuning without changing the ISA target. It is an x86/x64-only option.
| Option | Target | Description |
|---|---|---|
/favor:blend | x86 and x64 | Optimizes for a broad mix of AMD and Intel micro-architectures (default) |
/favor:ATOM | x86 and x64 | Optimizes for Intel Atom/Centrino Atom; may generate SSSE3/SSE3 instructions |
/favor:AMD64 | x64 only | Tunes for AMD Opteron/Athlon 64; may perform worse on Intel processors |
/favor:INTEL64 | x64 only | Tunes for Intel processors supporting Intel64; may perform worse on AMD |
Linker-Side Optimizations
MSVC’s linker performs several optimizations that complement compiler-level work./OPT:REF — Remove Unreferenced Code
/OPT:REF removes functions and data that are never referenced in the final binary. This requires that object files were compiled with /Gy (function-level linking, enabled by /O1 and /O2) so that each function is in its own COMDAT section.
/OPT:ICF — Identical COMDAT Folding
/OPT:ICF merges functions or read-only data that have identical binary content, reducing the binary size.
Complete Optimization Option Table
| Option | Tool | Type | Description |
|---|---|---|---|
/Od | cl.exe | Compiler | Disable optimization (Debug) |
/O1 | cl.exe | Compiler | Minimize size |
/O2 | cl.exe | Compiler | Maximize speed (Release default) |
/Ox | cl.exe | Compiler | Full optimization |
/Og | cl.exe | Compiler | Global optimizations |
/Oi | cl.exe | Compiler | Intrinsic functions |
/Os | cl.exe | Compiler | Favor small code |
/Ot | cl.exe | Compiler | Favor fast code |
/Ob2 | cl.exe | Compiler | Aggressive inlining |
/Gy | cl.exe | Compiler | Function-level linking (COMDATs) |
/GF | cl.exe | Compiler | String pooling |
/GL | cl.exe | Compiler | Whole-program optimization (deferred codegen) |
/favor:blend | cl.exe | Compiler | Balanced AMD/Intel tuning (default) |
/favor:INTEL64 | cl.exe | Compiler | Intel64-specific tuning |
/favor:AMD64 | cl.exe | Compiler | AMD64-specific tuning |
/LTCG | link.exe | Linker | Link-time code generation |
/LTCG:INCREMENTAL | link.exe | Linker | Incremental LTCG |
/OPT:REF | link.exe | Linker | Eliminate unreferenced functions/data |
/OPT:ICF | link.exe | Linker | Fold identical COMDATs |
/GENPROFILE | link.exe | Linker | PGO Phase 1: Instrument |
/FASTGENPROFILE | link.exe | Linker | PGO Phase 1: Sampling instrument |
/USEPROFILE | link.exe | Linker | PGO Phase 3: Optimize from profile data |
Recommended Configurations
- Standard Release
- Maximum Performance (PGO)
- Minimum Size
A solid release configuration suitable for most applications: