Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/MicrosoftDocs/cpp-docs/llms.txt

Use this file to discover all available pages before exploring further.

The MSVC compiler and linker expose a rich set of optimization features ranging from basic per-file optimizations to cross-module whole-program optimization and profile-guided feedback. Understanding these options — and how they interact — lets you extract the best possible performance or the smallest possible binary from the MSVC toolchain. This guide covers each optimization tier in depth, with real command-line examples and a PGO workflow walkthrough.

Compiler Optimization Levels

The /O family of options selects a predefined combination of lower-level optimization flags.

Option Summary

OptionPurposeExpands To
/OdDisable all optimization (Debug default)
/O1Minimize code size/Og /Os /Oy /Ob2 /GF /Gy
/O2Maximize execution speed (Release default)/Og /Oi /Ot /Oy /Ob2 /GF /Gy
/OxFull optimization (superset of /O2)/Og /Oi /Ot /Oy /Ob2
/OgEnable global (cross-statement) optimizations
/OiReplace eligible function calls with intrinsics
/OsFavor small code over fast code
/OtFavor fast code over small code
/Ob0Disable all inlining
/Ob1Inline only functions marked __inline or inline
/Ob2Inline any function the compiler deems suitable
/OyOmit frame pointer (x86 only, frees one register)
/GFEliminate duplicate string literals (string pooling)
/GyEnable function-level linking (COMDATs)

/O1 — Minimize Size

/O1 is equivalent to /Og /Os /Oy /Ob2 /GF /Gy. It prioritizes code size over speed. Use this when binary footprint matters more than raw throughput — for example, embedded targets or components where cache effects from a smaller binary outweigh raw instruction throughput.
cl /O1 /EHsc /MD /DNDEBUG src\engine.cpp

/O2 — Maximize Speed

/O2 is the standard release build optimization level and the default for Visual Studio Release configurations. It adds /Oi (intrinsic functions) and /Ot (favor fast code) to the /O1 set.
cl /O2 /EHsc /MD /DNDEBUG src\engine.cpp

/Ox — Maximum Optimization

/Ox enables /Og /Oi /Ot /Oy /Ob2 but does not include /GF (string pooling) or /Gy (function-level linking). In practice, /O2 /Gy is usually preferred over /Ox because function-level linking enables the linker to eliminate unreferenced functions with /OPT:REF.
cl /Ox /GF /Gy /EHsc /MD /DNDEBUG src\engine.cpp
/O1 and /O2 are mutually exclusive. The last one specified wins.

Whole-Program Optimization: /GL

/GL (Whole Program Optimization) tells the compiler to emit intermediate MSVC IR rather than native code into .obj files. The actual native code generation is deferred to the linker, which can then see and optimize across all translation units simultaneously.
# Compile with /GL — emits MSVC IR, not native code
cl /GL /O2 /EHsc /MD /DNDEBUG ^
   src\main.cpp src\engine.cpp src\utils.cpp

# Link with /LTCG — triggers cross-module native code generation
link /LTCG /OPT:REF /OPT:ICF ^
     /OUT:bin\myapp.exe ^
     main.obj engine.obj utils.obj ^
     kernel32.lib msvcrt.lib
What /GL enables across modules:
  • Inline functions across translation-unit boundaries
  • Better register allocation that spans function calls
  • Reduce redundant loads and stores to global data
  • Track possible modifications through pointer dereferences
.obj files compiled with /GL are not usable by DUMPBIN.exe or EDITBIN.exe, and are not compatible with LINK from a different version of Visual Studio. Do not ship static libraries (.lib) built entirely from /GL object files unless you also ship corresponding binaries for each Visual Studio version you support.
/LTCG is the linker-side counterpart to /GL. When the linker detects one or more /GL-compiled modules, it invokes the compiler back-end to perform native code generation across all of them at once. If you don’t specify /LTCG explicitly, the linker detects /GL modules and restarts itself with LTCG automatically — but specifying it explicitly gives the fastest build performance.
link /LTCG /OPT:REF /OPT:ICF /OUT:myapp.exe *.obj kernel32.lib

Incremental LTCG

/LTCG:INCREMENTAL reoptimizes only the modules that changed since the previous link, which can substantially reduce link time during development of release builds:
link /LTCG:INCREMENTAL /OPT:REF /OPT:ICF /OUT:myapp.exe *.obj kernel32.lib
If you remove /LTCG:INCREMENTAL, also remove any corresponding /LTCGOUT option to avoid stale incremental data increasing build times.

Profile-Guided Optimization (PGO)

Profile-Guided Optimization uses runtime data collected from real workloads to guide final code generation. PGO can produce binaries 10–25% faster than /O2 /GL /LTCG alone for workloads that have measurable hot/cold patterns. The PGO workflow has three distinct phases:
1

Instrument — Build a profiling binary

Compile with /GL and link with /LTCG /GENPROFILE. This produces an instrumented executable that records branch frequencies, function call counts, and indirect call targets at runtime. A .pgd database file is created alongside the binary.
# Compile all sources with whole-program optimization
cl /GL /O2 /EHsc /MD /DNDEBUG ^
   src\main.cpp src\engine.cpp src\utils.cpp

# Link the instrumented binary
link /LTCG /GENPROFILE /PGD:myapp.pgd ^
     /OUT:myapp_instr.exe ^
     main.obj engine.obj utils.obj ^
     kernel32.lib msvcrt.lib
2

Train — Run representative workloads

Execute the instrumented binary with scenarios representative of real usage. Each run appends profiling data to .pgc files in the same directory as the binary. You can run multiple workloads:
# Run with different representative inputs
myapp_instr.exe --scenario benchmark_http
myapp_instr.exe --scenario benchmark_crypto
myapp_instr.exe --scenario benchmark_parsing
After training, the directory contains myapp_instr!N.pgc files (one per run). Use pgomgr.exe to merge them:
pgomgr /merge myapp.pgd
3

Optimize — Build the optimized binary

Link the final binary using /LTCG /USEPROFILE. The compiler uses the collected profile data to:
  • Reorder code so hot paths have better cache locality
  • Inline hot call sites more aggressively
  • Optimize indirect calls by specializing for the most common target
  • Move cold code (error paths, rarely-called functions) to separate sections
link /LTCG /USEPROFILE /PGD:myapp.pgd ^
     /OPT:REF /OPT:ICF ^
     /OUT:myapp_opt.exe ^
     main.obj engine.obj utils.obj ^
     kernel32.lib msvcrt.lib

PGO in MSBuild / Visual Studio

In a Visual Studio project, PGO is exposed in Property Pages → Configuration Properties → C/C++ → Optimization → Profile Guided Optimization. The IDE provides three commands under Build → Profile Guided Optimization:
  • Instrument — builds the instrumented binary
  • Run Instrumented Application — shortcut to run it
  • Optimize — links the final optimized binary

PGO Tips

/FASTGENPROFILE instruments using statistical sampling rather than precise counters, which reduces the overhead of the instrumented binary (typically 20–100% instead of 5–10× slowdown). Use it when running the full workload with precise instrumentation is impractical.
link /LTCG /FASTGENPROFILE /PGD:myapp.pgd /OUT:myapp_instr.exe *.obj
PGO is only as good as the training data. Code paths not exercised during training are treated as cold. Provide a representative mix of inputs: normal usage, edge cases, and performance-critical scenarios. If your product has automated benchmarks, run them during the training phase.
PGO profiles can become stale after major refactoring. A profile collected from an older version of the code may produce suboptimal results if function layouts have changed significantly. Re-instrument and re-train when the hot paths change.

Architecture-Specific Tuning: /favor

The /favor option selects micro-architecture-specific code generation tuning without changing the ISA target. It is an x86/x64-only option.
OptionTargetDescription
/favor:blendx86 and x64Optimizes for a broad mix of AMD and Intel micro-architectures (default)
/favor:ATOMx86 and x64Optimizes for Intel Atom/Centrino Atom; may generate SSSE3/SSE3 instructions
/favor:AMD64x64 onlyTunes for AMD Opteron/Athlon 64; may perform worse on Intel processors
/favor:INTEL64x64 onlyTunes for Intel processors supporting Intel64; may perform worse on AMD
# Tune for AMD processors (e.g., for a dedicated AMD-based cluster)
cl /O2 /favor:AMD64 /EHsc /MD /DNDEBUG src\engine.cpp

# Tune for Intel processors
cl /O2 /favor:INTEL64 /EHsc /MD /DNDEBUG src\engine.cpp

# Default: balanced across AMD and Intel (most portable)
cl /O2 /favor:blend /EHsc /MD /DNDEBUG src\engine.cpp
Code generated with /favor:AMD64 may perform significantly worse on Intel processors and vice versa. Unless you control the deployment target hardware, use the default /favor:blend.

Linker-Side Optimizations

MSVC’s linker performs several optimizations that complement compiler-level work.

/OPT:REF — Remove Unreferenced Code

/OPT:REF removes functions and data that are never referenced in the final binary. This requires that object files were compiled with /Gy (function-level linking, enabled by /O1 and /O2) so that each function is in its own COMDAT section.
link /OPT:REF /OUT:myapp.exe *.obj kernel32.lib

/OPT:ICF — Identical COMDAT Folding

/OPT:ICF merges functions or read-only data that have identical binary content, reducing the binary size.
link /OPT:REF /OPT:ICF /OUT:myapp.exe *.obj kernel32.lib

# Run additional folding passes (default is 1)
link /OPT:REF /OPT:ICF=3 /OUT:myapp.exe *.obj kernel32.lib
/OPT:ICF can assign the same address to different functions. Code that relies on function pointer inequality (e.g., plugin registries keyed on function address) may break. Use /OPT:NOICF in such cases.

Complete Optimization Option Table

OptionToolTypeDescription
/Odcl.exeCompilerDisable optimization (Debug)
/O1cl.exeCompilerMinimize size
/O2cl.exeCompilerMaximize speed (Release default)
/Oxcl.exeCompilerFull optimization
/Ogcl.exeCompilerGlobal optimizations
/Oicl.exeCompilerIntrinsic functions
/Oscl.exeCompilerFavor small code
/Otcl.exeCompilerFavor fast code
/Ob2cl.exeCompilerAggressive inlining
/Gycl.exeCompilerFunction-level linking (COMDATs)
/GFcl.exeCompilerString pooling
/GLcl.exeCompilerWhole-program optimization (deferred codegen)
/favor:blendcl.exeCompilerBalanced AMD/Intel tuning (default)
/favor:INTEL64cl.exeCompilerIntel64-specific tuning
/favor:AMD64cl.exeCompilerAMD64-specific tuning
/LTCGlink.exeLinkerLink-time code generation
/LTCG:INCREMENTALlink.exeLinkerIncremental LTCG
/OPT:REFlink.exeLinkerEliminate unreferenced functions/data
/OPT:ICFlink.exeLinkerFold identical COMDATs
/GENPROFILElink.exeLinkerPGO Phase 1: Instrument
/FASTGENPROFILElink.exeLinkerPGO Phase 1: Sampling instrument
/USEPROFILElink.exeLinkerPGO Phase 3: Optimize from profile data
A solid release configuration suitable for most applications:
# Compile
cl /O2 /GL /EHsc /GR /MD /DNDEBUG /W4 ^
   /std:c++17 /Zi ^
   /Fo:obj\release\ ^
   src\*.cpp

# Link
link /LTCG /OPT:REF /OPT:ICF ^
     /DEBUG ^
     /OUT:bin\myapp.exe ^
     /PDB:symbols\myapp.pdb ^
     /SUBSYSTEM:CONSOLE ^
     obj\release\*.obj ^
     kernel32.lib msvcrt.lib

Build docs developers (and LLMs) love