Use this file to discover all available pages before exploring further.
When the Linux kernel encounters an error—a NULL pointer dereference, a corrupted data structure, an unexpected code path—it prints a diagnostic message to the kernel log. Learning to read these messages and use the kernel’s tracing infrastructure turns what looks like an opaque crash into a traceable sequence of events you can understand and fix.
A kernel oops is a recoverable error; the offending process is killed and the kernel continues. A panic is unrecoverable; the system halts or reboots. Both produce a stack trace that identifies exactly where in the kernel code the fault occurred.
Debug symbols significantly increase build size—a typical x86 kernel with localmodconfig grows from under 1 GB to roughly 5 GB of build artifacts. Disable them when space is constrained and you do not need to decode stack traces.
Dynamic debug lets you enable or disable individual pr_debug() and dev_dbg() callsites at runtime without recompiling. The kernel must be built with CONFIG_DYNAMIC_DEBUG=y.
The control interface is at /proc/dynamic_debug/control (and /sys/kernel/debug/dynamic_debug/control if debugfs is mounted):
# Enable all pr_debug messages in a specific fileecho 'file svcsock.c +p' > /proc/dynamic_debug/control# Enable all messages in a moduleecho 'module nfsd +p' > /proc/dynamic_debug/control# Enable messages in a specific functionecho 'func svc_process +p' > /proc/dynamic_debug/control# Add function name and line number to each messageecho 'module e1000e +pfl' > /proc/dynamic_debug/control# Disable everything you enabledecho '-p' > /proc/dynamic_debug/control
# Show all enabled debug callsites (=p means enabled)grep '=p' /proc/dynamic_debug/control# Show the full catalogcat /proc/dynamic_debug/control | head -20
The output format is filename:lineno [module]function flags format:
ftrace is the kernel’s built-in tracing framework, accessible through the tracefs filesystem (usually at /sys/kernel/tracing or /sys/kernel/debug/tracing).
# Mount tracefs if not already mountedmount -t tracefs tracefs /sys/kernel/tracingcd /sys/kernel/tracing# List available tracerscat available_tracers# function function_graph blk nop ...# Enable function tracerecho function > current_tracer# Start tracingecho 1 > tracing_on# Run your workload, then stopecho 0 > tracing_on# Read the tracecat trace | head -50
# Trace only functions matching a patternecho 'ext4_*' > set_ftrace_filter# Trace a function and everything it callsecho function_graph > current_tracerecho ext4_write_begin > set_graph_functionecho 1 > tracing_on
perf accesses the hardware performance monitoring units (PMUs) in the CPU to profile where time is spent, count cache misses, and trace kernel events.
1
Install perf
sudo apt install linux-perf
2
Record a CPU profile
# Profile system-wide for 10 secondssudo perf record -a -g sleep 10# Profile a specific commandsudo perf record -g ./my-program
3
View the report
sudo perf report
This opens an interactive TUI showing the hottest functions with their call chains.
Common perf commands:
# Count hardware events during a commandperf stat ./my-program# Record and show a flame graph (requires FlameGraph scripts)perf record -F 99 -a -g -- sleep 30perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl > flames.svg# List available eventsperf list# Trace specific kernel tracepointssudo perf trace -e sched:sched_switch
KGDB allows GDB to connect to a running kernel via a serial port (or kgdboc over a network). It requires CONFIG_KGDB=y and CONFIG_KGDB_SERIAL_CONSOLE=y (or CONFIG_KGDBOC).
Several /proc/sys/kernel/ entries affect how the kernel behaves around panics:
# Reboot 30 seconds after a panicsysctl -w kernel.panic=30# Panic on oops instead of killing only the processsysctl -w kernel.panic_on_oops=1# Print more context around oops messagessysctl -w kernel.print-fatal-signals=1
Make these permanent by adding them to /etc/sysctl.d/:
kdump captures a memory dump when the kernel panics. A secondary kernel (the “capture kernel”) is loaded at boot into reserved memory; when the primary kernel crashes, the capture kernel boots and saves the dump to disk.
crash> bt # backtrace of the crashing taskcrash> log # print the kernel message buffercrash> ps # list processes at crash timecrash> quit
For automated crash collection in CI or on remote servers where interactive debugging is not practical, consider configuring kdump to write dumps to a network share via SSH or NFS, then analyze them offline with the crash utility.