Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/DeelerDev/linux/llms.txt

Use this file to discover all available pages before exploring further.

When the Linux kernel encounters an error—a NULL pointer dereference, a corrupted data structure, an unexpected code path—it prints a diagnostic message to the kernel log. Learning to read these messages and use the kernel’s tracing infrastructure turns what looks like an opaque crash into a traceable sequence of events you can understand and fix.

Reading kernel oops and panics

A kernel oops is a recoverable error; the offending process is killed and the kernel continues. A panic is unrecoverable; the system halts or reboots. Both produce a stack trace that identifies exactly where in the kernel code the fault occurred.

Anatomy of an oops

Here is a typical oops:
------------[ cut here ]------------
WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO)
CPU: 1 PID: 28102 Comm: rmmod Tainted: P        WC O 4.8.4-build.1 #1
Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
Call Trace:
 [<c12ba080>] ? dump_stack+0x44/0x64
 [<c103ed6a>] ? __warn+0xfa/0x120
 [<c109e8a7>] ? module_put+0x57/0x70
 [<c103ee33>] ? warn_slowpath_null+0x23/0x30
---[ end trace 6ebc60ef3981792f ]---
Key fields to read:
FieldWhat it tells you
CPU: 1 PID: 28102Which CPU and process triggered the fault
Comm: rmmodThe process name
Tainted: P WC OTaint flags (proprietary module, warning, out-of-tree)
module_put+0x57/0x70Faulting function and byte offset within it
Call TraceThe full call stack at the time of the fault

Finding the source line with gdb

If the kernel was compiled with CONFIG_DEBUG_INFO=y, you can resolve any address in the call trace to a source file and line number:
gdb vmlinux
(gdb) l *module_put+0x57
Or use the helper script that automates this for the whole trace:
# Pipe a saved oops through the decoder
cat oops.txt | scripts/decode_stacktrace.sh vmlinux
To enable debug symbols for future builds:
./scripts/config -d DEBUG_INFO_NONE -e DEBUG_KERNEL \
  -e DEBUG_INFO -e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT \
  -e KALLSYMS -e KALLSYMS_ALL
make olddefconfig
Debug symbols significantly increase build size—a typical x86 kernel with localmodconfig grows from under 1 GB to roughly 5 GB of build artifacts. Disable them when space is constrained and you do not need to decode stack traces.

Finding the oops message

On most systems, oops messages are captured by the syslog daemon and written to /var/log/messages or accessible via journalctl:
# Recent kernel messages including any oops
dmesg | tail -100

# Via journald
journalctl -k

# Save ring buffer to a file
dmesg > /tmp/kernel-log.txt

Dynamic debug

Dynamic debug lets you enable or disable individual pr_debug() and dev_dbg() callsites at runtime without recompiling. The kernel must be built with CONFIG_DYNAMIC_DEBUG=y.

Enabling debug messages

The control interface is at /proc/dynamic_debug/control (and /sys/kernel/debug/dynamic_debug/control if debugfs is mounted):
# Enable all pr_debug messages in a specific file
echo 'file svcsock.c +p' > /proc/dynamic_debug/control

# Enable all messages in a module
echo 'module nfsd +p' > /proc/dynamic_debug/control

# Enable messages in a specific function
echo 'func svc_process +p' > /proc/dynamic_debug/control

# Add function name and line number to each message
echo 'module e1000e +pfl' > /proc/dynamic_debug/control

# Disable everything you enabled
echo '-p' > /proc/dynamic_debug/control

Enabling at boot time

For built-in code that runs before debugfs is mounted, use the dyndbg= kernel parameter:
dyndbg="file ec.c +p"
btrfs.dyndbg="+p"

Viewing active debug sites

# Show all enabled debug callsites (=p means enabled)
grep '=p' /proc/dynamic_debug/control

# Show the full catalog
cat /proc/dynamic_debug/control | head -20
The output format is filename:lineno [module]function flags format:
net/sunrpc/svcsock.c:1603 [sunrpc]svc_tcp_accept =p "accept failed..."

ftrace: function tracing

ftrace is the kernel’s built-in tracing framework, accessible through the tracefs filesystem (usually at /sys/kernel/tracing or /sys/kernel/debug/tracing).

Basic function tracing

# Mount tracefs if not already mounted
mount -t tracefs tracefs /sys/kernel/tracing

cd /sys/kernel/tracing

# List available tracers
cat available_tracers
# function function_graph blk nop ...

# Enable function tracer
echo function > current_tracer

# Start tracing
echo 1 > tracing_on

# Run your workload, then stop
echo 0 > tracing_on

# Read the trace
cat trace | head -50

Tracing specific functions

# Trace only functions matching a pattern
echo 'ext4_*' > set_ftrace_filter

# Trace a function and everything it calls
echo function_graph > current_tracer
echo ext4_write_begin > set_graph_function
echo 1 > tracing_on

Using trace-cmd

trace-cmd wraps the tracefs interface into a convenient command-line tool:
# Record a trace for 5 seconds
trace-cmd record -p function_graph -g ext4_write_begin sleep 5

# Display the recorded trace
trace-cmd report

perf: performance counters and profiling

perf accesses the hardware performance monitoring units (PMUs) in the CPU to profile where time is spent, count cache misses, and trace kernel events.
1

Install perf

sudo apt install linux-perf
2

Record a CPU profile

# Profile system-wide for 10 seconds
sudo perf record -a -g sleep 10

# Profile a specific command
sudo perf record -g ./my-program
3

View the report

sudo perf report
This opens an interactive TUI showing the hottest functions with their call chains.
Common perf commands:
# Count hardware events during a command
perf stat ./my-program

# Record and show a flame graph (requires FlameGraph scripts)
perf record -F 99 -a -g -- sleep 30
perf script | ./FlameGraph/stackcollapse-perf.pl | ./FlameGraph/flamegraph.pl > flames.svg

# List available events
perf list

# Trace specific kernel tracepoints
sudo perf trace -e sched:sched_switch

KGDB: kernel debugger over serial

KGDB allows GDB to connect to a running kernel via a serial port (or kgdboc over a network). It requires CONFIG_KGDB=y and CONFIG_KGDB_SERIAL_CONSOLE=y (or CONFIG_KGDBOC).

Setup

Add to the kernel command line:
kgdboc=ttyS0,115200 kgdbwait
kgdbwait halts the kernel at boot until GDB connects. Remove it after initial setup.

Connecting GDB

On the debugging host:
gdb vmlinux
(gdb) set remotebaud 115200
(gdb) target remote /dev/ttyS0
(gdb) bt       # backtrace
(gdb) list     # show source
(gdb) continue
kgdbwait stops the kernel completely at boot. Do not use it on production systems where you need the system to come up unattended.

Bisecting regressions with git bisect

When a bug appears after a kernel upgrade, git bisect uses binary search to find the exact commit that introduced the regression in O(log n) steps.
1

Start the bisection

git bisect start
git bisect good v6.0    # last known-good version
git bisect bad v6.1     # first version with the bug
2

Build and test each commit

Git checks out a commit at the midpoint. Build and test it:
cp ~/prepared_kernel_.config .config
make olddefconfig
make -j $(nproc --all)
sudo make modules_install install
reboot
3

Mark the result

After testing, tell git whether this commit is good or bad:
git bisect good   # bug is absent
# or
git bisect bad    # bug is present
If a commit fails to compile, skip it:
git bisect skip
Repeat until git identifies the first bad commit.
4

Finish and clean up

# Save the bisection log for your bug report
git bisect log > ~/bisection-log

# Reset to the original HEAD
git bisect reset
After the bisection, you can try reverting the culprit commit to validate the finding:
git revert --no-edit <bad-commit-id>

Controlling panic behavior with sysctl

Several /proc/sys/kernel/ entries affect how the kernel behaves around panics:
# Reboot 30 seconds after a panic
sysctl -w kernel.panic=30

# Panic on oops instead of killing only the process
sysctl -w kernel.panic_on_oops=1

# Print more context around oops messages
sysctl -w kernel.print-fatal-signals=1
Make these permanent by adding them to /etc/sysctl.d/:
# /etc/sysctl.d/99-panic.conf
kernel.panic = 30
kernel.panic_on_oops = 1

kdump and crash for post-mortem analysis

kdump captures a memory dump when the kernel panics. A secondary kernel (the “capture kernel”) is loaded at boot into reserved memory; when the primary kernel crashes, the capture kernel boots and saves the dump to disk.
1

Install kdump tools

sudo apt install kdump-tools crash linux-crashdump
2

Reserve memory for the capture kernel

Add to the kernel command line:
crashkernel=256M
Then reboot.
3

Verify kdump is active

sudo systemctl status kdump
cat /proc/iomem | grep Crash
4

Analyze a saved vmcore

After a crash, the dump is saved to /var/crash/ (Debian/Ubuntu) or /var/crash/<date>/ (Fedora/RHEL). Open it with crash:
sudo crash /usr/lib/debug/boot/vmlinux-$(uname -r) \
           /var/crash/vmcore
Inside the crash shell:
crash> bt         # backtrace of the crashing task
crash> log        # print the kernel message buffer
crash> ps         # list processes at crash time
crash> quit
For automated crash collection in CI or on remote servers where interactive debugging is not practical, consider configuring kdump to write dumps to a network share via SSH or NFS, then analyze them offline with the crash utility.

Build docs developers (and LLMs) love