Kernel self-protection is the design and implementation of mechanisms within the Linux kernel that reduce the impact of security flaws in the kernel itself. The core insight, from the kernel’s own documentation, is that the worst-case attacker has arbitrary read and write access to kernel memory. If defenses hold under that assumption, they will also hold under the more limited access that most real-world bugs provide. The goals for a self-protection system are that it is effective, on by default, requires no opt-in by developers, has no measurable performance impact, does not impede kernel debugging, and has tests. Meeting all of these simultaneously is uncommon, but they remain the standard against which proposals are evaluated.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/DeelerDev/linux/llms.txt
Use this file to discover all available pages before exploring further.
Defense-in-depth philosophy
No single mechanism is sufficient. Kernel hardening is applied in layers: some mechanisms prevent bugs from being exploitable at all (read-only memory, strict RWX), some raise the cost of exploitation (KASLR, stack canaries), some detect exploitation in progress (KASAN, KFENCE), and some limit what a successful exploit can do (seccomp, namespaces, LSM policy). This document covers the mechanisms applied at the kernel level itself.Kernel hardening options are independent of userspace access control. They protect the kernel from exploitation of its own bugs — whether triggered by an unprivileged local attacker or a privileged one who has loaded a malicious module.
Strict kernel memory permissions
Executable code and read-only data
The most direct way to prevent an attacker from redirecting kernel execution is to ensure that kernel code pages are never writable and kernel data pages are never executable.CONFIG_STRICT_KERNEL_RWX enforces this split:
- Kernel text (
.text) and read-only data (.rodata) are mapped non-writable. - Kernel data (
.data,.bss) is mapped non-executable. - Module code and data are treated the same way via
CONFIG_STRICT_MODULE_RWX.
ARCH_OPTIONAL_KERNEL_RWX.
Immutable function pointers
Kernel data structures contain many function pointer tables (file operations, network protocol handlers, descriptor tables). These are prime targets for overwrite attacks. Variables that are set once at boot can be marked__ro_after_init, which places them in a region that becomes read-only after kernel initialization completes:
__init time and then constant for the rest of the kernel’s life should use this attribute. It prevents runtime overwrite without the overhead of cryptographic integrity checking.
KASLR — kernel address space layout randomization
Since knowing the address of kernel code or data structures is a prerequisite for most kernel exploits, randomizing those addresses at boot raises the cost of an attack significantly.CONFIG_RANDOMIZE_BASE randomizes the physical and virtual load address of the kernel at each boot. Even if an attacker knows the kernel version, they cannot assume the base address. The module loading address is offset separately, so a fixed module load order does not reveal the kernel base.
- Stack base — the kernel stack base varies between processes and can vary between syscalls, making stack-targeted attacks harder to aim.
- Dynamic memory base — the base addresses for
kmallocandvmallocregions are randomized between boots, frustrating layout-dependent heap exploits. - Structure layout — with
CONFIG_RANDSTRUCT, the field order of sensitive kernel structures is randomized per build. An exploit tuned to one kernel build will fail on another.
Stack protection
Stack canaries
The classic stack buffer overflow overwrites the saved return address on the stack. A stack canary is a secret value placed between local variables and the return address; the compiler inserts a check before every function return. If the canary has been overwritten, the kernel panics rather than executing attacker-controlled code.CONFIG_STACKPROTECTOR enables basic stack canaries. CONFIG_STACKPROTECTOR_STRONG extends protection to functions with any array, structure, or union on the stack — not just those with character arrays — at a small performance cost. STRONG is the recommended setting for production kernels.
Shadow call stack
On AArch64 kernels withCONFIG_SHADOW_CALL_STACK, the return address is also saved in a separate shadow stack that is not accessible via normal memory writes. A corrupted in-memory return address is caught when it diverges from the shadow stack copy.
Stack depth overflow
A stack depth overflow (unbounded recursion or large stack allocations) can write past the bottom of the preallocated kernel stack into adjacent memory. Thethread_info structure has been moved off the stack on most architectures, and a faulting guard page (CONFIG_VMAP_STACK) catches overflows before they corrupt other objects.
Heap protection
KASAN
The Kernel Address Sanitizer (CONFIG_KASAN) instruments memory allocations to detect out-of-bounds reads and writes and use-after-free accesses at runtime. It maintains a shadow memory region that tracks the valid state of every byte of heap memory, and traps on any access to invalid memory.
KASAN has a significant memory and performance overhead (typically 2-3x slowdown and substantial memory cost), so it is used in testing and CI environments rather than production. It is invaluable for finding heap memory bugs before they become exploits.
KFENCE
CONFIG_KFENCE (Kernel Electric Fence) is a lightweight, production-usable alternative to KASAN. It uses a probabilistic sampling approach: a small fraction of allocations are placed in specially guarded pages so that any out-of-bounds access or use-after-free triggers an immediate trap. The performance overhead is negligible, making KFENCE suitable for enabling in production kernels.
Slab hardening
CONFIG_SLAB_FREELIST_RANDOM randomizes the order of free objects in the slab allocator’s per-CPU freelist, frustrating predictable heap spray attacks. CONFIG_SLAB_FREELIST_HARDENED adds integrity metadata to freelist pointers so that a corrupted pointer is detected before use.
Memory poisoning — clearing freed allocations with a known pattern — prevents use-after-free attacks from reading stale contents. CONFIG_KSTACK_ERASE clears the kernel stack on syscall return.
CPU-assisted protections
Modern x86 processors provide hardware enforcement of kernel/userspace separation:SMEP — Supervisor Mode Execution Prevention
SMEP — Supervisor Mode Execution Prevention
SMEP, enabled by the
SMEP bit in CR4, causes a general protection fault if the CPU attempts to execute an instruction from a user-space page while in supervisor mode. This prevents the classic attack of mapping shellcode in user space and redirecting a kernel function pointer to it. Linux enables SMEP unconditionally on CPUs that support it.SMAP — Supervisor Mode Access Prevention
SMAP — Supervisor Mode Access Prevention
SMAP, enabled by the
SMAP bit in CR4, causes a fault if the kernel reads or writes user-space memory without first setting the AC (access control) flag in EFLAGS via stac/clac instructions. This prevents the kernel from being tricked into dereferencing attacker-controlled user-space pointers without explicit intent. copy_to_user() and copy_from_user() handle the flag manipulation correctly; direct dereferences of user pointers would fault. Linux enables SMAP unconditionally on supporting CPUs.CET — Control-flow Enforcement Technology
CET — Control-flow Enforcement Technology
Intel CET provides two complementary mechanisms: Indirect Branch Tracking (IBT), which ensures that indirect calls and jumps land only on
ENDBR instructions, and Shadow Stack (SHSTK), which maintains a separate hardware-protected stack of return addresses. Linux has experimental support for CET in later versions.ARM PXN/PAN
ARM PXN/PAN
On ARM, Privileged Execute Never (PXN) prevents the kernel from executing user-space pages (equivalent to SMEP), and Privileged Access Never (PAN) prevents the kernel from directly accessing user-space memory (equivalent to SMAP). Linux enables both on supporting ARM hardware.
Control-flow integrity
Control-flow integrity (CFI) enforces that indirect calls and returns can only target legitimate destinations. WithCONFIG_CFI_CLANG, the Clang compiler instruments every indirect call to check that the target function’s type signature matches the call site. A type-confused function pointer dereference (a common exploitation technique) triggers a kernel panic instead of executing attacker-controlled code.
CFI requires the kernel to be built with Clang and is currently most mature on AArch64. It provides stronger guarantees than stack canaries for some classes of control-flow hijack.
Seccomp filter deployment
Seccomp is covered in more detail in the security overview, but from a hardening perspective, the key deployment patterns are:- Apply seccomp filters as early as possible in a process’s lifecycle, before any untrusted input is processed.
- Use
SECCOMP_RET_KILL_PROCESSfor truly disallowed syscalls to prevent partial-execution attacks. - Always check the
archfield instruct seccomp_databefore matching on syscall number — on multi-ABI architectures (x86-64 with compat), syscall numbers overlap between calling conventions.
Preventing kernel pointer leaks
kptr_restrict
Thekptr_restrict sysctl controls whether the %p and %pK printk format specifiers print raw kernel addresses or suppress them:
kptr_restrict=1 at minimum in production. This prevents an unprivileged attacker from learning kernel addresses via /proc/kallsyms or similar interfaces, preserving KASLR’s effectiveness.
Suppressing raw address printing
Code that writes to user-readable files (/proc, /sys, seq_file-backed files) should use %pK instead of %px or %p for addresses, and should avoid printing raw addresses unless the information is necessary and access is restricted to root.
As of kernel 4.15, the plain %p specifier hashes the address before printing. Use %px only in contexts where the raw address is genuinely needed for debugging and the file is not user-readable.
Hardening Kconfig checklist
Use this checklist when configuring a kernel for a security-sensitive environment. Options marked with a* are typically enabled by default on mainstream distributions.
Security architecture overview
The full security layer stack: DAC, capabilities, namespaces, seccomp, and LSMs.
LSM framework
How LSM hooks work, configuring SELinux and AppArmor, and writing a custom LSM.
