The Linux kernel provides several memory allocators tuned for different size classes, contiguity requirements, and calling contexts. Choosing the wrong allocator is one of the most common sources of kernel bugs—allocating withDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/DeelerDev/linux/llms.txt
Use this file to discover all available pages before exploring further.
GFP_ATOMIC where GFP_KERNEL would suffice wastes reserves, and calling a sleeping allocator from an interrupt handler causes an immediate BUG. This reference covers the full allocation stack from the slab allocator to the raw page allocator, along with GFP flag semantics and failure-handling patterns.
kmalloc / kzalloc
General-purpose slab allocation for objects smaller than a page.
vmalloc
Virtually contiguous allocations for large, non-DMA buffers.
Slab caches
High-frequency, fixed-size object pools with
kmem_cache.Page allocator
Raw page allocation via
alloc_pages() and __get_free_pages().GFP flags
Controlling reclaim behaviour, zones, and allocation semantics.
Failure handling
Patterns for detecting and recovering from allocation failures.
Kmalloc and kzalloc
kmalloc() is the general-purpose kernel allocator. It returns physically contiguous memory suitable for DMA and for objects up to roughly KMALLOC_MAX_SIZE (architecture-dependent; commonly 4 MiB). For objects smaller than a page, it satisfies requests from pre-built power-of-two slab caches, making it fast.
Function signatures
Arrays and size-overflow helpers
Typical usage
The address returned by
kmalloc() is aligned to at least ARCH_KMALLOC_MINALIGN bytes. For power-of-two sizes, alignment equals the size itself. This makes kmalloc safe for naturally-aligned scalar types without extra alignment specification.Vmalloc and vfree
vmalloc() maps physically discontiguous pages into a single virtually contiguous region. It is slower than kmalloc() (requires page-table manipulation) and is not suitable for DMA, but it can satisfy much larger allocations.
When to use vmalloc vs kmalloc
kvmalloc: the adaptive allocator
When you do not know whether the size will fit in a kmalloc slab, usekvmalloc(). It tries kmalloc() first and falls back to vmalloc() if that fails.
Slab caches
When a subsystem allocates many objects of the same fixed size at high frequency (e.g., a packet descriptor, an inode, a request block), creating a dedicated slab cache withkmem_cache_create() is more efficient than repeated kmalloc() calls. The slab allocator batches construction and destruction and can colour objects to reduce cache-line conflicts.
Creating and destroying a cache
Allocating and freeing objects
Common SLAB flags
| Flag | Effect |
|---|---|
SLAB_HWCACHE_ALIGN | Align objects to CPU cache-line boundaries for better performance. |
SLAB_PANIC | Panic on allocation failure during cache creation (for caches that must succeed). |
SLAB_TYPESAFE_BY_RCU | Delay freeing of slab pages by one RCU grace period; enables RCU-protected lookups. |
SLAB_POISON | Fill freed objects with a poison pattern to catch use-after-free. |
SLAB_RED_ZONE | Add red zones around objects to catch out-of-bounds writes (debug). |
Page allocator
The page allocator is the lowest-level allocator in the kernel. It operates on physically contiguous orders (powers of two in pages: order 0 = 1 page, order 1 = 2 pages, order 10 = 1024 pages). Use this when you need guaranteed physical contiguity that kmalloc cannot provide.Converting between pages and addresses
Page allocator allocations at order > 0 must be physically contiguous. High-order allocations can fail under memory pressure even when total free memory is abundant, because fragmentation prevents a contiguous block from being assembled. Prefer the slab allocator for objects smaller than a page.
GFP flags
GFP (Get Free Pages) flags control which memory zones the allocator may use, whether it may block, whether it may trigger reclaim, and other policies.Primary GFP flag combinations
GFP flags and reclaim behaviour
GFP_KERNEL — background and direct reclaim
GFP_KERNEL — background and direct reclaim
Both background (kswapd) and direct (in-caller) reclaim are enabled. This is the default and correct choice for process-context allocations. Non-costly requests are effectively non-failing, but callers must still check the return value because OOM-killed tasks may see failures.
GFP_NOWAIT — no direct reclaim
GFP_NOWAIT — no direct reclaim
Equivalent to
GFP_KERNEL & ~__GFP_DIRECT_RECLAIM. The allocator may wake kswapd but will not block the caller. Use in interrupt-safe paths that have a fallback.GFP_ATOMIC — reserve access, no sleep
GFP_ATOMIC — reserve access, no sleep
(GFP_KERNEL | __GFP_HIGH) & ~__GFP_DIRECT_RECLAIM. Provides access to a small per-zone reserve pool. Use only in hard interrupt / softirq context. Overuse depletes reserves and degrades system stability.GFP_KERNEL | __GFP_NORETRY — fail fast
GFP_KERNEL | __GFP_NORETRY — fail fast
Triggers one round of reclaim and returns NULL rather than retrying. Does not invoke the OOM killer. Useful when the caller has a cheaper fallback and does not want to stall.
GFP_KERNEL | __GFP_NOFAIL — loop until success
GFP_KERNEL | __GFP_NOFAIL — loop until success
The allocator retries indefinitely. This should be used only when failure is genuinely unacceptable and the kernel will be unable to continue (e.g., critical boot-time structures). Never use for high-order allocations.
When to use each allocator
| Situation | Recommended allocator |
|---|---|
| Small struct in process context | kzalloc(sizeof(*obj), GFP_KERNEL) |
| Small struct in interrupt context | kmalloc(sizeof(*obj), GFP_ATOMIC) |
| Large buffer (>1 page), non-DMA | vmalloc(size) |
| Unknown size, process context | kvmalloc(size, GFP_KERNEL) |
| Many identical objects, high rate | kmem_cache_alloc(cache, GFP_KERNEL) |
| DMA-capable buffer | dma_alloc_coherent(dev, size, &dma_addr, GFP_KERNEL) |
| Raw page(s) needed | alloc_pages(GFP_KERNEL, order) |
Allocation failure handling
Every allocation can fail. Ignoring the return value is a bug; the kernel’s sparse checker will warn about unchecked allocations when__must_check is applied to the allocator prototypes.
