Concurrency in the kernel is pervasive: interrupt handlers, softirqs, worker threads, and multiple CPU cores can all execute your code simultaneously. The kernel provides a layered set of synchronization primitives, each with distinct performance and context constraints. Picking the wrong primitive is a common source of deadlocks, priority inversions, and subtle data corruption. This reference covers the full set: atomic operations for lockless counters, spinlocks for short critical sections in any context, mutexes for sleeping locks in process context, RCU for high-read-frequency data, semaphores for resource counting, completions for event signaling, and memory barriers for ordering guarantees.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/DeelerDev/linux/llms.txt
Use this file to discover all available pages before exploring further.
Atomic operations
Lock-free integer operations for counters and flags.
Spinlocks
Busy-wait locks safe in interrupt and atomic context.
Mutexes
Sleeping mutual exclusion for process context.
RCU
Read-Copy-Update for fast, scalable read-mostly data.
Semaphores
Counting semaphores for resource limiting.
Completions
One-shot event notification between kernel threads.
Choosing a primitive
The right primitive depends on two constraints: who holds the lock (only process context, or also interrupt handlers and softirqs?) and how long the critical section is (microseconds or potentially milliseconds?).Atomic operations — no lock needed
Atomic operations — no lock needed
Use for simple integer counters and boolean flags shared between contexts. Zero overhead on modern CPUs. Suitable in any context including hardirq.
Spinlock — short critical section, any context
Spinlock — short critical section, any context
Disables preemption on the local CPU while held. Can be taken in hardirq context (with
spin_lock_irqsave). Critical section must not sleep. Ideal for protecting small data structures for a handful of instructions.Mutex — longer critical section, process context only
Mutex — longer critical section, process context only
The holder sleeps if the mutex is contended. Cannot be acquired in interrupt context. Use when the critical section might allocate memory, call
copy_from_user(), or do I/O. Preferred over spinlocks when sleeping is safe.RCU — read-mostly data, very high read frequency
RCU — read-mostly data, very high read frequency
Readers are never blocked. Writers make a copy, update it, then wait for a grace period before freeing the old version. Ideal for routing tables, device lists, and other data read on every packet or syscall.
Semaphore — resource counting or one-way signaling
Semaphore — resource counting or one-way signaling
Like a mutex but with a count > 1. Use for limiting concurrent access to a pool of resources (e.g., at most N concurrent DMA transfers). For binary signaling between threads, prefer
struct completion.Completion — one-shot event signaling
Completion — one-shot event signaling
One thread waits; another signals. Cleaner than a semaphore initialised to 0 for this pattern. Use for “wait for hardware to finish” or “wait for thread to start”.
Atomic operations
atomic_t wraps a 32-bit integer with CPU-level atomic read-modify-write instructions. No lock is required; all operations are indivisible with respect to other CPUs.
64-bit atomics
Spinlocks
Spinlocks are the correct choice when a lock must be acquired from interrupt context, or when the critical section is very short (tens of instructions). The lock holder busy-waits—it does not sleep—so long critical sections waste CPU cycles on other CPUs.Choosing the right spinlock variant
| Interrupt handler takes the lock? | Softirq/tasklet takes it? | Use |
|---|---|---|
| No | No | spin_lock / spin_unlock |
| No | Yes | spin_lock_bh / spin_unlock_bh |
| Yes | — | spin_lock_irqsave / spin_unlock_irqrestore |
Mutexes
A mutex serializes access to a resource in process context. If the mutex is already held when a thread callsmutex_lock(), the thread is put to sleep and only woken when the mutex is released. This makes mutexes unsuitable for interrupt handlers but ideal for protecting state that requires memory allocation, userspace copies, or device I/O.
A mutex must be released by the same task that acquired it. This is enforced in debug builds. If you need a lock that can be released by a different task (e.g., producer/consumer), use a semaphore.
RCU (Read-Copy-Update)
RCU is a synchronization mechanism optimized for data that is read far more often than it is written. Readers acquire no lock and are never blocked. Writers make a copy of the data, modify it, atomically swap in the new version, and then wait for all pre-existing readers to finish before freeing the old version.Reader side
Writer side
Asynchronous callback (call_rcu)
When the writer cannot sleep (e.g., it holds a spinlock), usecall_rcu() to schedule the free callback asynchronously after the grace period:
RCU-protected lists
Semaphores
Semaphores maintain an integer count.down() decrements and blocks if the count would go negative; up() increments and wakes a waiter. Binary semaphores (count = 1) behave like sleeping mutexes but without the ownership constraint—any task can call up().
For the common case of mutual exclusion in process context, prefer
struct mutex over a binary semaphore. Mutex has better semantics (ownership tracking, priority inheritance on RT), more debugging support, and is faster on most architectures.Completions
struct completion is the idiomatic way to signal a one-shot event from one kernel thread (or interrupt handler) to another. It is cleaner and more obvious in intent than a semaphore initialised to 0.
Memory barriers
The CPU and compiler may reorder memory operations for performance. Memory barriers enforce ordering constraints where the architecture’s relaxed memory model would otherwise allow reordering visible to other CPUs.When barriers are needed
rcu_assign_pointer() and rcu_dereference() already include the appropriate barriers for their usage context. spin_lock() / spin_unlock() and mutex_lock() / mutex_unlock() also imply full memory barriers on all supported architectures. Manual barriers are needed mainly in lockless or RCU-free code.Barrier summary
| Barrier | Orders | Use case |
|---|---|---|
smp_mb() | All loads and stores | General producer/consumer flag protocols |
smp_rmb() | Loads only | Checking a flag then reading the data it guards |
smp_wmb() | Stores only | Writing data then publishing a pointer to it |
READ_ONCE(x) | Single load | Reading a variable shared without a lock |
WRITE_ONCE(x, v) | Single store | Writing a variable shared without a lock |
dma_wmb() | Stores visible to DMA | Ensuring DMA descriptor is ready before ringing doorbell |
