From: Michael Bommarito <michael.bommarito@gmail.com>
To: Steffen Klassert <steffen.klassert@secunet.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
Eric Dumazet <edumazet@google.com>,
netdev@vger.kernel.org
Cc: "David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Kuniyuki Iwashima <kuniyu@google.com>,
Maciej Zenczykowski <maze@google.com>,
Kees Cook <kees@kernel.org>, Jeff Layton <jlayton@kernel.org>,
"Gustavo A . R . Silva" <gustavoars@kernel.org>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Florian Westphal <fw@strlen.de>,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
linux-kernel@vger.kernel.org
Subject: [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets
Date: Tue, 12 May 2026 16:51:13 -0400 [thread overview]
Message-ID: <cover.1778614451.git.michael.bommarito@gmail.com> (raw)
This series fixes a size_t underflow in net/ipv4/ah4.c:ah_output()
reachable when a raw IP_HDRINCL socket sends a packet with ihl < 5
through an xfrm AH policy. Originally triaged on security@kernel.org;
moving to netdev at Herbert's suggestion so nftables / netfilter
maintainers can weigh in on a related question (see "Open question"
below). Herbert also asked for the malformed packet to be rejected
upstream of AH rather than guarded at the AH consumer; that is
patch 1/2. v1's AH-side guard is kept here as 2/2 defense-in-depth.
Bug
---
In net/ipv4/ah4.c, ah_output_done() and ah_output() copy the IPv4
options area with
if (top_iph->ihl != 5) {
memcpy(dst, src, top_iph->ihl * 4 - sizeof(struct iphdr));
}
The "!= 5" guard correctly excludes the no-options case but does
NOT exclude ihl < 5. For ihl in [0, 4], top_iph->ihl * 4 is less
than sizeof(struct iphdr) (20); the subtraction is computed as int
and becomes negative, then is implicitly converted to size_t at the
memcpy() call. The resulting length is close to SIZE_MAX and
memcpy walks off the slab allocation backing the skb's network
header.
The malformed packet arrives via raw_send_hdrinc() in net/ipv4/raw.c.
raw_send_hdrinc() validates "iphlen > length" but does not reject
"iphlen < sizeof(struct iphdr)". An IP_HDRINCL caller with
CAP_NET_RAW (acquirable in an unprivileged user+net namespace on a
distro kernel with CONFIG_USER_NS=y) can therefore craft an ihl < 5
packet; if a matching xfrm AH policy is installed on the outgoing
route, ah_output() runs on the crafted packet and panics the host
kernel.
The guard has been in place since 1da177e4c3f4 ("Linux-2.6.12-rc2",
2005). No prior fix on lore (3-year window) and no CVE on the file.
Reproduction
------------
x86 + KASAN (QEMU KVM, net-next 7.1.0-rc2):
BUG: KASAN: out-of-bounds in ah_output+0x696/0x19e0
Read of size 18446744073709551596 at addr ffff88800bae9824 \
by task trigger_ah4_ihl/97
Call Trace:
__asan_memcpy+0x23/0x60
ah_output+0x696/0x19e0
xfrm_output_resume+0xdc8/0x6280
xfrm4_output+0xfe/0x4c0
raw_sendmsg+0x2531/0x26f0
__sys_sendto+0x32b/0x390
__x64_sys_sendto+0xdf/0x1f0
do_syscall_64+0xf3/0x6a0
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88800bae9800
which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 36 bytes inside of
1024-byte region [ffff88800bae9800, ffff88800bae9c00)
The read size 0xFFFFFFFFFFFFFFEC (SIZE_MAX - 19) is the
underflowed result of (top_iph->ihl * 4 - sizeof(struct iphdr))
for ihl = 0. Trigger: veth pair (loopback bypasses
xfrm_output), xfrm AH transport-mode policy, IP_HDRINCL
sendto() of a 128-byte packet with iph->ihl in [0, 4].
A container-only variant (CAP_NET_ADMIN container, no
--privileged, no host networking) panics the host kernel on a
stock distro kernel with CONFIG_INET_AH=m + module autoload.
Repro harness + container Dockerfile + console logs available
privately on request; not attached to this public posting.
Patches
-------
1/2 ipv4: raw: reject IP_HDRINCL packets with ihl < 5
Upstream-of-AH fix. An IPv4 header with ihl < 5 is malformed
by definition (RFC 791) and must not be allowed to continue
along the in-stack output path. This is the primary fix.
2/2 ipv4: ah: harden ah_output options-copy guard against ihl < 5
Defense-in-depth at the three memcpy sites in ah_output() and
ah_output_done(). Changes "if (top_iph->ihl != 5)" to
"if (top_iph->ihl > 5)" so a future path delivering an ihl < 5
packet cannot re-introduce the OOB access. With patch 1/2 in
place an IP_HDRINCL-crafted ihl < 5 packet should no longer
reach ah_output; this patch closes the OOB primitive
specifically at the AH consumer.
Open question for netfilter / netdev
------------------------------------
After patch 1/2 lands, a caller with CAP_NET_ADMIN can still
deliver an ihl < 5 packet into the post-LOCAL_OUT in-stack path by
attaching an nftables payload-set rule on NF_INET_LOCAL_OUT (or an
NFQUEUE reinject on the same hook) that rewrites byte 0 of the
IPv4 header after the raw_send_hdrinc / __ip_local_out validation
has run. Construction:
nft add table ip mangle
nft add chain ip mangle output { type filter hook output \
priority -150 \; }
nft add rule ip mangle output ip daddr <victim> \
@nh,0,8 set 0x40
I reproduced this separately with nftables payload-set delivering an
ihl = 0 packet to xfrm4_output() and onward. Patch 2/2 covers the
AH consumer; other consumers that read iph->ihl after the LOCAL_OUT
hook may be similarly exposed and I have not enumerated them.
Direction question rather than a fix proposal: does basic iphdr
re-sanitization after a header-mangling hook belong in the netfilter
machinery, in each in-stack consumer, or both?
Michael Bommarito (2):
ipv4: raw: reject IP_HDRINCL packets with ihl < 5
ipv4: ah: harden ah_output options-copy guard against ihl < 5
net/ipv4/ah4.c | 6 +++---
net/ipv4/raw.c | 2 +-
2 files changed, 4 insertions(+), 4 deletions(-)
base-commit: 73d587ae684d176fac9db94173f77d78a794ea4f
--
2.53.0
next reply other threads:[~2026-05-12 20:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-12 20:51 Michael Bommarito [this message]
2026-05-12 20:51 ` [PATCH net 1/2] ipv4: raw: reject IP_HDRINCL packets with ihl < 5 Michael Bommarito
2026-05-12 20:51 ` [PATCH net 2/2] ipv4: ah: harden ah_output options-copy guard against " Michael Bommarito
2026-05-12 22:34 ` [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets Pablo Neira Ayuso
2026-05-12 23:05 ` Michael Bommarito
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1778614451.git.michael.bommarito@gmail.com \
--to=michael.bommarito@gmail.com \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=gustavoars@kernel.org \
--cc=herbert@gondor.apana.org.au \
--cc=jlayton@kernel.org \
--cc=kees@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maze@google.com \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
--cc=steffen.klassert@secunet.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox