Netdev List
 help / color / mirror / Atom feed
From: Michael Bommarito <michael.bommarito@gmail.com>
To: Steffen Klassert <steffen.klassert@secunet.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Eric Dumazet <edumazet@google.com>,
	netdev@vger.kernel.org
Cc: "David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	Maciej Zenczykowski <maze@google.com>,
	Kees Cook <kees@kernel.org>, Jeff Layton <jlayton@kernel.org>,
	"Gustavo A . R . Silva" <gustavoars@kernel.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets
Date: Tue, 12 May 2026 16:51:13 -0400	[thread overview]
Message-ID: <cover.1778614451.git.michael.bommarito@gmail.com> (raw)

This series fixes a size_t underflow in net/ipv4/ah4.c:ah_output()
reachable when a raw IP_HDRINCL socket sends a packet with ihl < 5
through an xfrm AH policy.  Originally triaged on security@kernel.org;
moving to netdev at Herbert's suggestion so nftables / netfilter
maintainers can weigh in on a related question (see "Open question"
below).  Herbert also asked for the malformed packet to be rejected
upstream of AH rather than guarded at the AH consumer; that is
patch 1/2.  v1's AH-side guard is kept here as 2/2 defense-in-depth.

Bug
---

In net/ipv4/ah4.c, ah_output_done() and ah_output() copy the IPv4
options area with

    if (top_iph->ihl != 5) {
        memcpy(dst, src, top_iph->ihl * 4 - sizeof(struct iphdr));
    }

The "!= 5" guard correctly excludes the no-options case but does
NOT exclude ihl < 5.  For ihl in [0, 4], top_iph->ihl * 4 is less
than sizeof(struct iphdr) (20); the subtraction is computed as int
and becomes negative, then is implicitly converted to size_t at the
memcpy() call.  The resulting length is close to SIZE_MAX and
memcpy walks off the slab allocation backing the skb's network
header.

The malformed packet arrives via raw_send_hdrinc() in net/ipv4/raw.c.
raw_send_hdrinc() validates "iphlen > length" but does not reject
"iphlen < sizeof(struct iphdr)".  An IP_HDRINCL caller with
CAP_NET_RAW (acquirable in an unprivileged user+net namespace on a
distro kernel with CONFIG_USER_NS=y) can therefore craft an ihl < 5
packet; if a matching xfrm AH policy is installed on the outgoing
route, ah_output() runs on the crafted packet and panics the host
kernel.

The guard has been in place since 1da177e4c3f4 ("Linux-2.6.12-rc2",
2005).  No prior fix on lore (3-year window) and no CVE on the file.

Reproduction
------------

x86 + KASAN (QEMU KVM, net-next 7.1.0-rc2):

  BUG: KASAN: out-of-bounds in ah_output+0x696/0x19e0
  Read of size 18446744073709551596 at addr ffff88800bae9824 \
      by task trigger_ah4_ihl/97
  Call Trace:
   __asan_memcpy+0x23/0x60
   ah_output+0x696/0x19e0
   xfrm_output_resume+0xdc8/0x6280
   xfrm4_output+0xfe/0x4c0
   raw_sendmsg+0x2531/0x26f0
   __sys_sendto+0x32b/0x390
   __x64_sys_sendto+0xdf/0x1f0
   do_syscall_64+0xf3/0x6a0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  The buggy address belongs to the object at ffff88800bae9800
   which belongs to the cache kmalloc-1k of size 1024
  The buggy address is located 36 bytes inside of
   1024-byte region [ffff88800bae9800, ffff88800bae9c00)

The read size 0xFFFFFFFFFFFFFFEC (SIZE_MAX - 19) is the
underflowed result of (top_iph->ihl * 4 - sizeof(struct iphdr))
for ihl = 0.  Trigger: veth pair (loopback bypasses
xfrm_output), xfrm AH transport-mode policy, IP_HDRINCL
sendto() of a 128-byte packet with iph->ihl in [0, 4].

A container-only variant (CAP_NET_ADMIN container, no
--privileged, no host networking) panics the host kernel on a
stock distro kernel with CONFIG_INET_AH=m + module autoload.
Repro harness + container Dockerfile + console logs available
privately on request; not attached to this public posting.

Patches
-------

1/2 ipv4: raw: reject IP_HDRINCL packets with ihl < 5

    Upstream-of-AH fix.  An IPv4 header with ihl < 5 is malformed
    by definition (RFC 791) and must not be allowed to continue
    along the in-stack output path.  This is the primary fix.

2/2 ipv4: ah: harden ah_output options-copy guard against ihl < 5

    Defense-in-depth at the three memcpy sites in ah_output() and
    ah_output_done().  Changes "if (top_iph->ihl != 5)" to
    "if (top_iph->ihl > 5)" so a future path delivering an ihl < 5
    packet cannot re-introduce the OOB access.  With patch 1/2 in
    place an IP_HDRINCL-crafted ihl < 5 packet should no longer
    reach ah_output; this patch closes the OOB primitive
    specifically at the AH consumer.

Open question for netfilter / netdev
------------------------------------

After patch 1/2 lands, a caller with CAP_NET_ADMIN can still
deliver an ihl < 5 packet into the post-LOCAL_OUT in-stack path by
attaching an nftables payload-set rule on NF_INET_LOCAL_OUT (or an
NFQUEUE reinject on the same hook) that rewrites byte 0 of the
IPv4 header after the raw_send_hdrinc / __ip_local_out validation
has run.  Construction:

    nft add table ip mangle
    nft add chain ip mangle output { type filter hook output \
                                     priority -150 \; }
    nft add rule ip mangle output ip daddr <victim> \
                                  @nh,0,8 set 0x40

I reproduced this separately with nftables payload-set delivering an
ihl = 0 packet to xfrm4_output() and onward.  Patch 2/2 covers the
AH consumer; other consumers that read iph->ihl after the LOCAL_OUT
hook may be similarly exposed and I have not enumerated them.

Direction question rather than a fix proposal: does basic iphdr
re-sanitization after a header-mangling hook belong in the netfilter
machinery, in each in-stack consumer, or both?

Michael Bommarito (2):
  ipv4: raw: reject IP_HDRINCL packets with ihl < 5
  ipv4: ah: harden ah_output options-copy guard against ihl < 5

 net/ipv4/ah4.c | 6 +++---
 net/ipv4/raw.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)


base-commit: 73d587ae684d176fac9db94173f77d78a794ea4f
--
2.53.0


             reply	other threads:[~2026-05-12 20:51 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 20:51 Michael Bommarito [this message]
2026-05-12 20:51 ` [PATCH net 1/2] ipv4: raw: reject IP_HDRINCL packets with ihl < 5 Michael Bommarito
2026-05-12 20:51 ` [PATCH net 2/2] ipv4: ah: harden ah_output options-copy guard against " Michael Bommarito
2026-05-12 22:34 ` [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets Pablo Neira Ayuso
2026-05-12 23:05   ` Michael Bommarito

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1778614451.git.michael.bommarito@gmail.com \
    --to=michael.bommarito@gmail.com \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=gustavoars@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jlayton@kernel.org \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox