All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Bommarito <michael.bommarito@gmail.com>
To: Steffen Klassert <steffen.klassert@secunet.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Eric Dumazet <edumazet@google.com>,
	netdev@vger.kernel.org
Cc: "David S . Miller" <davem@davemloft.net>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Kuniyuki Iwashima <kuniyu@google.com>,
	Maciej Zenczykowski <maze@google.com>,
	Kees Cook <kees@kernel.org>, Jeff Layton <jlayton@kernel.org>,
	"Gustavo A . R . Silva" <gustavoars@kernel.org>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Florian Westphal <fw@strlen.de>,
	netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets
Date: Tue, 12 May 2026 16:51:13 -0400	[thread overview]
Message-ID: <cover.1778614451.git.michael.bommarito@gmail.com> (raw)

This series fixes a size_t underflow in net/ipv4/ah4.c:ah_output()
reachable when a raw IP_HDRINCL socket sends a packet with ihl < 5
through an xfrm AH policy.  Originally triaged on security@kernel.org;
moving to netdev at Herbert's suggestion so nftables / netfilter
maintainers can weigh in on a related question (see "Open question"
below).  Herbert also asked for the malformed packet to be rejected
upstream of AH rather than guarded at the AH consumer; that is
patch 1/2.  v1's AH-side guard is kept here as 2/2 defense-in-depth.

Bug
---

In net/ipv4/ah4.c, ah_output_done() and ah_output() copy the IPv4
options area with

    if (top_iph->ihl != 5) {
        memcpy(dst, src, top_iph->ihl * 4 - sizeof(struct iphdr));
    }

The "!= 5" guard correctly excludes the no-options case but does
NOT exclude ihl < 5.  For ihl in [0, 4], top_iph->ihl * 4 is less
than sizeof(struct iphdr) (20); the subtraction is computed as int
and becomes negative, then is implicitly converted to size_t at the
memcpy() call.  The resulting length is close to SIZE_MAX and
memcpy walks off the slab allocation backing the skb's network
header.

The malformed packet arrives via raw_send_hdrinc() in net/ipv4/raw.c.
raw_send_hdrinc() validates "iphlen > length" but does not reject
"iphlen < sizeof(struct iphdr)".  An IP_HDRINCL caller with
CAP_NET_RAW (acquirable in an unprivileged user+net namespace on a
distro kernel with CONFIG_USER_NS=y) can therefore craft an ihl < 5
packet; if a matching xfrm AH policy is installed on the outgoing
route, ah_output() runs on the crafted packet and panics the host
kernel.

The guard has been in place since 1da177e4c3f4 ("Linux-2.6.12-rc2",
2005).  No prior fix on lore (3-year window) and no CVE on the file.

Reproduction
------------

x86 + KASAN (QEMU KVM, net-next 7.1.0-rc2):

  BUG: KASAN: out-of-bounds in ah_output+0x696/0x19e0
  Read of size 18446744073709551596 at addr ffff88800bae9824 \
      by task trigger_ah4_ihl/97
  Call Trace:
   __asan_memcpy+0x23/0x60
   ah_output+0x696/0x19e0
   xfrm_output_resume+0xdc8/0x6280
   xfrm4_output+0xfe/0x4c0
   raw_sendmsg+0x2531/0x26f0
   __sys_sendto+0x32b/0x390
   __x64_sys_sendto+0xdf/0x1f0
   do_syscall_64+0xf3/0x6a0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  The buggy address belongs to the object at ffff88800bae9800
   which belongs to the cache kmalloc-1k of size 1024
  The buggy address is located 36 bytes inside of
   1024-byte region [ffff88800bae9800, ffff88800bae9c00)

The read size 0xFFFFFFFFFFFFFFEC (SIZE_MAX - 19) is the
underflowed result of (top_iph->ihl * 4 - sizeof(struct iphdr))
for ihl = 0.  Trigger: veth pair (loopback bypasses
xfrm_output), xfrm AH transport-mode policy, IP_HDRINCL
sendto() of a 128-byte packet with iph->ihl in [0, 4].

A container-only variant (CAP_NET_ADMIN container, no
--privileged, no host networking) panics the host kernel on a
stock distro kernel with CONFIG_INET_AH=m + module autoload.
Repro harness + container Dockerfile + console logs available
privately on request; not attached to this public posting.

Patches
-------

1/2 ipv4: raw: reject IP_HDRINCL packets with ihl < 5

    Upstream-of-AH fix.  An IPv4 header with ihl < 5 is malformed
    by definition (RFC 791) and must not be allowed to continue
    along the in-stack output path.  This is the primary fix.

2/2 ipv4: ah: harden ah_output options-copy guard against ihl < 5

    Defense-in-depth at the three memcpy sites in ah_output() and
    ah_output_done().  Changes "if (top_iph->ihl != 5)" to
    "if (top_iph->ihl > 5)" so a future path delivering an ihl < 5
    packet cannot re-introduce the OOB access.  With patch 1/2 in
    place an IP_HDRINCL-crafted ihl < 5 packet should no longer
    reach ah_output; this patch closes the OOB primitive
    specifically at the AH consumer.

Open question for netfilter / netdev
------------------------------------

After patch 1/2 lands, a caller with CAP_NET_ADMIN can still
deliver an ihl < 5 packet into the post-LOCAL_OUT in-stack path by
attaching an nftables payload-set rule on NF_INET_LOCAL_OUT (or an
NFQUEUE reinject on the same hook) that rewrites byte 0 of the
IPv4 header after the raw_send_hdrinc / __ip_local_out validation
has run.  Construction:

    nft add table ip mangle
    nft add chain ip mangle output { type filter hook output \
                                     priority -150 \; }
    nft add rule ip mangle output ip daddr <victim> \
                                  @nh,0,8 set 0x40

I reproduced this separately with nftables payload-set delivering an
ihl = 0 packet to xfrm4_output() and onward.  Patch 2/2 covers the
AH consumer; other consumers that read iph->ihl after the LOCAL_OUT
hook may be similarly exposed and I have not enumerated them.

Direction question rather than a fix proposal: does basic iphdr
re-sanitization after a header-mangling hook belong in the netfilter
machinery, in each in-stack consumer, or both?

Michael Bommarito (2):
  ipv4: raw: reject IP_HDRINCL packets with ihl < 5
  ipv4: ah: harden ah_output options-copy guard against ihl < 5

 net/ipv4/ah4.c | 6 +++---
 net/ipv4/raw.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)


base-commit: 73d587ae684d176fac9db94173f77d78a794ea4f
--
2.53.0


             reply	other threads:[~2026-05-12 20:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-12 20:51 Michael Bommarito [this message]
2026-05-12 20:51 ` [PATCH net 1/2] ipv4: raw: reject IP_HDRINCL packets with ihl < 5 Michael Bommarito
2026-05-12 20:51 ` [PATCH net 2/2] ipv4: ah: harden ah_output options-copy guard against " Michael Bommarito
2026-05-15  4:20   ` Herbert Xu
2026-05-12 22:34 ` [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets Pablo Neira Ayuso
2026-05-12 23:05   ` Michael Bommarito
2026-05-15  4:22   ` Herbert Xu
2026-05-15 22:59 ` Jakub Kicinski
2026-05-15 23:00 ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1778614451.git.michael.bommarito@gmail.com \
    --to=michael.bommarito@gmail.com \
    --cc=coreteam@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=gustavoars@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jlayton@kernel.org \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maze@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.