Netdev List
 help / color / mirror / Atom feed
* [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets
@ 2026-05-12 20:51 Michael Bommarito
  2026-05-12 20:51 ` [PATCH net 1/2] ipv4: raw: reject IP_HDRINCL packets with ihl < 5 Michael Bommarito
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Michael Bommarito @ 2026-05-12 20:51 UTC (permalink / raw)
  To: Steffen Klassert, Herbert Xu, Eric Dumazet, netdev
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima,
	Maciej Zenczykowski, Kees Cook, Jeff Layton,
	Gustavo A . R . Silva, Pablo Neira Ayuso, Florian Westphal,
	netfilter-devel, coreteam, linux-kernel

This series fixes a size_t underflow in net/ipv4/ah4.c:ah_output()
reachable when a raw IP_HDRINCL socket sends a packet with ihl < 5
through an xfrm AH policy.  Originally triaged on security@kernel.org;
moving to netdev at Herbert's suggestion so nftables / netfilter
maintainers can weigh in on a related question (see "Open question"
below).  Herbert also asked for the malformed packet to be rejected
upstream of AH rather than guarded at the AH consumer; that is
patch 1/2.  v1's AH-side guard is kept here as 2/2 defense-in-depth.

Bug
---

In net/ipv4/ah4.c, ah_output_done() and ah_output() copy the IPv4
options area with

    if (top_iph->ihl != 5) {
        memcpy(dst, src, top_iph->ihl * 4 - sizeof(struct iphdr));
    }

The "!= 5" guard correctly excludes the no-options case but does
NOT exclude ihl < 5.  For ihl in [0, 4], top_iph->ihl * 4 is less
than sizeof(struct iphdr) (20); the subtraction is computed as int
and becomes negative, then is implicitly converted to size_t at the
memcpy() call.  The resulting length is close to SIZE_MAX and
memcpy walks off the slab allocation backing the skb's network
header.

The malformed packet arrives via raw_send_hdrinc() in net/ipv4/raw.c.
raw_send_hdrinc() validates "iphlen > length" but does not reject
"iphlen < sizeof(struct iphdr)".  An IP_HDRINCL caller with
CAP_NET_RAW (acquirable in an unprivileged user+net namespace on a
distro kernel with CONFIG_USER_NS=y) can therefore craft an ihl < 5
packet; if a matching xfrm AH policy is installed on the outgoing
route, ah_output() runs on the crafted packet and panics the host
kernel.

The guard has been in place since 1da177e4c3f4 ("Linux-2.6.12-rc2",
2005).  No prior fix on lore (3-year window) and no CVE on the file.

Reproduction
------------

x86 + KASAN (QEMU KVM, net-next 7.1.0-rc2):

  BUG: KASAN: out-of-bounds in ah_output+0x696/0x19e0
  Read of size 18446744073709551596 at addr ffff88800bae9824 \
      by task trigger_ah4_ihl/97
  Call Trace:
   __asan_memcpy+0x23/0x60
   ah_output+0x696/0x19e0
   xfrm_output_resume+0xdc8/0x6280
   xfrm4_output+0xfe/0x4c0
   raw_sendmsg+0x2531/0x26f0
   __sys_sendto+0x32b/0x390
   __x64_sys_sendto+0xdf/0x1f0
   do_syscall_64+0xf3/0x6a0
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  The buggy address belongs to the object at ffff88800bae9800
   which belongs to the cache kmalloc-1k of size 1024
  The buggy address is located 36 bytes inside of
   1024-byte region [ffff88800bae9800, ffff88800bae9c00)

The read size 0xFFFFFFFFFFFFFFEC (SIZE_MAX - 19) is the
underflowed result of (top_iph->ihl * 4 - sizeof(struct iphdr))
for ihl = 0.  Trigger: veth pair (loopback bypasses
xfrm_output), xfrm AH transport-mode policy, IP_HDRINCL
sendto() of a 128-byte packet with iph->ihl in [0, 4].

A container-only variant (CAP_NET_ADMIN container, no
--privileged, no host networking) panics the host kernel on a
stock distro kernel with CONFIG_INET_AH=m + module autoload.
Repro harness + container Dockerfile + console logs available
privately on request; not attached to this public posting.

Patches
-------

1/2 ipv4: raw: reject IP_HDRINCL packets with ihl < 5

    Upstream-of-AH fix.  An IPv4 header with ihl < 5 is malformed
    by definition (RFC 791) and must not be allowed to continue
    along the in-stack output path.  This is the primary fix.

2/2 ipv4: ah: harden ah_output options-copy guard against ihl < 5

    Defense-in-depth at the three memcpy sites in ah_output() and
    ah_output_done().  Changes "if (top_iph->ihl != 5)" to
    "if (top_iph->ihl > 5)" so a future path delivering an ihl < 5
    packet cannot re-introduce the OOB access.  With patch 1/2 in
    place an IP_HDRINCL-crafted ihl < 5 packet should no longer
    reach ah_output; this patch closes the OOB primitive
    specifically at the AH consumer.

Open question for netfilter / netdev
------------------------------------

After patch 1/2 lands, a caller with CAP_NET_ADMIN can still
deliver an ihl < 5 packet into the post-LOCAL_OUT in-stack path by
attaching an nftables payload-set rule on NF_INET_LOCAL_OUT (or an
NFQUEUE reinject on the same hook) that rewrites byte 0 of the
IPv4 header after the raw_send_hdrinc / __ip_local_out validation
has run.  Construction:

    nft add table ip mangle
    nft add chain ip mangle output { type filter hook output \
                                     priority -150 \; }
    nft add rule ip mangle output ip daddr <victim> \
                                  @nh,0,8 set 0x40

I reproduced this separately with nftables payload-set delivering an
ihl = 0 packet to xfrm4_output() and onward.  Patch 2/2 covers the
AH consumer; other consumers that read iph->ihl after the LOCAL_OUT
hook may be similarly exposed and I have not enumerated them.

Direction question rather than a fix proposal: does basic iphdr
re-sanitization after a header-mangling hook belong in the netfilter
machinery, in each in-stack consumer, or both?

Michael Bommarito (2):
  ipv4: raw: reject IP_HDRINCL packets with ihl < 5
  ipv4: ah: harden ah_output options-copy guard against ihl < 5

 net/ipv4/ah4.c | 6 +++---
 net/ipv4/raw.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)


base-commit: 73d587ae684d176fac9db94173f77d78a794ea4f
--
2.53.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-05-12 23:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-12 20:51 [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets Michael Bommarito
2026-05-12 20:51 ` [PATCH net 1/2] ipv4: raw: reject IP_HDRINCL packets with ihl < 5 Michael Bommarito
2026-05-12 20:51 ` [PATCH net 2/2] ipv4: ah: harden ah_output options-copy guard against " Michael Bommarito
2026-05-12 22:34 ` [PATCH net 0/2] ipv4: harden against ihl < 5 IP_HDRINCL packets Pablo Neira Ayuso
2026-05-12 23:05   ` Michael Bommarito

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox