* [PATCH net v2 1/4] ipv4: validate ip_options length in __ip_options_echo() against skb tail
2026-05-24 4:14 [PATCH net v2 0/4] net: trust-after-modification fixes for IPv4 options + netlabel Qi Tang
@ 2026-05-24 4:14 ` Qi Tang
2026-05-25 14:47 ` Ido Schimmel
2026-05-24 4:14 ` [PATCH net v2 2/4] ipv4: ipmr: clamp ip_hdrlen against skb_headlen in ipmr_cache_report Qi Tang
` (2 subsequent siblings)
3 siblings, 1 reply; 6+ messages in thread
From: Qi Tang @ 2026-05-24 4:14 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet
Cc: netdev, fw, lyutoon, stable, Qi Tang, David Ahern, Ido Schimmel,
Simon Horman
__ip_options_echo() re-reads each option length byte (RR/TS/SRR/CIPSO)
from skb->data when building the echoed options into a 40-byte
__data[] buffer. __ip_options_compile() saved only the option offset
into IPCB(skb)->opt, not the length. An nftables LOCAL_IN payload
write reachable from an unprivileged user namespace can mutate the
length byte between parse and recvmsg, turning a parse-time validated
7-byte option into a 255-byte read.
unsigned char optbuf[sizeof(struct ip_options) + 40];
/* in __ip_options_echo: */
optlen = sptr[sopt->rr + 1]; /* re-read; nft can mutate */
memcpy(dptr, sptr + sopt->rr, optlen); /* into 40-byte buffer */
The destination is a stack buffer in ip_cmsg_recv_retopts() and a
DEFINE_RAW_FLEX() buffer in icmp.c / ip_output.c sized
IP_OPTIONS_DATA_FIXED_SIZE (40). KASAN reports a stack-out-of-bounds
write of size 255:
BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x7fc/0x1310
Write of size 255 at addr ffff88800a657950
__asan_memcpy+0x3c/0x60
__ip_options_echo+0x7fc/0x1310
ip_cmsg_recv_offset+0x58b/0xd10
udp_recvmsg+0x8da/0xc20
____sys_recvmsg+0x1b1/0x620
Validate that each re-read option length stays within
skb_tail_pointer(skb) before the memcpy.
Cc: stable@vger.kernel.org
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/ipv4/ip_options.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index be8815ce3ac24..1cc6096e6dd9d 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -91,6 +91,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
if (sopt->rr) {
optlen = sptr[sopt->rr+1];
+ if (sptr + sopt->rr + optlen > skb_tail_pointer(skb))
+ return -EINVAL;
soffset = sptr[sopt->rr+2];
dopt->rr = dopt->optlen + sizeof(struct iphdr);
memcpy(dptr, sptr+sopt->rr, optlen);
@@ -105,6 +107,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
}
if (sopt->ts) {
optlen = sptr[sopt->ts+1];
+ if (sptr + sopt->ts + optlen > skb_tail_pointer(skb))
+ return -EINVAL;
soffset = sptr[sopt->ts+2];
dopt->ts = dopt->optlen + sizeof(struct iphdr);
memcpy(dptr, sptr+sopt->ts, optlen);
@@ -145,6 +149,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
__be32 faddr;
optlen = start[1];
+ if (start + optlen > skb_tail_pointer(skb))
+ return -EINVAL;
soffset = start[2];
doffset = 0;
if (soffset > optlen)
@@ -174,6 +180,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
}
if (sopt->cipso) {
optlen = sptr[sopt->cipso+1];
+ if (sptr + sopt->cipso + optlen > skb_tail_pointer(skb))
+ return -EINVAL;
dopt->cipso = dopt->optlen+sizeof(struct iphdr);
memcpy(dptr, sptr+sopt->cipso, optlen);
dptr += optlen;
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH net v2 1/4] ipv4: validate ip_options length in __ip_options_echo() against skb tail
2026-05-24 4:14 ` [PATCH net v2 1/4] ipv4: validate ip_options length in __ip_options_echo() against skb tail Qi Tang
@ 2026-05-25 14:47 ` Ido Schimmel
0 siblings, 0 replies; 6+ messages in thread
From: Ido Schimmel @ 2026-05-25 14:47 UTC (permalink / raw)
To: Qi Tang
Cc: davem, kuba, pabeni, edumazet, netdev, fw, lyutoon, stable,
David Ahern, Simon Horman
On Sun, May 24, 2026 at 12:14:35PM +0800, Qi Tang wrote:
> __ip_options_echo() re-reads each option length byte (RR/TS/SRR/CIPSO)
> from skb->data when building the echoed options into a 40-byte
> __data[] buffer. __ip_options_compile() saved only the option offset
> into IPCB(skb)->opt, not the length. An nftables LOCAL_IN payload
> write reachable from an unprivileged user namespace can mutate the
> length byte between parse and recvmsg, turning a parse-time validated
> 7-byte option into a 255-byte read.
>
> unsigned char optbuf[sizeof(struct ip_options) + 40];
> /* in __ip_options_echo: */
> optlen = sptr[sopt->rr + 1]; /* re-read; nft can mutate */
> memcpy(dptr, sptr + sopt->rr, optlen); /* into 40-byte buffer */
>
> The destination is a stack buffer in ip_cmsg_recv_retopts() and a
> DEFINE_RAW_FLEX() buffer in icmp.c / ip_output.c sized
> IP_OPTIONS_DATA_FIXED_SIZE (40). KASAN reports a stack-out-of-bounds
> write of size 255:
>
> BUG: KASAN: stack-out-of-bounds in __ip_options_echo+0x7fc/0x1310
> Write of size 255 at addr ffff88800a657950
> __asan_memcpy+0x3c/0x60
> __ip_options_echo+0x7fc/0x1310
> ip_cmsg_recv_offset+0x58b/0xd10
> udp_recvmsg+0x8da/0xc20
> ____sys_recvmsg+0x1b1/0x620
>
> Validate that each re-read option length stays within
> skb_tail_pointer(skb) before the memcpy.
>
> Cc: stable@vger.kernel.org
> Reported-by: Qi Tang <tpluszz77@gmail.com>
> Reported-by: Tong Liu <lyutoon@gmail.com>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Qi Tang <tpluszz77@gmail.com>
> ---
> net/ipv4/ip_options.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
> index be8815ce3ac24..1cc6096e6dd9d 100644
> --- a/net/ipv4/ip_options.c
> +++ b/net/ipv4/ip_options.c
> @@ -91,6 +91,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
>
> if (sopt->rr) {
> optlen = sptr[sopt->rr+1];
> + if (sptr + sopt->rr + optlen > skb_tail_pointer(skb))
> + return -EINVAL;
Both Sashiko instances flag valid issues. Please go over them. The most
obvious issues are:
1. This check only avoids reading past the skb's linear buffer. The
memcpy() below can still overflow the destination buffer which is only
40 bytes.
2. There is no validation against the original IP options length
(sopt->optlen), so we might be echoing bytes from the skb payload (past
the IP options).
> soffset = sptr[sopt->rr+2];
> dopt->rr = dopt->optlen + sizeof(struct iphdr);
> memcpy(dptr, sptr+sopt->rr, optlen);
> @@ -105,6 +107,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
> }
> if (sopt->ts) {
> optlen = sptr[sopt->ts+1];
> + if (sptr + sopt->ts + optlen > skb_tail_pointer(skb))
> + return -EINVAL;
> soffset = sptr[sopt->ts+2];
> dopt->ts = dopt->optlen + sizeof(struct iphdr);
> memcpy(dptr, sptr+sopt->ts, optlen);
> @@ -145,6 +149,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
> __be32 faddr;
>
> optlen = start[1];
> + if (start + optlen > skb_tail_pointer(skb))
> + return -EINVAL;
> soffset = start[2];
> doffset = 0;
> if (soffset > optlen)
> @@ -174,6 +180,8 @@ int __ip_options_echo(struct net *net, struct ip_options *dopt,
> }
> if (sopt->cipso) {
> optlen = sptr[sopt->cipso+1];
> + if (sptr + sopt->cipso + optlen > skb_tail_pointer(skb))
> + return -EINVAL;
> dopt->cipso = dopt->optlen+sizeof(struct iphdr);
> memcpy(dptr, sptr+sopt->cipso, optlen);
> dptr += optlen;
> --
> 2.47.3
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net v2 2/4] ipv4: ipmr: clamp ip_hdrlen against skb_headlen in ipmr_cache_report
2026-05-24 4:14 [PATCH net v2 0/4] net: trust-after-modification fixes for IPv4 options + netlabel Qi Tang
2026-05-24 4:14 ` [PATCH net v2 1/4] ipv4: validate ip_options length in __ip_options_echo() against skb tail Qi Tang
@ 2026-05-24 4:14 ` Qi Tang
2026-05-24 4:14 ` [PATCH net v2 3/4] netlabel: validate CALIPSO option against skb tail in netlbl_skbuff_getattr Qi Tang
2026-05-24 4:14 ` [PATCH net v2 4/4] netlabel: validate CIPSO " Qi Tang
3 siblings, 0 replies; 6+ messages in thread
From: Qi Tang @ 2026-05-24 4:14 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet
Cc: netdev, fw, lyutoon, stable, Qi Tang, David Ahern, Ido Schimmel,
Simon Horman
ipmr_cache_report() copies ip_hdrlen(pkt) bytes from pkt->data into
a freshly allocated 128-byte skb that is delivered to userspace via
the mrouted IGMP raw socket and via igmpmsg_netlink_event:
const int ihl = ip_hdrlen(pkt);
...
skb_put(skb, ihl);
skb_copy_to_linear_data(skb, pkt->data, ihl);
ip_rcv_core() validates iph->ihl and pskb_may_pull()s ihl*4 bytes at
parse time. An nftables PRE_ROUTING payload write reachable from an
unprivileged user namespace can flip the ihl nibble from 5 to 15
between parse and ipmr_cache_report(). When the original skb is
non-linear (received via a NIC driver that uses paged frags), only
the parse-time ihl*4 = 20 bytes are in the linear region; the
consumer copies 60 bytes, and the extra 40 bytes are read from
skb_shared_info or adjacent slab memory and queued back to userspace,
a kernel heap-content infoleak. PoC observation: recvfrom on the
mroute socket returns 28 bytes without mutation, 68 bytes with
mutation (40 extra bytes leaked).
Clamp ihl against skb_headlen(pkt) so only bytes actually present
in the linear region are copied.
Cc: stable@vger.kernel.org
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/ipv4/ipmr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c
index 2628cd3a93a68..b40f3dd8f650f 100644
--- a/net/ipv4/ipmr.c
+++ b/net/ipv4/ipmr.c
@@ -1056,7 +1056,7 @@ static void ipmr_cache_resolve(struct net *net, struct mr_table *mrt,
static int ipmr_cache_report(const struct mr_table *mrt,
struct sk_buff *pkt, vifi_t vifi, int assert)
{
- const int ihl = ip_hdrlen(pkt);
+ const int ihl = min_t(int, ip_hdrlen(pkt), skb_headlen(pkt));
struct sock *mroute_sk;
struct igmphdr *igmp;
struct igmpmsg *msg;
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH net v2 3/4] netlabel: validate CALIPSO option against skb tail in netlbl_skbuff_getattr
2026-05-24 4:14 [PATCH net v2 0/4] net: trust-after-modification fixes for IPv4 options + netlabel Qi Tang
2026-05-24 4:14 ` [PATCH net v2 1/4] ipv4: validate ip_options length in __ip_options_echo() against skb tail Qi Tang
2026-05-24 4:14 ` [PATCH net v2 2/4] ipv4: ipmr: clamp ip_hdrlen against skb_headlen in ipmr_cache_report Qi Tang
@ 2026-05-24 4:14 ` Qi Tang
2026-05-24 4:14 ` [PATCH net v2 4/4] netlabel: validate CIPSO " Qi Tang
3 siblings, 0 replies; 6+ messages in thread
From: Qi Tang @ 2026-05-24 4:14 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet
Cc: netdev, fw, lyutoon, stable, Qi Tang, Paul Moore, Simon Horman,
Huw Davies, linux-security-module
netlbl_skbuff_getattr() locates the CALIPSO option in the IPv6 HBH
header via calipso_optptr() and hands the bare pointer to
calipso_getattr() -> calipso_opt_getattr(). The consumer re-reads
calipso[1] (option data length) and calipso[6] (cat_len/4) and walks
calipso + 10 for cat_len bytes via netlbl_bitmap_walk().
ipv6_hop_calipso() validates these bytes only at parse time inside
ipv6_parse_hopopts(). An nftables PRE_ROUTING payload write reachable
from an unprivileged user namespace can rewrite both bytes between
parse and the SELinux peer-label consume path
(selinux_sock_rcv_skb_compat -> selinux_netlbl_sock_rcv_skb ->
netlbl_skbuff_getattr). The self-consistency check
(cat_len + 8 > len) inside calipso_opt_getattr() is defeated by
mutating both bytes consistently, allowing a ~232-byte
slab-out-of-bounds read from calipso + 10 whose set bits become MLS
categories driving the access decision.
netlbl_skbuff_getattr() has the skb; gate the consume on the option
fitting within skb_tail_pointer(). The IPv6 option layout is
type(1) + length(1) + length bytes of data, so requiring
ptr + 2 + ptr[1] <= skb_tail covers the option and its embedded
bitmap. When the bounds check fails the packet has been mutated
after parse, so return -EINVAL rather than fall through to the
unlabeled path.
Runtime confirmation (SELinux compat path with selinux=1 enforcing=0
and a CALIPSO DOI added via netlabelctl): Udp6InDatagrams increments
to 1 with the mutated cat_len, showing
selinux_socket_sock_rcv_skb -> netlbl_skbuff_getattr ->
calipso_opt_getattr -> netlbl_bitmap_walk runs end-to-end past the
option's true bound; with this patch the consume path returns
-EINVAL at the bounds check and the counter stays 0.
Cc: stable@vger.kernel.org
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Fixes: 2917f57b6bc1 ("calipso: Allow the lsm to label the skbuff directly.")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/netlabel/netlabel_kapi.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/net/netlabel/netlabel_kapi.c b/net/netlabel/netlabel_kapi.c
index 3583fa63dd01f..d0d6220b8d59d 100644
--- a/net/netlabel/netlabel_kapi.c
+++ b/net/netlabel/netlabel_kapi.c
@@ -1399,11 +1399,22 @@ int netlbl_skbuff_getattr(const struct sk_buff *skb,
return 0;
break;
#if IS_ENABLED(CONFIG_IPV6)
- case AF_INET6:
+ case AF_INET6: {
+ const unsigned char *tail = skb_tail_pointer(skb);
+ u8 opt_data_len;
+
ptr = calipso_optptr(skb);
- if (ptr && calipso_getattr(ptr, secattr) == 0)
+ if (!ptr)
+ break;
+ if (ptr + 2 > tail)
+ return -EINVAL;
+ opt_data_len = ptr[1]; /* IPv6 option data length */
+ if (ptr + 2 + opt_data_len > tail)
+ return -EINVAL;
+ if (calipso_getattr(ptr, secattr) == 0)
return 0;
break;
+ }
#endif /* IPv6 */
}
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH net v2 4/4] netlabel: validate CIPSO option against skb tail in netlbl_skbuff_getattr
2026-05-24 4:14 [PATCH net v2 0/4] net: trust-after-modification fixes for IPv4 options + netlabel Qi Tang
` (2 preceding siblings ...)
2026-05-24 4:14 ` [PATCH net v2 3/4] netlabel: validate CALIPSO option against skb tail in netlbl_skbuff_getattr Qi Tang
@ 2026-05-24 4:14 ` Qi Tang
3 siblings, 0 replies; 6+ messages in thread
From: Qi Tang @ 2026-05-24 4:14 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet
Cc: netdev, fw, lyutoon, stable, Qi Tang, Paul Moore, Simon Horman,
linux-security-module
netlbl_skbuff_getattr() locates the CIPSO option in the IPv4 IP header
via cipso_v4_optptr() and hands the bare pointer to cipso_v4_getattr().
The consumer re-reads cipso[1] (option length), cipso[6] (tag type),
and then cipso_v4_parsetag_*() re-reads further bytes from the skb.
__ip_options_compile() validates these bytes only at parse time. An
nftables LOCAL_IN payload write reachable from an unprivileged user
namespace can rewrite them after parse and before the SELinux/Smack
peer-label consume path (selinux_sock_rcv_skb_compat ->
selinux_netlbl_sock_rcv_skb -> netlbl_skbuff_getattr). This is the
IPv4 analogue of the CALIPSO IPv6 trust-after-modification fixed in
the previous patch: the tag parsers walk the option using attacker-
controlled length bytes, producing slab-out-of-bounds reads whose
contents feed into the MLS access decision.
Validate the option fits within skb_tail_pointer(skb) before invoking
cipso_v4_getattr(). The pre-tag-walk guard "ptr + 8 > tail" covers
the CIPSO option header (type + length + DOI = 6 bytes) plus the
first tag header (type + length = 2 bytes), which are the bytes
cipso_v4_getattr() reads to dispatch on the tag. When the bounds
check fails the packet has been mutated after parse, so return
-EINVAL rather than fall through to the unlabeled path.
Runtime confirmation (Smack peer-label policy + nft LOCAL_IN
mutation of tag_len): UdpInDatagrams increments to 1 and recvfrom
returns the payload, showing netlbl_skbuff_getattr ->
cipso_v4_getattr -> cipso_v4_parsetag_rbm -> netlbl_bitmap_walk runs
end-to-end past the option's true bound; with this patch the
consume path returns -EINVAL at the bounds check and the counter
stays 0.
Cc: stable@vger.kernel.org
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Fixes: 04f81f0154e4 ("cipso: don't use IPCB() to locate the CIPSO IP option")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/netlabel/netlabel_kapi.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/net/netlabel/netlabel_kapi.c b/net/netlabel/netlabel_kapi.c
index d0d6220b8d59d..c2d3ea751f4e1 100644
--- a/net/netlabel/netlabel_kapi.c
+++ b/net/netlabel/netlabel_kapi.c
@@ -1393,11 +1393,24 @@ int netlbl_skbuff_getattr(const struct sk_buff *skb,
unsigned char *ptr;
switch (family) {
- case AF_INET:
+ case AF_INET: {
+ const unsigned char *tail = skb_tail_pointer(skb);
+ u8 opt_len, tag_len;
+
ptr = cipso_v4_optptr(skb);
- if (ptr && cipso_v4_getattr(ptr, secattr) == 0)
+ if (!ptr)
+ break;
+ /* CIPSO header (type+len+DOI = 6) + first tag header (type+len = 2) */
+ if (ptr + 8 > tail)
+ return -EINVAL;
+ opt_len = ptr[1]; /* total CIPSO option length */
+ tag_len = ptr[7]; /* first tag length */
+ if (ptr + opt_len > tail || ptr + 6 + tag_len > tail)
+ return -EINVAL;
+ if (cipso_v4_getattr(ptr, secattr) == 0)
return 0;
break;
+ }
#if IS_ENABLED(CONFIG_IPV6)
case AF_INET6: {
const unsigned char *tail = skb_tail_pointer(skb);
--
2.47.3
^ permalink raw reply related [flat|nested] 6+ messages in thread