* [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
@ 2026-05-28 11:12 Qi Tang
2026-05-28 13:48 ` Jiayuan Chen
0 siblings, 1 reply; 3+ messages in thread
From: Qi Tang @ 2026-05-28 11:12 UTC (permalink / raw)
To: davem, kuba, pabeni, edumazet
Cc: netdev, dsahern, idosch, horms, lyutoon, stable, Qi Tang
ip_forward_options() re-reads the RR/SRR/TS option length byte
optptr[1] and pointer byte optptr[2] from the skb on the forwarding
path and uses them as indexes for 4-byte writes via
ip_rt_get_source() (and a memcmp walk in the SRR branch).
__ip_options_compile() validates those bytes at parse time but stores
only the option's offset into IPCB(skb)->opt.{rr,srr,ts}. An nftables
FORWARD-chain payload mutation between parse and consume can rewrite
the bytes, driving the indexed writes out of bounds and overlapping
skb_shared_info. With optptr[2] mutated the write can land in
skb_shared_info.frag_list; the next time the skb is dropped
kfree_skb_list_reason() walks the forged list and frees an
attacker-controlled pointer, an arbitrary-free primitive (R15 below
is the corrupted frag_list):
BUG: unable to handle page fault for address: ffffed10195fd757
Oops: 0000 [#1] SMP KASAN NOPTI
RIP: 0010:kfree_skb_list_reason+0x167/0x5f0
RAX: 1ffff110195fd757 RBX: dffffc0000000000
R15: ffff8880cafebabe
CR2: ffffed10195fd757
Call Trace:
skb_release_data+0x565/0x820
sk_skb_reason_drop+0xc1/0x350
ip_rcv_core+0x7a8/0xcd0
ip_rcv+0x97/0x270
__netif_receive_skb_one_core+0x161/0x1b0
process_backlog+0x1c4/0x5b0
net_rx_action+0x934/0xfa0
Bound optptr[2] within optptr[1] before the RR and TS writes, and
clamp the SRR walk to the bytes actually present in the skb. Match
the existing error handling in this function: skip the malformed
option in place rather than returning, so the single ip_send_check()
at the end still recomputes the checksum for any option that was
updated earlier.
Cc: stable@vger.kernel.org
Reported-by: Qi Tang <tpluszz77@gmail.com>
Reported-by: Tong Liu <lyutoon@gmail.com>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/ipv4/ip_options.c | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index be8815ce3ac24..36a4e3cc39dd1 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -544,18 +544,26 @@ void ip_forward_options(struct sk_buff *skb)
if (opt->rr_needaddr) {
optptr = (unsigned char *)raw + opt->rr;
- ip_rt_get_source(&optptr[optptr[2]-5], skb, rt);
- opt->is_changed = 1;
+ if (optptr + optptr[1] <= skb_tail_pointer(skb) &&
+ optptr[2] >= 5 && optptr[2] <= optptr[1] + 1) {
+ ip_rt_get_source(&optptr[optptr[2] - 5], skb, rt);
+ opt->is_changed = 1;
+ }
}
if (opt->srr_is_hit) {
int srrptr, srrspace;
optptr = raw + opt->srr;
- for ( srrptr = optptr[2], srrspace = optptr[1];
- srrptr <= srrspace;
- srrptr += 4
- ) {
+ /* optptr[1] (option length) may have been rewritten after the
+ * parse-time check; if it now runs past the skb the option is
+ * malformed, so skip the source-route rewrite below.
+ */
+ srrspace = optptr[1];
+ if (optptr + srrspace > skb_tail_pointer(skb))
+ srrspace = 0;
+
+ for (srrptr = optptr[2]; srrptr <= srrspace; srrptr += 4) {
if (srrptr + 3 > srrspace)
break;
if (memcmp(&opt->nexthop, &optptr[srrptr-1], 4) == 0)
@@ -572,8 +580,11 @@ void ip_forward_options(struct sk_buff *skb)
}
if (opt->ts_needaddr) {
optptr = raw + opt->ts;
- ip_rt_get_source(&optptr[optptr[2]-9], skb, rt);
- opt->is_changed = 1;
+ if (optptr + optptr[1] <= skb_tail_pointer(skb) &&
+ optptr[2] >= 9 && optptr[2] <= optptr[1] + 5) {
+ ip_rt_get_source(&optptr[optptr[2] - 9], skb, rt);
+ opt->is_changed = 1;
+ }
}
}
if (opt->is_changed) {
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
2026-05-28 11:12 [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail Qi Tang
@ 2026-05-28 13:48 ` Jiayuan Chen
2026-05-28 16:32 ` Qi Tang
0 siblings, 1 reply; 3+ messages in thread
From: Jiayuan Chen @ 2026-05-28 13:48 UTC (permalink / raw)
To: Qi Tang, davem, kuba, pabeni, edumazet
Cc: netdev, dsahern, idosch, horms, lyutoon, stable
On 5/28/26 7:12 PM, Qi Tang wrote:
> ip_forward_options() re-reads the RR/SRR/TS option length byte
> optptr[1] and pointer byte optptr[2] from the skb on the forwarding
> path and uses them as indexes for 4-byte writes via
> ip_rt_get_source() (and a memcmp walk in the SRR branch).
>
> __ip_options_compile() validates those bytes at parse time but stores
> only the option's offset into IPCB(skb)->opt.{rr,srr,ts}. An nftables
> FORWARD-chain payload mutation between parse and consume can rewrite
> the bytes, driving the indexed writes out of bounds and overlapping
> skb_shared_info. With optptr[2] mutated the write can land in
> skb_shared_info.frag_list; the next time the skb is dropped
> kfree_skb_list_reason() walks the forged list and frees an
> attacker-controlled pointer, an arbitrary-free primitive (R15 below
> is the corrupted frag_list):
>
> BUG: unable to handle page fault for address: ffffed10195fd757
> Oops: 0000 [#1] SMP KASAN NOPTI
> RIP: 0010:kfree_skb_list_reason+0x167/0x5f0
> RAX: 1ffff110195fd757 RBX: dffffc0000000000
> R15: ffff8880cafebabe
> CR2: ffffed10195fd757
> Call Trace:
> skb_release_data+0x565/0x820
> sk_skb_reason_drop+0xc1/0x350
> ip_rcv_core+0x7a8/0xcd0
> ip_rcv+0x97/0x270
> __netif_receive_skb_one_core+0x161/0x1b0
> process_backlog+0x1c4/0x5b0
> net_rx_action+0x934/0xfa0
The bug is real, but I'm curious what kernel version and driver you're on.
On my side the skb falls into SKB_SMALL_HEAD_CACHE_SIZE (704), so the
linear area
is pretty long, and optptr[2] maxes out at 255, which doesn't look like
it can reach frag_list.
May the driver use alloc_skb to allocate small liner buffer?
> Bound optptr[2] within optptr[1] before the RR and TS writes, and
> clamp the SRR walk to the bytes actually present in the skb. Match
> the existing error handling in this function: skip the malformed
> option in place rather than returning, so the single ip_send_check()
> at the end still recomputes the checksum for any option that was
> updated earlier.
>
> Cc: stable@vger.kernel.org
> Reported-by: Qi Tang <tpluszz77@gmail.com>
> Reported-by: Tong Liu <lyutoon@gmail.com>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Qi Tang <tpluszz77@gmail.com>
> ---
> net/ipv4/ip_options.c | 27 +++++++++++++++++++--------
> 1 file changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
> index be8815ce3ac24..36a4e3cc39dd1 100644
> --- a/net/ipv4/ip_options.c
> +++ b/net/ipv4/ip_options.c
> @@ -544,18 +544,26 @@ void ip_forward_options(struct sk_buff *skb)
>
> if (opt->rr_needaddr) {
> optptr = (unsigned char *)raw + opt->rr;
> - ip_rt_get_source(&optptr[optptr[2]-5], skb, rt);
> - opt->is_changed = 1;
> + if (optptr + optptr[1] <= skb_tail_pointer(skb) &&
> + optptr[2] >= 5 && optptr[2] <= optptr[1] + 1) {
> + ip_rt_get_source(&optptr[optptr[2] - 5], skb, rt);
> + opt->is_changed = 1;
> + }
> }
> if (opt->srr_is_hit) {
> int srrptr, srrspace;
>
> optptr = raw + opt->srr;
>
> - for ( srrptr = optptr[2], srrspace = optptr[1];
> - srrptr <= srrspace;
> - srrptr += 4
> - ) {
> + /* optptr[1] (option length) may have been rewritten after the
> + * parse-time check; if it now runs past the skb the option is
> + * malformed, so skip the source-route rewrite below.
> + */
> + srrspace = optptr[1];
> + if (optptr + srrspace > skb_tail_pointer(skb))
> + srrspace = 0;
> +
> + for (srrptr = optptr[2]; srrptr <= srrspace; srrptr += 4) {
> if (srrptr + 3 > srrspace)
> break;
> if (memcmp(&opt->nexthop, &optptr[srrptr-1], 4) == 0)
> @@ -572,8 +580,11 @@ void ip_forward_options(struct sk_buff *skb)
> }
> if (opt->ts_needaddr) {
> optptr = raw + opt->ts;
> - ip_rt_get_source(&optptr[optptr[2]-9], skb, rt);
> - opt->is_changed = 1;
> + if (optptr + optptr[1] <= skb_tail_pointer(skb) &&
> + optptr[2] >= 9 && optptr[2] <= optptr[1] + 5) {
> + ip_rt_get_source(&optptr[optptr[2] - 9], skb, rt);
> + opt->is_changed = 1;
> + }
> }
> }
> if (opt->is_changed) {
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
2026-05-28 13:48 ` Jiayuan Chen
@ 2026-05-28 16:32 ` Qi Tang
0 siblings, 0 replies; 3+ messages in thread
From: Qi Tang @ 2026-05-28 16:32 UTC (permalink / raw)
To: jiayuan.chen
Cc: davem, kuba, pabeni, edumazet, netdev, dsahern, idosch, horms,
lyutoon, stable, Qi Tang
On 5/28/26 9:48 PM, Jiayuan Chen wrote:
> The bug is real, but I'm curious what kernel version and driver you're on.
> On my side the skb falls into SKB_SMALL_HEAD_CACHE_SIZE (704), so the
> linear area is pretty long, and optptr[2] maxes out at 255, which doesn't
> look like it can reach frag_list.
>
> May the driver use alloc_skb to allocate small liner buffer?
net.git at e1914add2799 (7.1-rc3), x86_64 + KASAN, plain QEMU, no special
driver. You're right that with a normal small nh_off the +250 write stays in
the linear area. We get the reach from a large nh_off instead.
The packet is forwarded over a VXLAN-over-IPv6 tunnel, so after decap the
inner IP packet still has the outer eth/IPv6/UDP/VXLAN/inner-eth in front of
it in the same head (nh_off ~112 here). Inner options are 12 NOPs + RR, so
opt->rr = 32, and nft rewrites the RR pointer byte to 0xff on the forward
hook:
nft add rule ip filter forward @nh,272,8 set 0xff
so ip_forward_options() does
write = head + nh_off + opt->rr + (0xff - 5)
= head + 112 + 32 + 250 = head + 394
with end = 384 that lands at shinfo+10, inside frag_list. ip_rt_get_source()
writes the route source there, and kfree_skb_list_reason() walks the corrupted
frag_list when the skb is dropped.
VXLAN was just convenient. Other paths likely work too: any encap that pushes
the options deeper, or a smaller head like you suggested. Pre-6.3 without
skb_small_head_cache a plain forwarded packet already has end=192. I can send
the PoC off-list if you want to repro.
Thanks,
Qi
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-28 16:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-28 11:12 [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail Qi Tang
2026-05-28 13:48 ` Jiayuan Chen
2026-05-28 16:32 ` Qi Tang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox