* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
[not found] ` <20260528163226.573363-1-tpluszz77@gmail.com>
@ 2026-05-29 2:55 ` Jiayuan Chen
2026-05-29 9:40 ` Florian Westphal
0 siblings, 1 reply; 5+ messages in thread
From: Jiayuan Chen @ 2026-05-29 2:55 UTC (permalink / raw)
To: Qi Tang, netfilter-devel, Pablo Neira Ayuso, Florian Westphal
Cc: davem, kuba, pabeni, edumazet, netdev, dsahern, idosch, horms,
lyutoon, stable
On 5/29/26 12:32 AM, Qi Tang wrote:
> On 5/28/26 9:48 PM, Jiayuan Chen wrote:
>> The bug is real, but I'm curious what kernel version and driver you're on.
>> On my side the skb falls into SKB_SMALL_HEAD_CACHE_SIZE (704), so the
>> linear area is pretty long, and optptr[2] maxes out at 255, which doesn't
>> look like it can reach frag_list.
>>
>> May the driver use alloc_skb to allocate small liner buffer?
> net.git at e1914add2799 (7.1-rc3), x86_64 + KASAN, plain QEMU, no special
> driver. You're right that with a normal small nh_off the +250 write stays in
> the linear area. We get the reach from a large nh_off instead.
>
> The packet is forwarded over a VXLAN-over-IPv6 tunnel, so after decap the
> inner IP packet still has the outer eth/IPv6/UDP/VXLAN/inner-eth in front of
> it in the same head (nh_off ~112 here). Inner options are 12 NOPs + RR, so
> opt->rr = 32, and nft rewrites the RR pointer byte to 0xff on the forward
> hook:
>
> nft add rule ip filter forward @nh,272,8 set 0xff
>
> so ip_forward_options() does
>
> write = head + nh_off + opt->rr + (0xff - 5)
> = head + 112 + 32 + 250 = head + 394
>
> with end = 384 that lands at shinfo+10, inside frag_list. ip_rt_get_source()
> writes the route source there, and kfree_skb_list_reason() walks the corrupted
> frag_list when the skb is dropped.
>
> VXLAN was just convenient. Other paths likely work too: any encap that pushes
> the options deeper, or a smaller head like you suggested. Pre-6.3 without
> skb_small_head_cache a plain forwarded packet already has end=192. I can send
> the PoC off-list if you want to repro.
>
> Thanks,
> Qi
An alternative would be to re-validate the options by calling
__ip_options_compile()
for writes targeting NFT_PAYLOAD_NETWORK_HEADER. Let's wait for the
netfilter maintainers' opinion.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
2026-05-29 2:55 ` [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail Jiayuan Chen
@ 2026-05-29 9:40 ` Florian Westphal
2026-05-29 10:43 ` Qi Tang
0 siblings, 1 reply; 5+ messages in thread
From: Florian Westphal @ 2026-05-29 9:40 UTC (permalink / raw)
To: Jiayuan Chen
Cc: Qi Tang, netfilter-devel, Pablo Neira Ayuso, davem, kuba, pabeni,
edumazet, netdev, dsahern, idosch, horms, lyutoon, stable
Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> > VXLAN was just convenient. Other paths likely work too: any encap that pushes
> > the options deeper, or a smaller head like you suggested. Pre-6.3 without
> > skb_small_head_cache a plain forwarded packet already has end=192. I can send
> > the PoC off-list if you want to repro.
> >
> > Thanks,
> > Qi
>
>
> An alternative would be to re-validate the options by calling
> __ip_options_compile()
> for writes targeting NFT_PAYLOAD_NETWORK_HEADER. Let's wait for the
> netfilter maintainers' opinion.
I'm not sure netfilter is the only facility that can munge data this
way nowadays. The plan is to disable arbitrary network header rewrites:
https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
2026-05-29 9:40 ` Florian Westphal
@ 2026-05-29 10:43 ` Qi Tang
2026-05-31 12:17 ` Ido Schimmel
0 siblings, 1 reply; 5+ messages in thread
From: Qi Tang @ 2026-05-29 10:43 UTC (permalink / raw)
To: fw
Cc: jiayuan.chen, pablo, netfilter-devel, davem, kuba, pabeni,
edumazet, netdev, dsahern, idosch, horms, lyutoon, stable,
Qi Tang
Florian Westphal <fw@strlen.de> wrote:
> I'm not sure netfilter is the only facility that can munge data this
> way nowadays. The plan is to disable arbitrary network header rewrites:
>
> https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/
Agreed, the source side is the better place for this on mainline.
I went looking for other ways into the window between option compile
(ip_rcv_options() in ip_rcv_finish_core, after PREROUTING) and
ip_forward_options(), and only found nft_payload and nfqueue at the
FORWARD hook. tc/cls-act run before compile (ingress) or after
ip_forward_options (egress), BPF at the netfilter hook can't write the
packet (base helpers only, no bpf_skb_store_bytes), and the LWT_IN BPF
path is blocked by the verifier. So your two-part restriction closes the
only in-tree triggers I could find.
This is just one consumer of the pattern; __ip_options_echo(),
ipmr_cache_report() and the CIPSO/CALIPSO netlbl_skbuff_getattr() path
are the same, posted as a series here:
https://lore.kernel.org/netdev/20260524041442.2432071-1-tpluszz77@gmail.com/
so if the source-side restriction is the way to go it probably makes
more sense to drop these consumer-side checks than to fix each site.
Your call.
Thanks,
Qi
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
2026-05-29 10:43 ` Qi Tang
@ 2026-05-31 12:17 ` Ido Schimmel
2026-06-04 8:46 ` Paolo Abeni
0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2026-05-31 12:17 UTC (permalink / raw)
To: Qi Tang
Cc: fw, jiayuan.chen, pablo, netfilter-devel, davem, kuba, pabeni,
edumazet, netdev, dsahern, horms, lyutoon, stable
On Fri, May 29, 2026 at 06:43:56PM +0800, Qi Tang wrote:
> Florian Westphal <fw@strlen.de> wrote:
> > I'm not sure netfilter is the only facility that can munge data this
> > way nowadays. The plan is to disable arbitrary network header rewrites:
> >
> > https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/
>
> Agreed, the source side is the better place for this on mainline.
>
> I went looking for other ways into the window between option compile
> (ip_rcv_options() in ip_rcv_finish_core, after PREROUTING) and
> ip_forward_options(), and only found nft_payload and nfqueue at the
> FORWARD hook. tc/cls-act run before compile (ingress) or after
> ip_forward_options (egress), BPF at the netfilter hook can't write the
> packet (base helpers only, no bpf_skb_store_bytes), and the LWT_IN BPF
> path is blocked by the verifier. So your two-part restriction closes the
> only in-tree triggers I could find.
>
> This is just one consumer of the pattern; __ip_options_echo(),
> ipmr_cache_report() and the CIPSO/CALIPSO netlbl_skbuff_getattr() path
> are the same, posted as a series here:
>
> https://lore.kernel.org/netdev/20260524041442.2432071-1-tpluszz77@gmail.com/
>
> so if the source-side restriction is the way to go it probably makes
> more sense to drop these consumer-side checks than to fix each site.
> Your call.
FWIW, I agree that it would be better to go with Florian's patches
rather than always assuming that we can't trust the data that was parsed
from the IP options.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
2026-05-31 12:17 ` Ido Schimmel
@ 2026-06-04 8:46 ` Paolo Abeni
0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2026-06-04 8:46 UTC (permalink / raw)
To: Ido Schimmel, Qi Tang
Cc: fw, jiayuan.chen, pablo, netfilter-devel, davem, kuba, edumazet,
netdev, dsahern, horms, lyutoon, stable
31/26 2:17 PM, Ido Schimmel wrote:
> On Fri, May 29, 2026 at 06:43:56PM +0800, Qi Tang wrote:
>> Florian Westphal <fw@strlen.de> wrote:
>>> I'm not sure netfilter is the only facility that can munge data this
>>> way nowadays. The plan is to disable arbitrary network header rewrites:
>>>
>>> https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/
>>
>> Agreed, the source side is the better place for this on mainline.
>>
>> I went looking for other ways into the window between option compile
>> (ip_rcv_options() in ip_rcv_finish_core, after PREROUTING) and
>> ip_forward_options(), and only found nft_payload and nfqueue at the
>> FORWARD hook. tc/cls-act run before compile (ingress) or after
>> ip_forward_options (egress), BPF at the netfilter hook can't write the
>> packet (base helpers only, no bpf_skb_store_bytes), and the LWT_IN BPF
>> path is blocked by the verifier. So your two-part restriction closes the
>> only in-tree triggers I could find.
>>
>> This is just one consumer of the pattern; __ip_options_echo(),
>> ipmr_cache_report() and the CIPSO/CALIPSO netlbl_skbuff_getattr() path
>> are the same, posted as a series here:
>>
>> https://lore.kernel.org/netdev/20260524041442.2432071-1-tpluszz77@gmail.com/
>>
>> so if the source-side restriction is the way to go it probably makes
>> more sense to drop these consumer-side checks than to fix each site.
>> Your call.
>
> FWIW, I agree that it would be better to go with Florian's patches
> rather than always assuming that we can't trust the data that was parsed
> from the IP options.
FTR, I agree with the above plan.
/P
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-04 8:46 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <b1447f76-0ca4-49b1-a1ba-2670dbbe5eea@linux.dev>
[not found] ` <20260528163226.573363-1-tpluszz77@gmail.com>
2026-05-29 2:55 ` [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail Jiayuan Chen
2026-05-29 9:40 ` Florian Westphal
2026-05-29 10:43 ` Qi Tang
2026-05-31 12:17 ` Ido Schimmel
2026-06-04 8:46 ` Paolo Abeni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox