Linux Netfilter development
 help / color / mirror / Atom feed
* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
       [not found] ` <20260528163226.573363-1-tpluszz77@gmail.com>
@ 2026-05-29  2:55   ` Jiayuan Chen
  2026-05-29  9:40     ` Florian Westphal
  0 siblings, 1 reply; 5+ messages in thread
From: Jiayuan Chen @ 2026-05-29  2:55 UTC (permalink / raw)
  To: Qi Tang, netfilter-devel, Pablo Neira Ayuso, Florian Westphal
  Cc: davem, kuba, pabeni, edumazet, netdev, dsahern, idosch, horms,
	lyutoon, stable


On 5/29/26 12:32 AM, Qi Tang wrote:
> On 5/28/26 9:48 PM, Jiayuan Chen wrote:
>> The bug is real, but I'm curious what kernel version and driver you're on.
>> On my side the skb falls into SKB_SMALL_HEAD_CACHE_SIZE (704), so the
>> linear area is pretty long, and optptr[2] maxes out at 255, which doesn't
>> look like it can reach frag_list.
>>
>> May the driver use alloc_skb to allocate small liner buffer?
> net.git at e1914add2799 (7.1-rc3), x86_64 + KASAN, plain QEMU, no special
> driver. You're right that with a normal small nh_off the +250 write stays in
> the linear area. We get the reach from a large nh_off instead.
>
> The packet is forwarded over a VXLAN-over-IPv6 tunnel, so after decap the
> inner IP packet still has the outer eth/IPv6/UDP/VXLAN/inner-eth in front of
> it in the same head (nh_off ~112 here). Inner options are 12 NOPs + RR, so
> opt->rr = 32, and nft rewrites the RR pointer byte to 0xff on the forward
> hook:
>
>    nft add rule ip filter forward @nh,272,8 set 0xff
>
> so ip_forward_options() does
>
>    write = head + nh_off + opt->rr + (0xff - 5)
>          = head + 112 + 32 + 250 = head + 394
>
> with end = 384 that lands at shinfo+10, inside frag_list. ip_rt_get_source()
> writes the route source there, and kfree_skb_list_reason() walks the corrupted
> frag_list when the skb is dropped.
>
> VXLAN was just convenient. Other paths likely work too: any encap that pushes
> the options deeper, or a smaller head like you suggested. Pre-6.3 without
> skb_small_head_cache a plain forwarded packet already has end=192. I can send
> the PoC off-list if you want to repro.
>
> Thanks,
> Qi


An alternative would be to re-validate the options by calling 
__ip_options_compile()
for writes targeting NFT_PAYLOAD_NETWORK_HEADER. Let's wait for the 
netfilter maintainers' opinion.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
  2026-05-29  2:55   ` [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail Jiayuan Chen
@ 2026-05-29  9:40     ` Florian Westphal
  2026-05-29 10:43       ` Qi Tang
  0 siblings, 1 reply; 5+ messages in thread
From: Florian Westphal @ 2026-05-29  9:40 UTC (permalink / raw)
  To: Jiayuan Chen
  Cc: Qi Tang, netfilter-devel, Pablo Neira Ayuso, davem, kuba, pabeni,
	edumazet, netdev, dsahern, idosch, horms, lyutoon, stable

Jiayuan Chen <jiayuan.chen@linux.dev> wrote:
> > VXLAN was just convenient. Other paths likely work too: any encap that pushes
> > the options deeper, or a smaller head like you suggested. Pre-6.3 without
> > skb_small_head_cache a plain forwarded packet already has end=192. I can send
> > the PoC off-list if you want to repro.
> >
> > Thanks,
> > Qi
> 
> 
> An alternative would be to re-validate the options by calling 
> __ip_options_compile()
> for writes targeting NFT_PAYLOAD_NETWORK_HEADER. Let's wait for the 
> netfilter maintainers' opinion.

I'm not sure netfilter is the only facility that can munge data this
way nowadays.  The plan is to disable arbitrary network header rewrites:

https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
  2026-05-29  9:40     ` Florian Westphal
@ 2026-05-29 10:43       ` Qi Tang
  2026-05-31 12:17         ` Ido Schimmel
  0 siblings, 1 reply; 5+ messages in thread
From: Qi Tang @ 2026-05-29 10:43 UTC (permalink / raw)
  To: fw
  Cc: jiayuan.chen, pablo, netfilter-devel, davem, kuba, pabeni,
	edumazet, netdev, dsahern, idosch, horms, lyutoon, stable,
	Qi Tang

Florian Westphal <fw@strlen.de> wrote:
> I'm not sure netfilter is the only facility that can munge data this
> way nowadays.  The plan is to disable arbitrary network header rewrites:
>
> https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/

Agreed, the source side is the better place for this on mainline.

I went looking for other ways into the window between option compile
(ip_rcv_options() in ip_rcv_finish_core, after PREROUTING) and
ip_forward_options(), and only found nft_payload and nfqueue at the
FORWARD hook. tc/cls-act run before compile (ingress) or after
ip_forward_options (egress), BPF at the netfilter hook can't write the
packet (base helpers only, no bpf_skb_store_bytes), and the LWT_IN BPF
path is blocked by the verifier. So your two-part restriction closes the
only in-tree triggers I could find.

This is just one consumer of the pattern; __ip_options_echo(),
ipmr_cache_report() and the CIPSO/CALIPSO netlbl_skbuff_getattr() path
are the same, posted as a series here:

  https://lore.kernel.org/netdev/20260524041442.2432071-1-tpluszz77@gmail.com/

so if the source-side restriction is the way to go it probably makes
more sense to drop these consumer-side checks than to fix each site.
Your call.

Thanks,
Qi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
  2026-05-29 10:43       ` Qi Tang
@ 2026-05-31 12:17         ` Ido Schimmel
  2026-06-04  8:46           ` Paolo Abeni
  0 siblings, 1 reply; 5+ messages in thread
From: Ido Schimmel @ 2026-05-31 12:17 UTC (permalink / raw)
  To: Qi Tang
  Cc: fw, jiayuan.chen, pablo, netfilter-devel, davem, kuba, pabeni,
	edumazet, netdev, dsahern, horms, lyutoon, stable

On Fri, May 29, 2026 at 06:43:56PM +0800, Qi Tang wrote:
> Florian Westphal <fw@strlen.de> wrote:
> > I'm not sure netfilter is the only facility that can munge data this
> > way nowadays.  The plan is to disable arbitrary network header rewrites:
> >
> > https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/
> 
> Agreed, the source side is the better place for this on mainline.
> 
> I went looking for other ways into the window between option compile
> (ip_rcv_options() in ip_rcv_finish_core, after PREROUTING) and
> ip_forward_options(), and only found nft_payload and nfqueue at the
> FORWARD hook. tc/cls-act run before compile (ingress) or after
> ip_forward_options (egress), BPF at the netfilter hook can't write the
> packet (base helpers only, no bpf_skb_store_bytes), and the LWT_IN BPF
> path is blocked by the verifier. So your two-part restriction closes the
> only in-tree triggers I could find.
> 
> This is just one consumer of the pattern; __ip_options_echo(),
> ipmr_cache_report() and the CIPSO/CALIPSO netlbl_skbuff_getattr() path
> are the same, posted as a series here:
> 
>   https://lore.kernel.org/netdev/20260524041442.2432071-1-tpluszz77@gmail.com/
> 
> so if the source-side restriction is the way to go it probably makes
> more sense to drop these consumer-side checks than to fix each site.
> Your call.

FWIW, I agree that it would be better to go with Florian's patches
rather than always assuming that we can't trust the data that was parsed
from the IP options.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail
  2026-05-31 12:17         ` Ido Schimmel
@ 2026-06-04  8:46           ` Paolo Abeni
  0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2026-06-04  8:46 UTC (permalink / raw)
  To: Ido Schimmel, Qi Tang
  Cc: fw, jiayuan.chen, pablo, netfilter-devel, davem, kuba, edumazet,
	netdev, dsahern, horms, lyutoon, stable

31/26 2:17 PM, Ido Schimmel wrote:
> On Fri, May 29, 2026 at 06:43:56PM +0800, Qi Tang wrote:
>> Florian Westphal <fw@strlen.de> wrote:
>>> I'm not sure netfilter is the only facility that can munge data this
>>> way nowadays.  The plan is to disable arbitrary network header rewrites:
>>>
>>> https://lore.kernel.org/netfilter-devel/20260527121147.22076-1-fw@strlen.de/
>>
>> Agreed, the source side is the better place for this on mainline.
>>
>> I went looking for other ways into the window between option compile
>> (ip_rcv_options() in ip_rcv_finish_core, after PREROUTING) and
>> ip_forward_options(), and only found nft_payload and nfqueue at the
>> FORWARD hook. tc/cls-act run before compile (ingress) or after
>> ip_forward_options (egress), BPF at the netfilter hook can't write the
>> packet (base helpers only, no bpf_skb_store_bytes), and the LWT_IN BPF
>> path is blocked by the verifier. So your two-part restriction closes the
>> only in-tree triggers I could find.
>>
>> This is just one consumer of the pattern; __ip_options_echo(),
>> ipmr_cache_report() and the CIPSO/CALIPSO netlbl_skbuff_getattr() path
>> are the same, posted as a series here:
>>
>>   https://lore.kernel.org/netdev/20260524041442.2432071-1-tpluszz77@gmail.com/
>>
>> so if the source-side restriction is the way to go it probably makes
>> more sense to drop these consumer-side checks than to fix each site.
>> Your call.
> 
> FWIW, I agree that it would be better to go with Florian's patches
> rather than always assuming that we can't trust the data that was parsed
> from the IP options.


FTR, I agree with the above plan.

/P


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-04  8:46 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <b1447f76-0ca4-49b1-a1ba-2670dbbe5eea@linux.dev>
     [not found] ` <20260528163226.573363-1-tpluszz77@gmail.com>
2026-05-29  2:55   ` [PATCH net] ipv4: validate ip_forward_options() option fields against skb tail Jiayuan Chen
2026-05-29  9:40     ` Florian Westphal
2026-05-29 10:43       ` Qi Tang
2026-05-31 12:17         ` Ido Schimmel
2026-06-04  8:46           ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox