* [PATCH net v3 1/2] net: clear the dst when changing skb protocol
@ 2025-06-10 0:12 Jakub Kicinski
2025-06-10 13:21 ` Willem de Bruijn
2025-06-12 0:40 ` patchwork-bot+netdevbpf
0 siblings, 2 replies; 3+ messages in thread
From: Jakub Kicinski @ 2025-06-10 0:12 UTC (permalink / raw)
To: davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms,
willemdebruijn.kernel, maze, daniel, Jakub Kicinski, stable,
martin.lau, john.fastabend, eddyz87, sdf, haoluo, willemb,
william.xuanziyang, alan.maguire, bpf, shuah, linux-kselftest,
yonghong.song
A not-so-careful NAT46 BPF program can crash the kernel
if it indiscriminately flips ingress packets from v4 to v6:
BUG: kernel NULL pointer dereference, address: 0000000000000000
ip6_rcv_core (net/ipv6/ip6_input.c:190:20)
ipv6_rcv (net/ipv6/ip6_input.c:306:8)
process_backlog (net/core/dev.c:6186:4)
napi_poll (net/core/dev.c:6906:9)
net_rx_action (net/core/dev.c:7028:13)
do_softirq (kernel/softirq.c:462:3)
netif_rx (net/core/dev.c:5326:3)
dev_loopback_xmit (net/core/dev.c:4015:2)
ip_mc_finish_output (net/ipv4/ip_output.c:363:8)
NF_HOOK (./include/linux/netfilter.h:314:9)
ip_mc_output (net/ipv4/ip_output.c:400:5)
dst_output (./include/net/dst.h:459:9)
ip_local_out (net/ipv4/ip_output.c:130:9)
ip_send_skb (net/ipv4/ip_output.c:1496:8)
udp_send_skb (net/ipv4/udp.c:1040:8)
udp_sendmsg (net/ipv4/udp.c:1328:10)
The output interface has a 4->6 program attached at ingress.
We try to loop the multicast skb back to the sending socket.
Ingress BPF runs as part of netif_rx(), pushes a valid v6 hdr
and changes skb->protocol to v6. We enter ip6_rcv_core which
tries to use skb_dst(). But the dst is still an IPv4 one left
after IPv4 mcast output.
Clear the dst in all BPF helpers which change the protocol.
Try to preserve metadata dsts, those may carry non-routing
metadata.
Cc: stable@vger.kernel.org
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: d219df60a70e ("bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()")
Fixes: 1b00e0dfe7d0 ("bpf: update skb->protocol in bpf_skb_net_grow")
Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v3:
- go back to v1, the encap / decap which don't change proto
will be added in -next
- split out the test
v2: https://lore.kernel.org/20250607204734.1588964-1-kuba@kernel.org
- drop on encap/decap
- fix typo (protcol)
- add the test to the Makefile
v1: https://lore.kernel.org/20250604210604.257036-1-kuba@kernel.org
I wonder if we should not skip ingress (tc_skip_classify?)
for looped back packets in the first place. But that doesn't
seem robust enough vs multiple redirections to solve the crash.
Ignoring LOOPBACK packets (like the NAT46 prog should) doesn't
work either, since BPF can change pkt_type arbitrarily.
CC: martin.lau@linux.dev
CC: daniel@iogearbox.net
CC: john.fastabend@gmail.com
CC: eddyz87@gmail.com
CC: sdf@fomichev.me
CC: haoluo@google.com
CC: willemb@google.com
CC: william.xuanziyang@huawei.com
CC: alan.maguire@oracle.com
CC: bpf@vger.kernel.org
CC: edumazet@google.com
CC: maze@google.com
CC: shuah@kernel.org
CC: linux-kselftest@vger.kernel.org
CC: yonghong.song@linux.dev
---
net/core/filter.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/net/core/filter.c b/net/core/filter.c
index 327ca73f9cd7..7a72f766aacf 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3233,6 +3233,13 @@ static const struct bpf_func_proto bpf_skb_vlan_pop_proto = {
.arg1_type = ARG_PTR_TO_CTX,
};
+static void bpf_skb_change_protocol(struct sk_buff *skb, u16 proto)
+{
+ skb->protocol = htons(proto);
+ if (skb_valid_dst(skb))
+ skb_dst_drop(skb);
+}
+
static int bpf_skb_generic_push(struct sk_buff *skb, u32 off, u32 len)
{
/* Caller already did skb_cow() with len as headroom,
@@ -3329,7 +3336,7 @@ static int bpf_skb_proto_4_to_6(struct sk_buff *skb)
}
}
- skb->protocol = htons(ETH_P_IPV6);
+ bpf_skb_change_protocol(skb, ETH_P_IPV6);
skb_clear_hash(skb);
return 0;
@@ -3359,7 +3366,7 @@ static int bpf_skb_proto_6_to_4(struct sk_buff *skb)
}
}
- skb->protocol = htons(ETH_P_IP);
+ bpf_skb_change_protocol(skb, ETH_P_IP);
skb_clear_hash(skb);
return 0;
@@ -3550,10 +3557,10 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
/* Match skb->protocol to new outer l3 protocol */
if (skb->protocol == htons(ETH_P_IP) &&
flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
- skb->protocol = htons(ETH_P_IPV6);
+ bpf_skb_change_protocol(skb, ETH_P_IPV6);
else if (skb->protocol == htons(ETH_P_IPV6) &&
flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4)
- skb->protocol = htons(ETH_P_IP);
+ bpf_skb_change_protocol(skb, ETH_P_IP);
}
if (skb_is_gso(skb)) {
@@ -3606,10 +3613,10 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
/* Match skb->protocol to new outer l3 protocol */
if (skb->protocol == htons(ETH_P_IP) &&
flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
- skb->protocol = htons(ETH_P_IPV6);
+ bpf_skb_change_protocol(skb, ETH_P_IPV6);
else if (skb->protocol == htons(ETH_P_IPV6) &&
flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
- skb->protocol = htons(ETH_P_IP);
+ bpf_skb_change_protocol(skb, ETH_P_IP);
if (skb_is_gso(skb)) {
struct skb_shared_info *shinfo = skb_shinfo(skb);
--
2.49.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH net v3 1/2] net: clear the dst when changing skb protocol
2025-06-10 0:12 [PATCH net v3 1/2] net: clear the dst when changing skb protocol Jakub Kicinski
@ 2025-06-10 13:21 ` Willem de Bruijn
2025-06-12 0:40 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: Willem de Bruijn @ 2025-06-10 13:21 UTC (permalink / raw)
To: Jakub Kicinski, davem
Cc: netdev, edumazet, pabeni, andrew+netdev, horms,
willemdebruijn.kernel, maze, daniel, Jakub Kicinski, stable,
martin.lau, john.fastabend, eddyz87, sdf, haoluo, willemb,
william.xuanziyang, alan.maguire, bpf, shuah, linux-kselftest,
yonghong.song
Jakub Kicinski wrote:
> A not-so-careful NAT46 BPF program can crash the kernel
> if it indiscriminately flips ingress packets from v4 to v6:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> ip6_rcv_core (net/ipv6/ip6_input.c:190:20)
> ipv6_rcv (net/ipv6/ip6_input.c:306:8)
> process_backlog (net/core/dev.c:6186:4)
> napi_poll (net/core/dev.c:6906:9)
> net_rx_action (net/core/dev.c:7028:13)
> do_softirq (kernel/softirq.c:462:3)
> netif_rx (net/core/dev.c:5326:3)
> dev_loopback_xmit (net/core/dev.c:4015:2)
> ip_mc_finish_output (net/ipv4/ip_output.c:363:8)
> NF_HOOK (./include/linux/netfilter.h:314:9)
> ip_mc_output (net/ipv4/ip_output.c:400:5)
> dst_output (./include/net/dst.h:459:9)
> ip_local_out (net/ipv4/ip_output.c:130:9)
> ip_send_skb (net/ipv4/ip_output.c:1496:8)
> udp_send_skb (net/ipv4/udp.c:1040:8)
> udp_sendmsg (net/ipv4/udp.c:1328:10)
>
> The output interface has a 4->6 program attached at ingress.
> We try to loop the multicast skb back to the sending socket.
> Ingress BPF runs as part of netif_rx(), pushes a valid v6 hdr
> and changes skb->protocol to v6. We enter ip6_rcv_core which
> tries to use skb_dst(). But the dst is still an IPv4 one left
> after IPv4 mcast output.
>
> Clear the dst in all BPF helpers which change the protocol.
> Try to preserve metadata dsts, those may carry non-routing
> metadata.
>
> Cc: stable@vger.kernel.org
> Reviewed-by: Maciej Żenczykowski <maze@google.com>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
> Fixes: d219df60a70e ("bpf: Add ipip6 and ip6ip decap support for bpf_skb_adjust_room()")
> Fixes: 1b00e0dfe7d0 ("bpf: update skb->protocol in bpf_skb_net_grow")
> Fixes: 6578171a7ff0 ("bpf: add bpf_skb_change_proto helper")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH net v3 1/2] net: clear the dst when changing skb protocol
2025-06-10 0:12 [PATCH net v3 1/2] net: clear the dst when changing skb protocol Jakub Kicinski
2025-06-10 13:21 ` Willem de Bruijn
@ 2025-06-12 0:40 ` patchwork-bot+netdevbpf
1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-06-12 0:40 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms,
willemdebruijn.kernel, maze, daniel, stable, martin.lau,
john.fastabend, eddyz87, sdf, haoluo, willemb, william.xuanziyang,
alan.maguire, bpf, shuah, linux-kselftest, yonghong.song
Hello:
This series was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 9 Jun 2025 17:12:44 -0700 you wrote:
> A not-so-careful NAT46 BPF program can crash the kernel
> if it indiscriminately flips ingress packets from v4 to v6:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> ip6_rcv_core (net/ipv6/ip6_input.c:190:20)
> ipv6_rcv (net/ipv6/ip6_input.c:306:8)
> process_backlog (net/core/dev.c:6186:4)
> napi_poll (net/core/dev.c:6906:9)
> net_rx_action (net/core/dev.c:7028:13)
> do_softirq (kernel/softirq.c:462:3)
> netif_rx (net/core/dev.c:5326:3)
> dev_loopback_xmit (net/core/dev.c:4015:2)
> ip_mc_finish_output (net/ipv4/ip_output.c:363:8)
> NF_HOOK (./include/linux/netfilter.h:314:9)
> ip_mc_output (net/ipv4/ip_output.c:400:5)
> dst_output (./include/net/dst.h:459:9)
> ip_local_out (net/ipv4/ip_output.c:130:9)
> ip_send_skb (net/ipv4/ip_output.c:1496:8)
> udp_send_skb (net/ipv4/udp.c:1040:8)
> udp_sendmsg (net/ipv4/udp.c:1328:10)
>
> [...]
Here is the summary with links:
- [net,v3,1/2] net: clear the dst when changing skb protocol
https://git.kernel.org/netdev/net/c/ba9db6f907ac
- [net,v3,2/2] selftests: net: add test case for NAT46 looping back dst
https://git.kernel.org/netdev/net/c/567766954b2d
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-06-12 0:40 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-10 0:12 [PATCH net v3 1/2] net: clear the dst when changing skb protocol Jakub Kicinski
2025-06-10 13:21 ` Willem de Bruijn
2025-06-12 0:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).