* [PATCH net V1] veth: reduce XDP no_direct return section to fix race
@ 2025-11-19 16:28 Jesper Dangaard Brouer
2025-11-21 2:50 ` patchwork-bot+netdevbpf
0 siblings, 1 reply; 2+ messages in thread
From: Jesper Dangaard Brouer @ 2025-11-19 16:28 UTC (permalink / raw)
To: netdev, bigeasy
Cc: Jesper Dangaard Brouer, bpf, Eric Dumazet, David S. Miller,
Jakub Kicinski, Paolo Abeni, makita.toshiaki, toshiaki.makita1,
kernel-team, mfleming, maciej.fijalkowski, dtatulea, edumazet,
sdf, andrew+netdev, john.fastabend, ast, daniel
As explain in commit fa349e396e48 ("veth: Fix race with AF_XDP exposing
old or uninitialized descriptors") for veth there is a chance after
napi_complete_done() that another CPU can manage start another NAPI
instance running veth_pool(). For NAPI this is correctly handled as the
napi_schedule_prep() check will prevent multiple instances from getting
scheduled, but for the remaining code in veth_pool() this can run
concurrent with the newly started NAPI instance.
The problem/race is that xdp_clear_return_frame_no_direct() isn't
designed to be nested.
Prior to commit 401cb7dae813 ("net: Reference bpf_redirect_info via
task_struct on PREEMPT_RT.") the temporary BPF net context
bpf_redirect_info was stored per CPU, where this wasn't an issue. Since
this commit the BPF context is stored in 'current' task_struct. When
running veth in threaded-NAPI mode, then the kthread becomes the storage
area. Now a race exists between two concurrent veth_pool() function calls
one exiting NAPI and one running new NAPI, both using the same BPF net
context.
Race is when another CPU gets within the xdp_set_return_frame_no_direct()
section before exiting veth_pool() calls the clear-function
xdp_clear_return_frame_no_direct().
Fixes: 401cb7dae8130 ("net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.")
Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org>
---
We are seeing crashes that looks like UAF in production for an AF_XDP
application running on veth devices in threaded-NAPI mode.
This patch have already been deployed to production and we are anxiously
waiting to see if it resolves those crashes.
We believe this is a variation over fix in commit fa349e396e48
described in great details in Cloudflare blogpost:
https://blog.cloudflare.com/a-debugging-story-corrupt-packets-in-af_xdp-kernel-bug-or-user-error/
---
drivers/net/veth.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 35dd89aff4a9..cc502bf022d5 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -975,6 +975,9 @@ static int veth_poll(struct napi_struct *napi, int budget)
if (stats.xdp_redirect > 0)
xdp_do_flush();
+ if (stats.xdp_tx > 0)
+ veth_xdp_flush(rq, &bq);
+ xdp_clear_return_frame_no_direct();
if (done < budget && napi_complete_done(napi, done)) {
/* Write rx_notify_masked before reading ptr_ring */
@@ -987,10 +990,6 @@ static int veth_poll(struct napi_struct *napi, int budget)
}
}
- if (stats.xdp_tx > 0)
- veth_xdp_flush(rq, &bq);
- xdp_clear_return_frame_no_direct();
-
/* Release backpressure per NAPI poll */
smp_rmb(); /* Paired with netif_tx_stop_queue set_bit */
if (peer_txq && netif_tx_queue_stopped(peer_txq)) {
^ permalink raw reply related [flat|nested] 2+ messages in thread* Re: [PATCH net V1] veth: reduce XDP no_direct return section to fix race
2025-11-19 16:28 [PATCH net V1] veth: reduce XDP no_direct return section to fix race Jesper Dangaard Brouer
@ 2025-11-21 2:50 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 2+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-11-21 2:50 UTC (permalink / raw)
To: Jesper Dangaard Brouer
Cc: netdev, bigeasy, bpf, eric.dumazet, davem, kuba, pabeni,
makita.toshiaki, toshiaki.makita1, kernel-team, mfleming,
maciej.fijalkowski, dtatulea, edumazet, sdf, andrew+netdev,
john.fastabend, ast, daniel
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 19 Nov 2025 17:28:36 +0100 you wrote:
> As explain in commit fa349e396e48 ("veth: Fix race with AF_XDP exposing
> old or uninitialized descriptors") for veth there is a chance after
> napi_complete_done() that another CPU can manage start another NAPI
> instance running veth_pool(). For NAPI this is correctly handled as the
> napi_schedule_prep() check will prevent multiple instances from getting
> scheduled, but for the remaining code in veth_pool() this can run
> concurrent with the newly started NAPI instance.
>
> [...]
Here is the summary with links:
- [net,V1] veth: reduce XDP no_direct return section to fix race
https://git.kernel.org/netdev/net/c/a14602fcae17
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-11-21 2:50 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-19 16:28 [PATCH net V1] veth: reduce XDP no_direct return section to fix race Jesper Dangaard Brouer
2025-11-21 2:50 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox