BPF List
 help / color / mirror / Atom feed
* [PATCH] net: stmmac: prevent kernel panic during XDP program and XSK pool transitions
@ 2026-06-05  7:56 Carlos Fangmeier
  2026-06-05  8:06 ` sashiko-bot
  0 siblings, 1 reply; 2+ messages in thread
From: Carlos Fangmeier @ 2026-06-05  7:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, David S. Miller,
	Jakub Kicinski, Jesper Dangaard Brouer, John Fastabend,
	Stanislav Fomichev, Andrew Lunn, Eric Dumazet, Paolo Abeni,
	Maxime Coquelin, Alexandre Torgue, Ong Boon Leong
  Cc: netdev, bpf, linux-stm32, linux-arm-kernel, linux-kernel,
	Carlos Fangmeier

stmmac_xdp_set_prog() tears down and rebuilds all DMA channels via
stmmac_xdp_release()/stmmac_xdp_open() without pausing the netdev
TX path. Similarly, stmmac_xdp_enable_pool() and
stmmac_xdp_disable_pool() reconfigure individual queue DMA rings
while TX remains active.

If the kernel transmits a frame during these windows — for example an
MLD report queued by the IPv6 stack — stmmac_xmit() calls
dwmac4_set_addr() against an MMIO register whose mapping has been
torn down, triggering a level-3 translation fault:

  Unable to handle kernel paging request at virtual address ffff8000840ec000
  pc : dwmac4_set_addr+0x8/0x18
  lr : stmmac_xmit+0x64c/0xb60
  Call trace:
   dwmac4_set_addr+0x8/0x18
   dev_hard_start_xmit+0xb0/0x220
   sch_direct_xmit+0x108/0x3f0
   __dev_queue_xmit+0x844/0xd00
   ip6_finish_output2+0x2d8/0x610
   mld_sendpack+0x180/0x2e0
   mld_ifc_work+0x1dc/0x480

The existing netif_tx_disable() in stmmac_xdp_release() is not
sufficient because stmmac_xdp_open() re-enables TX via
netif_tx_start_all_queues() before the caller regains control, leaving
a window where the freshly rebuilt rings can race with pending TX work.

Fix this by wrapping each reconfiguration path with
netif_tx_disable()/netif_tx_wake_all_queues():

 - stmmac_xdp_set_prog(): hold TX disabled across the full
   stmmac_xdp_release() + stmmac_xdp_open() sequence, only waking
   TX after stmmac_xdp_open() returns.

 - stmmac_xdp_enable_pool(): disable TX before tearing down the
   queue, re-enable after the queue is rebuilt and NAPI is active.

 - stmmac_xdp_disable_pool(): same pattern around the pool teardown
   and queue rebuild.

Tested on Cortex-A55 (stmmac/dwmac4, kernel 6.6.60) with AF_XDP
zero-copy and IPv6 active — no panics observed across repeated
XDP attach/detach and XSK pool setup/teardown cycles.

Fixes: 132c32ee5bc0 ("net: stmmac: Add TX via XDP zero-copy socket")
Signed-off-by: Carlos Fangmeier <carlos.fangmeier@gmail.com>
---
 drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
index d7e4db7224b0..a6611aee687f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
@@ -34,6 +34,7 @@ static int stmmac_xdp_enable_pool(struct stmmac_priv *priv,
 	need_update = netif_running(priv->dev) && stmmac_xdp_is_enabled(priv);
 
 	if (need_update) {
+		netif_tx_disable(priv->dev);
 		napi_disable(&ch->rx_napi);
 		napi_disable(&ch->tx_napi);
 		stmmac_disable_rx_queue(priv, queue);
@@ -46,6 +47,7 @@ static int stmmac_xdp_enable_pool(struct stmmac_priv *priv,
 		stmmac_enable_rx_queue(priv, queue);
 		stmmac_enable_tx_queue(priv, queue);
 		napi_enable(&ch->rxtx_napi);
+		netif_tx_wake_all_queues(priv->dev);
 
 		err = stmmac_xsk_wakeup(priv->dev, queue, XDP_WAKEUP_RX);
 		if (err)
@@ -72,6 +74,7 @@ static int stmmac_xdp_disable_pool(struct stmmac_priv *priv, u16 queue)
 	need_update = netif_running(priv->dev) && stmmac_xdp_is_enabled(priv);
 
 	if (need_update) {
+		netif_tx_disable(priv->dev);
 		napi_disable(&ch->rxtx_napi);
 		stmmac_disable_rx_queue(priv, queue);
 		stmmac_disable_tx_queue(priv, queue);
@@ -87,6 +90,7 @@ static int stmmac_xdp_disable_pool(struct stmmac_priv *priv, u16 queue)
 		stmmac_enable_tx_queue(priv, queue);
 		napi_enable(&ch->rx_napi);
 		napi_enable(&ch->tx_napi);
+		netif_tx_wake_all_queues(priv->dev);
 	}
 
 	return 0;
@@ -121,8 +125,10 @@ int stmmac_xdp_set_prog(struct stmmac_priv *priv, struct bpf_prog *prog,
 		xdp_features_clear_redirect_target(dev);
 
 	need_update = !!priv->xdp_prog != !!prog;
-	if (if_running && need_update)
+	if (if_running && need_update) {
+		netif_tx_disable(dev);
 		stmmac_xdp_release(dev);
+	}
 
 	old_prog = xchg(&priv->xdp_prog, prog);
 	if (old_prog)
@@ -131,8 +137,10 @@ int stmmac_xdp_set_prog(struct stmmac_priv *priv, struct bpf_prog *prog,
 	/* Disable RX SPH for XDP operation */
 	priv->sph_active = priv->sph_capable && !stmmac_xdp_is_enabled(priv);
 
-	if (if_running && need_update)
+	if (if_running && need_update) {
 		stmmac_xdp_open(dev);
+		netif_tx_wake_all_queues(dev);
+	}
 
 	if (prog)
 		xdp_features_set_redirect_target(dev, false);

---
base-commit: 4aacf509e537a711fa71bca9f234e5eb6968850e
change-id: 20260604-main-f69f9564a74b

Best regards,
--  
Carlos Fangmeier <carlos.fangmeier@gmail.com>


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] net: stmmac: prevent kernel panic during XDP program and XSK pool transitions
  2026-06-05  7:56 [PATCH] net: stmmac: prevent kernel panic during XDP program and XSK pool transitions Carlos Fangmeier
@ 2026-06-05  8:06 ` sashiko-bot
  0 siblings, 0 replies; 2+ messages in thread
From: sashiko-bot @ 2026-06-05  8:06 UTC (permalink / raw)
  To: Carlos Fangmeier; +Cc: bpf

Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:

New issues:
- [High] Unconditional `netif_tx_wake_all_queues()` on error paths forces early transmit, leading to NULL pointer dereferences.
- [High] Global TX queue operations during single-queue XSK pool reconfiguration destroy flow control state for unaffected queues.
- [Medium] The added synchronization logic in `stmmac_xdp_set_prog()` is ineffective, redundant, and based on a flawed premise.

Pre-existing issues:
- [High] Missing error handling for `stmmac_enable_tx_queue` and `stmmac_enable_rx_queue` allocations leads to crashes during XSK pool setup.
--

commit 5f08fcd5650e1c25484ffcb83c996e73d9df4bd4
Author: Carlos Fangmeier <carlos.fangmeier@gmail.com>
Subject: net: stmmac: prevent kernel panic during XDP program and XSK pool transitions

This commit attempts to prevent races with pending TX work during XDP and
XSK pool reconfigurations by wrapping the operations with netif_tx_disable()
and netif_tx_wake_all_queues().

> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
> index d7e4db7224b0c..a6611aee687fd 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_xdp.c
> @@ -34,6 +34,7 @@ static int stmmac_xdp_enable_pool(struct stmmac_priv *priv,
>  	need_update = netif_running(priv->dev) && stmmac_xdp_is_enabled(priv);
>  
>  	if (need_update) {
> +		netif_tx_disable(priv->dev);
>  		napi_disable(&ch->rx_napi);
>  		napi_disable(&ch->tx_napi);
>  		stmmac_disable_rx_queue(priv, queue);
> @@ -46,6 +47,7 @@ static int stmmac_xdp_enable_pool(struct stmmac_priv *priv,
>  		stmmac_enable_rx_queue(priv, queue);
>  		stmmac_enable_tx_queue(priv, queue);
>  		napi_enable(&ch->rxtx_napi);

[Severity: High]
This isn't a bug introduced by this patch, but does the lack of error returns
from stmmac_enable_tx_queue() and stmmac_enable_rx_queue() risk kernel panics
if descriptor allocations fail?

Currently, if __alloc_dma_tx_desc_resources() fails inside
stmmac_enable_tx_queue(), it prints an error and returns early without
initializing the rings.

Since the function returns void, stmmac_xdp_enable_pool() blindly proceeds
to call napi_enable(), which could cause a crash when NAPI or the TX path
dereferences uninitialized pointers.

> +		netif_tx_wake_all_queues(priv->dev);

[Severity: High]
Does calling netif_tx_wake_all_queues() here unintentionally clear the flow
control state for other unaffected transmit queues?

Since stmmac_xdp_enable_pool() only reconfigures a single DMA queue, waking
all queues will clear the off state for every queue on the netdev.

If another queue was intentionally stopped due to actual backpressure because
its DMA ring was full during stmmac_xmit(), the network stack might
erroneously resume transmitting on that full queue, leading to dropped packets.

[ ... ]

> @@ -72,6 +74,7 @@ static int stmmac_xdp_disable_pool(struct stmmac_priv *priv, u16 queue)
>  	need_update = netif_running(priv->dev) && stmmac_xdp_is_enabled(priv);
>  
>  	if (need_update) {
> +		netif_tx_disable(priv->dev);
>  		napi_disable(&ch->rxtx_napi);
>  		stmmac_disable_rx_queue(priv, queue);
>  		stmmac_disable_tx_queue(priv, queue);
> @@ -87,6 +90,7 @@ static int stmmac_xdp_disable_pool(struct stmmac_priv *priv, u16 queue)
>  		stmmac_enable_tx_queue(priv, queue);
>  		napi_enable(&ch->rx_napi);
>  		napi_enable(&ch->tx_napi);
> +		netif_tx_wake_all_queues(priv->dev);
>  	}

[Severity: High]
Does this also suffer from the same issue as stmmac_xdp_enable_pool(), where
a global queue wake overrides the flow control state for all other unaffected
TX queues?

[ ... ]

> @@ -121,8 +125,10 @@ int stmmac_xdp_set_prog(struct stmmac_priv *priv, struct bpf_prog *prog,
>  		xdp_features_clear_redirect_target(dev);
>  
>  	need_update = !!priv->xdp_prog != !!prog;
> -	if (if_running && need_update)
> +	if (if_running && need_update) {
> +		netif_tx_disable(dev);
>  		stmmac_xdp_release(dev);
> +	}

[Severity: Medium]
Is this netif_tx_disable() redundant, and does it actually prevent the race
during transitions?

Looking at stmmac_xdp_release(), it already calls netif_tx_disable()
internally.

Furthermore, on the success path, stmmac_xdp_open() calls
netif_tx_start_all_queues() before returning, meaning TX is re-enabled early
regardless of the wrapper added here. It appears the perceived race window
might remain identical.

>  	old_prog = xchg(&priv->xdp_prog, prog);
>  	if (old_prog)
> @@ -131,8 +137,10 @@ int stmmac_xdp_set_prog(struct stmmac_priv *priv, struct bpf_prog *prog,
>  	/* Disable RX SPH for XDP operation */
>  	priv->sph_active = priv->sph_capable && !stmmac_xdp_is_enabled(priv);
>  
> -	if (if_running && need_update)
> +	if (if_running && need_update) {
>  		stmmac_xdp_open(dev);
> +		netif_tx_wake_all_queues(dev);
> +	}

[Severity: High]
Could this netif_tx_wake_all_queues() force an early transmit and lead to a
panic if stmmac_xdp_open() fails?

If stmmac_xdp_open() returns an error, for example due to DMA descriptor
allocation failure, the DMA rings are not initialized.

Because this code unconditionally wakes all TX queues regardless of success,
the network stack will immediately resume transmission, and stmmac_xmit()
may dereference freed or uninitialized rings.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260605-main-v1-1-aed15b1cf1af@gmail.com?part=1

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-05  8:06 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-05  7:56 [PATCH] net: stmmac: prevent kernel panic during XDP program and XSK pool transitions Carlos Fangmeier
2026-06-05  8:06 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox