* [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP
[not found] <CGME20260511165107eucas1p1882391435991ffc19670a60a43bbde01@eucas1p1.samsung.com>
@ 2026-05-11 16:50 ` Jakub Raczynski
2026-05-14 12:01 ` Paolo Abeni
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Raczynski @ 2026-05-11 16:50 UTC (permalink / raw)
To: netdev
Cc: linux-kernel, andrew+netdev, davem, edumazet, kuba, pabeni,
mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski,
Jakub Raczynski, Chang-Sub Lee
When running XDP forwarding and interface gets shut down, kernel might panic
or show SLUB "poison overwritten" errors due to a race condition between
NAPI polling and resource freeing.
Observed error is one of following:
- Poison overwrriten
[ 1889.547746] eth1: Link is Down
[ 1889.549940] =============================================================================
[ 1889.549954] BUG kmalloc-4k (Tainted: G B ): Poison overwritten
[ 1889.549959] -----------------------------------------------------------------------------
[ 1889.549963] 0xffffff882dcc4d80-0xffffff882dcc4da7 @offset=19840. First byte 0x0 instead of 0x6b
[ 1889.549969] Allocated in __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] age=169 cpu=7 pid=27759
[ 1889.550020] __kmem_cache_alloc_node+0x100/0x2e8
[ 1889.550032] __kmalloc+0x58/0x1a0
[ 1889.550039] __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac]
[ 1889.550052] alloc_dma_desc_resources+0xec/0x164 [stmmac]
[ 1889.550064] stmmac_setup_dma_desc+0xec/0x1e4 [stmmac]
[ 1889.550076] stmmac_open+0x28/0x94 [stmmac]
[...]
- Wrong memory address
[ 1901.546692] Unable to handle kernel paging request at virtual address dead000000000122
[...]
[ 1902.964068] Call trace:
[ 1902.967193] free_to_partial_list+0x560/0x600
[ 1902.972227] __slab_free+0x1a8/0x420
[ 1902.976480] __kmem_cache_free+0x204/0x218
[ 1902.981254] kfree+0x6c/0x128
[ 1902.984900] kvfree+0x3c/0x4c
[ 1902.988545] page_pool_release+0x234/0x27c
[ 1902.993320] page_pool_destroy+0xcc/0x190
[ 1902.998006] __free_dma_rx_desc_resources+0x100/0x360 [stmmac]
[ 1903.004516] free_dma_desc_resources+0x8c/0xac [stmmac]
[ 1903.010419] stmmac_release+0x1c0/0x2b4 [stmmac]
[...]
Root cause is stmmac_release() stops DMA and frees TX/RX ring buffers and
page pools while NAPI/XDP could still be accessing these resources in the
background.
Fix this by following:
- Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop
- Call synchronize_rcu() after stopping DMA but before freeing resources to
ensure all ongoing NAPI operations complete
- Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to
drop packets when interface is going down. This has already been done for
stmmac_xdp_xmit() so make it consistent
- Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation.
This was only done for stmmac_reset_subtask() during abnormal operation,
which is not enough. This does not affect normal operation as this flag is
used only for XDP apps
Co-developed-by: Chang-Sub Lee <cs0617.lee@samsung.com>
Signed-off-by: Chang-Sub Lee <cs0617.lee@samsung.com>
Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 3591755ea30b..3b7b7b0cab9b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4147,6 +4147,9 @@ static int __stmmac_open(struct net_device *dev,
stmmac_reset_queues_param(priv);
+ /* Clear DOWN flag when opening the interface */
+ clear_bit(STMMAC_DOWN, &priv->state);
+
ret = stmmac_hw_setup(dev);
if (ret < 0) {
netdev_err(priv->dev, "%s: Hw setup failed\n", __func__);
@@ -4251,9 +4254,18 @@ static void __stmmac_release(struct net_device *dev)
/* Free the IRQ lines */
stmmac_free_irq(dev, REQ_IRQ_ERR_ALL, 0);
+ /* Set DOWN flag to prevent XDP from processing new packets */
+ set_bit(STMMAC_DOWN, &priv->state);
+
/* Stop TX/RX DMA and clear the descriptors */
stmmac_stop_all_dma(priv);
+ /* Ensure NAPI has finished before freeing resources.
+ * This prevents use-after-free when NAPI is mid-execution
+ * accessing TX/RX ring buffers and page pool during ifconfig down.
+ */
+ synchronize_rcu();
+
/* Release and free the Rx/Tx resources */
free_dma_desc_resources(priv, &priv->dma_conf);
@@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
if (unlikely(!xdpf))
return STMMAC_XDP_CONSUMED;
+ if (unlikely(test_bit(STMMAC_DOWN, &priv->state)))
+ return -ENETDOWN;
+
queue = stmmac_xdp_get_tx_queue(priv, cpu);
nq = netdev_get_tx_queue(priv->dev, queue);
@@ -5308,7 +5323,9 @@ static int __stmmac_xdp_run_prog(struct stmmac_priv *priv,
res = stmmac_xdp_xmit_back(priv, xdp);
break;
case XDP_REDIRECT:
- if (xdp_do_redirect(priv->dev, xdp, prog) < 0)
+ if (unlikely(test_bit(STMMAC_DOWN, &priv->state)))
+ res = STMMAC_XDP_CONSUMED;
+ else if (xdp_do_redirect(priv->dev, xdp, prog) < 0)
res = STMMAC_XDP_CONSUMED;
else
res = STMMAC_XDP_REDIRECT;
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP
2026-05-11 16:50 ` [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP Jakub Raczynski
@ 2026-05-14 12:01 ` Paolo Abeni
2026-05-15 8:29 ` Jakub Raczynski
0 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2026-05-14 12:01 UTC (permalink / raw)
To: Jakub Raczynski, netdev
Cc: linux-kernel, andrew+netdev, davem, edumazet, kuba,
mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski,
Chang-Sub Lee
On 5/11/26 6:50 PM, Jakub Raczynski wrote:
> When running XDP forwarding and interface gets shut down, kernel might panic
> or show SLUB "poison overwritten" errors due to a race condition between
> NAPI polling and resource freeing.
>
> Observed error is one of following:
> - Poison overwrriten
> [ 1889.547746] eth1: Link is Down
> [ 1889.549940] =============================================================================
> [ 1889.549954] BUG kmalloc-4k (Tainted: G B ): Poison overwritten
> [ 1889.549959] -----------------------------------------------------------------------------
> [ 1889.549963] 0xffffff882dcc4d80-0xffffff882dcc4da7 @offset=19840. First byte 0x0 instead of 0x6b
> [ 1889.549969] Allocated in __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] age=169 cpu=7 pid=27759
> [ 1889.550020] __kmem_cache_alloc_node+0x100/0x2e8
> [ 1889.550032] __kmalloc+0x58/0x1a0
> [ 1889.550039] __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac]
> [ 1889.550052] alloc_dma_desc_resources+0xec/0x164 [stmmac]
> [ 1889.550064] stmmac_setup_dma_desc+0xec/0x1e4 [stmmac]
> [ 1889.550076] stmmac_open+0x28/0x94 [stmmac]
> [...]
>
> - Wrong memory address
> [ 1901.546692] Unable to handle kernel paging request at virtual address dead000000000122
> [...]
> [ 1902.964068] Call trace:
> [ 1902.967193] free_to_partial_list+0x560/0x600
> [ 1902.972227] __slab_free+0x1a8/0x420
> [ 1902.976480] __kmem_cache_free+0x204/0x218
> [ 1902.981254] kfree+0x6c/0x128
> [ 1902.984900] kvfree+0x3c/0x4c
> [ 1902.988545] page_pool_release+0x234/0x27c
> [ 1902.993320] page_pool_destroy+0xcc/0x190
> [ 1902.998006] __free_dma_rx_desc_resources+0x100/0x360 [stmmac]
> [ 1903.004516] free_dma_desc_resources+0x8c/0xac [stmmac]
> [ 1903.010419] stmmac_release+0x1c0/0x2b4 [stmmac]
> [...]
>
> Root cause is stmmac_release() stops DMA and frees TX/RX ring buffers and
> page pools while NAPI/XDP could still be accessing these resources in the
> background.
>
> Fix this by following:
> - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop
> - Call synchronize_rcu() after stopping DMA but before freeing resources to
> ensure all ongoing NAPI operations complete
> - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to
> drop packets when interface is going down. This has already been done for
> stmmac_xdp_xmit() so make it consistent
> - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation.
> This was only done for stmmac_reset_subtask() during abnormal operation,
> which is not enough. This does not affect normal operation as this flag is
> used only for XDP apps
The above looks racy. I think instead you should just use
napi_synchronize() in __stmmac_release.
> @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
> if (unlikely(!xdpf))
> return STMMAC_XDP_CONSUMED;
>
> + if (unlikely(test_bit(STMMAC_DOWN, &priv->state)))
> + return -ENETDOWN;
Sashiko noted here you should return STMMAC_XDP_CONSUMED
/P
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP
2026-05-14 12:01 ` Paolo Abeni
@ 2026-05-15 8:29 ` Jakub Raczynski
2026-05-15 9:24 ` Paolo Abeni
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Raczynski @ 2026-05-15 8:29 UTC (permalink / raw)
To: Paolo Abeni
Cc: netdev, linux-kernel, andrew+netdev, davem, edumazet, kuba,
mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski,
Chang-Sub Lee
[-- Attachment #1: Type: text/plain, Size: 1533 bytes --]
On Thu, May 14, 2026 at 02:01:20PM +0200, Paolo Abeni wrote:
> >
> > Fix this by following:
> > - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop
> > - Call synchronize_rcu() after stopping DMA but before freeing resources to
> > ensure all ongoing NAPI operations complete
> > - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to
> > drop packets when interface is going down. This has already been done for
> > stmmac_xdp_xmit() so make it consistent
> > - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation.
> > This was only done for stmmac_reset_subtask() during abnormal operation,
> > which is not enough. This does not affect normal operation as this flag is
> > used only for XDP apps
>
> The above looks racy. I think instead you should just use
> napi_synchronize() in __stmmac_release.
>
You put this after whole section, but I assume you are talking about
synchronize_rcu()? Because currently there are 0 checks and it is pure race
condition. synchronize_rcu() does secure it in some way, but you are correct,
proper ensuring that napi has finished is napi_synchronize().
Will fix in v2.
>
> > @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
> > if (unlikely(!xdpf))
> > return STMMAC_XDP_CONSUMED;
> >
> > + if (unlikely(test_bit(STMMAC_DOWN, &priv->state)))
> > + return -ENETDOWN;
>
> Sashiko noted here you should return STMMAC_XDP_CONSUMED
>
> /P
Seems good, will fix in v2.
Thanks
Jakub Raczynski
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP
2026-05-15 8:29 ` Jakub Raczynski
@ 2026-05-15 9:24 ` Paolo Abeni
2026-05-15 11:16 ` Jakub Raczynski
0 siblings, 1 reply; 5+ messages in thread
From: Paolo Abeni @ 2026-05-15 9:24 UTC (permalink / raw)
To: Jakub Raczynski
Cc: netdev, linux-kernel, andrew+netdev, davem, edumazet, kuba,
mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski,
Chang-Sub Lee
On 5/15/26 10:29 AM, Jakub Raczynski wrote:
> On Thu, May 14, 2026 at 02:01:20PM +0200, Paolo Abeni wrote:
>>>
>>> Fix this by following:
>>> - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop
>>> - Call synchronize_rcu() after stopping DMA but before freeing resources to
>>> ensure all ongoing NAPI operations complete
>>> - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to
>>> drop packets when interface is going down. This has already been done for
>>> stmmac_xdp_xmit() so make it consistent
>>> - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation.
>>> This was only done for stmmac_reset_subtask() during abnormal operation,
>>> which is not enough. This does not affect normal operation as this flag is
>>> used only for XDP apps
>>
>> The above looks racy. I think instead you should just use
>> napi_synchronize() in __stmmac_release.
>>
> You put this after whole section, but I assume you are talking about
> synchronize_rcu()? Because currently there are 0 checks and it is pure race
> condition. synchronize_rcu() does secure it in some way, but you are correct,
> proper ensuring that napi has finished is napi_synchronize().
> Will fix in v2.
>>
>>> @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
>>> if (unlikely(!xdpf))
>>> return STMMAC_XDP_CONSUMED;
>>>
>>> + if (unlikely(test_bit(STMMAC_DOWN, &priv->state)))
>>> + return -ENETDOWN;
>>
>> Sashiko noted here you should return STMMAC_XDP_CONSUMED
>>
>> /P
> Seems good, will fix in v2.
If you use napi_synchronize(), I think you can avoid setting STMMAC_DOWN
and testing it in the fast path: the run-to-completion after irq
disabling should ensure that no tx could happen after that
napi_synchronize() completes.
Side note: it's not clear to me where/when irq disabling take place?!?
/P
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP
2026-05-15 9:24 ` Paolo Abeni
@ 2026-05-15 11:16 ` Jakub Raczynski
0 siblings, 0 replies; 5+ messages in thread
From: Jakub Raczynski @ 2026-05-15 11:16 UTC (permalink / raw)
To: Paolo Abeni
Cc: netdev, linux-kernel, andrew+netdev, davem, edumazet, kuba,
mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski,
Chang-Sub Lee
[-- Attachment #1: Type: text/plain, Size: 2365 bytes --]
On Fri, May 15, 2026 at 11:24:01AM +0200, Paolo Abeni wrote:
> > You put this after whole section, but I assume you are talking about
> > synchronize_rcu()? Because currently there are 0 checks and it is pure race
> > condition. synchronize_rcu() does secure it in some way, but you are correct,
> > proper ensuring that napi has finished is napi_synchronize().
> > Will fix in v2.
> >>
> >>> @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv,
> >>> if (unlikely(!xdpf))
> >>> return STMMAC_XDP_CONSUMED;
> >>>
> >>> + if (unlikely(test_bit(STMMAC_DOWN, &priv->state)))
> >>> + return -ENETDOWN;
> >>
> >> Sashiko noted here you should return STMMAC_XDP_CONSUMED
> >>
> >> /P
> > Seems good, will fix in v2.
>
> If you use napi_synchronize(), I think you can avoid setting STMMAC_DOWN
> and testing it in the fast path: the run-to-completion after irq
> disabling should ensure that no tx could happen after that
> napi_synchronize() completes.
>
> Side note: it's not clear to me where/when irq disabling take place?!?
>
> /P
>
STMMAC_DOWN is currently in strange place where it is only used in XDP and is
only ever set/cleared in stmmac_reset_subtask(), and is only applied in
one path. So for starters we wanted to unify it accross XDP paths.
But it could probably be replaced by detecting netlink state,
but that is talk for different patch.
But that aside, in stmmac_disable_all_queues() there is already measure for
XDP to drain which is after phylink stop but before stopping IRQ's in
stmmac_free_irq() (or rather freeing them, which should stop them,
unless shared), but it is not enough it seems, as we managed to crash in
test environment.
So there is a moment between phylink_stop() and stop_all_dma() where
packets are processed, even though there was synchronize_rcu() inbetween.
Using napi_synchronize() or synchronize_rcu() AFTER stop_all_dma() should be
sufficient, but we will need to retest as you said.
Question is whether we actually care about scraps of data when
killing interface.
But that generally still makes STMMAC_DOWN flag almost completely useless.
I mean, it should still be added to other paths, but doesn't do much, other
than during abnormal behavior.
Meanwhile stmmac_xdp_release() handles above in different order and should
be no issue there.
So we will retest.
BR
Jakub Raczynski
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-15 11:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20260511165107eucas1p1882391435991ffc19670a60a43bbde01@eucas1p1.samsung.com>
2026-05-11 16:50 ` [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP Jakub Raczynski
2026-05-14 12:01 ` Paolo Abeni
2026-05-15 8:29 ` Jakub Raczynski
2026-05-15 9:24 ` Paolo Abeni
2026-05-15 11:16 ` Jakub Raczynski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox