* [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP [not found] <CGME20260511165107eucas1p1882391435991ffc19670a60a43bbde01@eucas1p1.samsung.com> @ 2026-05-11 16:50 ` Jakub Raczynski 2026-05-14 12:01 ` Paolo Abeni 0 siblings, 1 reply; 5+ messages in thread From: Jakub Raczynski @ 2026-05-11 16:50 UTC (permalink / raw) To: netdev Cc: linux-kernel, andrew+netdev, davem, edumazet, kuba, pabeni, mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski, Jakub Raczynski, Chang-Sub Lee When running XDP forwarding and interface gets shut down, kernel might panic or show SLUB "poison overwritten" errors due to a race condition between NAPI polling and resource freeing. Observed error is one of following: - Poison overwrriten [ 1889.547746] eth1: Link is Down [ 1889.549940] ============================================================================= [ 1889.549954] BUG kmalloc-4k (Tainted: G B ): Poison overwritten [ 1889.549959] ----------------------------------------------------------------------------- [ 1889.549963] 0xffffff882dcc4d80-0xffffff882dcc4da7 @offset=19840. First byte 0x0 instead of 0x6b [ 1889.549969] Allocated in __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] age=169 cpu=7 pid=27759 [ 1889.550020] __kmem_cache_alloc_node+0x100/0x2e8 [ 1889.550032] __kmalloc+0x58/0x1a0 [ 1889.550039] __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] [ 1889.550052] alloc_dma_desc_resources+0xec/0x164 [stmmac] [ 1889.550064] stmmac_setup_dma_desc+0xec/0x1e4 [stmmac] [ 1889.550076] stmmac_open+0x28/0x94 [stmmac] [...] - Wrong memory address [ 1901.546692] Unable to handle kernel paging request at virtual address dead000000000122 [...] [ 1902.964068] Call trace: [ 1902.967193] free_to_partial_list+0x560/0x600 [ 1902.972227] __slab_free+0x1a8/0x420 [ 1902.976480] __kmem_cache_free+0x204/0x218 [ 1902.981254] kfree+0x6c/0x128 [ 1902.984900] kvfree+0x3c/0x4c [ 1902.988545] page_pool_release+0x234/0x27c [ 1902.993320] page_pool_destroy+0xcc/0x190 [ 1902.998006] __free_dma_rx_desc_resources+0x100/0x360 [stmmac] [ 1903.004516] free_dma_desc_resources+0x8c/0xac [stmmac] [ 1903.010419] stmmac_release+0x1c0/0x2b4 [stmmac] [...] Root cause is stmmac_release() stops DMA and frees TX/RX ring buffers and page pools while NAPI/XDP could still be accessing these resources in the background. Fix this by following: - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop - Call synchronize_rcu() after stopping DMA but before freeing resources to ensure all ongoing NAPI operations complete - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to drop packets when interface is going down. This has already been done for stmmac_xdp_xmit() so make it consistent - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation. This was only done for stmmac_reset_subtask() during abnormal operation, which is not enough. This does not affect normal operation as this flag is used only for XDP apps Co-developed-by: Chang-Sub Lee <cs0617.lee@samsung.com> Signed-off-by: Chang-Sub Lee <cs0617.lee@samsung.com> Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com> --- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 3591755ea30b..3b7b7b0cab9b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -4147,6 +4147,9 @@ static int __stmmac_open(struct net_device *dev, stmmac_reset_queues_param(priv); + /* Clear DOWN flag when opening the interface */ + clear_bit(STMMAC_DOWN, &priv->state); + ret = stmmac_hw_setup(dev); if (ret < 0) { netdev_err(priv->dev, "%s: Hw setup failed\n", __func__); @@ -4251,9 +4254,18 @@ static void __stmmac_release(struct net_device *dev) /* Free the IRQ lines */ stmmac_free_irq(dev, REQ_IRQ_ERR_ALL, 0); + /* Set DOWN flag to prevent XDP from processing new packets */ + set_bit(STMMAC_DOWN, &priv->state); + /* Stop TX/RX DMA and clear the descriptors */ stmmac_stop_all_dma(priv); + /* Ensure NAPI has finished before freeing resources. + * This prevents use-after-free when NAPI is mid-execution + * accessing TX/RX ring buffers and page pool during ifconfig down. + */ + synchronize_rcu(); + /* Release and free the Rx/Tx resources */ free_dma_desc_resources(priv, &priv->dma_conf); @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, if (unlikely(!xdpf)) return STMMAC_XDP_CONSUMED; + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) + return -ENETDOWN; + queue = stmmac_xdp_get_tx_queue(priv, cpu); nq = netdev_get_tx_queue(priv->dev, queue); @@ -5308,7 +5323,9 @@ static int __stmmac_xdp_run_prog(struct stmmac_priv *priv, res = stmmac_xdp_xmit_back(priv, xdp); break; case XDP_REDIRECT: - if (xdp_do_redirect(priv->dev, xdp, prog) < 0) + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) + res = STMMAC_XDP_CONSUMED; + else if (xdp_do_redirect(priv->dev, xdp, prog) < 0) res = STMMAC_XDP_CONSUMED; else res = STMMAC_XDP_REDIRECT; -- 2.34.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP 2026-05-11 16:50 ` [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP Jakub Raczynski @ 2026-05-14 12:01 ` Paolo Abeni 2026-05-15 8:29 ` Jakub Raczynski 0 siblings, 1 reply; 5+ messages in thread From: Paolo Abeni @ 2026-05-14 12:01 UTC (permalink / raw) To: Jakub Raczynski, netdev Cc: linux-kernel, andrew+netdev, davem, edumazet, kuba, mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski, Chang-Sub Lee On 5/11/26 6:50 PM, Jakub Raczynski wrote: > When running XDP forwarding and interface gets shut down, kernel might panic > or show SLUB "poison overwritten" errors due to a race condition between > NAPI polling and resource freeing. > > Observed error is one of following: > - Poison overwrriten > [ 1889.547746] eth1: Link is Down > [ 1889.549940] ============================================================================= > [ 1889.549954] BUG kmalloc-4k (Tainted: G B ): Poison overwritten > [ 1889.549959] ----------------------------------------------------------------------------- > [ 1889.549963] 0xffffff882dcc4d80-0xffffff882dcc4da7 @offset=19840. First byte 0x0 instead of 0x6b > [ 1889.549969] Allocated in __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] age=169 cpu=7 pid=27759 > [ 1889.550020] __kmem_cache_alloc_node+0x100/0x2e8 > [ 1889.550032] __kmalloc+0x58/0x1a0 > [ 1889.550039] __alloc_dma_tx_desc_resources+0x60/0x10c [stmmac] > [ 1889.550052] alloc_dma_desc_resources+0xec/0x164 [stmmac] > [ 1889.550064] stmmac_setup_dma_desc+0xec/0x1e4 [stmmac] > [ 1889.550076] stmmac_open+0x28/0x94 [stmmac] > [...] > > - Wrong memory address > [ 1901.546692] Unable to handle kernel paging request at virtual address dead000000000122 > [...] > [ 1902.964068] Call trace: > [ 1902.967193] free_to_partial_list+0x560/0x600 > [ 1902.972227] __slab_free+0x1a8/0x420 > [ 1902.976480] __kmem_cache_free+0x204/0x218 > [ 1902.981254] kfree+0x6c/0x128 > [ 1902.984900] kvfree+0x3c/0x4c > [ 1902.988545] page_pool_release+0x234/0x27c > [ 1902.993320] page_pool_destroy+0xcc/0x190 > [ 1902.998006] __free_dma_rx_desc_resources+0x100/0x360 [stmmac] > [ 1903.004516] free_dma_desc_resources+0x8c/0xac [stmmac] > [ 1903.010419] stmmac_release+0x1c0/0x2b4 [stmmac] > [...] > > Root cause is stmmac_release() stops DMA and frees TX/RX ring buffers and > page pools while NAPI/XDP could still be accessing these resources in the > background. > > Fix this by following: > - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop > - Call synchronize_rcu() after stopping DMA but before freeing resources to > ensure all ongoing NAPI operations complete > - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to > drop packets when interface is going down. This has already been done for > stmmac_xdp_xmit() so make it consistent > - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation. > This was only done for stmmac_reset_subtask() during abnormal operation, > which is not enough. This does not affect normal operation as this flag is > used only for XDP apps The above looks racy. I think instead you should just use napi_synchronize() in __stmmac_release. > @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, > if (unlikely(!xdpf)) > return STMMAC_XDP_CONSUMED; > > + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) > + return -ENETDOWN; Sashiko noted here you should return STMMAC_XDP_CONSUMED /P ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP 2026-05-14 12:01 ` Paolo Abeni @ 2026-05-15 8:29 ` Jakub Raczynski 2026-05-15 9:24 ` Paolo Abeni 0 siblings, 1 reply; 5+ messages in thread From: Jakub Raczynski @ 2026-05-15 8:29 UTC (permalink / raw) To: Paolo Abeni Cc: netdev, linux-kernel, andrew+netdev, davem, edumazet, kuba, mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski, Chang-Sub Lee [-- Attachment #1: Type: text/plain, Size: 1533 bytes --] On Thu, May 14, 2026 at 02:01:20PM +0200, Paolo Abeni wrote: > > > > Fix this by following: > > - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop > > - Call synchronize_rcu() after stopping DMA but before freeing resources to > > ensure all ongoing NAPI operations complete > > - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to > > drop packets when interface is going down. This has already been done for > > stmmac_xdp_xmit() so make it consistent > > - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation. > > This was only done for stmmac_reset_subtask() during abnormal operation, > > which is not enough. This does not affect normal operation as this flag is > > used only for XDP apps > > The above looks racy. I think instead you should just use > napi_synchronize() in __stmmac_release. > You put this after whole section, but I assume you are talking about synchronize_rcu()? Because currently there are 0 checks and it is pure race condition. synchronize_rcu() does secure it in some way, but you are correct, proper ensuring that napi has finished is napi_synchronize(). Will fix in v2. > > > @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, > > if (unlikely(!xdpf)) > > return STMMAC_XDP_CONSUMED; > > > > + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) > > + return -ENETDOWN; > > Sashiko noted here you should return STMMAC_XDP_CONSUMED > > /P Seems good, will fix in v2. Thanks Jakub Raczynski [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP 2026-05-15 8:29 ` Jakub Raczynski @ 2026-05-15 9:24 ` Paolo Abeni 2026-05-15 11:16 ` Jakub Raczynski 0 siblings, 1 reply; 5+ messages in thread From: Paolo Abeni @ 2026-05-15 9:24 UTC (permalink / raw) To: Jakub Raczynski Cc: netdev, linux-kernel, andrew+netdev, davem, edumazet, kuba, mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski, Chang-Sub Lee On 5/15/26 10:29 AM, Jakub Raczynski wrote: > On Thu, May 14, 2026 at 02:01:20PM +0200, Paolo Abeni wrote: >>> >>> Fix this by following: >>> - Set STMMAC_DOWN flag before stopping DMA to signal XDP to stop >>> - Call synchronize_rcu() after stopping DMA but before freeing resources to >>> ensure all ongoing NAPI operations complete >>> - Add STMMAC_DOWN flag checks in XDP code paths (XDP_TX and XDP_REDIRECT) to >>> drop packets when interface is going down. This has already been done for >>> stmmac_xdp_xmit() so make it consistent >>> - Clear STMMAC_DOWN flag in __stmmac_open() to restore normal operation. >>> This was only done for stmmac_reset_subtask() during abnormal operation, >>> which is not enough. This does not affect normal operation as this flag is >>> used only for XDP apps >> >> The above looks racy. I think instead you should just use >> napi_synchronize() in __stmmac_release. >> > You put this after whole section, but I assume you are talking about > synchronize_rcu()? Because currently there are 0 checks and it is pure race > condition. synchronize_rcu() does secure it in some way, but you are correct, > proper ensuring that napi has finished is napi_synchronize(). > Will fix in v2. >> >>> @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, >>> if (unlikely(!xdpf)) >>> return STMMAC_XDP_CONSUMED; >>> >>> + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) >>> + return -ENETDOWN; >> >> Sashiko noted here you should return STMMAC_XDP_CONSUMED >> >> /P > Seems good, will fix in v2. If you use napi_synchronize(), I think you can avoid setting STMMAC_DOWN and testing it in the fast path: the run-to-completion after irq disabling should ensure that no tx could happen after that napi_synchronize() completes. Side note: it's not clear to me where/when irq disabling take place?!? /P ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP 2026-05-15 9:24 ` Paolo Abeni @ 2026-05-15 11:16 ` Jakub Raczynski 0 siblings, 0 replies; 5+ messages in thread From: Jakub Raczynski @ 2026-05-15 11:16 UTC (permalink / raw) To: Paolo Abeni Cc: netdev, linux-kernel, andrew+netdev, davem, edumazet, kuba, mcoquelin.stm32, alexandre.torgue, k.domagalski, k.tegowski, Chang-Sub Lee [-- Attachment #1: Type: text/plain, Size: 2365 bytes --] On Fri, May 15, 2026 at 11:24:01AM +0200, Paolo Abeni wrote: > > You put this after whole section, but I assume you are talking about > > synchronize_rcu()? Because currently there are 0 checks and it is pure race > > condition. synchronize_rcu() does secure it in some way, but you are correct, > > proper ensuring that napi has finished is napi_synchronize(). > > Will fix in v2. > >> > >>> @@ -5267,6 +5279,9 @@ static int stmmac_xdp_xmit_back(struct stmmac_priv *priv, > >>> if (unlikely(!xdpf)) > >>> return STMMAC_XDP_CONSUMED; > >>> > >>> + if (unlikely(test_bit(STMMAC_DOWN, &priv->state))) > >>> + return -ENETDOWN; > >> > >> Sashiko noted here you should return STMMAC_XDP_CONSUMED > >> > >> /P > > Seems good, will fix in v2. > > If you use napi_synchronize(), I think you can avoid setting STMMAC_DOWN > and testing it in the fast path: the run-to-completion after irq > disabling should ensure that no tx could happen after that > napi_synchronize() completes. > > Side note: it's not clear to me where/when irq disabling take place?!? > > /P > STMMAC_DOWN is currently in strange place where it is only used in XDP and is only ever set/cleared in stmmac_reset_subtask(), and is only applied in one path. So for starters we wanted to unify it accross XDP paths. But it could probably be replaced by detecting netlink state, but that is talk for different patch. But that aside, in stmmac_disable_all_queues() there is already measure for XDP to drain which is after phylink stop but before stopping IRQ's in stmmac_free_irq() (or rather freeing them, which should stop them, unless shared), but it is not enough it seems, as we managed to crash in test environment. So there is a moment between phylink_stop() and stop_all_dma() where packets are processed, even though there was synchronize_rcu() inbetween. Using napi_synchronize() or synchronize_rcu() AFTER stop_all_dma() should be sufficient, but we will need to retest as you said. Question is whether we actually care about scraps of data when killing interface. But that generally still makes STMMAC_DOWN flag almost completely useless. I mean, it should still be added to other paths, but doesn't do much, other than during abnormal behavior. Meanwhile stmmac_xdp_release() handles above in different order and should be no issue there. So we will retest. BR Jakub Raczynski [-- Attachment #2: Type: text/plain, Size: 0 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-15 11:17 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <CGME20260511165107eucas1p1882391435991ffc19670a60a43bbde01@eucas1p1.samsung.com>
2026-05-11 16:50 ` [PATCH net] net/stmmac: Fix free-after-use panic when interface goes does with XDP Jakub Raczynski
2026-05-14 12:01 ` Paolo Abeni
2026-05-15 8:29 ` Jakub Raczynski
2026-05-15 9:24 ` Paolo Abeni
2026-05-15 11:16 ` Jakub Raczynski
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox