* [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done
@ 2026-05-28 14:03 Oscar Maes
2026-05-28 14:55 ` Alexander Lobakin
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Oscar Maes @ 2026-05-28 14:03 UTC (permalink / raw)
To: netdev; +Cc: edumazet, kuba, pabeni, andrew, Oscar Maes
napi_complete_done may call gro_flush_normal (though not currently, as GRO
is unsupported at the moment), which may result in packet TX. This will
eventually result in calling pcnet32_start_xmit - resulting in a deadlock
while trying to re-acquire the already locked spin lock.
It is safe to split the spinlock block into two, because the hardware
registers are still protected from concurrent access, and the two blocks
perform unrelated operations that don't need to happen atomically.
Fixes: 5b2ec6f2be51 ("pcnet32: use napi_complete_done()")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oscar Maes <oscmaes92@gmail.com>
---
NOTE: This patch was a part of the following net-next series:
https://lore.kernel.org/netdev/20260525125437.4061-2-oscmaes92@gmail.com/
drivers/net/ethernet/amd/pcnet32.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c
index 911808ab13a7..4f3076d4ea34 100644
--- a/drivers/net/ethernet/amd/pcnet32.c
+++ b/drivers/net/ethernet/amd/pcnet32.c
@@ -1407,8 +1407,10 @@ static int pcnet32_poll(struct napi_struct *napi, int budget)
pcnet32_restart(dev, CSR0_START);
netif_wake_queue(dev);
}
+ spin_unlock_irqrestore(&lp->lock, flags);
if (work_done < budget && napi_complete_done(napi, work_done)) {
+ spin_lock_irqsave(&lp->lock, flags);
/* clear interrupt masks */
val = lp->a->read_csr(ioaddr, CSR3);
val &= 0x00ff;
@@ -1416,9 +1418,9 @@ static int pcnet32_poll(struct napi_struct *napi, int budget)
/* Set interrupt enable. */
lp->a->write_csr(ioaddr, CSR0, CSR0_INTEN);
+ spin_unlock_irqrestore(&lp->lock, flags);
}
- spin_unlock_irqrestore(&lp->lock, flags);
return work_done;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 8+ messages in thread* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-05-28 14:03 [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done Oscar Maes @ 2026-05-28 14:55 ` Alexander Lobakin 2026-05-28 17:07 ` Oscar Maes 2026-06-02 2:44 ` Jakub Kicinski 2026-06-02 18:40 ` patchwork-bot+netdevbpf 2 siblings, 1 reply; 8+ messages in thread From: Alexander Lobakin @ 2026-05-28 14:55 UTC (permalink / raw) To: Oscar Maes; +Cc: netdev, edumazet, kuba, pabeni, andrew From: Oscar Maes <oscmaes92@gmail.com> Date: Thu, 28 May 2026 16:03:20 +0200 > napi_complete_done may call gro_flush_normal (though not currently, as GRO > is unsupported at the moment), which may result in packet TX. This will > eventually result in calling pcnet32_start_xmit - resulting in a deadlock > while trying to re-acquire the already locked spin lock. > > It is safe to split the spinlock block into two, because the hardware > registers are still protected from concurrent access, and the two blocks > perform unrelated operations that don't need to happen atomically. > > Fixes: 5b2ec6f2be51 ("pcnet32: use napi_complete_done()") > Reviewed-by: Andrew Lunn <andrew@lunn.ch> > Signed-off-by: Oscar Maes <oscmaes92@gmail.com> > --- > NOTE: This patch was a part of the following net-next series: > https://lore.kernel.org/netdev/20260525125437.4061-2-oscmaes92@gmail.com/ > > drivers/net/ethernet/amd/pcnet32.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c > index 911808ab13a7..4f3076d4ea34 100644 > --- a/drivers/net/ethernet/amd/pcnet32.c > +++ b/drivers/net/ethernet/amd/pcnet32.c > @@ -1407,8 +1407,10 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > pcnet32_restart(dev, CSR0_START); > netif_wake_queue(dev); > } > + spin_unlock_irqrestore(&lp->lock, flags); > > if (work_done < budget && napi_complete_done(napi, work_done)) { > + spin_lock_irqsave(&lp->lock, flags); > /* clear interrupt masks */ > val = lp->a->read_csr(ioaddr, CSR3); > val &= 0x00ff; > @@ -1416,9 +1418,9 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > > /* Set interrupt enable. */ > lp->a->write_csr(ioaddr, CSR0, CSR0_INTEN); > + spin_unlock_irqrestore(&lp->lock, flags); > } > > - spin_unlock_irqrestore(&lp->lock, flags); > return work_done; While this fix is valid, I'm wondering whether this needs a deeper rework as it's generally a very bad idea to have IRQs disabled when NAPI-polling (except for every short sections). Could these irqoff sections get narrowed down to reading/writing interrupt registers only? Thanks, Olek ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-05-28 14:55 ` Alexander Lobakin @ 2026-05-28 17:07 ` Oscar Maes 2026-05-29 15:21 ` Alexander Lobakin 0 siblings, 1 reply; 8+ messages in thread From: Oscar Maes @ 2026-05-28 17:07 UTC (permalink / raw) To: Alexander Lobakin; +Cc: netdev, edumazet, kuba, pabeni, andrew On Thu, May 28, 2026 at 04:55:55PM +0200, Alexander Lobakin wrote: > From: Oscar Maes <oscmaes92@gmail.com> > Date: Thu, 28 May 2026 16:03:20 +0200 > > > napi_complete_done may call gro_flush_normal (though not currently, as GRO > > is unsupported at the moment), which may result in packet TX. This will > > eventually result in calling pcnet32_start_xmit - resulting in a deadlock > > while trying to re-acquire the already locked spin lock. > > > > It is safe to split the spinlock block into two, because the hardware > > registers are still protected from concurrent access, and the two blocks > > perform unrelated operations that don't need to happen atomically. > > > > Fixes: 5b2ec6f2be51 ("pcnet32: use napi_complete_done()") > > Reviewed-by: Andrew Lunn <andrew@lunn.ch> > > Signed-off-by: Oscar Maes <oscmaes92@gmail.com> > > --- > > NOTE: This patch was a part of the following net-next series: > > https://lore.kernel.org/netdev/20260525125437.4061-2-oscmaes92@gmail.com/ > > > > drivers/net/ethernet/amd/pcnet32.c | 4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c > > index 911808ab13a7..4f3076d4ea34 100644 > > --- a/drivers/net/ethernet/amd/pcnet32.c > > +++ b/drivers/net/ethernet/amd/pcnet32.c > > @@ -1407,8 +1407,10 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > > pcnet32_restart(dev, CSR0_START); > > netif_wake_queue(dev); > > } > > + spin_unlock_irqrestore(&lp->lock, flags); > > > > if (work_done < budget && napi_complete_done(napi, work_done)) { > > + spin_lock_irqsave(&lp->lock, flags); > > /* clear interrupt masks */ > > val = lp->a->read_csr(ioaddr, CSR3); > > val &= 0x00ff; > > @@ -1416,9 +1418,9 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > > > > /* Set interrupt enable. */ > > lp->a->write_csr(ioaddr, CSR0, CSR0_INTEN); > > + spin_unlock_irqrestore(&lp->lock, flags); > > } > > > > - spin_unlock_irqrestore(&lp->lock, flags); > > return work_done; > > While this fix is valid, I'm wondering whether this needs a deeper > rework as it's generally a very bad idea to have IRQs disabled when > NAPI-polling (except for every short sections). > > Could these irqoff sections get narrowed down to reading/writing > interrupt registers only? > > Thanks, > Olek Sounds like a good idea, however, I think it's out of scope for this bug fix. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-05-28 17:07 ` Oscar Maes @ 2026-05-29 15:21 ` Alexander Lobakin 0 siblings, 0 replies; 8+ messages in thread From: Alexander Lobakin @ 2026-05-29 15:21 UTC (permalink / raw) To: Oscar Maes; +Cc: netdev, edumazet, kuba, pabeni, andrew From: Oscar Maes <oscmaes92@gmail.com> Date: Thu, 28 May 2026 19:07:47 +0200 > On Thu, May 28, 2026 at 04:55:55PM +0200, Alexander Lobakin wrote: >> From: Oscar Maes <oscmaes92@gmail.com> >> Date: Thu, 28 May 2026 16:03:20 +0200 >> >>> napi_complete_done may call gro_flush_normal (though not currently, as GRO >>> is unsupported at the moment), which may result in packet TX. This will >>> eventually result in calling pcnet32_start_xmit - resulting in a deadlock >>> while trying to re-acquire the already locked spin lock. >>> >>> It is safe to split the spinlock block into two, because the hardware >>> registers are still protected from concurrent access, and the two blocks >>> perform unrelated operations that don't need to happen atomically. >>> >>> Fixes: 5b2ec6f2be51 ("pcnet32: use napi_complete_done()") >>> Reviewed-by: Andrew Lunn <andrew@lunn.ch> >>> Signed-off-by: Oscar Maes <oscmaes92@gmail.com> >>> --- >>> NOTE: This patch was a part of the following net-next series: >>> https://lore.kernel.org/netdev/20260525125437.4061-2-oscmaes92@gmail.com/ >>> >>> drivers/net/ethernet/amd/pcnet32.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c >>> index 911808ab13a7..4f3076d4ea34 100644 >>> --- a/drivers/net/ethernet/amd/pcnet32.c >>> +++ b/drivers/net/ethernet/amd/pcnet32.c >>> @@ -1407,8 +1407,10 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) >>> pcnet32_restart(dev, CSR0_START); >>> netif_wake_queue(dev); >>> } >>> + spin_unlock_irqrestore(&lp->lock, flags); >>> >>> if (work_done < budget && napi_complete_done(napi, work_done)) { >>> + spin_lock_irqsave(&lp->lock, flags); >>> /* clear interrupt masks */ >>> val = lp->a->read_csr(ioaddr, CSR3); >>> val &= 0x00ff; >>> @@ -1416,9 +1418,9 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) >>> >>> /* Set interrupt enable. */ >>> lp->a->write_csr(ioaddr, CSR0, CSR0_INTEN); >>> + spin_unlock_irqrestore(&lp->lock, flags); >>> } >>> >>> - spin_unlock_irqrestore(&lp->lock, flags); >>> return work_done; >> >> While this fix is valid, I'm wondering whether this needs a deeper >> rework as it's generally a very bad idea to have IRQs disabled when >> NAPI-polling (except for every short sections). >> >> Could these irqoff sections get narrowed down to reading/writing >> interrupt registers only? >> >> Thanks, >> Olek > > Sounds like a good idea, however, I think it's out of scope for this bug fix. Up to you, you might want to improve these sections later and send to net-next, since (as far as I can see) they didn't cause any issues except this one. Just keep in mind in general that NAPI callbacks should in general either fully avoid irqoff sections or limit them to the minimum, and especially should not try to pass skbs to the netstack with the IRQs disabled and in general not call any NAPI/networking-related functions under irqoff. For this particular fix: Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> How long has this issue been present in the kernel? You may want to add "Cc: stable@vger.kernel.org" after the "Fixes:" tag to be marked as a candidate for backporting to LTSes. Thanks, Olek ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-05-28 14:03 [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done Oscar Maes 2026-05-28 14:55 ` Alexander Lobakin @ 2026-06-02 2:44 ` Jakub Kicinski 2026-06-02 16:41 ` Oscar Maes 2026-06-02 18:40 ` patchwork-bot+netdevbpf 2 siblings, 1 reply; 8+ messages in thread From: Jakub Kicinski @ 2026-06-02 2:44 UTC (permalink / raw) To: oscmaes92; +Cc: Jakub Kicinski, netdev, edumazet, pabeni, andrew This is an AI-generated review of your patch. The human sending this email has considered the AI review valid, or at least plausible. --- pcnet32: stop holding device spin lock during napi_complete_done This commit removes the lp->lock spinlock hold around napi_complete_done() to avoid a potential deadlock with pcnet32_start_xmit() if a GRO flush were to occur. > diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c > index 911808ab13a79..4f3076d4ea34e 100644 > --- a/drivers/net/ethernet/amd/pcnet32.c > +++ b/drivers/net/ethernet/amd/pcnet32.c > @@ -1407,8 +1407,10 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > pcnet32_restart(dev, CSR0_START); > netif_wake_queue(dev); > } > + spin_unlock_irqrestore(&lp->lock, flags); > > if (work_done < budget && napi_complete_done(napi, work_done)) { > + spin_lock_irqsave(&lp->lock, flags); [Severity: High] Does dropping the spinlock before napi_complete_done() introduce a race condition with shared IRQs that breaks NAPI interrupt mitigation? If napi_complete_done() clears the NAPIF_STATE_SCHED bit, marking the NAPI poll as complete, a shared IRQ could fire on another CPU. If pcnet32_interrupt() then acquires lp->lock and checks csr0 without checking the INTR bit, could it process pending events and successfully call napi_schedule_prep()? If pcnet32_interrupt() schedules NAPI and releases the lock, this code would then resume, re-acquire lp->lock, and unconditionally unmask device interrupts. This could leave the system in a state where NAPI is scheduled but hardware interrupts are enabled, resulting in an interrupt storm on every subsequent packet. > /* clear interrupt masks */ > val = lp->a->read_csr(ioaddr, CSR3); > val &= 0x00ff; > @@ -1416,9 +1418,9 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > > /* Set interrupt enable. */ > lp->a->write_csr(ioaddr, CSR0, CSR0_INTEN); > + spin_unlock_irqrestore(&lp->lock, flags); > } > > - spin_unlock_irqrestore(&lp->lock, flags); > return work_done; > } -- pw-bot: cr ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-06-02 2:44 ` Jakub Kicinski @ 2026-06-02 16:41 ` Oscar Maes 2026-06-02 18:34 ` Jakub Kicinski 0 siblings, 1 reply; 8+ messages in thread From: Oscar Maes @ 2026-06-02 16:41 UTC (permalink / raw) To: Jakub Kicinski; +Cc: netdev, edumazet, pabeni, andrew On Mon, Jun 01, 2026 at 07:44:28PM -0700, Jakub Kicinski wrote: > This is an AI-generated review of your patch. The human sending this > email has considered the AI review valid, or at least plausible. > --- > pcnet32: stop holding device spin lock during napi_complete_done > > This commit removes the lp->lock spinlock hold around napi_complete_done() > to avoid a potential deadlock with pcnet32_start_xmit() if a GRO flush > were to occur. > > > diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c > > index 911808ab13a79..4f3076d4ea34e 100644 > > --- a/drivers/net/ethernet/amd/pcnet32.c > > +++ b/drivers/net/ethernet/amd/pcnet32.c > > @@ -1407,8 +1407,10 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > > pcnet32_restart(dev, CSR0_START); > > netif_wake_queue(dev); > > } > > + spin_unlock_irqrestore(&lp->lock, flags); > > > > if (work_done < budget && napi_complete_done(napi, work_done)) { > > + spin_lock_irqsave(&lp->lock, flags); > > [Severity: High] > Does dropping the spinlock before napi_complete_done() introduce a race > condition with shared IRQs that breaks NAPI interrupt mitigation? > > If napi_complete_done() clears the NAPIF_STATE_SCHED bit, marking the > NAPI poll as complete, a shared IRQ could fire on another CPU. > > If pcnet32_interrupt() then acquires lp->lock and checks csr0 without > checking the INTR bit, could it process pending events and successfully > call napi_schedule_prep()? > > If pcnet32_interrupt() schedules NAPI and releases the lock, this code > would then resume, re-acquire lp->lock, and unconditionally unmask > device interrupts. > > This could leave the system in a state where NAPI is scheduled but > hardware interrupts are enabled, resulting in an interrupt storm on > every subsequent packet. > > > /* clear interrupt masks */ > > val = lp->a->read_csr(ioaddr, CSR3); > > val &= 0x00ff; > > @@ -1416,9 +1418,9 @@ static int pcnet32_poll(struct napi_struct *napi, int budget) > > > > /* Set interrupt enable. */ > > lp->a->write_csr(ioaddr, CSR0, CSR0_INTEN); > > + spin_unlock_irqrestore(&lp->lock, flags); > > } > > > > - spin_unlock_irqrestore(&lp->lock, flags); > > return work_done; > > } > -- > pw-bot: cr The suggested race condition is impossible, even after calling napi_complete_done all pcnet32 interrupts are masked via CSR3. pcnet32_interrupt would never be called until the second locked section in pcnet32_poll, which unmasks the interrupts. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-06-02 16:41 ` Oscar Maes @ 2026-06-02 18:34 ` Jakub Kicinski 0 siblings, 0 replies; 8+ messages in thread From: Jakub Kicinski @ 2026-06-02 18:34 UTC (permalink / raw) To: Oscar Maes; +Cc: netdev, edumazet, pabeni, andrew On Tue, 2 Jun 2026 18:41:58 +0200 Oscar Maes wrote: > The suggested race condition is impossible, even after > calling napi_complete_done all pcnet32 interrupts are masked via CSR3. > pcnet32_interrupt would never be called until the second locked > section in pcnet32_poll, which unmasks the interrupts. Not very clear to me that pcnet32_interrupt() will not trigger until we unmask, but okay, not worth the effort.. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done 2026-05-28 14:03 [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done Oscar Maes 2026-05-28 14:55 ` Alexander Lobakin 2026-06-02 2:44 ` Jakub Kicinski @ 2026-06-02 18:40 ` patchwork-bot+netdevbpf 2 siblings, 0 replies; 8+ messages in thread From: patchwork-bot+netdevbpf @ 2026-06-02 18:40 UTC (permalink / raw) To: Oscar Maes; +Cc: netdev, edumazet, kuba, pabeni, andrew Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 28 May 2026 16:03:20 +0200 you wrote: > napi_complete_done may call gro_flush_normal (though not currently, as GRO > is unsupported at the moment), which may result in packet TX. This will > eventually result in calling pcnet32_start_xmit - resulting in a deadlock > while trying to re-acquire the already locked spin lock. > > It is safe to split the spinlock block into two, because the hardware > registers are still protected from concurrent access, and the two blocks > perform unrelated operations that don't need to happen atomically. > > [...] Here is the summary with links: - [net] pcnet32: stop holding device spin lock during napi_complete_done https://git.kernel.org/netdev/net/c/73bf3cca7de6 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-06-02 18:40 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-28 14:03 [PATCH net] pcnet32: stop holding device spin lock during napi_complete_done Oscar Maes 2026-05-28 14:55 ` Alexander Lobakin 2026-05-28 17:07 ` Oscar Maes 2026-05-29 15:21 ` Alexander Lobakin 2026-06-02 2:44 ` Jakub Kicinski 2026-06-02 16:41 ` Oscar Maes 2026-06-02 18:34 ` Jakub Kicinski 2026-06-02 18:40 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox