* [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
@ 2025-12-22 1:56 Xiaolei Wang
2025-12-30 10:30 ` patchwork-bot+netdevbpf
2026-02-19 14:34 ` Alexander Dahl
0 siblings, 2 replies; 5+ messages in thread
From: Xiaolei Wang @ 2025-12-22 1:56 UTC (permalink / raw)
To: pabeni, nicolas.ferre, claudiu.beznea, andrew+netdev, davem,
edumazet, kuba, rmk+kernel, Kexin.Hao, Xiaolei.Wang
Cc: netdev, stable, linux-kernel
In the non-RT kernel, local_bh_disable() merely disables preemption,
whereas it maps to an actual spin lock in the RT kernel. Consequently,
when attempting to refill RX buffers via netdev_alloc_skb() in
macb_mac_link_up(), a deadlock scenario arises as follows:
WARNING: possible circular locking dependency detected
6.18.0-08691-g2061f18ad76e #39 Not tainted
------------------------------------------------------
kworker/0:0/8 is trying to acquire lock:
ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c
but task is already holding lock:
ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit
+0x148/0xb7c
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&queue->tx_ptr_lock){+...}-{3:3}:
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x148/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #2 (_xmit_ETHER#2){+...}-{3:3}:
rt_spin_lock+0x50/0x1f0
sch_direct_xmit+0x11c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}:
lock_release+0x250/0x348
__local_bh_enable_ip+0x7c/0x240
__netdev_alloc_skb+0x1b4/0x1d8
gem_rx_refill+0xdc/0x240
gem_init_rings+0xb4/0x108
macb_mac_link_up+0x9c/0x2b4
phylink_resolve+0x170/0x614
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #0 (&bp->lock){+.+.}-{3:3}:
__lock_acquire+0x15a8/0x2084
lock_acquire+0x1cc/0x350
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x808/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
&bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&queue->tx_ptr_lock);
lock(_xmit_ETHER#2);
lock(&queue->tx_ptr_lock);
lock(&bp->lock);
*** DEADLOCK ***
Call trace:
show_stack+0x18/0x24 (C)
dump_stack_lvl+0xa0/0xf0
dump_stack+0x18/0x24
print_circular_bug+0x28c/0x370
check_noncircular+0x198/0x1ac
__lock_acquire+0x15a8/0x2084
lock_acquire+0x1cc/0x350
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x808/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
Notably, invoking the mog_init_rings() callback upon link establishment
is unnecessary. Instead, we can exclusively call mog_init_rings() within
the ndo_open() callback. This adjustment resolves the deadlock issue.
Furthermore, since MACB_CAPS_MACB_IS_EMAC cases do not use mog_init_rings()
when opening the network interface via at91ether_open(), moving
mog_init_rings() to macb_open() also eliminates the MACB_CAPS_MACB_IS_EMAC
check.
Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()")
Cc: stable@vger.kernel.org
Suggested-by: Kevin Hao <kexin.hao@windriver.com>
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
---
V1: https://patchwork.kernel.org/project/netdevbpf/patch/20251128103647.351259-1-xiaolei.wang@windriver.com/
V2: Update the correct lock dependency chain and add the Fix tag.
V3: update commit log, Add full deadlock log added explanations: because MACB_CAPS_MACB_IS_EMAC cases do not
use mog_init_rings(), we don't need the MACB_CAPS_MACB_IS_EMAC check when moving mog_init_rings() to macb_open().
drivers/net/ethernet/cadence/macb_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index ca2386b83473..064fccdcf699 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -744,7 +744,6 @@ static void macb_mac_link_up(struct phylink_config *config,
/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
* cleared the pipeline and control registers.
*/
- bp->macbgem_ops.mog_init_rings(bp);
macb_init_buffers(bp);
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
@@ -2991,6 +2990,8 @@ static int macb_open(struct net_device *dev)
goto pm_exit;
}
+ bp->macbgem_ops.mog_init_rings(bp);
+
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
napi_enable(&queue->napi_rx);
napi_enable(&queue->napi_tx);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() 2025-12-22 1:56 [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() Xiaolei Wang @ 2025-12-30 10:30 ` patchwork-bot+netdevbpf 2026-02-19 14:34 ` Alexander Dahl 1 sibling, 0 replies; 5+ messages in thread From: patchwork-bot+netdevbpf @ 2025-12-30 10:30 UTC (permalink / raw) To: Xiaolei Wang Cc: pabeni, nicolas.ferre, claudiu.beznea, andrew+netdev, davem, edumazet, kuba, rmk+kernel, Kexin.Hao, Xiaolei.Wang, netdev, stable, linux-kernel Hello: This patch was applied to netdev/net.git (main) by Paolo Abeni <pabeni@redhat.com>: On Mon, 22 Dec 2025 09:56:24 +0800 you wrote: > In the non-RT kernel, local_bh_disable() merely disables preemption, > whereas it maps to an actual spin lock in the RT kernel. Consequently, > when attempting to refill RX buffers via netdev_alloc_skb() in > macb_mac_link_up(), a deadlock scenario arises as follows: > > WARNING: possible circular locking dependency detected > 6.18.0-08691-g2061f18ad76e #39 Not tainted > > [...] Here is the summary with links: - [net,v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() https://git.kernel.org/netdev/net/c/99537d5c476c You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() 2025-12-22 1:56 [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() Xiaolei Wang 2025-12-30 10:30 ` patchwork-bot+netdevbpf @ 2026-02-19 14:34 ` Alexander Dahl 2026-02-19 14:52 ` Kevin Hao 1 sibling, 1 reply; 5+ messages in thread From: Alexander Dahl @ 2026-02-19 14:34 UTC (permalink / raw) To: Xiaolei Wang Cc: pabeni, nicolas.ferre, claudiu.beznea, andrew+netdev, davem, edumazet, kuba, rmk+kernel, Kexin.Hao, netdev, stable, linux-kernel, linux-rt-users Hello, this change leads to a system lockup, see below. Am Mon, Dec 22, 2025 at 09:56:24AM +0800 schrieb Xiaolei Wang: > In the non-RT kernel, local_bh_disable() merely disables preemption, > whereas it maps to an actual spin lock in the RT kernel. Consequently, > when attempting to refill RX buffers via netdev_alloc_skb() in > macb_mac_link_up(), a deadlock scenario arises as follows: > > WARNING: possible circular locking dependency detected > 6.18.0-08691-g2061f18ad76e #39 Not tainted > ------------------------------------------------------ > kworker/0:0/8 is trying to acquire lock: > ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c > > but task is already holding lock: > ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit > +0x148/0xb7c > > which lock already depends on the new lock. > > the existing dependency chain (in reverse order) is: > > -> #3 (&queue->tx_ptr_lock){+...}-{3:3}: > rt_spin_lock+0x50/0x1f0 > macb_start_xmit+0x148/0xb7c > dev_hard_start_xmit+0x94/0x284 > sch_direct_xmit+0x8c/0x37c > __dev_queue_xmit+0x708/0x1120 > neigh_resolve_output+0x148/0x28c > ip6_finish_output2+0x2c0/0xb2c > __ip6_finish_output+0x114/0x308 > ip6_output+0xc4/0x4a4 > mld_sendpack+0x220/0x68c > mld_ifc_work+0x2a8/0x4f4 > process_one_work+0x20c/0x5f8 > worker_thread+0x1b0/0x35c > kthread+0x144/0x200 > ret_from_fork+0x10/0x20 > > -> #2 (_xmit_ETHER#2){+...}-{3:3}: > rt_spin_lock+0x50/0x1f0 > sch_direct_xmit+0x11c/0x37c > __dev_queue_xmit+0x708/0x1120 > neigh_resolve_output+0x148/0x28c > ip6_finish_output2+0x2c0/0xb2c > __ip6_finish_output+0x114/0x308 > ip6_output+0xc4/0x4a4 > mld_sendpack+0x220/0x68c > mld_ifc_work+0x2a8/0x4f4 > process_one_work+0x20c/0x5f8 > worker_thread+0x1b0/0x35c > kthread+0x144/0x200 > ret_from_fork+0x10/0x20 > > -> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}: > lock_release+0x250/0x348 > __local_bh_enable_ip+0x7c/0x240 > __netdev_alloc_skb+0x1b4/0x1d8 > gem_rx_refill+0xdc/0x240 > gem_init_rings+0xb4/0x108 > macb_mac_link_up+0x9c/0x2b4 > phylink_resolve+0x170/0x614 > process_one_work+0x20c/0x5f8 > worker_thread+0x1b0/0x35c > kthread+0x144/0x200 > ret_from_fork+0x10/0x20 > > -> #0 (&bp->lock){+.+.}-{3:3}: > __lock_acquire+0x15a8/0x2084 > lock_acquire+0x1cc/0x350 > rt_spin_lock+0x50/0x1f0 > macb_start_xmit+0x808/0xb7c > dev_hard_start_xmit+0x94/0x284 > sch_direct_xmit+0x8c/0x37c > __dev_queue_xmit+0x708/0x1120 > neigh_resolve_output+0x148/0x28c > ip6_finish_output2+0x2c0/0xb2c > __ip6_finish_output+0x114/0x308 > ip6_output+0xc4/0x4a4 > mld_sendpack+0x220/0x68c > mld_ifc_work+0x2a8/0x4f4 > process_one_work+0x20c/0x5f8 > worker_thread+0x1b0/0x35c > kthread+0x144/0x200 > ret_from_fork+0x10/0x20 > > other info that might help us debug this: > > Chain exists of: > &bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock > > Possible unsafe locking scenario: > > CPU0 CPU1 > ---- ---- > lock(&queue->tx_ptr_lock); > lock(_xmit_ETHER#2); > lock(&queue->tx_ptr_lock); > lock(&bp->lock); > > *** DEADLOCK *** > > Call trace: > show_stack+0x18/0x24 (C) > dump_stack_lvl+0xa0/0xf0 > dump_stack+0x18/0x24 > print_circular_bug+0x28c/0x370 > check_noncircular+0x198/0x1ac > __lock_acquire+0x15a8/0x2084 > lock_acquire+0x1cc/0x350 > rt_spin_lock+0x50/0x1f0 > macb_start_xmit+0x808/0xb7c > dev_hard_start_xmit+0x94/0x284 > sch_direct_xmit+0x8c/0x37c > __dev_queue_xmit+0x708/0x1120 > neigh_resolve_output+0x148/0x28c > ip6_finish_output2+0x2c0/0xb2c > __ip6_finish_output+0x114/0x308 > ip6_output+0xc4/0x4a4 > mld_sendpack+0x220/0x68c > mld_ifc_work+0x2a8/0x4f4 > process_one_work+0x20c/0x5f8 > worker_thread+0x1b0/0x35c > kthread+0x144/0x200 > ret_from_fork+0x10/0x20 > > Notably, invoking the mog_init_rings() callback upon link establishment > is unnecessary. Instead, we can exclusively call mog_init_rings() within > the ndo_open() callback. This adjustment resolves the deadlock issue. > Furthermore, since MACB_CAPS_MACB_IS_EMAC cases do not use mog_init_rings() > when opening the network interface via at91ether_open(), moving > mog_init_rings() to macb_open() also eliminates the MACB_CAPS_MACB_IS_EMAC > check. > > Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()") > Cc: stable@vger.kernel.org > Suggested-by: Kevin Hao <kexin.hao@windriver.com> > Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com> > --- > > V1: https://patchwork.kernel.org/project/netdevbpf/patch/20251128103647.351259-1-xiaolei.wang@windriver.com/ > V2: Update the correct lock dependency chain and add the Fix tag. > V3: update commit log, Add full deadlock log added explanations: because MACB_CAPS_MACB_IS_EMAC cases do not > use mog_init_rings(), we don't need the MACB_CAPS_MACB_IS_EMAC check when moving mog_init_rings() to macb_open(). After upgrading from 6.12.57-rt14 to 6.12.66-rt15 on a custom at91 sam9x60 based board with PREEMPT_RT patch, we noticed a complete system lockup, which I bisected to this changeset. After unplugging and plugging the ethernet cable, while running PROFINET, system does not respond to anything anymore. Last message in kernel log is: [ +8.621919] macb f802c000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off Heartbeat LED does not blink anymore, no network communication, serial console does not respond anymore. Reverting that change locally prevents the system lockup for me, but what is the proper course of action on kernel side now? Send a revert to stable? Send a revert to master? Please advise. (I'm aware there were least two more patches on netdev referencing this change, but if I'm not mistaken none of those made it to stable, right?) Greets Alex P.S.: adding linux-rt-users to Cc > > drivers/net/ethernet/cadence/macb_main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c > index ca2386b83473..064fccdcf699 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -744,7 +744,6 @@ static void macb_mac_link_up(struct phylink_config *config, > /* Initialize rings & buffers as clearing MACB_BIT(TE) in link down > * cleared the pipeline and control registers. > */ > - bp->macbgem_ops.mog_init_rings(bp); > macb_init_buffers(bp); > > for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) > @@ -2991,6 +2990,8 @@ static int macb_open(struct net_device *dev) > goto pm_exit; > } > > + bp->macbgem_ops.mog_init_rings(bp); > + > for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) { > napi_enable(&queue->napi_rx); > napi_enable(&queue->napi_tx); > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() 2026-02-19 14:34 ` Alexander Dahl @ 2026-02-19 14:52 ` Kevin Hao 2026-02-19 15:36 ` Alexander Dahl 0 siblings, 1 reply; 5+ messages in thread From: Kevin Hao @ 2026-02-19 14:52 UTC (permalink / raw) To: Xiaolei Wang, pabeni, nicolas.ferre, claudiu.beznea, andrew+netdev, davem, edumazet, kuba, rmk+kernel, netdev, stable, linux-kernel, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 1277 bytes --] On Thu, Feb 19, 2026 at 03:34:54PM +0100, Alexander Dahl wrote: > After upgrading from 6.12.57-rt14 to 6.12.66-rt15 on a custom at91 > sam9x60 based board with PREEMPT_RT patch, we noticed a complete > system lockup, which I bisected to this changeset. > > After unplugging and plugging the ethernet cable, while > running PROFINET, system does not respond to anything anymore. > Last message in kernel log is: > > [ +8.621919] macb f802c000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off > > Heartbeat LED does not blink anymore, no network communication, > serial console does not respond anymore. > > Reverting that change locally prevents the system lockup for me, but > what is the proper course of action on kernel side now? Send a revert > to stable? Send a revert to master? Please advise. > > (I'm aware there were least two more patches on netdev referencing > this change, but if I'm not mistaken none of those made it to stable, > right?) A fix for this commit is available in the latest mainline kernel. Could you please verify whether it resolves the issue you encountered? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bf9cf80cab81e39701861a42877a28295ade266f Thanks, Kevin [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() 2026-02-19 14:52 ` Kevin Hao @ 2026-02-19 15:36 ` Alexander Dahl 0 siblings, 0 replies; 5+ messages in thread From: Alexander Dahl @ 2026-02-19 15:36 UTC (permalink / raw) To: Kevin Hao Cc: Xiaolei Wang, pabeni, nicolas.ferre, claudiu.beznea, andrew+netdev, davem, edumazet, kuba, rmk+kernel, netdev, stable, linux-kernel, linux-rt-users Hello Kevin, Am Thu, Feb 19, 2026 at 10:52:24PM +0800 schrieb Kevin Hao: > On Thu, Feb 19, 2026 at 03:34:54PM +0100, Alexander Dahl wrote: > > After upgrading from 6.12.57-rt14 to 6.12.66-rt15 on a custom at91 > > sam9x60 based board with PREEMPT_RT patch, we noticed a complete > > system lockup, which I bisected to this changeset. > > > > After unplugging and plugging the ethernet cable, while > > running PROFINET, system does not respond to anything anymore. > > Last message in kernel log is: > > > > [ +8.621919] macb f802c000.ethernet eth0: Link is Up - 100Mbps/Full - flow control off > > > > Heartbeat LED does not blink anymore, no network communication, > > serial console does not respond anymore. > > > > Reverting that change locally prevents the system lockup for me, but > > what is the proper course of action on kernel side now? Send a revert > > to stable? Send a revert to master? Please advise. > > > > (I'm aware there were least two more patches on netdev referencing > > this change, but if I'm not mistaken none of those made it to stable, > > right?) > > A fix for this commit is available in the latest mainline kernel. Could you > please verify whether it resolves the issue you encountered? > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=bf9cf80cab81e39701861a42877a28295ade266f Verified by replacing the revert with your patch backported to my 6.12.66-rt15 based tree. Answered to your patch mail, so hopefully that change hits stable soon. Thanks for your effort. Greets Alex > > Thanks, > Kevin ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-02-19 15:36 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-12-22 1:56 [PATCH net v3] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() Xiaolei Wang 2025-12-30 10:30 ` patchwork-bot+netdevbpf 2026-02-19 14:34 ` Alexander Dahl 2026-02-19 14:52 ` Kevin Hao 2026-02-19 15:36 ` Alexander Dahl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox