* [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
@ 2025-11-28 10:36 Xiaolei Wang
2025-12-04 9:25 ` Paolo Abeni
0 siblings, 1 reply; 3+ messages in thread
From: Xiaolei Wang @ 2025-11-28 10:36 UTC (permalink / raw)
To: nicolas.ferre, claudiu.beznea, andrew+netdev, davem, edumazet,
kuba, pabeni, Kexin.Hao, Xiaolei.Wang
Cc: netdev, linux-kernel
In the non-RT kernel, local_bh_disable() merely disables preemption,
whereas it maps to an actual spin lock in the RT kernel. Consequently,
when attempting to refill RX buffers via netdev_alloc_skb() in
macb_mac_link_up(), a deadlock scenario arises as follows:
The dependency chain caused by macb_mac_link_up():
&bp->lock --> (softirq_ctrl.lock) --> _xmit_ETHER#2
The dependency chain caused by macb_start_xmit():
_xmit_ETHER#2 --> &bp->lock
Notably, invoking the mog_init_rings() callback upon link establishment
is unnecessary. Instead, we can exclusively call mog_init_rings() within
the ndo_open() callback. This adjustment resolves the deadlock issue.
Given that mog_init_rings() is only applicable to
non-MACB_CAPS_MACB_IS_EMAC cases, we can simply move it to macb_open()
and simultaneously eliminate the MACB_CAPS_MACB_IS_EMAC check.
Suggested-by: Kevin Hao <kexin.hao@windriver.com>
Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
---
drivers/net/ethernet/cadence/macb_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c
index 65bc73f6fa30..7df2ad093bab 100644
--- a/drivers/net/ethernet/cadence/macb_main.c
+++ b/drivers/net/ethernet/cadence/macb_main.c
@@ -775,7 +775,6 @@ static void macb_mac_link_up(struct phylink_config *config,
/* Initialize rings & buffers as clearing MACB_BIT(TE) in link down
* cleared the pipeline and control registers.
*/
- bp->macbgem_ops.mog_init_rings(bp);
macb_init_buffers(bp);
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue)
@@ -3049,6 +3048,8 @@ static int macb_open(struct net_device *dev)
goto pm_exit;
}
+ bp->macbgem_ops.mog_init_rings(bp);
+
for (q = 0, queue = bp->queues; q < bp->num_queues; ++q, ++queue) {
napi_enable(&queue->napi_rx);
napi_enable(&queue->napi_tx);
--
2.43.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
2025-11-28 10:36 [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() Xiaolei Wang
@ 2025-12-04 9:25 ` Paolo Abeni
2025-12-05 7:36 ` Kevin Hao
0 siblings, 1 reply; 3+ messages in thread
From: Paolo Abeni @ 2025-12-04 9:25 UTC (permalink / raw)
To: Xiaolei Wang, nicolas.ferre, claudiu.beznea, andrew+netdev, davem,
edumazet, kuba, Kexin.Hao
Cc: netdev, linux-kernel
On 11/28/25 11:36 AM, Xiaolei Wang wrote:
> In the non-RT kernel, local_bh_disable() merely disables preemption,
> whereas it maps to an actual spin lock in the RT kernel. Consequently,
> when attempting to refill RX buffers via netdev_alloc_skb() in
> macb_mac_link_up(), a deadlock scenario arises as follows:
> The dependency chain caused by macb_mac_link_up():
> &bp->lock --> (softirq_ctrl.lock) --> _xmit_ETHER#2
I'm sorry, but I can't see how this dependency chain is caused by
mog_init_rings(), please extend the above info pin pointing the
function/code effectively acquiring the lock and how it's reached.
> The dependency chain caused by macb_start_xmit():
> _xmit_ETHER#2 --> &bp->lock
>
> Notably, invoking the mog_init_rings() callback upon link establishment
> is unnecessary. Instead, we can exclusively call mog_init_rings() within
> the ndo_open() callback. This adjustment resolves the deadlock issue.
> Given that mog_init_rings() is only applicable to
> non-MACB_CAPS_MACB_IS_EMAC cases, we can simply move it to macb_open()
> and simultaneously eliminate the MACB_CAPS_MACB_IS_EMAC check.
>
> Suggested-by: Kevin Hao <kexin.hao@windriver.com>
> Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
Please include a suitable fixes tag.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
2025-12-04 9:25 ` Paolo Abeni
@ 2025-12-05 7:36 ` Kevin Hao
0 siblings, 0 replies; 3+ messages in thread
From: Kevin Hao @ 2025-12-05 7:36 UTC (permalink / raw)
To: Paolo Abeni
Cc: Xiaolei Wang, nicolas.ferre, claudiu.beznea, andrew+netdev, davem,
edumazet, kuba, netdev, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 8063 bytes --]
On Thu, Dec 04, 2025 at 10:25:04AM +0100, Paolo Abeni wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
>
> On 11/28/25 11:36 AM, Xiaolei Wang wrote:
> > In the non-RT kernel, local_bh_disable() merely disables preemption,
> > whereas it maps to an actual spin lock in the RT kernel. Consequently,
> > when attempting to refill RX buffers via netdev_alloc_skb() in
> > macb_mac_link_up(), a deadlock scenario arises as follows:
> > The dependency chain caused by macb_mac_link_up():
> > &bp->lock --> (softirq_ctrl.lock) --> _xmit_ETHER#2
>
> I'm sorry, but I can't see how this dependency chain is caused by
> mog_init_rings(), please extend the above info pin pointing the
> function/code effectively acquiring the lock and how it's reached.
Apologies for the confusion. It appears I made an error in the lock chain description.
The correct lock dependency chain is as follows:
Chain caused by macb_mac_link_up():
&bp->lock --> (softirq_ctrl.lock)
Chain caused by macb_start_xmit():
(softirq_ctrl.lock) --> _xmit_ETHER#2 --> &bp->lock
Below is the complete log output from lockdep:
======================================================
WARNING: possible circular locking dependency detected
6.18.0-08691-g2061f18ad76e #39 Not tainted
------------------------------------------------------
kworker/0:0/8 is trying to acquire lock:
ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c
but task is already holding lock:
ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit+0x148/0xb7c
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #3 (&queue->tx_ptr_lock){+...}-{3:3}:
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x148/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #2 (_xmit_ETHER#2){+...}-{3:3}:
rt_spin_lock+0x50/0x1f0
sch_direct_xmit+0x11c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}:
lock_release+0x250/0x348
__local_bh_enable_ip+0x7c/0x240
__netdev_alloc_skb+0x1b4/0x1d8
gem_rx_refill+0xdc/0x240
gem_init_rings+0xb4/0x108
macb_mac_link_up+0x9c/0x2b4
phylink_resolve+0x170/0x614
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
-> #0 (&bp->lock){+.+.}-{3:3}:
__lock_acquire+0x15a8/0x2084
lock_acquire+0x1cc/0x350
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x808/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of:
&bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&queue->tx_ptr_lock);
lock(_xmit_ETHER#2);
lock(&queue->tx_ptr_lock);
lock(&bp->lock);
*** DEADLOCK ***
16 locks held by kworker/0:0/8:
#0: ffff000800905b38 ((wq_completion)mld){+.+.}-{0:0}, at: process_one_work+0x190/0x5f8
#1: ffff800082943d80 ((work_completion)(&(&idev->mc_ifc_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1b8/0x5f8
#2: ffff0008036a1620 (&idev->mc_lock){+.+.}-{4:4}, at: mld_ifc_work+0x34/0x4f4
#3: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: mld_sendpack+0x0/0x68c
#4: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: ip6_output+0x44/0x4a4
#5: ffff800081b7d158 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x2c/0x2ac
#6: ffff00087f6e4588 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x1d0/0x2ac
#7: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0xd4/0x1f0
#8: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0x1b8/0x2ac
#9: ffff800081bc0260 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x64/0x1120
#10: ffff000803711328 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{3:3}, at: __dev_queue_xmit+0x694/0x1120
#11: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x40/0x170
#12: ffff0008001e3558 (_xmit_ETHER#2){+...}-{3:3}, at: sch_direct_xmit+0x11c/0x37c
#13: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0xd4/0x1f0
#14: ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit+0x148/0xb7c
#15: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0xd4/0x1f0
stack backtrace:
CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.18.0-08691-g2061f18ad76e #39 PREEMPT_RT
Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
Workqueue: mld mld_ifc_work
Call trace:
show_stack+0x18/0x24 (C)
dump_stack_lvl+0xa0/0xf0
dump_stack+0x18/0x24
print_circular_bug+0x28c/0x370
check_noncircular+0x198/0x1ac
__lock_acquire+0x15a8/0x2084
lock_acquire+0x1cc/0x350
rt_spin_lock+0x50/0x1f0
macb_start_xmit+0x808/0xb7c
dev_hard_start_xmit+0x94/0x284
sch_direct_xmit+0x8c/0x37c
__dev_queue_xmit+0x708/0x1120
neigh_resolve_output+0x148/0x28c
ip6_finish_output2+0x2c0/0xb2c
__ip6_finish_output+0x114/0x308
ip6_output+0xc4/0x4a4
mld_sendpack+0x220/0x68c
mld_ifc_work+0x2a8/0x4f4
process_one_work+0x20c/0x5f8
worker_thread+0x1b0/0x35c
kthread+0x144/0x200
ret_from_fork+0x10/0x20
>
> > The dependency chain caused by macb_start_xmit():
> > _xmit_ETHER#2 --> &bp->lock
> >
> > Notably, invoking the mog_init_rings() callback upon link establishment
> > is unnecessary. Instead, we can exclusively call mog_init_rings() within
> > the ndo_open() callback. This adjustment resolves the deadlock issue.
> > Given that mog_init_rings() is only applicable to
> > non-MACB_CAPS_MACB_IS_EMAC cases, we can simply move it to macb_open()
> > and simultaneously eliminate the MACB_CAPS_MACB_IS_EMAC check.
> >
> > Suggested-by: Kevin Hao <kexin.hao@windriver.com>
> > Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
>
> Please include a suitable fixes tag.
This issue has been present since commit 633e98a711ac ("net: macb: use resolved
link config in mac_link_up()") introduced the acquisition of bp->lock in
macb_mac_link_up(). Therefore, I will add:
Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()")
Thanks,
Kevin
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-12-05 7:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-28 10:36 [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() Xiaolei Wang
2025-12-04 9:25 ` Paolo Abeni
2025-12-05 7:36 ` Kevin Hao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox