public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Kevin Hao <kexin.hao@windriver.com>
To: Paolo Abeni <pabeni@redhat.com>
Cc: Xiaolei Wang <xiaolei.wang@windriver.com>,
	nicolas.ferre@microchip.com, claudiu.beznea@tuxon.dev,
	andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open()
Date: Fri, 5 Dec 2025 15:36:01 +0800	[thread overview]
Message-ID: <aTKLYc7-lc7lfmvI@pek-khao-d3> (raw)
In-Reply-To: <1b9ade5b-bfa9-4bcd-9bc4-6457dffcd887@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 8063 bytes --]

On Thu, Dec 04, 2025 at 10:25:04AM +0100, Paolo Abeni wrote:
> CAUTION: This email comes from a non Wind River email account!
> Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> On 11/28/25 11:36 AM, Xiaolei Wang wrote:
> > In the non-RT kernel, local_bh_disable() merely disables preemption,
> > whereas it maps to an actual spin lock in the RT kernel. Consequently,
> > when attempting to refill RX buffers via netdev_alloc_skb() in
> > macb_mac_link_up(), a deadlock scenario arises as follows:
> >   The dependency chain caused by macb_mac_link_up():
> >   &bp->lock --> (softirq_ctrl.lock) --> _xmit_ETHER#2
> 
> I'm sorry, but I can't see how this dependency chain is caused by
> mog_init_rings(), please extend the above info pin pointing the
> function/code effectively acquiring the lock and how it's reached.

Apologies for the confusion. It appears I made an error in the lock chain description.

The correct lock dependency chain is as follows:

   Chain caused by macb_mac_link_up():
   &bp->lock --> (softirq_ctrl.lock)

   Chain caused by macb_start_xmit():
   (softirq_ctrl.lock) --> _xmit_ETHER#2 --> &bp->lock

Below is the complete log output from lockdep:
   ======================================================
   WARNING: possible circular locking dependency detected
   6.18.0-08691-g2061f18ad76e #39 Not tainted
   ------------------------------------------------------
   kworker/0:0/8 is trying to acquire lock:
   ffff00080369bbe0 (&bp->lock){+.+.}-{3:3}, at: macb_start_xmit+0x808/0xb7c
   
   but task is already holding lock:
   ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit+0x148/0xb7c
   
   which lock already depends on the new lock.
   
   
   the existing dependency chain (in reverse order) is:
   
   -> #3 (&queue->tx_ptr_lock){+...}-{3:3}:
          rt_spin_lock+0x50/0x1f0
          macb_start_xmit+0x148/0xb7c
          dev_hard_start_xmit+0x94/0x284
          sch_direct_xmit+0x8c/0x37c
          __dev_queue_xmit+0x708/0x1120
          neigh_resolve_output+0x148/0x28c
          ip6_finish_output2+0x2c0/0xb2c
          __ip6_finish_output+0x114/0x308
          ip6_output+0xc4/0x4a4
          mld_sendpack+0x220/0x68c
          mld_ifc_work+0x2a8/0x4f4
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20
   
   -> #2 (_xmit_ETHER#2){+...}-{3:3}:
          rt_spin_lock+0x50/0x1f0
          sch_direct_xmit+0x11c/0x37c
          __dev_queue_xmit+0x708/0x1120
          neigh_resolve_output+0x148/0x28c
          ip6_finish_output2+0x2c0/0xb2c
          __ip6_finish_output+0x114/0x308
          ip6_output+0xc4/0x4a4
          mld_sendpack+0x220/0x68c
          mld_ifc_work+0x2a8/0x4f4
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20
   
   -> #1 ((softirq_ctrl.lock)){+.+.}-{3:3}:
          lock_release+0x250/0x348
          __local_bh_enable_ip+0x7c/0x240
          __netdev_alloc_skb+0x1b4/0x1d8
          gem_rx_refill+0xdc/0x240
          gem_init_rings+0xb4/0x108
          macb_mac_link_up+0x9c/0x2b4
          phylink_resolve+0x170/0x614
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20
   
   -> #0 (&bp->lock){+.+.}-{3:3}:
          __lock_acquire+0x15a8/0x2084
          lock_acquire+0x1cc/0x350
          rt_spin_lock+0x50/0x1f0
          macb_start_xmit+0x808/0xb7c
          dev_hard_start_xmit+0x94/0x284
          sch_direct_xmit+0x8c/0x37c
          __dev_queue_xmit+0x708/0x1120
          neigh_resolve_output+0x148/0x28c
          ip6_finish_output2+0x2c0/0xb2c
          __ip6_finish_output+0x114/0x308
          ip6_output+0xc4/0x4a4
          mld_sendpack+0x220/0x68c
          mld_ifc_work+0x2a8/0x4f4
          process_one_work+0x20c/0x5f8
          worker_thread+0x1b0/0x35c
          kthread+0x144/0x200
          ret_from_fork+0x10/0x20
   
   other info that might help us debug this:
   
   Chain exists of:
     &bp->lock --> _xmit_ETHER#2 --> &queue->tx_ptr_lock
   
    Possible unsafe locking scenario:
   
          CPU0                    CPU1
          ----                    ----
     lock(&queue->tx_ptr_lock);
                                  lock(_xmit_ETHER#2);
                                  lock(&queue->tx_ptr_lock);
     lock(&bp->lock);
   
    *** DEADLOCK ***
   
   16 locks held by kworker/0:0/8:
    #0: ffff000800905b38 ((wq_completion)mld){+.+.}-{0:0}, at: process_one_work+0x190/0x5f8
    #1: ffff800082943d80 ((work_completion)(&(&idev->mc_ifc_work)->work)){+.+.}-{0:0}, at: process_one_work+0x1b8/0x5f8
    #2: ffff0008036a1620 (&idev->mc_lock){+.+.}-{4:4}, at: mld_ifc_work+0x34/0x4f4
    #3: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: mld_sendpack+0x0/0x68c
    #4: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: ip6_output+0x44/0x4a4
    #5: ffff800081b7d158 (local_bh){.+.+}-{1:3}, at: __local_bh_disable_ip+0x2c/0x2ac
    #6: ffff00087f6e4588 ((softirq_ctrl.lock)){+.+.}-{3:3}, at: __local_bh_disable_ip+0x1d0/0x2ac
    #7: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0xd4/0x1f0
    #8: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: __local_bh_disable_ip+0x1b8/0x2ac
    #9: ffff800081bc0260 (rcu_read_lock_bh){....}-{1:3}, at: __dev_queue_xmit+0x64/0x1120
    #10: ffff000803711328 (dev->qdisc_tx_busylock ?: &qdisc_tx_busylock){+...}-{3:3}, at: __dev_queue_xmit+0x694/0x1120
    #11: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_trylock+0x40/0x170
    #12: ffff0008001e3558 (_xmit_ETHER#2){+...}-{3:3}, at: sch_direct_xmit+0x11c/0x37c
    #13: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0xd4/0x1f0
    #14: ffff000803698e58 (&queue->tx_ptr_lock){+...}-{3:3}, at: macb_start_xmit+0x148/0xb7c
    #15: ffff800081bc0238 (rcu_read_lock){....}-{1:3}, at: rt_spin_lock+0xd4/0x1f0
   
   stack backtrace:
   CPU: 0 UID: 0 PID: 8 Comm: kworker/0:0 Not tainted 6.18.0-08691-g2061f18ad76e #39 PREEMPT_RT 
   Hardware name: ZynqMP ZCU102 Rev1.0 (DT)
   Workqueue: mld mld_ifc_work
   Call trace:
    show_stack+0x18/0x24 (C)
    dump_stack_lvl+0xa0/0xf0
    dump_stack+0x18/0x24
    print_circular_bug+0x28c/0x370
    check_noncircular+0x198/0x1ac
    __lock_acquire+0x15a8/0x2084
    lock_acquire+0x1cc/0x350
    rt_spin_lock+0x50/0x1f0
    macb_start_xmit+0x808/0xb7c
    dev_hard_start_xmit+0x94/0x284
    sch_direct_xmit+0x8c/0x37c
    __dev_queue_xmit+0x708/0x1120
    neigh_resolve_output+0x148/0x28c
    ip6_finish_output2+0x2c0/0xb2c
    __ip6_finish_output+0x114/0x308
    ip6_output+0xc4/0x4a4
    mld_sendpack+0x220/0x68c
    mld_ifc_work+0x2a8/0x4f4
    process_one_work+0x20c/0x5f8
    worker_thread+0x1b0/0x35c
    kthread+0x144/0x200
    ret_from_fork+0x10/0x20

> 
> >   The dependency chain caused by macb_start_xmit():
> >   _xmit_ETHER#2 --> &bp->lock
> >
> > Notably, invoking the mog_init_rings() callback upon link establishment
> > is unnecessary. Instead, we can exclusively call mog_init_rings() within
> > the ndo_open() callback. This adjustment resolves the deadlock issue.
> > Given that mog_init_rings() is only applicable to
> > non-MACB_CAPS_MACB_IS_EMAC cases, we can simply move it to macb_open()
> > and simultaneously eliminate the MACB_CAPS_MACB_IS_EMAC check.
> >
> > Suggested-by: Kevin Hao <kexin.hao@windriver.com>
> > Signed-off-by: Xiaolei Wang <xiaolei.wang@windriver.com>
> 
> Please include a suitable fixes tag.

This issue has been present since commit 633e98a711ac ("net: macb: use resolved
link config in mac_link_up()") introduced the acquisition of bp->lock in
macb_mac_link_up(). Therefore, I will add:
  Fixes: 633e98a711ac ("net: macb: use resolved link config in mac_link_up()")

Thanks,
Kevin

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      reply	other threads:[~2025-12-05  7:36 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-28 10:36 [net PATCH] net: macb: Relocate mog_init_rings() callback from macb_mac_link_up() to macb_open() Xiaolei Wang
2025-12-04  9:25 ` Paolo Abeni
2025-12-05  7:36   ` Kevin Hao [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTKLYc7-lc7lfmvI@pek-khao-d3 \
    --to=kexin.hao@windriver.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=claudiu.beznea@tuxon.dev \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nicolas.ferre@microchip.com \
    --cc=pabeni@redhat.com \
    --cc=xiaolei.wang@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox