linux-mediatek.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Christian Marangi <ansuelsmth@gmail.com>
To: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Rob Herring <robh@kernel.org>,
	Krzysztof Kozlowski <krzk+dt@kernel.org>,
	Conor Dooley <conor+dt@kernel.org>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Heiner Kallweit <hkallweit1@gmail.com>,
	Russell King <linux@armlinux.org.uk>,
	Philipp Zabel <p.zabel@pengutronix.de>,
	Christian Marangi <ansuelsmth@gmail.com>,
	Daniel Golle <daniel@makrotopia.org>,
	"Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>,
	netdev@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org,
	"Lei Wei (QUIC)" <quic_leiwei@quicinc.com>
Cc: stable@vger.kernel.org
Subject: [RFC PATCH net-next v2 01/11] net: phylink: fix possible circular locking dep with config in-band
Date: Mon,  7 Apr 2025 00:13:54 +0200	[thread overview]
Message-ID: <20250406221423.9723-2-ansuelsmth@gmail.com> (raw)
In-Reply-To: <20250406221423.9723-1-ansuelsmth@gmail.com>

Debug lock detection revealed a possible circular locking dependency between
phylink_major_config() and phylink_bringup_phy().

This was introduced by the addition of phy_config_inband(), which acquires
the phydev lock. This made the locking order in phylink_bringup_phy()
inconsistent, as it acquires the phydev lock before phylink's state_mutex.

A deadlock can occur when phylink_major_config() is called from
phylink_resolve(), where the state_mutex is taken first and then the phydev
lock. This is the reverse of the order used in phylink_bringup_phy().

To avoid this circular dependency, change the locking order in
phylink_bringup_phy() to match the pattern used in phylink_resolve(): take
state_mutex first, then the phydev lock.

A sample lockdep warning is included below for reference:

[  147.749178]
[  147.750682] ======================================================
[  147.756850] WARNING: possible circular locking dependency detected
[  147.763019] 6.14.0-next-20250404+ #0 Tainted: G           O
[  147.769189] ------------------------------------------------------
[  147.775356] kworker/u16:0/12 is trying to acquire lock:
[  147.780571] ffffff80ce9bcf08 (&dev->lock#2){+.+.}-{4:4}, at: phy_config_inband+0x44/0x90
[  147.788672]
[  147.788672] but task is already holding lock:
[  147.794492] ffffff80c0dfbda0 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x2c/0x6a8
[  147.802840]
[  147.802840] which lock already depends on the new lock.
[  147.802840]
[  147.811002]
[  147.811002] the existing dependency chain (in reverse order) is:
[  147.818472]
[  147.818472] -> #1 (&pl->state_mutex){+.+.}-{4:4}:
[  147.824647]        __mutex_lock+0x90/0x924
[  147.828738]        mutex_lock_nested+0x20/0x28
[  147.833173]        phylink_bringup_phy+0x210/0x700
[  147.837954]        phylink_fwnode_phy_connect+0xe0/0x124
[  147.843256]        phylink_of_phy_connect+0x18/0x20
[  147.848124]        dsa_user_create+0x210/0x414
[  147.852561]        dsa_port_setup+0xd4/0x14c
[  147.856823]        dsa_register_switch+0xbb0/0xe40
[  147.861605]        mt7988_probe+0xf8/0x140
[  147.865694]        platform_probe+0x64/0xbc
[  147.869869]        really_probe+0xbc/0x388
[  147.873955]        __driver_probe_device+0x78/0x154
[  147.878823]        driver_probe_device+0x3c/0xd4
[  147.883430]        __device_attach_driver+0xb0/0x144
[  147.888383]        bus_for_each_drv+0x6c/0xb0
[  147.892732]        __device_attach+0x9c/0x19c
[  147.897078]        device_initial_probe+0x10/0x18
[  147.901771]        bus_probe_device+0xa8/0xac
[  147.906118]        deferred_probe_work_func+0xb8/0x118
[  147.911245]        process_one_work+0x224/0x610
[  147.915769]        worker_thread+0x1b8/0x35c
[  147.920029]        kthread+0x11c/0x1e8
[  147.923769]        ret_from_fork+0x10/0x20
[  147.927857]
[  147.927857] -> #0 (&dev->lock#2){+.+.}-{4:4}:
[  147.933686]        __lock_acquire+0x12b8/0x1ff0
[  147.938209]        lock_acquire+0xf4/0x2d8
[  147.942295]        __mutex_lock+0x90/0x924
[  147.946383]        mutex_lock_nested+0x20/0x28
[  147.950817]        phy_config_inband+0x44/0x90
[  147.955252]        phylink_major_config+0x684/0xa64
[  147.960120]        phylink_resolve+0x24c/0x6a8
[  147.964554]        process_one_work+0x224/0x610
[  147.969075]        worker_thread+0x1b8/0x35c
[  147.973335]        kthread+0x11c/0x1e8
[  147.977075]        ret_from_fork+0x10/0x20
[  147.981162]
[  147.981162] other info that might help us debug this:
[  147.981162]
[  147.989150]  Possible unsafe locking scenario:
[  147.989150]
[  147.995056]        CPU0                    CPU1
[  147.999575]        ----                    ----
[  148.004094]   lock(&pl->state_mutex);
[  148.007748]                                lock(&dev->lock#2);
[  148.013572]                                lock(&pl->state_mutex);
[  148.019742]   lock(&dev->lock#2);
[  148.023051]
[  148.023051]  *** DEADLOCK ***
[  148.023051]
[  148.028958] 3 locks held by kworker/u16:0/12:
[  148.033304]  #0: ffffff80c0011d48 ((wq_completion)events_power_efficient){+.+.}-{0:0}, at: process_one_work+0x1a8/0x610
[  148.044082]  #1: ffffffc081ca3dd8 ((work_completion)(&pl->resolve)){+.+.}-{0:0}, at: process_one_work+0x1d0/0x610
[  148.054338]  #2: ffffff80c0dfbda0 (&pl->state_mutex){+.+.}-{4:4}, at: phylink_resolve+0x2c/0x6a8
[  148.063119]
[  148.063119] stack backtrace:
[  148.067465] CPU: 3 UID: 0 PID: 12 Comm: kworker/u16:0 Tainted: G           O        6.14.0-next-20250404+ #0 NONE
[  148.067472] Tainted: [O]=OOT_MODULE
[  148.067474] Hardware name: Bananapi BPI-R4 2.5GE PoE (DT)
[  148.067476] Workqueue: events_power_efficient phylink_resolve
[  148.067482] Call trace:
[  148.067484]  show_stack+0x14/0x1c (C)
[  148.067492]  dump_stack_lvl+0x84/0xc0
[  148.067497]  dump_stack+0x14/0x1c
[  148.067500]  print_circular_bug+0x330/0x43c
[  148.067505]  check_noncircular+0x124/0x134
[  148.067508]  __lock_acquire+0x12b8/0x1ff0
[  148.067512]  lock_acquire+0xf4/0x2d8
[  148.067516]  __mutex_lock+0x90/0x924
[  148.067521]  mutex_lock_nested+0x20/0x28
[  148.067527]  phy_config_inband+0x44/0x90
[  148.067531]  phylink_major_config+0x684/0xa64
[  148.067535]  phylink_resolve+0x24c/0x6a8
[  148.067539]  process_one_work+0x224/0x610
[  148.067544]  worker_thread+0x1b8/0x35c
[  148.067548]  kthread+0x11c/0x1e8
[  148.067552]  ret_from_fork+0x10/0x20

Cc: stable@vger.kernel.org
Fixes: 5fd0f1a02e75 ("net: phylink: add negotiation of in-band capabilities")
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
---
 drivers/net/phy/phylink.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 69ca765485db..4a1edf19dfad 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2072,8 +2072,8 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
 		     dev_name(&phy->mdio.dev), phy->drv->name, irq_str);
 	kfree(irq_str);
 
-	mutex_lock(&phy->lock);
 	mutex_lock(&pl->state_mutex);
+	mutex_lock(&phy->lock);
 	pl->phydev = phy;
 	pl->phy_state.interface = interface;
 	pl->phy_state.pause = MLO_PAUSE_NONE;
@@ -2115,8 +2115,8 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
 		phy_disable_eee(phy);
 	}
 
-	mutex_unlock(&pl->state_mutex);
 	mutex_unlock(&phy->lock);
+	mutex_unlock(&pl->state_mutex);
 
 	phylink_dbg(pl,
 		    "phy: %s setting supported %*pb advertising %*pb\n",
-- 
2.48.1



  reply	other threads:[~2025-04-06 22:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-06 22:13 [RFC PATCH net-next v2 00/11] net: pcs: Introduce support for fwnode PCS Christian Marangi
2025-04-06 22:13 ` Christian Marangi [this message]
2025-04-06 22:13 ` [RFC PATCH net-next v2 02/11] net: phylink: keep and use MAC supported_interfaces in phylink struct Christian Marangi
2025-04-06 22:13 ` [RFC PATCH net-next v2 03/11] net: phy: introduce phy_interface_copy helper Christian Marangi
2025-04-06 22:13 ` [RFC PATCH net-next v2 04/11] net: phylink: introduce internal phylink PCS handling Christian Marangi
2025-04-06 22:13 ` [RFC PATCH net-next v2 05/11] net: phylink: add phylink_release_pcs() to externally release a PCS Christian Marangi
2025-04-06 22:13 ` [RFC PATCH net-next v2 06/11] net: pcs: implement Firmware node support for PCS driver Christian Marangi
2025-04-06 22:14 ` [RFC PATCH net-next v2 07/11] net: phylink: support late PCS provider attach Christian Marangi
2025-04-06 22:14 ` [RFC PATCH net-next v2 08/11] dt-bindings: net: ethernet-controller: permit to define multiple PCS Christian Marangi
2025-04-06 22:14 ` [RFC PATCH net-next v2 09/11] net: pcs: airoha: add PCS driver for Airoha SoC Christian Marangi
2025-04-06 22:14 ` [RFC PATCH net-next v2 10/11] dt-bindings: net: pcs: Document support for Airoha Ethernet PCS Christian Marangi
2025-04-06 22:14 ` [RFC PATCH net-next v2 11/11] net: airoha: add phylink support for GDM2/3/4 Christian Marangi
2025-04-08  7:29   ` Lorenzo Bianconi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250406221423.9723-2-ansuelsmth@gmail.com \
    --to=ansuelsmth@gmail.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=conor+dt@kernel.org \
    --cc=daniel@makrotopia.org \
    --cc=davem@davemloft.net \
    --cc=devicetree@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=krzk+dt@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=lorenzo@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=p.zabel@pengutronix.de \
    --cc=pabeni@redhat.com \
    --cc=quic_leiwei@quicinc.com \
    --cc=rmk+kernel@armlinux.org.uk \
    --cc=robh@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).