netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
@ 2025-09-03 15:23 Vladimir Oltean
  2025-09-03 15:23 ` [PATCH v2 net 2/2] net: phy: transfer phy_config_inband() locking responsibility to phylink Vladimir Oltean
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Vladimir Oltean @ 2025-09-03 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

Currently phylink_resolve() protects itself against concurrent
phylink_bringup_phy() or phylink_disconnect_phy() calls which modify
pl->phydev by relying on pl->state_mutex.

The problem is that in phylink_resolve(), pl->state_mutex is in a lock
inversion state with pl->phydev->lock. So pl->phydev->lock needs to be
acquired prior to pl->state_mutex. But that requires dereferencing
pl->phydev in the first place, and without pl->state_mutex, that is
racy.

Hence the reason for the extra lock. Currently it is redundant, but it
will serve a functional purpose once mutex_lock(&phy->lock) will be
moved outside of the mutex_lock(&pl->state_mutex) section.

A less desirable alternative would have been to let phylink_resolve()
acquire the rtnl_mutex, which is also held when phylink_bringup_phy()
and phylink_disconnect_phy() are called. But this unnecessarily blocks
many other call paths as well in the entire kernel, so the smaller lock
was preferred.

Link: https://lore.kernel.org/netdev/aLb6puGVzR29GpPx@shell.armlinux.org.uk/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v1->v2: patch is new

 drivers/net/phy/phylink.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index c7f867b361dd..5bb0e1860a75 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -67,6 +67,8 @@ struct phylink {
 	struct timer_list link_poll;
 
 	struct mutex state_mutex;
+	/* Serialize updates to pl->phydev with phylink_resolve() */
+	struct mutex phy_lock;
 	struct phylink_link_state phy_state;
 	unsigned int phy_ib_mode;
 	struct work_struct resolve;
@@ -1582,8 +1584,11 @@ static void phylink_resolve(struct work_struct *w)
 	struct phylink_link_state link_state;
 	bool mac_config = false;
 	bool retrigger = false;
+	struct phy_device *phy;
 	bool cur_link_state;
 
+	mutex_lock(&pl->phy_lock);
+	phy = pl->phydev;
 	mutex_lock(&pl->state_mutex);
 	cur_link_state = phylink_link_is_up(pl);
 
@@ -1685,6 +1690,7 @@ static void phylink_resolve(struct work_struct *w)
 		queue_work(system_power_efficient_wq, &pl->resolve);
 	}
 	mutex_unlock(&pl->state_mutex);
+	mutex_unlock(&pl->phy_lock);
 }
 
 static void phylink_run_resolve(struct phylink *pl)
@@ -1820,6 +1826,7 @@ struct phylink *phylink_create(struct phylink_config *config,
 	if (!pl)
 		return ERR_PTR(-ENOMEM);
 
+	mutex_init(&pl->phy_lock);
 	mutex_init(&pl->state_mutex);
 	INIT_WORK(&pl->resolve, phylink_resolve);
 
@@ -2080,6 +2087,7 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
 		     dev_name(&phy->mdio.dev), phy->drv->name, irq_str);
 	kfree(irq_str);
 
+	mutex_lock(&pl->phy_lock);
 	mutex_lock(&phy->lock);
 	mutex_lock(&pl->state_mutex);
 	pl->phydev = phy;
@@ -2125,6 +2133,7 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
 
 	mutex_unlock(&pl->state_mutex);
 	mutex_unlock(&phy->lock);
+	mutex_unlock(&pl->phy_lock);
 
 	phylink_dbg(pl,
 		    "phy: %s setting supported %*pb advertising %*pb\n",
@@ -2305,6 +2314,7 @@ void phylink_disconnect_phy(struct phylink *pl)
 
 	phy = pl->phydev;
 	if (phy) {
+		mutex_lock(&pl->phy_lock);
 		mutex_lock(&phy->lock);
 		mutex_lock(&pl->state_mutex);
 		pl->phydev = NULL;
@@ -2312,6 +2322,7 @@ void phylink_disconnect_phy(struct phylink *pl)
 		pl->mac_tx_clk_stop = false;
 		mutex_unlock(&pl->state_mutex);
 		mutex_unlock(&phy->lock);
+		mutex_unlock(&pl->phy_lock);
 		flush_work(&pl->resolve);
 
 		phy_disconnect(phy);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 net 2/2] net: phy: transfer phy_config_inband() locking responsibility to phylink
  2025-09-03 15:23 [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Vladimir Oltean
@ 2025-09-03 15:23 ` Vladimir Oltean
  2025-09-03 15:26 ` [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Russell King (Oracle)
  2025-09-03 18:48 ` Simon Horman
  2 siblings, 0 replies; 9+ messages in thread
From: Vladimir Oltean @ 2025-09-03 15:23 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

Problem description
===================

Lockdep reports a possible circular locking dependency (AB/BA) between
&pl->state_mutex and &phy->lock, as follows.

phylink_resolve() // acquires &pl->state_mutex
-> phylink_major_config()
   -> phy_config_inband() // acquires &pl->phydev->lock

whereas all the other call sites where &pl->state_mutex and
&pl->phydev->lock have the locking scheme reversed. Everywhere else,
&pl->phydev->lock is acquired at the top level, and &pl->state_mutex at
the lower level. A clear example is phylink_bringup_phy().

The outlier is the newly introduced phy_config_inband() and the existing
lock order is the correct one. To understand why it cannot be the other
way around, it is sufficient to consider phylink_phy_change(), phylink's
callback from the PHY device's phy->phy_link_change() virtual method,
invoked by the PHY state machine.

phy_link_up() and phy_link_down(), the (indirect) callers of
phylink_phy_change(), are called with &phydev->lock acquired.
Then phylink_phy_change() acquires its own &pl->state_mutex, to
serialize changes made to its pl->phy_state and pl->link_config.
So all other instances of &pl->state_mutex and &phydev->lock must be
consistent with this order.

Problem impact
==============

I think the kernel runs a serious deadlock risk if an existing
phylink_resolve() thread, which results in a phy_config_inband() call,
is concurrent with a phy_link_up() or phy_link_down() call, which will
deadlock on &pl->state_mutex in phylink_phy_change(). Practically
speaking, the impact may be limited by the slow speed of the medium
auto-negotiation protocol, which makes it unlikely for the current state
to still be unresolved when a new one is detected, but I think the
problem is there. Nonetheless, the problem was discovered using lockdep.

Proposed solution
=================

Practically speaking, the phy_config_inband() requirement of having
phydev->lock acquired must transfer to the caller (phylink is the only
caller). There, it must bubble up until immediately before
&pl->state_mutex is acquired, for the cases where that takes place.

Solution details, considerations, notes
=======================================

This is the phy_config_inband() call graph:

                          sfp_upstream_ops :: connect_phy()
                          |
                          v
                          phylink_sfp_connect_phy()
                          |
                          v
                          phylink_sfp_config_phy()
                          |
                          |   sfp_upstream_ops :: module_insert()
                          |   |
                          |   v
                          |   phylink_sfp_module_insert()
                          |   |
                          |   |   sfp_upstream_ops :: module_start()
                          |   |   |
                          |   |   v
                          |   |   phylink_sfp_module_start()
                          |   |   |
                          |   v   v
                          |   phylink_sfp_config_optical()
 phylink_start()          |   |
   |   phylink_resume()   v   v
   |   |  phylink_sfp_set_config()
   |   |  |
   v   v  v
 phylink_mac_initial_config()
   |   phylink_resolve()
   |   |  phylink_ethtool_ksettings_set()
   v   v  v
   phylink_major_config()
            |
            v
    phy_config_inband()

phylink_major_config() caller #1, phylink_mac_initial_config(), does not
acquire &pl->state_mutex nor do its callers. It must acquire
&pl->phydev->lock prior to calling phylink_major_config().

phylink_major_config() caller #2, phylink_resolve() acquires
&pl->state_mutex, thus also needs to acquire &pl->phydev->lock.

phylink_major_config() caller #3, phylink_ethtool_ksettings_set(), is
completely uninteresting, because it only calls phylink_major_config()
if pl->phydev is NULL (otherwise it calls phy_ethtool_ksettings_set()).
We need to change nothing there.

Other solutions
===============

The lock inversion between &pl->state_mutex and &pl->phydev->lock has
occurred at least once before, as seen in commit c718af2d00a3 ("net:
phylink: fix ethtool -A with attached PHYs"). The solution there was to
simply not call phy_set_asym_pause() under the &pl->state_mutex. That
cannot be extended to our case though, where the phy_config_inband()
call is much deeper inside the &pl->state_mutex section.

Fixes: 5fd0f1a02e75 ("net: phylink: add negotiation of in-band capabilities")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
v1->v2:
- rebase over new patch which introduces pl->phy_lock
- add "Other solutions" section

v1 at:
https://lore.kernel.org/netdev/20250902134141.2430896-1-vladimir.oltean@nxp.com/

 drivers/net/phy/phy.c     | 12 ++++--------
 drivers/net/phy/phylink.c | 13 +++++++++++--
 2 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 13df28445f02..c02da57a4da5 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1065,23 +1065,19 @@ EXPORT_SYMBOL_GPL(phy_inband_caps);
  */
 int phy_config_inband(struct phy_device *phydev, unsigned int modes)
 {
-	int err;
+	lockdep_assert_held(&phydev->lock);
 
 	if (!!(modes & LINK_INBAND_DISABLE) +
 	    !!(modes & LINK_INBAND_ENABLE) +
 	    !!(modes & LINK_INBAND_BYPASS) != 1)
 		return -EINVAL;
 
-	mutex_lock(&phydev->lock);
 	if (!phydev->drv)
-		err = -EIO;
+		return -EIO;
 	else if (!phydev->drv->config_inband)
-		err = -EOPNOTSUPP;
-	else
-		err = phydev->drv->config_inband(phydev, modes);
-	mutex_unlock(&phydev->lock);
+		return -EOPNOTSUPP;
 
-	return err;
+	return phydev->drv->config_inband(phydev, modes);
 }
 EXPORT_SYMBOL(phy_config_inband);
 
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 5bb0e1860a75..a3559f6fcd02 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -1425,6 +1425,7 @@ static void phylink_get_fixed_state(struct phylink *pl,
 static void phylink_mac_initial_config(struct phylink *pl, bool force_restart)
 {
 	struct phylink_link_state link_state;
+	struct phy_device *phy = pl->phydev;
 
 	switch (pl->req_link_an_mode) {
 	case MLO_AN_PHY:
@@ -1448,7 +1449,11 @@ static void phylink_mac_initial_config(struct phylink *pl, bool force_restart)
 	link_state.link = false;
 
 	phylink_apply_manual_flow(pl, &link_state);
+	if (phy)
+		mutex_lock(&phy->lock);
 	phylink_major_config(pl, force_restart, &link_state);
+	if (phy)
+		mutex_unlock(&phy->lock);
 }
 
 static const char *phylink_pause_to_str(int pause)
@@ -1589,6 +1594,8 @@ static void phylink_resolve(struct work_struct *w)
 
 	mutex_lock(&pl->phy_lock);
 	phy = pl->phydev;
+	if (phy)
+		mutex_lock(&phy->lock);
 	mutex_lock(&pl->state_mutex);
 	cur_link_state = phylink_link_is_up(pl);
 
@@ -1622,11 +1629,11 @@ static void phylink_resolve(struct work_struct *w)
 		/* If we have a phy, the "up" state is the union of both the
 		 * PHY and the MAC
 		 */
-		if (pl->phydev)
+		if (phy)
 			link_state.link &= pl->phy_state.link;
 
 		/* Only update if the PHY link is up */
-		if (pl->phydev && pl->phy_state.link) {
+		if (phy && pl->phy_state.link) {
 			/* If the interface has changed, force a link down
 			 * event if the link isn't already down, and re-resolve.
 			 */
@@ -1690,6 +1697,8 @@ static void phylink_resolve(struct work_struct *w)
 		queue_work(system_power_efficient_wq, &pl->resolve);
 	}
 	mutex_unlock(&pl->state_mutex);
+	if (phy)
+		mutex_unlock(&phy->lock);
 	mutex_unlock(&pl->phy_lock);
 }
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 15:23 [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Vladimir Oltean
  2025-09-03 15:23 ` [PATCH v2 net 2/2] net: phy: transfer phy_config_inband() locking responsibility to phylink Vladimir Oltean
@ 2025-09-03 15:26 ` Russell King (Oracle)
  2025-09-03 15:31   ` Vladimir Oltean
  2025-09-03 18:48 ` Simon Horman
  2 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2025-09-03 15:26 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

On Wed, Sep 03, 2025 at 06:23:47PM +0300, Vladimir Oltean wrote:
> @@ -2305,6 +2314,7 @@ void phylink_disconnect_phy(struct phylink *pl)
>  
>  	phy = pl->phydev;
>  	if (phy) {
> +		mutex_lock(&pl->phy_lock);

If we can, I think it would be better to place this a couple of lines
above and move the unlock.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 15:26 ` [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Russell King (Oracle)
@ 2025-09-03 15:31   ` Vladimir Oltean
  2025-09-03 15:52     ` Russell King (Oracle)
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Oltean @ 2025-09-03 15:31 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: netdev, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

On Wed, Sep 03, 2025 at 04:26:35PM +0100, Russell King (Oracle) wrote:
> On Wed, Sep 03, 2025 at 06:23:47PM +0300, Vladimir Oltean wrote:
> > @@ -2305,6 +2314,7 @@ void phylink_disconnect_phy(struct phylink *pl)
> >  
> >  	phy = pl->phydev;
> >  	if (phy) {
> > +		mutex_lock(&pl->phy_lock);
> 
> If we can, I think it would be better to place this a couple of lines
> above and move the unlock.

Sorry for potentially misunderstanding, do you mean like this?

	mutex_lock(&pl->phy_lock);
	phy = pl->phydev;
	if (phy) {
		mutex_lock(&phy->lock);
		mutex_lock(&pl->state_mutex);
		pl->phydev = NULL;
		pl->phy_enable_tx_lpi = false;
		pl->mac_tx_clk_stop = false;
		mutex_unlock(&pl->state_mutex);
		mutex_unlock(&phy->lock);
		mutex_unlock(&pl->phy_lock);
		flush_work(&pl->resolve);

		phy_disconnect(phy);
	} else {
		mutex_unlock(&pl->phy_lock);
	}

move the unlock where? because flush_work(&pl->resolve) needs to happen
unlocked, otherwise we'll deadlock with phylink_resolve().

Additionally, dereferincing pl->phydev under rtnl_lock() is already safe,
and doesn't need the secondary clock.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 15:31   ` Vladimir Oltean
@ 2025-09-03 15:52     ` Russell King (Oracle)
  2025-09-03 17:04       ` Vladimir Oltean
  0 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2025-09-03 15:52 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

On Wed, Sep 03, 2025 at 06:31:20PM +0300, Vladimir Oltean wrote:
> On Wed, Sep 03, 2025 at 04:26:35PM +0100, Russell King (Oracle) wrote:
> > On Wed, Sep 03, 2025 at 06:23:47PM +0300, Vladimir Oltean wrote:
> > > @@ -2305,6 +2314,7 @@ void phylink_disconnect_phy(struct phylink *pl)
> > >  
> > >  	phy = pl->phydev;
> > >  	if (phy) {
> > > +		mutex_lock(&pl->phy_lock);
> > 
> > If we can, I think it would be better to place this a couple of lines
> > above and move the unlock.
> 
> Sorry for potentially misunderstanding, do you mean like this?
> 
> 	mutex_lock(&pl->phy_lock);
> 	phy = pl->phydev;
> 	if (phy) {
> 		mutex_lock(&phy->lock);
> 		mutex_lock(&pl->state_mutex);
> 		pl->phydev = NULL;
> 		pl->phy_enable_tx_lpi = false;
> 		pl->mac_tx_clk_stop = false;
> 		mutex_unlock(&pl->state_mutex);
> 		mutex_unlock(&phy->lock);
> 		mutex_unlock(&pl->phy_lock);
> 		flush_work(&pl->resolve);
> 
> 		phy_disconnect(phy);
> 	} else {
> 		mutex_unlock(&pl->phy_lock);
> 	}
> 
> move the unlock where? because flush_work(&pl->resolve) needs to happen
> unlocked, otherwise we'll deadlock with phylink_resolve().
> 
> Additionally, dereferincing pl->phydev under rtnl_lock() is already safe,
> and doesn't need the secondary clock.

The reason I'm making the suggestion is for consistency. If the lock
is there to ensure that reading pl->phydev is done safely, having one
site where we read it and then take the lock makes it look confusing.
I've also been thinking that it should be called pl->phydev_mutex
(note that phylink uses _mutex for mutexes.)

To avoid it looking weird, what about this:

	mutex_lock(&pl->phy_lock);
	phy = pl->phydev;
	if (phy) {
		mutex_lock(&phy->lock);
		mutex_lock(&pl->state_mutex);
		pl->phydev = NULL;
		pl->phy_enable_tx_lpi = false;
		pl->mac_tx_clk_stop = false;
		mutex_unlock(&pl->state_mutex);
		mutex_unlock(&phy->lock);
	}
	mutex_unlock(&pl->phy_lock);

	if (phy) 
 		flush_work(&pl->resolve);
 
 		phy_disconnect(phy);
 	}


-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 15:52     ` Russell King (Oracle)
@ 2025-09-03 17:04       ` Vladimir Oltean
  0 siblings, 0 replies; 9+ messages in thread
From: Vladimir Oltean @ 2025-09-03 17:04 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: netdev, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

On Wed, Sep 03, 2025 at 04:52:09PM +0100, Russell King (Oracle) wrote:
> The reason I'm making the suggestion is for consistency. If the lock
> is there to ensure that reading pl->phydev is done safely, having one
> site where we read it and then take the lock makes it look confusing.
> I've also been thinking that it should be called pl->phydev_mutex
> (note that phylink uses _mutex for mutexes.)
> 
> To avoid it looking weird, what about this:
> 
> 	mutex_lock(&pl->phy_lock);
> 	phy = pl->phydev;
> 	if (phy) {
> 		mutex_lock(&phy->lock);
> 		mutex_lock(&pl->state_mutex);
> 		pl->phydev = NULL;
> 		pl->phy_enable_tx_lpi = false;
> 		pl->mac_tx_clk_stop = false;
> 		mutex_unlock(&pl->state_mutex);
> 		mutex_unlock(&phy->lock);
> 	}
> 	mutex_unlock(&pl->phy_lock);
> 
> 	if (phy) 
>  		flush_work(&pl->resolve);
>  
>  		phy_disconnect(phy);
>  	}

I can make these changes and repost tomorrow, after some extra testing.
Thanks for the feedback.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 15:23 [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Vladimir Oltean
  2025-09-03 15:23 ` [PATCH v2 net 2/2] net: phy: transfer phy_config_inband() locking responsibility to phylink Vladimir Oltean
  2025-09-03 15:26 ` [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Russell King (Oracle)
@ 2025-09-03 18:48 ` Simon Horman
  2025-09-03 19:01   ` Vladimir Oltean
  2 siblings, 1 reply; 9+ messages in thread
From: Simon Horman @ 2025-09-03 18:48 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Heiner Kallweit, Russell King,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Wed, Sep 03, 2025 at 06:23:47PM +0300, Vladimir Oltean wrote:

...

> @@ -1582,8 +1584,11 @@ static void phylink_resolve(struct work_struct *w)
>  	struct phylink_link_state link_state;
>  	bool mac_config = false;
>  	bool retrigger = false;
> +	struct phy_device *phy;
>  	bool cur_link_state;
>  
> +	mutex_lock(&pl->phy_lock);
> +	phy = pl->phydev;

Hi Vladimir,

I guess this is an artifact of the development of this patchset.
Whatever the case, phy is set but otherwise unused in this function.

This makes CI lightup like a Christmas tree.
And it's a bit too early in the year for that.

>  	mutex_lock(&pl->state_mutex);
>  	cur_link_state = phylink_link_is_up(pl);
>  

...

-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 18:48 ` Simon Horman
@ 2025-09-03 19:01   ` Vladimir Oltean
  2025-09-04  8:42     ` Simon Horman
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Oltean @ 2025-09-03 19:01 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, Andrew Lunn, Heiner Kallweit, Russell King,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Wed, Sep 03, 2025 at 07:48:58PM +0100, Simon Horman wrote:
> On Wed, Sep 03, 2025 at 06:23:47PM +0300, Vladimir Oltean wrote:
> > @@ -1582,8 +1584,11 @@ static void phylink_resolve(struct work_struct *w)
> >  	struct phylink_link_state link_state;
> >  	bool mac_config = false;
> >  	bool retrigger = false;
> > +	struct phy_device *phy;
> >  	bool cur_link_state;
> >  
> > +	mutex_lock(&pl->phy_lock);
> > +	phy = pl->phydev;
> 
> Hi Vladimir,
> 
> I guess this is an artifact of the development of this patchset.
> Whatever the case, phy is set but otherwise unused in this function.
> 
> This makes CI lightup like a Christmas tree.
> And it's a bit too early in the year for that.

Thanks for letting me know. It's an artifact of moving patch 1 in front
of 2, and I'll address this for the next revision.

I downgraded to a slower computer for kernel compilation, and even
though I did compile patch by patch this submission, I had to stop
building with W=1 C=1 for some unrelated bisect and I forgot to turn
them back on.

I don't have a great solution to this, except I'll try next time to set
up a separate 'git worktree' for noisy stuff like bisection, and try to
keep the net-next environment separate and always with build warnings
and debug options enabled.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver
  2025-09-03 19:01   ` Vladimir Oltean
@ 2025-09-04  8:42     ` Simon Horman
  0 siblings, 0 replies; 9+ messages in thread
From: Simon Horman @ 2025-09-04  8:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Heiner Kallweit, Russell King,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Wed, Sep 03, 2025 at 10:01:45PM +0300, Vladimir Oltean wrote:
> On Wed, Sep 03, 2025 at 07:48:58PM +0100, Simon Horman wrote:
> > On Wed, Sep 03, 2025 at 06:23:47PM +0300, Vladimir Oltean wrote:
> > > @@ -1582,8 +1584,11 @@ static void phylink_resolve(struct work_struct *w)
> > >  	struct phylink_link_state link_state;
> > >  	bool mac_config = false;
> > >  	bool retrigger = false;
> > > +	struct phy_device *phy;
> > >  	bool cur_link_state;
> > >  
> > > +	mutex_lock(&pl->phy_lock);
> > > +	phy = pl->phydev;
> > 
> > Hi Vladimir,
> > 
> > I guess this is an artifact of the development of this patchset.
> > Whatever the case, phy is set but otherwise unused in this function.
> > 
> > This makes CI lightup like a Christmas tree.
> > And it's a bit too early in the year for that.
> 
> Thanks for letting me know. It's an artifact of moving patch 1 in front
> of 2, and I'll address this for the next revision.
> 
> I downgraded to a slower computer for kernel compilation, and even
> though I did compile patch by patch this submission, I had to stop
> building with W=1 C=1 for some unrelated bisect and I forgot to turn
> them back on.
> 
> I don't have a great solution to this, except I'll try next time to set
> up a separate 'git worktree' for noisy stuff like bisection, and try to
> keep the net-next environment separate and always with build warnings
> and debug options enabled.

Understood. It's a tricky problem.

FWIIW, while it's not perfect - e.g. it doesn't exercise linking - building
individual objects does catch problems like this one with low CPU time
requirements.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-09-04  8:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-03 15:23 [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Vladimir Oltean
2025-09-03 15:23 ` [PATCH v2 net 2/2] net: phy: transfer phy_config_inband() locking responsibility to phylink Vladimir Oltean
2025-09-03 15:26 ` [PATCH v2 net 1/2] net: phylink: add lock for serializing concurrent pl->phydev writes with resolver Russell King (Oracle)
2025-09-03 15:31   ` Vladimir Oltean
2025-09-03 15:52     ` Russell King (Oracle)
2025-09-03 17:04       ` Vladimir Oltean
2025-09-03 18:48 ` Simon Horman
2025-09-03 19:01   ` Vladimir Oltean
2025-09-04  8:42     ` Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).