From: Vladimir Oltean <vladimir.oltean@nxp.com>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: netdev@vger.kernel.org, Andrew Lunn <andrew@lunn.ch>,
Heiner Kallweit <hkallweit1@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH net] net: phy: transfer phy_config_inband() locking responsibility to phylink
Date: Tue, 2 Sep 2025 17:42:41 +0300 [thread overview]
Message-ID: <20250902144241.avfiqpmqy7xhlwqa@skbuf> (raw)
In-Reply-To: <aLb6puGVzR29GpPx@shell.armlinux.org.uk>
On Tue, Sep 02, 2025 at 03:09:42PM +0100, Russell King (Oracle) wrote:
> On Tue, Sep 02, 2025 at 04:41:41PM +0300, Vladimir Oltean wrote:
> > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> > index c7f867b361dd..350905928d46 100644
> > --- a/drivers/net/phy/phylink.c
> > +++ b/drivers/net/phy/phylink.c
> > @@ -1580,10 +1585,13 @@ static void phylink_resolve(struct work_struct *w)
> > {
> > struct phylink *pl = container_of(w, struct phylink, resolve);
> > struct phylink_link_state link_state;
> > + struct phy_device *phy = pl->phydev;
> > bool mac_config = false;
> > bool retrigger = false;
> > bool cur_link_state;
> >
> > + if (phy)
> > + mutex_lock(&phy->lock);
>
> I don't think this is safe.
>
> The addition and removal of PHYs is protected by two locks:
>
> 1. RTNL, to prevent ethtool operations running concurrently with the
> addition or removal of PHYs.
>
> 2. The state_mutex which protects the resolver which doesn't take the
> RTNL.
>
> Given that the RTNL is not held in this path, dereferencing pl->phydev
> is unsafe as the PHY may go away (through e.g. SFP module removal)
> which means this mutex_lock() may end up operating on free'd memory.
>
> I'm not sure we want to be taking the RTNL on this path.
>
> At the moment, I'm not sure what the solution is here.
Rephrased and slightly expanded: phylink_disconnect_phy(), when called
from drivers, has the convention that phylink_stop() must have been
called prior, or phylink_start() must have never been called.
However, when called from phylink_sfp_disconnect_phy(),
phylink_disconnect_phy() does not benefit from the same guarantee that
phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED) ran.
Correct so far?
Can we disable the resolver from phylink_sfp_disconnect_phy(), to offer
a similar guarantee that phylink_disconnect_phy() never runs with a
concurrent resolver?
I don't have a local setup at the moment to test what happens when I
unplug an SFP module with the change I am proposing. I can test in a few
hours at the earliest. However, there's a chance testing won't reveal
why we don't stop the resolver during SFP module disconnection, hence
the reason for this possibly stupid question.
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 350905928d46..a8facc177f1f 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2313,17 +2313,13 @@ void phylink_disconnect_phy(struct phylink *pl)
ASSERT_RTNL();
+ WARN_ON(!test_bit(PHYLINK_DISABLE_STOPPED, &pl->phylink_disable_state));
+
phy = pl->phydev;
if (phy) {
- mutex_lock(&phy->lock);
- mutex_lock(&pl->state_mutex);
pl->phydev = NULL;
pl->phy_enable_tx_lpi = false;
pl->mac_tx_clk_stop = false;
- mutex_unlock(&pl->state_mutex);
- mutex_unlock(&phy->lock);
- flush_work(&pl->resolve);
-
phy_disconnect(phy);
}
}
@@ -3809,7 +3805,10 @@ static int phylink_sfp_connect_phy(void *upstream, struct phy_device *phy)
static void phylink_sfp_disconnect_phy(void *upstream,
struct phy_device *phydev)
{
- phylink_disconnect_phy(upstream);
+ struct phylink *pl = upstream;
+
+ phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED);
+ phylink_disconnect_phy(pl);
}
static const struct sfp_upstream_ops sfp_phylink_ops = {
next prev parent reply other threads:[~2025-09-02 14:42 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-02 13:41 [RFC PATCH net] net: phy: transfer phy_config_inband() locking responsibility to phylink Vladimir Oltean
2025-09-02 14:09 ` Russell King (Oracle)
2025-09-02 14:42 ` Vladimir Oltean [this message]
2025-09-02 15:02 ` Vladimir Oltean
2025-09-03 8:36 ` Vladimir Oltean
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250902144241.avfiqpmqy7xhlwqa@skbuf \
--to=vladimir.oltean@nxp.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hkallweit1@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox