* [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
@ 2025-02-25 15:31 Vladimir Oltean
2025-03-03 17:31 ` Florian Fainelli
2025-03-03 18:06 ` Russell King (Oracle)
0 siblings, 2 replies; 4+ messages in thread
From: Vladimir Oltean @ 2025-02-25 15:31 UTC (permalink / raw)
To: netdev
Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Florian Fainelli,
Wei Fang
DSA has 2 kinds of drivers:
1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
2. Those who don't: all others. The above methods should be optional.
For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
These seem good candidates for setting mac_managed_pm = true because
that is essentially its definition, but that does not seem to be the
biggest problem for now, and is not what this change focuses on.
Talking strictly about the 2nd category of drivers here, I have noticed
that these also trigger the
WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
phydev->state != PHY_UP);
from mdio_bus_phy_resume(), because the PHY state machine is running.
It's running as a result of a previous dsa_user_open() -> ... ->
phylink_start() -> phy_start(), and AFAICS, mdio_bus_phy_suspend() was
supposed to have called phy_stop_machine(), but it didn't. So this is
why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume()
runs.
mdio_bus_phy_suspend() did not call phy_stop_machine() because for
phylink, the phydev->adjust_link function pointer is NULL. This seems a
technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
machine restart on resume"). That commit was written before phylink
existed, and was intended to avoid crashing with consumer drivers which
don't use the PHY state machine - phylink does.
Make the conditions dependent on the PHY device having a
phydev->phy_link_change() implementation equal to the default
phy_link_change() provided by phylib. Otherwise, just check that the
custom phydev->phy_link_change() has been provided and is non-NULL.
Phylink provides phylink_phy_change().
Thus, we will stop the state machine even for phylink-controlled PHYs
when using the MDIO bus PM ops.
Reported-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
I've only spent a few hours debugging this, and I'm unsure which patch
to even blame. I haven't noticed other issues apart from the WARN_ON()
originally added by commit 744d23c71af3 ("net: phy: Warn about incorrect
mdio_bus_phy_resume() state").
drivers/net/phy/phy_device.c | 38 ++++++++++++++++++++++--------------
1 file changed, 23 insertions(+), 15 deletions(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 7c4e1ad1864c..e2996fe8c498 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -241,6 +241,27 @@ static bool phy_drv_wol_enabled(struct phy_device *phydev)
return wol.wolopts != 0;
}
+static void phy_link_change(struct phy_device *phydev, bool up)
+{
+ struct net_device *netdev = phydev->attached_dev;
+
+ if (up)
+ netif_carrier_on(netdev);
+ else
+ netif_carrier_off(netdev);
+ phydev->adjust_link(netdev);
+ if (phydev->mii_ts && phydev->mii_ts->link_state)
+ phydev->mii_ts->link_state(phydev->mii_ts, phydev);
+}
+
+static bool phy_has_attached_dev(struct phy_device *phydev)
+{
+ if (phydev->phy_link_change == phy_link_change)
+ return phydev->attached_dev && phydev->adjust_link;
+
+ return phydev->phy_link_change;
+}
+
static bool mdio_bus_phy_may_suspend(struct phy_device *phydev)
{
struct device_driver *drv = phydev->mdio.dev.driver;
@@ -307,7 +328,7 @@ static __maybe_unused int mdio_bus_phy_suspend(struct device *dev)
* may call phy routines that try to grab the same lock, and that may
* lead to a deadlock.
*/
- if (phydev->attached_dev && phydev->adjust_link)
+ if (phy_has_attached_dev(phydev))
phy_stop_machine(phydev);
if (!mdio_bus_phy_may_suspend(phydev))
@@ -361,7 +382,7 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
}
}
- if (phydev->attached_dev && phydev->adjust_link)
+ if (phy_has_attached_dev(phydev))
phy_start_machine(phydev);
return 0;
@@ -1052,19 +1073,6 @@ struct phy_device *phy_find_first(struct mii_bus *bus)
}
EXPORT_SYMBOL(phy_find_first);
-static void phy_link_change(struct phy_device *phydev, bool up)
-{
- struct net_device *netdev = phydev->attached_dev;
-
- if (up)
- netif_carrier_on(netdev);
- else
- netif_carrier_off(netdev);
- phydev->adjust_link(netdev);
- if (phydev->mii_ts && phydev->mii_ts->link_state)
- phydev->mii_ts->link_state(phydev->mii_ts, phydev);
-}
-
/**
* phy_prepare_link - prepares the PHY layer to monitor link status
* @phydev: target phy_device struct
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
2025-02-25 15:31 [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY Vladimir Oltean
@ 2025-03-03 17:31 ` Florian Fainelli
2025-03-03 18:06 ` Russell King (Oracle)
1 sibling, 0 replies; 4+ messages in thread
From: Florian Fainelli @ 2025-03-03 17:31 UTC (permalink / raw)
To: Vladimir Oltean, netdev
Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Wei Fang
On 2/25/25 07:31, Vladimir Oltean wrote:
> DSA has 2 kinds of drivers:
>
> 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
> their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
> 2. Those who don't: all others. The above methods should be optional.
>
> For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
> and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
> These seem good candidates for setting mac_managed_pm = true because
> that is essentially its definition, but that does not seem to be the
> biggest problem for now, and is not what this change focuses on.
>
> Talking strictly about the 2nd category of drivers here, I have noticed
> that these also trigger the
>
> WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> phydev->state != PHY_UP);
>
> from mdio_bus_phy_resume(), because the PHY state machine is running.
>
> It's running as a result of a previous dsa_user_open() -> ... ->
> phylink_start() -> phy_start(), and AFAICS, mdio_bus_phy_suspend() was
> supposed to have called phy_stop_machine(), but it didn't. So this is
> why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume()
> runs.
>
> mdio_bus_phy_suspend() did not call phy_stop_machine() because for
> phylink, the phydev->adjust_link function pointer is NULL. This seems a
> technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
> machine restart on resume"). That commit was written before phylink
> existed, and was intended to avoid crashing with consumer drivers which
> don't use the PHY state machine - phylink does.
>
> Make the conditions dependent on the PHY device having a
> phydev->phy_link_change() implementation equal to the default
> phy_link_change() provided by phylib. Otherwise, just check that the
> custom phydev->phy_link_change() has been provided and is non-NULL.
> Phylink provides phylink_phy_change().
>
> Thus, we will stop the state machine even for phylink-controlled PHYs
> when using the MDIO bus PM ops.
>
> Reported-by: Wei Fang <wei.fang@nxp.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Sorry for the lag in reviewing, this looks reasonable to me, though I
don't have a device to reason about whether that will be a problem or not.
As you say though, some drivers should switch to mac_managed_pm, let me
try to set some cycles aside to make that change for bcm_sf2.c at the
very least.
Thanks!
--
Florian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
2025-02-25 15:31 [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY Vladimir Oltean
2025-03-03 17:31 ` Florian Fainelli
@ 2025-03-03 18:06 ` Russell King (Oracle)
2025-04-02 13:43 ` Vladimir Oltean
1 sibling, 1 reply; 4+ messages in thread
From: Russell King (Oracle) @ 2025-03-03 18:06 UTC (permalink / raw)
To: Vladimir Oltean, Richard Cochran
Cc: netdev, Andrew Lunn, Heiner Kallweit, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Florian Fainelli,
Wei Fang
On Tue, Feb 25, 2025 at 05:31:56PM +0200, Vladimir Oltean wrote:
> DSA has 2 kinds of drivers:
>
> 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
> their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
> 2. Those who don't: all others. The above methods should be optional.
>
> For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
> and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
> These seem good candidates for setting mac_managed_pm = true because
> that is essentially its definition, but that does not seem to be the
> biggest problem for now, and is not what this change focuses on.
>
> Talking strictly about the 2nd category of drivers here, I have noticed
> that these also trigger the
>
> WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> phydev->state != PHY_UP);
>
> from mdio_bus_phy_resume(), because the PHY state machine is running.
> It's running as a result of a previous dsa_user_open() -> ... ->
> phylink_start() -> phy_start(), and AFAICS, mdio_bus_phy_suspend() was
> supposed to have called phy_stop_machine(), but it didn't. So this is
> why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume()
> runs.
>
> mdio_bus_phy_suspend() did not call phy_stop_machine() because for
> phylink, the phydev->adjust_link function pointer is NULL. This seems a
> technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
> machine restart on resume"). That commit was written before phylink
> existed, and was intended to avoid crashing with consumer drivers which
> don't use the PHY state machine - phylink does.
>
> Make the conditions dependent on the PHY device having a
> phydev->phy_link_change() implementation equal to the default
> phy_link_change() provided by phylib. Otherwise, just check that the
> custom phydev->phy_link_change() has been provided and is non-NULL.
> Phylink provides phylink_phy_change().
>
> Thus, we will stop the state machine even for phylink-controlled PHYs
> when using the MDIO bus PM ops.
>
> Reported-by: Wei Fang <wei.fang@nxp.com>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> I've only spent a few hours debugging this, and I'm unsure which patch
> to even blame. I haven't noticed other issues apart from the WARN_ON()
> originally added by commit 744d23c71af3 ("net: phy: Warn about incorrect
> mdio_bus_phy_resume() state").
I think the commit looks correct to restore the intended behaviour,
but I'm puzzled why we haven't seen this before.
As for the right commit, you're correct that 744d23c71af3 brings the
warning. Phylink was never tested with suspend/resume initially, and
that's been something of an after-thought (I don't have platforms that
support suspend/resume and phylink, so this is something for other
people to test.)
However, your patch also brings up another concern:
commit 4715f65ffa0520af0680dbfbedbe349f175adaf4
Author: Richard Cochran <richardcochran@gmail.com>
Date: Wed Dec 25 18:16:15 2019 -0800
adding that call to MII timestamping stuff looks wrong to me - it means
MII timestamping doesn't get to know about link state if phylink is
being used. I'm not sure whether it needs to or not. Maybe Richard can
comment.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
2025-03-03 18:06 ` Russell King (Oracle)
@ 2025-04-02 13:43 ` Vladimir Oltean
0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Oltean @ 2025-04-02 13:43 UTC (permalink / raw)
To: Russell King (Oracle), Richard Cochran
Cc: netdev, Andrew Lunn, Heiner Kallweit, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Florian Fainelli,
Wei Fang
[-- Attachment #1: Type: text/plain, Size: 3879 bytes --]
On Mon, Mar 03, 2025 at 06:06:18PM +0000, Russell King (Oracle) wrote:
> On Tue, Feb 25, 2025 at 05:31:56PM +0200, Vladimir Oltean wrote:
> > DSA has 2 kinds of drivers:
> >
> > 1. Those who call dsa_switch_suspend() and dsa_switch_resume() from
> > their device PM ops: qca8k-8xxx, bcm_sf2, microchip ksz
> > 2. Those who don't: all others. The above methods should be optional.
> >
> > For type 1, dsa_switch_suspend() calls dsa_user_suspend() -> phylink_stop(),
> > and dsa_switch_resume() calls dsa_user_resume() -> phylink_start().
> > These seem good candidates for setting mac_managed_pm = true because
> > that is essentially its definition, but that does not seem to be the
> > biggest problem for now, and is not what this change focuses on.
> >
> > Talking strictly about the 2nd category of drivers here, I have noticed
> > that these also trigger the
> >
> > WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> > phydev->state != PHY_UP);
> >
> > from mdio_bus_phy_resume(), because the PHY state machine is running.
> > It's running as a result of a previous dsa_user_open() -> ... ->
> > phylink_start() -> phy_start(), and AFAICS, mdio_bus_phy_suspend() was
> > supposed to have called phy_stop_machine(), but it didn't. So this is
> > why the PHY is in state PHY_NOLINK by the time mdio_bus_phy_resume()
> > runs.
> >
> > mdio_bus_phy_suspend() did not call phy_stop_machine() because for
> > phylink, the phydev->adjust_link function pointer is NULL. This seems a
> > technicality introduced by commit fddd91016d16 ("phylib: fix PAL state
> > machine restart on resume"). That commit was written before phylink
> > existed, and was intended to avoid crashing with consumer drivers which
> > don't use the PHY state machine - phylink does.
> >
> > Make the conditions dependent on the PHY device having a
> > phydev->phy_link_change() implementation equal to the default
> > phy_link_change() provided by phylib. Otherwise, just check that the
> > custom phydev->phy_link_change() has been provided and is non-NULL.
> > Phylink provides phylink_phy_change().
> >
> > Thus, we will stop the state machine even for phylink-controlled PHYs
> > when using the MDIO bus PM ops.
> >
> > Reported-by: Wei Fang <wei.fang@nxp.com>
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> > ---
> > I've only spent a few hours debugging this, and I'm unsure which patch
> > to even blame. I haven't noticed other issues apart from the WARN_ON()
> > originally added by commit 744d23c71af3 ("net: phy: Warn about incorrect
> > mdio_bus_phy_resume() state").
>
> I think the commit looks correct to restore the intended behaviour,
> but I'm puzzled why we haven't seen this before.
>
> As for the right commit, you're correct that 744d23c71af3 brings the
> warning. Phylink was never tested with suspend/resume initially, and
> that's been something of an after-thought (I don't have platforms that
> support suspend/resume and phylink, so this is something for other
> people to test.)
>
> However, your patch also brings up another concern:
>
> commit 4715f65ffa0520af0680dbfbedbe349f175adaf4
> Author: Richard Cochran <richardcochran@gmail.com>
> Date: Wed Dec 25 18:16:15 2019 -0800
>
> adding that call to MII timestamping stuff looks wrong to me - it means
> MII timestamping doesn't get to know about link state if phylink is
> being used. I'm not sure whether it needs to or not. Maybe Richard can
> comment.
>
> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Thanks for the review and for pointing this out.
If Richard does not respond to the comment request, I will submit the
attached patch to net-next once it reopens on Apr 7th. I will anyway
resubmit the PM-related change above to "net" today, without the RFC tag.
[-- Attachment #2: 0001-net-phy-extend-MII-timestamper-link-state-update-to-.patch --]
[-- Type: text/x-diff, Size: 3870 bytes --]
From f1c1892e3ca21be9e241f6b8a3710a14f7ec304f Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Wed, 2 Apr 2025 16:20:33 +0300
Subject: [PATCH] net: phy: extend MII timestamper link state update to phylink
Context: since 2017, struct phy_device has a "phy_link_change" hook,
added by commit a81497bee70e ("net: phy: provide a hook for link up/link
down events"), with two implementations in the kernel:
- phylib's eponymous phy_link_change()
- phylink's phylink_phy_change()
Russell King points out here:
https://lore.kernel.org/netdev/Z8Xvmqp2sukNPzvt@shell.armlinux.org.uk/
that commit 4715f65ffa05 ("net: Introduce a new MII time stamping
interface.") from 2019 made the interesting design choice of placing the
further phydev->mii_ts->link_state() hook in the phylib implementation,
but not in the phylink one, due to an unknown reason.
As such, converting MAC drivers from phylib to phylink poses a
regression challenge if they use MII timestampers, because with phylink,
these will no longer be notified of link state changes (which is
something they may or may not care about).
The only upstream user of mii_ts->link_state is ptp_ines.c. I also don't
know in which systems it is integrated, and whether the attached MACs
use phylib or phylink.
In lack of link state updates coming from phylink, the ptp_ines.c driver
retains the initial PORT_CONF setting, which assumes PHY_SPEED_1000 <<
PHY_SPEED_SHIFT. I'm unable to further assess the impact of this
mismatch between the real MII speed and the initial assumption.
Lacking a proper bug report, I am going to assume there is no breakage,
but going forward, we should equally support phylib and phylink. That
can be done by placing the mii_ts->link_state() hook at the
phy_device->phy_link_state() call sites (i.e. phy_link_down() and
phy_link_up()), rather than at the individual implementation sites.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/net/phy/phy.c | 2 ++
drivers/net/phy/phy_device.c | 2 --
include/linux/phy.h | 10 ++++++++++
3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 13df28445f02..77b1d2d002ab 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -77,12 +77,14 @@ static void phy_link_up(struct phy_device *phydev)
{
phydev->phy_link_change(phydev, true);
phy_led_trigger_change_speed(phydev);
+ phy_ts_link_change(phydev);
}
static void phy_link_down(struct phy_device *phydev)
{
phydev->phy_link_change(phydev, false);
phy_led_trigger_change_speed(phydev);
+ phy_ts_link_change(phydev);
WRITE_ONCE(phydev->link_down_events, phydev->link_down_events + 1);
}
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 675fbd225378..f535a2862fc6 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1064,8 +1064,6 @@ static void phy_link_change(struct phy_device *phydev, bool up)
else
netif_carrier_off(netdev);
phydev->adjust_link(netdev);
- if (phydev->mii_ts && phydev->mii_ts->link_state)
- phydev->mii_ts->link_state(phydev->mii_ts, phydev);
}
/**
diff --git a/include/linux/phy.h b/include/linux/phy.h
index a2bfae80c449..c6cc4403323c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1609,6 +1609,16 @@ static inline bool phy_polling_mode(struct phy_device *phydev)
return phydev->irq == PHY_POLL;
}
+/**
+ * phy_ts_link_change: Notify MII timestamper of changes to PHY link state
+ * @phydev: the phy_device struct
+ */
+static inline void phy_ts_link_change(struct phy_device *phydev)
+{
+ if (phydev->mii_ts && phydev->mii_ts->link_state)
+ phydev->mii_ts->link_state(phydev->mii_ts, phydev);
+}
+
/**
* phy_has_hwtstamp - Tests whether a PHY time stamp configuration.
* @phydev: the phy_device struct
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-04-02 13:44 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-25 15:31 [RFC PATCH net] net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY Vladimir Oltean
2025-03-03 17:31 ` Florian Fainelli
2025-03-03 18:06 ` Russell King (Oracle)
2025-04-02 13:43 ` Vladimir Oltean
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).