* [PATCH net-next 1/9] net: mdio: add definition for clock stop capable bit
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 2/9] net: phy: add support for querying PHY clock stop capability Russell King (Oracle)
` (10 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Add a definition for the clock stop capable bit in the PCS MMD. This
bit indicates whether the MAC is able to stop the transmit xMII clock
while it is signalling LPI.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
include/uapi/linux/mdio.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/uapi/linux/mdio.h b/include/uapi/linux/mdio.h
index f0d3f268240d..6975f182b22c 100644
--- a/include/uapi/linux/mdio.h
+++ b/include/uapi/linux/mdio.h
@@ -125,6 +125,7 @@
#define MDIO_STAT1_LPOWERABLE 0x0002 /* Low-power ability */
#define MDIO_STAT1_LSTATUS BMSR_LSTATUS
#define MDIO_STAT1_FAULT 0x0080 /* Fault */
+#define MDIO_PCS_STAT1_CLKSTOP_CAP 0x0040
#define MDIO_AN_STAT1_LPABLE 0x0001 /* Link partner AN ability */
#define MDIO_AN_STAT1_ABLE BMSR_ANEGCAPABLE
#define MDIO_AN_STAT1_RFAULT BMSR_RFAULT
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 0/9] net: add phylink managed EEE support
@ 2025-01-15 20:42 Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 1/9] net: mdio: add definition for clock stop capable bit Russell King (Oracle)
` (11 more replies)
0 siblings, 12 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Hi,
Adding managed EEE support to phylink has been on the cards ever since
the idea in phylib was mooted. This overly large series attempts to do
so. I've included all the patches as it's important to get the driver
patches out there.
Patch 1 adds a definition for the clock stop capable bit in the PCS
MMD status register.
Patch 2 adds a phylib API to query whether the PHY allows the transmit
xMII clock to be stopped while in LPI mode. This capability is for MAC
drivers to save power when LPI is active, to allow them to stop their
transmit clock.
Patch 3 extracts a phylink internal helper for determining whether the
link is up.
Patch 4 adds basic phylink managed EEE support. Two new MAC APIs are
added, to enable and disable LPI. The enable method is passed the LPI
timer setting which it is expected to program into the hardware, and
also a flag ehther the transmit clock should be stopped.
I have taken the decision to make enable_tx_lpi() to return an error
code, but not do much with it other than report it - the intention
being that we can later use it to extend functionality if needed
without reworking loads of drivers.
I have also dropped the validation/limitation of the LPI timer, and
left that in the driver code prior to calling phylink_ethtool_set_eee().
The remainder of the patches convert mvneta, lan743x and stmmac, and
add support for mvneta.
Since yesterday's RFC:
- fixed the mvpp2 GENMASK()
- dropped the DSA patch
- changed how phylink restricts EEE advertisement, and the EEE support
reported to userspace which fixes a bug.
drivers/net/ethernet/marvell/mvneta.c | 107 ++++++++++------
drivers/net/ethernet/marvell/mvpp2/mvpp2.h | 5 +
drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 86 +++++++++++++
drivers/net/ethernet/microchip/lan743x_ethtool.c | 21 ---
drivers/net/ethernet/microchip/lan743x_main.c | 46 ++++++-
drivers/net/ethernet/microchip/lan743x_main.h | 1 -
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++--
drivers/net/phy/phy.c | 20 +++
drivers/net/phy/phylink.c | 149 ++++++++++++++++++++--
include/linux/phy.h | 1 +
include/linux/phylink.h | 45 +++++++
include/uapi/linux/mdio.h | 1 +
12 files changed, 446 insertions(+), 93 deletions(-)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH net-next 2/9] net: phy: add support for querying PHY clock stop capability
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 1/9] net: mdio: add definition for clock stop capable bit Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 3/9] net: phylink: add phylink_link_is_up() helper Russell King (Oracle)
` (9 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Add support for querying whether the PHY allows the transmit xMII clock
to be stopped while in LPI mode. This will be used by phylink to pass
to the MAC driver so it can configure the generation of the xMII clock
appropriately.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/phy/phy.c | 20 ++++++++++++++++++++
include/linux/phy.h | 1 +
2 files changed, 21 insertions(+)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index c008fe050245..d0c1718e2b16 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1703,6 +1703,26 @@ void phy_mac_interrupt(struct phy_device *phydev)
}
EXPORT_SYMBOL(phy_mac_interrupt);
+/**
+ * phy_eee_tx_clock_stop_capable() - indicate whether the MAC can stop tx clock
+ * @phydev: target phy_device struct
+ *
+ * Indicate whether the MAC can disable the transmit xMII clock while in LPI
+ * state. Returns 1 if the MAC may stop the transmit clock, 0 if the MAC must
+ * not stop the transmit clock, or negative error.
+ */
+int phy_eee_tx_clock_stop_capable(struct phy_device *phydev)
+{
+ int stat1;
+
+ stat1 = phy_read_mmd(phydev, MDIO_MMD_PCS, MDIO_STAT1);
+ if (stat1 < 0)
+ return stat1;
+
+ return !!(stat1 & MDIO_PCS_STAT1_CLKSTOP_CAP);
+}
+EXPORT_SYMBOL_GPL(phy_eee_tx_clock_stop_capable);
+
/**
* phy_eee_rx_clock_stop() - configure PHY receive clock in LPI
* @phydev: target phy_device struct
diff --git a/include/linux/phy.h b/include/linux/phy.h
index afaae74d0949..244f747b3cd9 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -2146,6 +2146,7 @@ int phy_unregister_fixup(const char *bus_id, u32 phy_uid, u32 phy_uid_mask);
int phy_unregister_fixup_for_id(const char *bus_id);
int phy_unregister_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask);
+int phy_eee_tx_clock_stop_capable(struct phy_device *phydev);
int phy_eee_rx_clock_stop(struct phy_device *phydev, bool clk_stop_enable);
int phy_init_eee(struct phy_device *phydev, bool clk_stop_enable);
int phy_get_eee_err(struct phy_device *phydev);
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 3/9] net: phylink: add phylink_link_is_up() helper
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 1/9] net: mdio: add definition for clock stop capable bit Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 2/9] net: phy: add support for querying PHY clock stop capability Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 4/9] net: phylink: add EEE management Russell King (Oracle)
` (8 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Add a helper to determine whether the link is up or down. Currently
this is only used in one location, but becomes necessary to test
when reconfiguring EEE.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/phy/phylink.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 31754d5fd659..e3fc1d1be1ed 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -1641,20 +1641,21 @@ static void phylink_link_down(struct phylink *pl)
phylink_info(pl, "Link is Down\n");
}
+static bool phylink_link_is_up(struct phylink *pl)
+{
+ return pl->netdev ? netif_carrier_ok(pl->netdev) : pl->old_link_state;
+}
+
static void phylink_resolve(struct work_struct *w)
{
struct phylink *pl = container_of(w, struct phylink, resolve);
struct phylink_link_state link_state;
- struct net_device *ndev = pl->netdev;
bool mac_config = false;
bool retrigger = false;
bool cur_link_state;
mutex_lock(&pl->state_mutex);
- if (pl->netdev)
- cur_link_state = netif_carrier_ok(ndev);
- else
- cur_link_state = pl->old_link_state;
+ cur_link_state = phylink_link_is_up(pl);
if (pl->phylink_disable_state) {
pl->link_failed = false;
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 4/9] net: phylink: add EEE management
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (2 preceding siblings ...)
2025-01-15 20:42 ` [PATCH net-next 3/9] net: phylink: add phylink_link_is_up() helper Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 5/9] net: mvneta: convert to phylink EEE implementation Russell King (Oracle)
` (7 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Add EEE management to phylink, making use of the phylib implementation.
This will only be used where a MAC driver populates the methods and
capabilities bitfield, otherwise we keep our old behaviour.
Phylink will keep track of the EEE configuration, including the clock
stop abilities at each end of the MAC to PHY link, programming the PHY
appropriately and preserving the LPI configuration should the PHY go
away.
Phylink will call into the MAC driver when LPI needs to be enabled or
disabled, with the requirement that the MAC have LPI disabled prior
to the netdev being brought up (in other words, it will only call
mac_disable_tx_lpi() if it has already called mac_enable_tx_lpi().)
Support for phylink managed EEE is enabled by populating both tx_lpi
MAC operations method pointers, and filling in both LPI interfaces
and capabilities. If the methods are provided but the LPI interfaces
or capabilities remain empty, this indicates to phylink that EEE is
implemented by the driver but the hardware it is driving does not
support EEE, and thus the ethtool set_eee() and get_eee() methods will
return EOPNOTSUPP.
No validation of the LPI timer value is performed by this patch.
For interface modes which do not support LPI, we make no attempt to
manipulate the phylib EEE advertisement, but instead refuse to
activate LPI at the MAC, noting it at debug message level.
We also restrict the advertisement and reported userspace support
linkmode masks according to the lpi_capabilities provided to
phylink by the MAC driver.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/phy/phylink.c | 138 ++++++++++++++++++++++++++++++++++++--
include/linux/phylink.h | 45 +++++++++++++
2 files changed, 178 insertions(+), 5 deletions(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index e3fc1d1be1ed..6a46463833f1 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -60,6 +60,7 @@ struct phylink {
u8 act_link_an_mode; /* Active MLO_AN_xxx mode */
u8 link_port; /* The current non-phy ethtool port */
__ETHTOOL_DECLARE_LINK_MODE_MASK(supported);
+ __ETHTOOL_DECLARE_LINK_MODE_MASK(supported_lpi);
/* The link configuration settings */
struct phylink_link_state link_config;
@@ -81,12 +82,20 @@ struct phylink {
unsigned int pcs_state;
bool link_failed;
+ bool mac_supports_eee_ops;
+ bool mac_supports_eee;
+ bool phy_enable_tx_lpi;
+ bool mac_enable_tx_lpi;
+ bool mac_tx_clk_stop;
+ u32 mac_tx_lpi_timer;
struct sfp_bus *sfp_bus;
bool sfp_may_have_phy;
DECLARE_PHY_INTERFACE_MASK(sfp_interfaces);
__ETHTOOL_DECLARE_LINK_MODE_MASK(sfp_support);
u8 sfp_port;
+
+ struct eee_config eee_cfg;
};
#define phylink_printk(level, pl, fmt, ...) \
@@ -1574,6 +1583,39 @@ static const char *phylink_pause_to_str(int pause)
}
}
+static void phylink_deactivate_lpi(struct phylink *pl)
+{
+ if (pl->mac_enable_tx_lpi) {
+ pl->mac_enable_tx_lpi = false;
+
+ phylink_dbg(pl, "disabling LPI\n");
+
+ pl->mac_ops->mac_disable_tx_lpi(pl->config);
+ }
+}
+
+static void phylink_activate_lpi(struct phylink *pl)
+{
+ int err;
+
+ if (!test_bit(pl->cur_interface, pl->config->lpi_interfaces)) {
+ phylink_dbg(pl, "MAC does not support LPI with %s\n",
+ phy_modes(pl->cur_interface));
+ return;
+ }
+
+ phylink_dbg(pl, "LPI timer %uus, tx clock stop %u\n",
+ pl->mac_tx_lpi_timer, pl->mac_tx_clk_stop);
+
+ err = pl->mac_ops->mac_enable_tx_lpi(pl->config, pl->mac_tx_lpi_timer,
+ pl->mac_tx_clk_stop);
+ if (!err)
+ pl->mac_enable_tx_lpi = true;
+ else
+ phylink_err(pl, "%ps() failed: %pe\n",
+ pl->mac_ops->mac_enable_tx_lpi, ERR_PTR(err));
+}
+
static void phylink_link_up(struct phylink *pl,
struct phylink_link_state link_state)
{
@@ -1620,6 +1662,9 @@ static void phylink_link_up(struct phylink *pl,
pl->cur_interface, speed, duplex,
!!(link_state.pause & MLO_PAUSE_TX), rx_pause);
+ if (pl->mac_supports_eee && pl->phy_enable_tx_lpi)
+ phylink_activate_lpi(pl);
+
if (ndev)
netif_carrier_on(ndev);
@@ -1636,6 +1681,9 @@ static void phylink_link_down(struct phylink *pl)
if (ndev)
netif_carrier_off(ndev);
+
+ phylink_deactivate_lpi(pl);
+
pl->mac_ops->mac_link_down(pl->config, pl->act_link_an_mode,
pl->cur_interface);
phylink_info(pl, "Link is Down\n");
@@ -1899,6 +1947,17 @@ struct phylink *phylink_create(struct phylink_config *config,
return ERR_PTR(-EINVAL);
}
+ pl->mac_supports_eee_ops = mac_ops->mac_disable_tx_lpi &&
+ mac_ops->mac_enable_tx_lpi;
+ pl->mac_supports_eee = pl->mac_supports_eee_ops &&
+ pl->config->lpi_capabilities &&
+ !phy_interface_empty(pl->config->lpi_interfaces);
+
+ /* Set the default EEE configuration */
+ pl->eee_cfg.eee_enabled = pl->config->eee_enabled_default;
+ pl->eee_cfg.tx_lpi_enabled = pl->eee_cfg.eee_enabled;
+ pl->eee_cfg.tx_lpi_timer = pl->config->lpi_timer_default;
+
pl->phy_state.interface = iface;
pl->link_interface = iface;
if (iface == PHY_INTERFACE_MODE_MOCA)
@@ -2003,16 +2062,22 @@ static void phylink_phy_change(struct phy_device *phydev, bool up)
pl->phy_state.link = up;
if (!up)
pl->link_failed = true;
+
+ /* Get the LPI state from phylib */
+ pl->phy_enable_tx_lpi = phydev->enable_tx_lpi;
+ pl->mac_tx_lpi_timer = phydev->eee_cfg.tx_lpi_timer;
mutex_unlock(&pl->state_mutex);
phylink_run_resolve(pl);
- phylink_dbg(pl, "phy link %s %s/%s/%s/%s/%s\n", up ? "up" : "down",
+ phylink_dbg(pl, "phy link %s %s/%s/%s/%s/%s/%slpi\n",
+ up ? "up" : "down",
phy_modes(phydev->interface),
phy_speed_to_str(phydev->speed),
phy_duplex_to_str(phydev->duplex),
phy_rate_matching_to_str(phydev->rate_matching),
- phylink_pause_to_str(pl->phy_state.pause));
+ phylink_pause_to_str(pl->phy_state.pause),
+ phydev->enable_tx_lpi ? "" : "no");
}
static int phylink_validate_phy(struct phylink *pl, struct phy_device *phy,
@@ -2142,6 +2207,36 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
/* Restrict the phy advertisement according to the MAC support. */
linkmode_copy(phy->advertising, config.advertising);
+
+ /* If the MAC supports phylink managed EEE, restrict the EEE
+ * advertisement according to the MAC's LPI capabilities.
+ */
+ if (pl->mac_supports_eee) {
+ /* If EEE is enabled, then we need to call phy_support_eee()
+ * to ensure that the advertising mask is appropriately set.
+ * This also enables EEE at the PHY.
+ */
+ if (pl->eee_cfg.eee_enabled)
+ phy_support_eee(phy);
+
+ phy->eee_cfg.tx_lpi_enabled = pl->eee_cfg.tx_lpi_enabled;
+ phy->eee_cfg.tx_lpi_timer = pl->eee_cfg.tx_lpi_timer;
+
+ /* Convert the MAC's LPI capabilities to linkmodes */
+ linkmode_zero(pl->supported_lpi);
+ phylink_caps_to_linkmodes(pl->supported_lpi,
+ pl->config->lpi_capabilities);
+
+ /* Restrict the PHYs EEE support/advertisement to the modes
+ * that the MAC supports.
+ */
+ linkmode_and(phy->advertising_eee, phy->advertising_eee,
+ pl->supported_lpi);
+ } else if (pl->mac_supports_eee_ops) {
+ /* MAC supports phylink EEE, but wants EEE always disabled. */
+ phy_disable_eee(phy);
+ }
+
mutex_unlock(&pl->state_mutex);
mutex_unlock(&phy->lock);
@@ -2157,7 +2252,13 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
if (pl->config->mac_managed_pm)
phy->mac_managed_pm = true;
- return 0;
+ /* Allow the MAC to stop its clock if the PHY has the capability */
+ pl->mac_tx_clk_stop = phy_eee_tx_clock_stop_capable(phy) > 0;
+
+ /* Explicitly configure whether the PHY is allowed to stop it's
+ * receive clock.
+ */
+ return phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
}
static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy,
@@ -2314,6 +2415,8 @@ void phylink_disconnect_phy(struct phylink *pl)
mutex_lock(&phy->lock);
mutex_lock(&pl->state_mutex);
pl->phydev = NULL;
+ pl->phy_enable_tx_lpi = false;
+ pl->mac_tx_clk_stop = false;
mutex_unlock(&pl->state_mutex);
mutex_unlock(&phy->lock);
flush_work(&pl->resolve);
@@ -3068,8 +3171,16 @@ int phylink_ethtool_get_eee(struct phylink *pl, struct ethtool_keee *eee)
ASSERT_RTNL();
- if (pl->phydev)
+ if (pl->mac_supports_eee_ops && !pl->mac_supports_eee)
+ return ret;
+
+ if (pl->phydev) {
ret = phy_ethtool_get_eee(pl->phydev, eee);
+ /* Restrict supported linkmode mask */
+ if (ret == 0 && pl->mac_supports_eee_ops)
+ linkmode_and(eee->supported, eee->supported,
+ pl->supported_lpi);
+ }
return ret;
}
@@ -3082,12 +3193,29 @@ EXPORT_SYMBOL_GPL(phylink_ethtool_get_eee);
*/
int phylink_ethtool_set_eee(struct phylink *pl, struct ethtool_keee *eee)
{
+ bool mac_eee = pl->mac_supports_eee;
int ret = -EOPNOTSUPP;
ASSERT_RTNL();
- if (pl->phydev)
+ phylink_dbg(pl, "mac %s phylink EEE%s, adv %*pbl, LPI%s timer %uus\n",
+ mac_eee ? "supports" : "does not support",
+ eee->eee_enabled ? ", enabled" : "",
+ __ETHTOOL_LINK_MODE_MASK_NBITS, eee->advertised,
+ eee->tx_lpi_enabled ? " enabled" : "", eee->tx_lpi_timer);
+
+ if (pl->mac_supports_eee_ops && !mac_eee)
+ return ret;
+
+ if (pl->phydev) {
+ /* Restrict advertisement mask */
+ if (pl->mac_supports_eee_ops)
+ linkmode_and(eee->advertised, eee->advertised,
+ pl->supported_lpi);
ret = phy_ethtool_set_eee(pl->phydev, eee);
+ if (ret == 0)
+ eee_to_eeecfg(&pl->eee_cfg, eee);
+ }
return ret;
}
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 4b7a20620b49..8e06d2812516 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -5,6 +5,8 @@
#include <linux/spinlock.h>
#include <linux/workqueue.h>
+#include <net/eee.h>
+
struct device_node;
struct ethtool_cmd;
struct fwnode_handle;
@@ -143,11 +145,17 @@ enum phylink_op_type {
* possible and avoid stopping it during suspend events.
* @default_an_inband: if true, defaults to MLO_AN_INBAND rather than
* MLO_AN_PHY. A fixed-link specification will override.
+ * @eee_rx_clk_stop_enable: if true, PHY can stop the receive clock during LPI
* @get_fixed_state: callback to execute to determine the fixed link state,
* if MAC link is at %MLO_AN_FIXED mode.
* @supported_interfaces: bitmap describing which PHY_INTERFACE_MODE_xxx
* are supported by the MAC/PCS.
+ * @lpi_interfaces: bitmap describing which PHY interface modes can support
+ * LPI signalling.
* @mac_capabilities: MAC pause/speed/duplex capabilities.
+ * @lpi_capabilities: MAC speeds which can support LPI signalling
+ * @lpi_timer_default: Default EEE LPI timer setting.
+ * @eee_enabled_default: If set, EEE will be enabled by phylink at creation time
*/
struct phylink_config {
struct device *dev;
@@ -156,10 +164,15 @@ struct phylink_config {
bool mac_managed_pm;
bool mac_requires_rxc;
bool default_an_inband;
+ bool eee_rx_clk_stop_enable;
void (*get_fixed_state)(struct phylink_config *config,
struct phylink_link_state *state);
DECLARE_PHY_INTERFACE_MASK(supported_interfaces);
+ DECLARE_PHY_INTERFACE_MASK(lpi_interfaces);
unsigned long mac_capabilities;
+ unsigned long lpi_capabilities;
+ u32 lpi_timer_default;
+ bool eee_enabled_default;
};
void phylink_limit_mac_speed(struct phylink_config *config, u32 max_speed);
@@ -173,6 +186,8 @@ void phylink_limit_mac_speed(struct phylink_config *config, u32 max_speed);
* @mac_finish: finish a major reconfiguration of the interface.
* @mac_link_down: take the link down.
* @mac_link_up: allow the link to come up.
+ * @mac_disable_tx_lpi: disable LPI.
+ * @mac_enable_tx_lpi: enable and configure LPI.
*
* The individual methods are described more fully below.
*/
@@ -193,6 +208,9 @@ struct phylink_mac_ops {
struct phy_device *phy, unsigned int mode,
phy_interface_t interface, int speed, int duplex,
bool tx_pause, bool rx_pause);
+ void (*mac_disable_tx_lpi)(struct phylink_config *config);
+ int (*mac_enable_tx_lpi)(struct phylink_config *config, u32 timer,
+ bool tx_clk_stop);
};
#if 0 /* For kernel-doc purposes only. */
@@ -387,6 +405,33 @@ void mac_link_down(struct phylink_config *config, unsigned int mode,
void mac_link_up(struct phylink_config *config, struct phy_device *phy,
unsigned int mode, phy_interface_t interface,
int speed, int duplex, bool tx_pause, bool rx_pause);
+
+/**
+ * mac_disable_tx_lpi() - disable LPI generation at the MAC
+ * @config: a pointer to a &struct phylink_config.
+ *
+ * Disable generation of LPI at the MAC, effectively preventing the MAC
+ * from indicating that it is idle.
+ */
+void mac_disable_tx_lpi(struct phylink_config *config);
+
+/**
+ * mac_enable_tx_lpi() - configure and enable LPI generation at the MAC
+ * @config: a pointer to a &struct phylink_config.
+ * @timer: LPI timeout in microseconds.
+ * @tx_clk_stop: allow xMII transmit clock to be stopped during LPI
+ *
+ * Configure the LPI timeout accordingly. This will only be called when
+ * the link is already up, to cater for situations where the hardware
+ * needs to be programmed according to the link speed.
+ *
+ * Enable LPI generation at the MAC, and configure whether the xMII transmit
+ * clock may be stopped.
+ *
+ * Returns: 0 on success. Please consult with rmk before returning an error.
+ */
+int mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
+ bool tx_clk_stop);
#endif
struct phylink_pcs_ops;
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 5/9] net: mvneta: convert to phylink EEE implementation
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (3 preceding siblings ...)
2025-01-15 20:42 ` [PATCH net-next 4/9] net: phylink: add EEE management Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 6/9] net: mvpp2: add " Russell King (Oracle)
` (6 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Convert mvneta to use phylink's EEE implementation by implementing the
two LPI control methods, and adding the initial configuration and
capabilities.
Although disabling LPI requires clearing a single bit, for safety we
clear the manual mode and force bits to ensure that auto mode will be
used.
Enabling LPI needs a full configuration of several values, as the timer
values are dependent on the MAC operating speed, as per the original
code.
As Armada 388 states that EEE is only supported in "SGMII" modes, mark
this in lpi_interfaces. Testing with RGMII on the Clearfog platform
indicates that the receive path fails to detect LPI over RGMII.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
--
v2: correct argument order to u32_replace_bits()
v3: split out validation and limitation of the LPI timer.
---
drivers/net/ethernet/marvell/mvneta.c | 107 ++++++++++++++++----------
1 file changed, 65 insertions(+), 42 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index fe6261b81540..5fc078a0c4d5 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -284,8 +284,12 @@
MVNETA_TXQ_BUCKET_REFILL_PERIOD))
#define MVNETA_LPI_CTRL_0 0x2cc0
+#define MVNETA_LPI_CTRL_0_TS (0xff << 8)
#define MVNETA_LPI_CTRL_1 0x2cc4
-#define MVNETA_LPI_REQUEST_ENABLE BIT(0)
+#define MVNETA_LPI_CTRL_1_REQUEST_ENABLE BIT(0)
+#define MVNETA_LPI_CTRL_1_REQUEST_FORCE BIT(1)
+#define MVNETA_LPI_CTRL_1_MANUAL_MODE BIT(2)
+#define MVNETA_LPI_CTRL_1_TW (0xfff << 4)
#define MVNETA_LPI_CTRL_2 0x2cc8
#define MVNETA_LPI_STATUS 0x2ccc
@@ -541,10 +545,6 @@ struct mvneta_port {
struct mvneta_bm_pool *pool_short;
int bm_win_id;
- bool eee_enabled;
- bool eee_active;
- bool tx_lpi_enabled;
-
u64 ethtool_stats[ARRAY_SIZE(mvneta_statistics)];
u32 indir[MVNETA_RSS_LU_TABLE_SIZE];
@@ -4213,18 +4213,6 @@ static int mvneta_mac_finish(struct phylink_config *config, unsigned int mode,
return 0;
}
-static void mvneta_set_eee(struct mvneta_port *pp, bool enable)
-{
- u32 lpi_ctl1;
-
- lpi_ctl1 = mvreg_read(pp, MVNETA_LPI_CTRL_1);
- if (enable)
- lpi_ctl1 |= MVNETA_LPI_REQUEST_ENABLE;
- else
- lpi_ctl1 &= ~MVNETA_LPI_REQUEST_ENABLE;
- mvreg_write(pp, MVNETA_LPI_CTRL_1, lpi_ctl1);
-}
-
static void mvneta_mac_link_down(struct phylink_config *config,
unsigned int mode, phy_interface_t interface)
{
@@ -4240,9 +4228,6 @@ static void mvneta_mac_link_down(struct phylink_config *config,
val |= MVNETA_GMAC_FORCE_LINK_DOWN;
mvreg_write(pp, MVNETA_GMAC_AUTONEG_CONFIG, val);
}
-
- pp->eee_active = false;
- mvneta_set_eee(pp, false);
}
static void mvneta_mac_link_up(struct phylink_config *config,
@@ -4291,11 +4276,56 @@ static void mvneta_mac_link_up(struct phylink_config *config,
}
mvneta_port_up(pp);
+}
- if (phy && pp->eee_enabled) {
- pp->eee_active = phy_init_eee(phy, false) >= 0;
- mvneta_set_eee(pp, pp->eee_active && pp->tx_lpi_enabled);
+static void mvneta_mac_disable_tx_lpi(struct phylink_config *config)
+{
+ struct mvneta_port *pp = netdev_priv(to_net_dev(config->dev));
+ u32 lpi1;
+
+ lpi1 = mvreg_read(pp, MVNETA_LPI_CTRL_1);
+ lpi1 &= ~(MVNETA_LPI_CTRL_1_REQUEST_ENABLE |
+ MVNETA_LPI_CTRL_1_REQUEST_FORCE |
+ MVNETA_LPI_CTRL_1_MANUAL_MODE);
+ mvreg_write(pp, MVNETA_LPI_CTRL_1, lpi1);
+}
+
+static int mvneta_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
+ bool tx_clk_stop)
+{
+ struct mvneta_port *pp = netdev_priv(to_net_dev(config->dev));
+ u32 ts, tw, lpi0, lpi1, status;
+
+ status = mvreg_read(pp, MVNETA_GMAC_STATUS);
+ if (status & MVNETA_GMAC_SPEED_1000) {
+ /* At 1G speeds, the timer resolution are 1us, and
+ * 802.3 says tw is 16.5us. Round up to 17us.
+ */
+ tw = 17;
+ ts = timer;
+ } else {
+ /* At 100M speeds, the timer resolutions are 10us, and
+ * 802.3 says tw is 30us.
+ */
+ tw = 3;
+ ts = DIV_ROUND_UP(timer, 10);
}
+
+ if (ts > 255)
+ ts = 255;
+
+ /* Configure ts */
+ lpi0 = mvreg_read(pp, MVNETA_LPI_CTRL_0);
+ lpi0 = u32_replace_bits(lpi0, ts, MVNETA_LPI_CTRL_0_TS);
+ mvreg_write(pp, MVNETA_LPI_CTRL_0, lpi0);
+
+ /* Configure tw and enable LPI generation */
+ lpi1 = mvreg_read(pp, MVNETA_LPI_CTRL_1);
+ lpi1 = u32_replace_bits(lpi1, tw, MVNETA_LPI_CTRL_1_TW);
+ lpi1 |= MVNETA_LPI_CTRL_1_REQUEST_ENABLE;
+ mvreg_write(pp, MVNETA_LPI_CTRL_1, lpi1);
+
+ return 0;
}
static const struct phylink_mac_ops mvneta_phylink_ops = {
@@ -4305,6 +4335,8 @@ static const struct phylink_mac_ops mvneta_phylink_ops = {
.mac_finish = mvneta_mac_finish,
.mac_link_down = mvneta_mac_link_down,
.mac_link_up = mvneta_mac_link_up,
+ .mac_disable_tx_lpi = mvneta_mac_disable_tx_lpi,
+ .mac_enable_tx_lpi = mvneta_mac_enable_tx_lpi,
};
static int mvneta_mdio_probe(struct mvneta_port *pp)
@@ -5106,14 +5138,6 @@ static int mvneta_ethtool_get_eee(struct net_device *dev,
struct ethtool_keee *eee)
{
struct mvneta_port *pp = netdev_priv(dev);
- u32 lpi_ctl0;
-
- lpi_ctl0 = mvreg_read(pp, MVNETA_LPI_CTRL_0);
-
- eee->eee_enabled = pp->eee_enabled;
- eee->eee_active = pp->eee_active;
- eee->tx_lpi_enabled = pp->tx_lpi_enabled;
- eee->tx_lpi_timer = (lpi_ctl0) >> 8; // * scale;
return phylink_ethtool_get_eee(pp->phylink, eee);
}
@@ -5122,7 +5146,6 @@ static int mvneta_ethtool_set_eee(struct net_device *dev,
struct ethtool_keee *eee)
{
struct mvneta_port *pp = netdev_priv(dev);
- u32 lpi_ctl0;
/* The Armada 37x documents do not give limits for this other than
* it being an 8-bit register.
@@ -5130,16 +5153,6 @@ static int mvneta_ethtool_set_eee(struct net_device *dev,
if (eee->tx_lpi_enabled && eee->tx_lpi_timer > 255)
return -EINVAL;
- lpi_ctl0 = mvreg_read(pp, MVNETA_LPI_CTRL_0);
- lpi_ctl0 &= ~(0xff << 8);
- lpi_ctl0 |= eee->tx_lpi_timer << 8;
- mvreg_write(pp, MVNETA_LPI_CTRL_0, lpi_ctl0);
-
- pp->eee_enabled = eee->eee_enabled;
- pp->tx_lpi_enabled = eee->tx_lpi_enabled;
-
- mvneta_set_eee(pp, eee->tx_lpi_enabled && eee->eee_enabled);
-
return phylink_ethtool_set_eee(pp->phylink, eee);
}
@@ -5453,6 +5466,9 @@ static int mvneta_port_power_up(struct mvneta_port *pp, int phy_mode)
!phy_interface_mode_is_rgmii(phy_mode))
return -EINVAL;
+ /* Ensure LPI is disabled */
+ mvneta_mac_disable_tx_lpi(&pp->phylink_config);
+
return 0;
}
@@ -5544,6 +5560,13 @@ static int mvneta_probe(struct platform_device *pdev)
pp->phylink_config.mac_capabilities = MAC_SYM_PAUSE | MAC_10 |
MAC_100 | MAC_1000FD | MAC_2500FD;
+ /* Setup EEE. Choose 250us idle. Only supported in SGMII modes. */
+ __set_bit(PHY_INTERFACE_MODE_QSGMII, pp->phylink_config.lpi_interfaces);
+ __set_bit(PHY_INTERFACE_MODE_SGMII, pp->phylink_config.lpi_interfaces);
+ pp->phylink_config.lpi_capabilities = MAC_100FD | MAC_1000FD;
+ pp->phylink_config.lpi_timer_default = 250;
+ pp->phylink_config.eee_enabled_default = true;
+
phy_interface_set_rgmii(pp->phylink_config.supported_interfaces);
__set_bit(PHY_INTERFACE_MODE_QSGMII,
pp->phylink_config.supported_interfaces);
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 6/9] net: mvpp2: add EEE implementation
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (4 preceding siblings ...)
2025-01-15 20:42 ` [PATCH net-next 5/9] net: mvneta: convert to phylink EEE implementation Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-16 8:27 ` Maxime Chevallier
2025-01-15 20:42 ` [PATCH net-next 7/9] net: lan743x: use netdev in lan743x_phylink_mac_link_down() Russell King (Oracle)
` (5 subsequent siblings)
11 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Add EEE support for mvpp2, using phylink's EEE implementation, which
means we just need to implement the two methods for LPI control, and
with the initial configuration. Only SGMII mode is supported, so only
100M and 1G speeds.
Disabling LPI requires clearing a single bit. Enabling LPI needs a full
configuration of several values, as the timer values are dependent on
the MAC operating speed.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
--
v3: split LPI timer limit and validation into separate patches
---
drivers/net/ethernet/marvell/mvpp2/mvpp2.h | 5 ++
.../net/ethernet/marvell/mvpp2/mvpp2_main.c | 86 +++++++++++++++++++
2 files changed, 91 insertions(+)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
index 9e02e4367bec..44fe9b68d1c2 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2.h
@@ -481,6 +481,11 @@
#define MVPP22_GMAC_INT_SUM_MASK 0xa4
#define MVPP22_GMAC_INT_SUM_MASK_LINK_STAT BIT(1)
#define MVPP22_GMAC_INT_SUM_MASK_PTP BIT(2)
+#define MVPP2_GMAC_LPI_CTRL0 0xc0
+#define MVPP2_GMAC_LPI_CTRL0_TS_MASK GENMASK(15, 8)
+#define MVPP2_GMAC_LPI_CTRL1 0xc4
+#define MVPP2_GMAC_LPI_CTRL1_REQ_EN BIT(0)
+#define MVPP2_GMAC_LPI_CTRL1_TW_MASK GENMASK(15, 4)
/* Per-port XGMAC registers. PPv2.2 and PPv2.3, only for GOP port 0,
* relative to port->base.
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
index f85229a30844..a8c33417bb3e 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c
@@ -5757,6 +5757,28 @@ static int mvpp2_ethtool_set_rxfh(struct net_device *dev,
return mvpp2_modify_rxfh_context(dev, NULL, rxfh, extack);
}
+static int mvpp2_ethtool_get_eee(struct net_device *dev,
+ struct ethtool_keee *eee)
+{
+ struct mvpp2_port *port = netdev_priv(dev);
+
+ if (!port->phylink)
+ return -EOPNOTSUPP;
+
+ return phylink_ethtool_get_eee(port->phylink, eee);
+}
+
+static int mvpp2_ethtool_set_eee(struct net_device *dev,
+ struct ethtool_keee *eee)
+{
+ struct mvpp2_port *port = netdev_priv(dev);
+
+ if (!port->phylink)
+ return -EOPNOTSUPP;
+
+ return phylink_ethtool_set_eee(port->phylink, eee);
+}
+
/* Device ops */
static const struct net_device_ops mvpp2_netdev_ops = {
@@ -5802,6 +5824,8 @@ static const struct ethtool_ops mvpp2_eth_tool_ops = {
.create_rxfh_context = mvpp2_create_rxfh_context,
.modify_rxfh_context = mvpp2_modify_rxfh_context,
.remove_rxfh_context = mvpp2_remove_rxfh_context,
+ .get_eee = mvpp2_ethtool_get_eee,
+ .set_eee = mvpp2_ethtool_set_eee,
};
/* Used for PPv2.1, or PPv2.2 with the old Device Tree binding that
@@ -6672,6 +6696,55 @@ static void mvpp2_mac_link_down(struct phylink_config *config,
mvpp2_port_disable(port);
}
+static void mvpp2_mac_disable_tx_lpi(struct phylink_config *config)
+{
+ struct mvpp2_port *port = mvpp2_phylink_to_port(config);
+
+ mvpp2_modify(port->base + MVPP2_GMAC_LPI_CTRL1,
+ MVPP2_GMAC_LPI_CTRL1_REQ_EN, 0);
+}
+
+static int mvpp2_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
+ bool tx_clk_stop)
+{
+ struct mvpp2_port *port = mvpp2_phylink_to_port(config);
+ u32 ts, tw, lpi1, status;
+
+ status = readl(port->base + MVPP2_GMAC_STATUS0);
+ if (status & MVPP2_GMAC_STATUS0_GMII_SPEED) {
+ /* At 1G speeds, the timer resolution are 1us, and
+ * 802.3 says tw is 16.5us. Round up to 17us.
+ */
+ tw = 17;
+ ts = timer;
+ } else {
+ /* At 100M speeds, the timer resolutions are 10us, and
+ * 802.3 says tw is 30us.
+ */
+ tw = 3;
+ ts = DIV_ROUND_UP(timer, 10);
+ }
+
+ if (ts > 255)
+ ts = 255;
+
+ /* Configure ts */
+ mvpp2_modify(port->base + MVPP2_GMAC_LPI_CTRL0,
+ MVPP2_GMAC_LPI_CTRL0_TS_MASK,
+ FIELD_PREP(MVPP2_GMAC_LPI_CTRL0_TS_MASK, ts));
+
+ lpi1 = readl(port->base + MVPP2_GMAC_LPI_CTRL1);
+
+ /* Configure tw */
+ lpi1 = u32_replace_bits(lpi1, tw, MVPP2_GMAC_LPI_CTRL1_TW_MASK);
+
+ /* Enable LPI generation */
+ writel(lpi1 | MVPP2_GMAC_LPI_CTRL1_REQ_EN,
+ port->base + MVPP2_GMAC_LPI_CTRL1);
+
+ return 0;
+}
+
static const struct phylink_mac_ops mvpp2_phylink_ops = {
.mac_select_pcs = mvpp2_select_pcs,
.mac_prepare = mvpp2_mac_prepare,
@@ -6679,6 +6752,8 @@ static const struct phylink_mac_ops mvpp2_phylink_ops = {
.mac_finish = mvpp2_mac_finish,
.mac_link_up = mvpp2_mac_link_up,
.mac_link_down = mvpp2_mac_link_down,
+ .mac_enable_tx_lpi = mvpp2_mac_enable_tx_lpi,
+ .mac_disable_tx_lpi = mvpp2_mac_disable_tx_lpi,
};
/* Work-around for ACPI */
@@ -6957,6 +7032,15 @@ static int mvpp2_port_probe(struct platform_device *pdev,
port->phylink_config.mac_capabilities =
MAC_2500FD | MAC_1000FD | MAC_100 | MAC_10;
+ __set_bit(PHY_INTERFACE_MODE_SGMII,
+ port->phylink_config.lpi_interfaces);
+
+ port->phylink_config.lpi_capabilities = MAC_1000FD | MAC_100FD;
+
+ /* Setup EEE. Choose 250us idle. */
+ port->phylink_config.lpi_timer_default = 250;
+ port->phylink_config.eee_enabled_default = true;
+
if (port->priv->global_tx_fc)
port->phylink_config.mac_capabilities |=
MAC_SYM_PAUSE | MAC_ASYM_PAUSE;
@@ -7031,6 +7115,8 @@ static int mvpp2_port_probe(struct platform_device *pdev,
goto err_free_port_pcpu;
}
port->phylink = phylink;
+
+ mvpp2_mac_disable_tx_lpi(&port->phylink_config);
} else {
dev_warn(&pdev->dev, "Use link irqs for port#%d. FW update required\n", port->id);
port->phylink = NULL;
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 7/9] net: lan743x: use netdev in lan743x_phylink_mac_link_down()
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (5 preceding siblings ...)
2025-01-15 20:42 ` [PATCH net-next 6/9] net: mvpp2: add " Russell King (Oracle)
@ 2025-01-15 20:42 ` Russell King (Oracle)
2025-01-15 20:43 ` [PATCH net-next 8/9] net: lan743x: convert to phylink managed EEE Russell King (Oracle)
` (4 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Use the netdev that we already have in lan743x_phylink_mac_link_down().
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/ethernet/microchip/lan743x_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 4dc5adcda6a3..8d7ad021ac70 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -3029,7 +3029,7 @@ static void lan743x_phylink_mac_link_down(struct phylink_config *config,
struct net_device *netdev = to_net_dev(config->dev);
struct lan743x_adapter *adapter = netdev_priv(netdev);
- netif_tx_stop_all_queues(to_net_dev(config->dev));
+ netif_tx_stop_all_queues(netdev);
lan743x_mac_eee_enable(adapter, false);
}
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 8/9] net: lan743x: convert to phylink managed EEE
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (6 preceding siblings ...)
2025-01-15 20:42 ` [PATCH net-next 7/9] net: lan743x: use netdev in lan743x_phylink_mac_link_down() Russell King (Oracle)
@ 2025-01-15 20:43 ` Russell King (Oracle)
2025-01-15 20:43 ` [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support Russell King (Oracle)
` (3 subsequent siblings)
11 siblings, 0 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:43 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Convert lan743x to phylink managed EEE:
- Set the lpi_capabilties.
- Move the call to lan743x_mac_eee_enable() into the enable/disable
tx_lpi functions.
- Ensure that EEEEN is clear during probe.
- Move the setting of the LPI timer into mac_enable_tx_lpi().
- Move reading of LPI timer to phylink initialisation to set the
default timer value.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
.../net/ethernet/microchip/lan743x_ethtool.c | 21 ---------
drivers/net/ethernet/microchip/lan743x_main.c | 44 ++++++++++++++++---
drivers/net/ethernet/microchip/lan743x_main.h | 1 -
3 files changed, 38 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_ethtool.c b/drivers/net/ethernet/microchip/lan743x_ethtool.c
index 1a1cbd034eda..1459acfb1e61 100644
--- a/drivers/net/ethernet/microchip/lan743x_ethtool.c
+++ b/drivers/net/ethernet/microchip/lan743x_ethtool.c
@@ -1055,9 +1055,6 @@ static int lan743x_ethtool_get_eee(struct net_device *netdev,
{
struct lan743x_adapter *adapter = netdev_priv(netdev);
- eee->tx_lpi_timer = lan743x_csr_read(adapter,
- MAC_EEE_TX_LPI_REQ_DLY_CNT);
-
return phylink_ethtool_get_eee(adapter->phylink, eee);
}
@@ -1065,24 +1062,6 @@ static int lan743x_ethtool_set_eee(struct net_device *netdev,
struct ethtool_keee *eee)
{
struct lan743x_adapter *adapter = netdev_priv(netdev);
- u32 tx_lpi_timer;
-
- tx_lpi_timer = lan743x_csr_read(adapter, MAC_EEE_TX_LPI_REQ_DLY_CNT);
- if (tx_lpi_timer != eee->tx_lpi_timer) {
- u32 mac_cr = lan743x_csr_read(adapter, MAC_CR);
-
- /* Software should only change this field when Energy Efficient
- * Ethernet Enable (EEEEN) is cleared.
- * This function will trigger an autonegotiation restart and
- * eee will be reenabled during link up if eee was negotiated.
- */
- lan743x_mac_eee_enable(adapter, false);
- lan743x_csr_write(adapter, MAC_EEE_TX_LPI_REQ_DLY_CNT,
- eee->tx_lpi_timer);
-
- if (mac_cr & MAC_CR_EEE_EN_)
- lan743x_mac_eee_enable(adapter, true);
- }
return phylink_ethtool_set_eee(adapter->phylink, eee);
}
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c
index 8d7ad021ac70..23760b613d3e 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.c
+++ b/drivers/net/ethernet/microchip/lan743x_main.c
@@ -2966,7 +2966,7 @@ static int lan743x_phylink_2500basex_config(struct lan743x_adapter *adapter)
return lan743x_pcs_power_reset(adapter);
}
-void lan743x_mac_eee_enable(struct lan743x_adapter *adapter, bool enable)
+static void lan743x_mac_eee_enable(struct lan743x_adapter *adapter, bool enable)
{
u32 mac_cr;
@@ -3027,10 +3027,8 @@ static void lan743x_phylink_mac_link_down(struct phylink_config *config,
phy_interface_t interface)
{
struct net_device *netdev = to_net_dev(config->dev);
- struct lan743x_adapter *adapter = netdev_priv(netdev);
netif_tx_stop_all_queues(netdev);
- lan743x_mac_eee_enable(adapter, false);
}
static void lan743x_phylink_mac_link_up(struct phylink_config *config,
@@ -3072,16 +3070,40 @@ static void lan743x_phylink_mac_link_up(struct phylink_config *config,
cap & FLOW_CTRL_TX,
cap & FLOW_CTRL_RX);
- if (phydev)
- lan743x_mac_eee_enable(adapter, phydev->enable_tx_lpi);
-
netif_tx_wake_all_queues(netdev);
}
+static void lan743x_mac_disable_tx_lpi(struct phylink_config *config)
+{
+ struct net_device *netdev = to_net_dev(config->dev);
+ struct lan743x_adapter *adapter = netdev_priv(netdev);
+
+ lan743x_mac_eee_enable(adapter, false);
+}
+
+static int lan743x_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
+ bool tx_clk_stop)
+{
+ struct net_device *netdev = to_net_dev(config->dev);
+ struct lan743x_adapter *adapter = netdev_priv(netdev);
+
+ /* Software should only change this field when Energy Efficient
+ * Ethernet Enable (EEEEN) is cleared. We ensure that by clearing
+ * EEEEN during probe, and phylink itself guarantees that
+ * mac_disable_tx_lpi() will have been previously called.
+ */
+ lan743x_csr_write(adapter, MAC_EEE_TX_LPI_REQ_DLY_CNT, timer);
+ lan743x_mac_eee_enable(adapter, true);
+
+ return 0;
+}
+
static const struct phylink_mac_ops lan743x_phylink_mac_ops = {
.mac_config = lan743x_phylink_mac_config,
.mac_link_down = lan743x_phylink_mac_link_down,
.mac_link_up = lan743x_phylink_mac_link_up,
+ .mac_disable_tx_lpi = lan743x_mac_disable_tx_lpi,
+ .mac_enable_tx_lpi = lan743x_mac_enable_tx_lpi,
};
static int lan743x_phylink_create(struct lan743x_adapter *adapter)
@@ -3095,6 +3117,9 @@ static int lan743x_phylink_create(struct lan743x_adapter *adapter)
adapter->phylink_config.mac_capabilities = MAC_ASYM_PAUSE |
MAC_SYM_PAUSE | MAC_10 | MAC_100 | MAC_1000FD;
+ adapter->phylink_config.lpi_capabilities = MAC_100FD | MAC_1000FD;
+ adapter->phylink_config.lpi_timer_default =
+ lan743x_csr_read(adapter, MAC_EEE_TX_LPI_REQ_DLY_CNT);
lan743x_phy_interface_select(adapter);
@@ -3120,6 +3145,10 @@ static int lan743x_phylink_create(struct lan743x_adapter *adapter)
phy_interface_set_rgmii(adapter->phylink_config.supported_interfaces);
}
+ memcpy(adapter->phylink_config.lpi_interfaces,
+ adapter->phylink_config.supported_interfaces,
+ sizeof(adapter->phylink_config.lpi_interfaces));
+
pl = phylink_create(&adapter->phylink_config, NULL,
adapter->phy_interface, &lan743x_phylink_mac_ops);
@@ -3517,6 +3546,9 @@ static int lan743x_hardware_init(struct lan743x_adapter *adapter,
spin_lock_init(&tx->ring_lock);
}
+ /* Ensure EEEEN is clear */
+ lan743x_mac_eee_enable(adapter, false);
+
return 0;
}
diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h
index 8ef897c114d3..7f73d66854be 100644
--- a/drivers/net/ethernet/microchip/lan743x_main.h
+++ b/drivers/net/ethernet/microchip/lan743x_main.h
@@ -1206,6 +1206,5 @@ void lan743x_hs_syslock_release(struct lan743x_adapter *adapter);
void lan743x_mac_flow_ctrl_set_enables(struct lan743x_adapter *adapter,
bool tx_enable, bool rx_enable);
int lan743x_sgmii_read(struct lan743x_adapter *adapter, u8 mmd, u16 addr);
-void lan743x_mac_eee_enable(struct lan743x_adapter *adapter, bool enable);
#endif /* _LAN743X_H */
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (7 preceding siblings ...)
2025-01-15 20:43 ` [PATCH net-next 8/9] net: lan743x: convert to phylink managed EEE Russell King (Oracle)
@ 2025-01-15 20:43 ` Russell King (Oracle)
2025-02-13 11:05 ` Jon Hunter
2025-01-16 0:40 ` [PATCH net-next 0/9] net: add " Jacob Keller
` (2 subsequent siblings)
11 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-15 20:43 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
Convert stmmac to use phylink managed EEE support rather than delving
into phylib:
1. Move the stmmac_eee_init() calls out of mac_link_down() and
mac_link_up() methods into the new mac_{enable,disable}_lpi()
methods. We leave the calls to stmmac_set_eee_pls() in place as
these change bits which tell the EEE hardware when the link came
up or down, and is used for a separate hardware timer. However,
symmetrically conditionalise this with priv->dma_cap.eee.
2. Update the current LPI timer each time LPI is enabled - which we
need for software-timed LPI.
3. With phylink managed EEE, phylink manages the receive clock stop
configuration via phylink_config.eee_rx_clk_stop_enable. Set this
appropriately which makes the call to phy_eee_rx_clock_stop()
redundant.
4. From what I can work out, all supported interfaces support LPI
signalling on stmmac (there's no restriction implemented.) It
also appears to support LPI at all full duplex speeds at or over
100M. Set these capabilities.
5. The default timer appears to be derived from a module parameter.
Set this the same, although we keep code that reconfigures the
timer in stmmac_init_phy().
6. Remove the direct call to phy_support_eee(), which phylink will do
on the drivers behalf if phylink_config.eee_enabled_default is set.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++++++++++----
1 file changed, 45 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index acd6994c1764..c5d293be8ab9 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -988,8 +988,8 @@ static void stmmac_mac_link_down(struct phylink_config *config,
struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
stmmac_mac_set(priv, priv->ioaddr, false);
- stmmac_eee_init(priv, false);
- stmmac_set_eee_pls(priv, priv->hw, false);
+ if (priv->dma_cap.eee)
+ stmmac_set_eee_pls(priv, priv->hw, false);
if (stmmac_fpe_supported(priv))
stmmac_fpe_link_state_handle(priv, false);
@@ -1096,13 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
stmmac_mac_set(priv, priv->ioaddr, true);
- if (phy && priv->dma_cap.eee) {
- phy_eee_rx_clock_stop(phy, !(priv->plat->flags &
- STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));
- priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer;
- stmmac_eee_init(priv, phy->enable_tx_lpi);
+ if (priv->dma_cap.eee)
stmmac_set_eee_pls(priv, priv->hw, true);
- }
if (stmmac_fpe_supported(priv))
stmmac_fpe_link_state_handle(priv, true);
@@ -1111,12 +1106,32 @@ static void stmmac_mac_link_up(struct phylink_config *config,
stmmac_hwtstamp_correct_latency(priv, priv);
}
+static void stmmac_mac_disable_tx_lpi(struct phylink_config *config)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
+
+ stmmac_eee_init(priv, false);
+}
+
+static int stmmac_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
+ bool tx_clk_stop)
+{
+ struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
+
+ priv->tx_lpi_timer = timer;
+ stmmac_eee_init(priv, true);
+
+ return 0;
+}
+
static const struct phylink_mac_ops stmmac_phylink_mac_ops = {
.mac_get_caps = stmmac_mac_get_caps,
.mac_select_pcs = stmmac_mac_select_pcs,
.mac_config = stmmac_mac_config,
.mac_link_down = stmmac_mac_link_down,
.mac_link_up = stmmac_mac_link_up,
+ .mac_disable_tx_lpi = stmmac_mac_disable_tx_lpi,
+ .mac_enable_tx_lpi = stmmac_mac_enable_tx_lpi,
};
/**
@@ -1189,9 +1204,6 @@ static int stmmac_init_phy(struct net_device *dev)
return -ENODEV;
}
- if (priv->dma_cap.eee)
- phy_support_eee(phydev);
-
ret = phylink_connect_phy(priv->phylink, phydev);
} else {
fwnode_handle_put(phy_fwnode);
@@ -1201,7 +1213,12 @@ static int stmmac_init_phy(struct net_device *dev)
if (ret == 0) {
struct ethtool_keee eee;
- /* Configure phylib's copy of the LPI timer */
+ /* Configure phylib's copy of the LPI timer. Normally,
+ * phylink_config.lpi_timer_default would do this, but there is
+ * a chance that userspace could change the eee_timer setting
+ * via sysfs before the first open. Thus, preserve existing
+ * behaviour.
+ */
if (!phylink_ethtool_get_eee(priv->phylink, &eee)) {
eee.tx_lpi_timer = priv->tx_lpi_timer;
phylink_ethtool_set_eee(priv->phylink, &eee);
@@ -1234,6 +1251,9 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
/* Stmmac always requires an RX clock for hardware initialization */
priv->phylink_config.mac_requires_rxc = true;
+ if (!(priv->plat->flags & STMMAC_FLAG_RX_CLK_RUNS_IN_LPI))
+ priv->phylink_config.eee_rx_clk_stop_enable = true;
+
mdio_bus_data = priv->plat->mdio_bus_data;
if (mdio_bus_data)
priv->phylink_config.default_an_inband =
@@ -1255,6 +1275,19 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
priv->phylink_config.supported_interfaces,
pcs->supported_interfaces);
+ if (priv->dma_cap.eee) {
+ /* Assume all supported interfaces also support LPI */
+ memcpy(priv->phylink_config.lpi_interfaces,
+ priv->phylink_config.supported_interfaces,
+ sizeof(priv->phylink_config.lpi_interfaces));
+
+ /* All full duplex speeds above 100Mbps are supported */
+ priv->phylink_config.lpi_capabilities = ~(MAC_1000FD - 1) |
+ MAC_100FD;
+ priv->phylink_config.lpi_timer_default = eee_timer * 1000;
+ priv->phylink_config.eee_enabled_default = true;
+ }
+
fwnode = priv->plat->port_node;
if (!fwnode)
fwnode = dev_fwnode(priv->device);
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (8 preceding siblings ...)
2025-01-15 20:43 ` [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support Russell King (Oracle)
@ 2025-01-16 0:40 ` Jacob Keller
2025-01-17 1:40 ` patchwork-bot+netdevbpf
2025-01-17 8:56 ` Jiawen Wu
11 siblings, 0 replies; 41+ messages in thread
From: Jacob Keller @ 2025-01-16 0:40 UTC (permalink / raw)
To: Russell King (Oracle), Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver
On 1/15/2025 12:42 PM, Russell King (Oracle) wrote:
> Hi,
>
> Adding managed EEE support to phylink has been on the cards ever since
> the idea in phylib was mooted. This overly large series attempts to do
> so. I've included all the patches as it's important to get the driver
> patches out there.
>
> Patch 1 adds a definition for the clock stop capable bit in the PCS
> MMD status register.
>
> Patch 2 adds a phylib API to query whether the PHY allows the transmit
> xMII clock to be stopped while in LPI mode. This capability is for MAC
> drivers to save power when LPI is active, to allow them to stop their
> transmit clock.
>
> Patch 3 extracts a phylink internal helper for determining whether the
> link is up.
>
> Patch 4 adds basic phylink managed EEE support. Two new MAC APIs are
> added, to enable and disable LPI. The enable method is passed the LPI
> timer setting which it is expected to program into the hardware, and
> also a flag ehther the transmit clock should be stopped.
>
> I have taken the decision to make enable_tx_lpi() to return an error
> code, but not do much with it other than report it - the intention
> being that we can later use it to extend functionality if needed
> without reworking loads of drivers.
>
> I have also dropped the validation/limitation of the LPI timer, and
> left that in the driver code prior to calling phylink_ethtool_set_eee().
>
> The remainder of the patches convert mvneta, lan743x and stmmac, and
> add support for mvneta.
>
> Since yesterday's RFC:
> - fixed the mvpp2 GENMASK()
> - dropped the DSA patch
> - changed how phylink restricts EEE advertisement, and the EEE support
> reported to userspace which fixes a bug.
>
Everything in this series looks good to me.
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 6/9] net: mvpp2: add EEE implementation
2025-01-15 20:42 ` [PATCH net-next 6/9] net: mvpp2: add " Russell King (Oracle)
@ 2025-01-16 8:27 ` Maxime Chevallier
0 siblings, 0 replies; 41+ messages in thread
From: Maxime Chevallier @ 2025-01-16 8:27 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver
Hi Russell,
On Wed, 15 Jan 2025 20:42:52 +0000
"Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> wrote:
> Add EEE support for mvpp2, using phylink's EEE implementation, which
> means we just need to implement the two methods for LPI control, and
> with the initial configuration. Only SGMII mode is supported, so only
> 100M and 1G speeds.
>
> Disabling LPI requires clearing a single bit. Enabling LPI needs a full
> configuration of several values, as the timer values are dependent on
> the MAC operating speed.
>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Thank you,
Maxime
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (9 preceding siblings ...)
2025-01-16 0:40 ` [PATCH net-next 0/9] net: add " Jacob Keller
@ 2025-01-17 1:40 ` patchwork-bot+netdevbpf
2025-01-17 8:56 ` Jiawen Wu
11 siblings, 0 replies; 41+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-17 1:40 UTC (permalink / raw)
To: Russell King
Cc: andrew, hkallweit1, alexandre.torgue, andrew+netdev,
bryan.whitehead, davem, edumazet, kuba, linux-arm-kernel,
linux-stm32, marcin.s.wojtas, mcoquelin.stm32, netdev, pabeni,
UNGLinuxDriver
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Wed, 15 Jan 2025 20:42:28 +0000 you wrote:
> Hi,
>
> Adding managed EEE support to phylink has been on the cards ever since
> the idea in phylib was mooted. This overly large series attempts to do
> so. I've included all the patches as it's important to get the driver
> patches out there.
>
> [...]
Here is the summary with links:
- [net-next,1/9] net: mdio: add definition for clock stop capable bit
https://git.kernel.org/netdev/net-next/c/3ba0262a8fed
- [net-next,2/9] net: phy: add support for querying PHY clock stop capability
https://git.kernel.org/netdev/net-next/c/a00e0d34c036
- [net-next,3/9] net: phylink: add phylink_link_is_up() helper
https://git.kernel.org/netdev/net-next/c/a17ceec62f81
- [net-next,4/9] net: phylink: add EEE management
https://git.kernel.org/netdev/net-next/c/03abf2a7c654
- [net-next,5/9] net: mvneta: convert to phylink EEE implementation
https://git.kernel.org/netdev/net-next/c/ac79927dc84f
- [net-next,6/9] net: mvpp2: add EEE implementation
https://git.kernel.org/netdev/net-next/c/b53b14786ed8
- [net-next,7/9] net: lan743x: use netdev in lan743x_phylink_mac_link_down()
https://git.kernel.org/netdev/net-next/c/a66447966f03
- [net-next,8/9] net: lan743x: convert to phylink managed EEE
https://git.kernel.org/netdev/net-next/c/bd691d5ca918
- [net-next,9/9] net: stmmac: convert to phylink managed EEE support
https://git.kernel.org/netdev/net-next/c/4218647d4556
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
` (10 preceding siblings ...)
2025-01-17 1:40 ` patchwork-bot+netdevbpf
@ 2025-01-17 8:56 ` Jiawen Wu
2025-01-17 9:05 ` Russell King (Oracle)
11 siblings, 1 reply; 41+ messages in thread
From: Jiawen Wu @ 2025-01-17 8:56 UTC (permalink / raw)
To: 'Russell King (Oracle)', 'Andrew Lunn',
'Heiner Kallweit', mengyuanlou
Cc: 'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, linux-stm32, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
> Hi,
>
> Adding managed EEE support to phylink has been on the cards ever since
> the idea in phylib was mooted. This overly large series attempts to do
> so. I've included all the patches as it's important to get the driver
> patches out there.
>
> Patch 1 adds a definition for the clock stop capable bit in the PCS
> MMD status register.
>
> Patch 2 adds a phylib API to query whether the PHY allows the transmit
> xMII clock to be stopped while in LPI mode. This capability is for MAC
> drivers to save power when LPI is active, to allow them to stop their
> transmit clock.
>
> Patch 3 extracts a phylink internal helper for determining whether the
> link is up.
>
> Patch 4 adds basic phylink managed EEE support. Two new MAC APIs are
> added, to enable and disable LPI. The enable method is passed the LPI
> timer setting which it is expected to program into the hardware, and
> also a flag ehther the transmit clock should be stopped.
>
> I have taken the decision to make enable_tx_lpi() to return an error
> code, but not do much with it other than report it - the intention
> being that we can later use it to extend functionality if needed
> without reworking loads of drivers.
>
> I have also dropped the validation/limitation of the LPI timer, and
> left that in the driver code prior to calling phylink_ethtool_set_eee().
>
> The remainder of the patches convert mvneta, lan743x and stmmac, and
> add support for mvneta.
>
> Since yesterday's RFC:
> - fixed the mvpp2 GENMASK()
> - dropped the DSA patch
> - changed how phylink restricts EEE advertisement, and the EEE support
> reported to userspace which fixes a bug.
>
> drivers/net/ethernet/marvell/mvneta.c | 107 ++++++++++------
> drivers/net/ethernet/marvell/mvpp2/mvpp2.h | 5 +
> drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 86 +++++++++++++
> drivers/net/ethernet/microchip/lan743x_ethtool.c | 21 ---
> drivers/net/ethernet/microchip/lan743x_main.c | 46 ++++++-
> drivers/net/ethernet/microchip/lan743x_main.h | 1 -
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++--
> drivers/net/phy/phy.c | 20 +++
> drivers/net/phy/phylink.c | 149 ++++++++++++++++++++--
> include/linux/phy.h | 1 +
> include/linux/phylink.h | 45 +++++++
> include/uapi/linux/mdio.h | 1 +
> 12 files changed, 446 insertions(+), 93 deletions(-)
Hi Russell,
Since merging these patches, phylink_connect_phy() can no longer be
invoked correctly in ngbe_open(). The error is returned from the function
phy_eee_rx_clock_stop(). Since EEE is not supported on our NGBE hardware.
How should I modify the ngbe driver to meet this change?
Thanks.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-17 8:56 ` Jiawen Wu
@ 2025-01-17 9:05 ` Russell King (Oracle)
2025-01-17 10:17 ` Jiawen Wu
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-17 9:05 UTC (permalink / raw)
To: Jiawen Wu
Cc: 'Andrew Lunn', 'Heiner Kallweit', mengyuanlou,
'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, linux-stm32, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
On Fri, Jan 17, 2025 at 04:56:34PM +0800, Jiawen Wu wrote:
> > Hi,
> >
> > Adding managed EEE support to phylink has been on the cards ever since
> > the idea in phylib was mooted. This overly large series attempts to do
> > so. I've included all the patches as it's important to get the driver
> > patches out there.
> >
> > Patch 1 adds a definition for the clock stop capable bit in the PCS
> > MMD status register.
> >
> > Patch 2 adds a phylib API to query whether the PHY allows the transmit
> > xMII clock to be stopped while in LPI mode. This capability is for MAC
> > drivers to save power when LPI is active, to allow them to stop their
> > transmit clock.
> >
> > Patch 3 extracts a phylink internal helper for determining whether the
> > link is up.
> >
> > Patch 4 adds basic phylink managed EEE support. Two new MAC APIs are
> > added, to enable and disable LPI. The enable method is passed the LPI
> > timer setting which it is expected to program into the hardware, and
> > also a flag ehther the transmit clock should be stopped.
> >
> > I have taken the decision to make enable_tx_lpi() to return an error
> > code, but not do much with it other than report it - the intention
> > being that we can later use it to extend functionality if needed
> > without reworking loads of drivers.
> >
> > I have also dropped the validation/limitation of the LPI timer, and
> > left that in the driver code prior to calling phylink_ethtool_set_eee().
> >
> > The remainder of the patches convert mvneta, lan743x and stmmac, and
> > add support for mvneta.
> >
> > Since yesterday's RFC:
> > - fixed the mvpp2 GENMASK()
> > - dropped the DSA patch
> > - changed how phylink restricts EEE advertisement, and the EEE support
> > reported to userspace which fixes a bug.
> >
> > drivers/net/ethernet/marvell/mvneta.c | 107 ++++++++++------
> > drivers/net/ethernet/marvell/mvpp2/mvpp2.h | 5 +
> > drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 86 +++++++++++++
> > drivers/net/ethernet/microchip/lan743x_ethtool.c | 21 ---
> > drivers/net/ethernet/microchip/lan743x_main.c | 46 ++++++-
> > drivers/net/ethernet/microchip/lan743x_main.h | 1 -
> > drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++--
> > drivers/net/phy/phy.c | 20 +++
> > drivers/net/phy/phylink.c | 149 ++++++++++++++++++++--
> > include/linux/phy.h | 1 +
> > include/linux/phylink.h | 45 +++++++
> > include/uapi/linux/mdio.h | 1 +
> > 12 files changed, 446 insertions(+), 93 deletions(-)
>
> Since merging these patches, phylink_connect_phy() can no longer be
> invoked correctly in ngbe_open(). The error is returned from the function
> phy_eee_rx_clock_stop(). Since EEE is not supported on our NGBE hardware.
That would mean phy_modify_mmd() is failing, but the question is why
that is. Please investigate. Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-17 9:05 ` Russell King (Oracle)
@ 2025-01-17 10:17 ` Jiawen Wu
2025-01-17 12:23 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jiawen Wu @ 2025-01-17 10:17 UTC (permalink / raw)
To: 'Russell King (Oracle)'
Cc: 'Andrew Lunn', 'Heiner Kallweit', mengyuanlou,
'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
> > Since merging these patches, phylink_connect_phy() can no longer be
> > invoked correctly in ngbe_open(). The error is returned from the function
> > phy_eee_rx_clock_stop(). Since EEE is not supported on our NGBE hardware.
>
> That would mean phy_modify_mmd() is failing, but the question is why
> that is. Please investigate. Thanks.
Yes, phy_modify_mmd() returns -EOPNOTSUPP. Since .read/write_mmd are
implemented in the PHY driver, but it's not supported to read/write the
register field (devnum=MDIO_MMD_PCS, regnum= MDIO_CTRL1).
So the error occurs on __phy_read_mmd():
if (phydev->drv && phydev->drv->read_mmd)
return phydev->drv->read_mmd(phydev, devad, regnum);
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-17 10:17 ` Jiawen Wu
@ 2025-01-17 12:23 ` Russell King (Oracle)
2025-01-20 1:51 ` Jiawen Wu
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-17 12:23 UTC (permalink / raw)
To: Jiawen Wu
Cc: 'Andrew Lunn', 'Heiner Kallweit', mengyuanlou,
'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
On Fri, Jan 17, 2025 at 06:17:05PM +0800, Jiawen Wu wrote:
> > > Since merging these patches, phylink_connect_phy() can no longer be
> > > invoked correctly in ngbe_open(). The error is returned from the function
> > > phy_eee_rx_clock_stop(). Since EEE is not supported on our NGBE hardware.
> >
> > That would mean phy_modify_mmd() is failing, but the question is why
> > that is. Please investigate. Thanks.
>
> Yes, phy_modify_mmd() returns -EOPNOTSUPP. Since .read/write_mmd are
> implemented in the PHY driver, but it's not supported to read/write the
> register field (devnum=MDIO_MMD_PCS, regnum= MDIO_CTRL1).
>
> So the error occurs on __phy_read_mmd():
> if (phydev->drv && phydev->drv->read_mmd)
> return phydev->drv->read_mmd(phydev, devad, regnum);
Thanks. The patch below should fix it. Please test, meanwhile I'll
prepare a proper patch.
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 66eea3f963d3..56d411bb2547 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2268,7 +2268,11 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
/* Explicitly configure whether the PHY is allowed to stop it's
* receive clock.
*/
- return phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
+ ret = phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
+ if (ret == -EOPNOTSUPP)
+ ret = 0;
+
+ return ret;
}
static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy,
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply related [flat|nested] 41+ messages in thread
* RE: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-17 12:23 ` Russell King (Oracle)
@ 2025-01-20 1:51 ` Jiawen Wu
2025-01-20 9:54 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jiawen Wu @ 2025-01-20 1:51 UTC (permalink / raw)
To: 'Russell King (Oracle)'
Cc: 'Andrew Lunn', 'Heiner Kallweit', mengyuanlou,
'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
On Fri, Jan 17, 2025 8:24 PM, Russell King (Oracle) wrote:
> On Fri, Jan 17, 2025 at 06:17:05PM +0800, Jiawen Wu wrote:
> > > > Since merging these patches, phylink_connect_phy() can no longer be
> > > > invoked correctly in ngbe_open(). The error is returned from the function
> > > > phy_eee_rx_clock_stop(). Since EEE is not supported on our NGBE hardware.
> > >
> > > That would mean phy_modify_mmd() is failing, but the question is why
> > > that is. Please investigate. Thanks.
> >
> > Yes, phy_modify_mmd() returns -EOPNOTSUPP. Since .read/write_mmd are
> > implemented in the PHY driver, but it's not supported to read/write the
> > register field (devnum=MDIO_MMD_PCS, regnum= MDIO_CTRL1).
> >
> > So the error occurs on __phy_read_mmd():
> > if (phydev->drv && phydev->drv->read_mmd)
> > return phydev->drv->read_mmd(phydev, devad, regnum);
>
> Thanks. The patch below should fix it. Please test, meanwhile I'll
> prepare a proper patch.
>
> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> index 66eea3f963d3..56d411bb2547 100644
> --- a/drivers/net/phy/phylink.c
> +++ b/drivers/net/phy/phylink.c
> @@ -2268,7 +2268,11 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
> /* Explicitly configure whether the PHY is allowed to stop it's
> * receive clock.
> */
> - return phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
> + ret = phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
> + if (ret == -EOPNOTSUPP)
> + ret = 0;
> +
> + return ret;
> }
>
> static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy,
Test pass.
Thanks.
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-20 1:51 ` Jiawen Wu
@ 2025-01-20 9:54 ` Russell King (Oracle)
2025-01-20 9:59 ` Jiawen Wu
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-01-20 9:54 UTC (permalink / raw)
To: Jiawen Wu
Cc: 'Andrew Lunn', 'Heiner Kallweit', mengyuanlou,
'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
On Mon, Jan 20, 2025 at 09:51:29AM +0800, Jiawen Wu wrote:
> On Fri, Jan 17, 2025 8:24 PM, Russell King (Oracle) wrote:
> > On Fri, Jan 17, 2025 at 06:17:05PM +0800, Jiawen Wu wrote:
> > > > > Since merging these patches, phylink_connect_phy() can no longer be
> > > > > invoked correctly in ngbe_open(). The error is returned from the function
> > > > > phy_eee_rx_clock_stop(). Since EEE is not supported on our NGBE hardware.
> > > >
> > > > That would mean phy_modify_mmd() is failing, but the question is why
> > > > that is. Please investigate. Thanks.
> > >
> > > Yes, phy_modify_mmd() returns -EOPNOTSUPP. Since .read/write_mmd are
> > > implemented in the PHY driver, but it's not supported to read/write the
> > > register field (devnum=MDIO_MMD_PCS, regnum= MDIO_CTRL1).
> > >
> > > So the error occurs on __phy_read_mmd():
> > > if (phydev->drv && phydev->drv->read_mmd)
> > > return phydev->drv->read_mmd(phydev, devad, regnum);
> >
> > Thanks. The patch below should fix it. Please test, meanwhile I'll
> > prepare a proper patch.
> >
> > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> > index 66eea3f963d3..56d411bb2547 100644
> > --- a/drivers/net/phy/phylink.c
> > +++ b/drivers/net/phy/phylink.c
> > @@ -2268,7 +2268,11 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
> > /* Explicitly configure whether the PHY is allowed to stop it's
> > * receive clock.
> > */
> > - return phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
> > + ret = phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
> > + if (ret == -EOPNOTSUPP)
> > + ret = 0;
> > +
> > + return ret;
> > }
> >
> > static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy,
>
> Test pass.
> Thanks.
Thanks, I guess that's a tested-by then?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* RE: [PATCH net-next 0/9] net: add phylink managed EEE support
2025-01-20 9:54 ` Russell King (Oracle)
@ 2025-01-20 9:59 ` Jiawen Wu
0 siblings, 0 replies; 41+ messages in thread
From: Jiawen Wu @ 2025-01-20 9:59 UTC (permalink / raw)
To: 'Russell King (Oracle)'
Cc: 'Andrew Lunn', 'Heiner Kallweit', mengyuanlou,
'Alexandre Torgue', 'Andrew Lunn',
'Bryan Whitehead', 'David S. Miller',
'Eric Dumazet', 'Jakub Kicinski',
linux-arm-kernel, 'Marcin Wojtas',
'Maxime Coquelin', netdev, 'Paolo Abeni',
UNGLinuxDriver
> > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
> > > index 66eea3f963d3..56d411bb2547 100644
> > > --- a/drivers/net/phy/phylink.c
> > > +++ b/drivers/net/phy/phylink.c
> > > @@ -2268,7 +2268,11 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
> > > /* Explicitly configure whether the PHY is allowed to stop it's
> > > * receive clock.
> > > */
> > > - return phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
> > > + ret = phy_eee_rx_clock_stop(phy, pl->config->eee_rx_clk_stop_enable);
> > > + if (ret == -EOPNOTSUPP)
> > > + ret = 0;
> > > +
> > > + return ret;
> > > }
> > >
> > > static int phylink_attach_phy(struct phylink *pl, struct phy_device *phy,
> >
> > Test pass.
> > Thanks.
>
> Thanks, I guess that's a tested-by then?
Yes, for this patch,
Tested-by: Jiawen Wu <jiawenwu@trustnetic.com>
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-01-15 20:43 ` [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support Russell King (Oracle)
@ 2025-02-13 11:05 ` Jon Hunter
2025-02-13 11:37 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-13 11:05 UTC (permalink / raw)
To: Russell King (Oracle), Andrew Lunn, Heiner Kallweit
Cc: Alexandre Torgue, Andrew Lunn, Bryan Whitehead, David S. Miller,
Eric Dumazet, Jakub Kicinski, linux-arm-kernel, linux-stm32,
Marcin Wojtas, Maxime Coquelin, netdev, Paolo Abeni,
UNGLinuxDriver, linux-tegra@vger.kernel.org
Hi Russell,
On 15/01/2025 20:43, Russell King (Oracle) wrote:
> Convert stmmac to use phylink managed EEE support rather than delving
> into phylib:
>
> 1. Move the stmmac_eee_init() calls out of mac_link_down() and
> mac_link_up() methods into the new mac_{enable,disable}_lpi()
> methods. We leave the calls to stmmac_set_eee_pls() in place as
> these change bits which tell the EEE hardware when the link came
> up or down, and is used for a separate hardware timer. However,
> symmetrically conditionalise this with priv->dma_cap.eee.
>
> 2. Update the current LPI timer each time LPI is enabled - which we
> need for software-timed LPI.
>
> 3. With phylink managed EEE, phylink manages the receive clock stop
> configuration via phylink_config.eee_rx_clk_stop_enable. Set this
> appropriately which makes the call to phy_eee_rx_clock_stop()
> redundant.
>
> 4. From what I can work out, all supported interfaces support LPI
> signalling on stmmac (there's no restriction implemented.) It
> also appears to support LPI at all full duplex speeds at or over
> 100M. Set these capabilities.
>
> 5. The default timer appears to be derived from a module parameter.
> Set this the same, although we keep code that reconfigures the
> timer in stmmac_init_phy().
>
> 6. Remove the direct call to phy_support_eee(), which phylink will do
> on the drivers behalf if phylink_config.eee_enabled_default is set.
>
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> ---
> .../net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++++++++++----
> 1 file changed, 45 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index acd6994c1764..c5d293be8ab9 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -988,8 +988,8 @@ static void stmmac_mac_link_down(struct phylink_config *config,
> struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
>
> stmmac_mac_set(priv, priv->ioaddr, false);
> - stmmac_eee_init(priv, false);
> - stmmac_set_eee_pls(priv, priv->hw, false);
> + if (priv->dma_cap.eee)
> + stmmac_set_eee_pls(priv, priv->hw, false);
>
> if (stmmac_fpe_supported(priv))
> stmmac_fpe_link_state_handle(priv, false);
> @@ -1096,13 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
> writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
>
> stmmac_mac_set(priv, priv->ioaddr, true);
> - if (phy && priv->dma_cap.eee) {
> - phy_eee_rx_clock_stop(phy, !(priv->plat->flags &
> - STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));
> - priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer;
> - stmmac_eee_init(priv, phy->enable_tx_lpi);
> + if (priv->dma_cap.eee)
> stmmac_set_eee_pls(priv, priv->hw, true);
> - }
>
> if (stmmac_fpe_supported(priv))
> stmmac_fpe_link_state_handle(priv, true);
> @@ -1111,12 +1106,32 @@ static void stmmac_mac_link_up(struct phylink_config *config,
> stmmac_hwtstamp_correct_latency(priv, priv);
> }
>
> +static void stmmac_mac_disable_tx_lpi(struct phylink_config *config)
> +{
> + struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> +
> + stmmac_eee_init(priv, false);
> +}
> +
> +static int stmmac_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
> + bool tx_clk_stop)
> +{
> + struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> +
> + priv->tx_lpi_timer = timer;
> + stmmac_eee_init(priv, true);
> +
> + return 0;
> +}
> +
> static const struct phylink_mac_ops stmmac_phylink_mac_ops = {
> .mac_get_caps = stmmac_mac_get_caps,
> .mac_select_pcs = stmmac_mac_select_pcs,
> .mac_config = stmmac_mac_config,
> .mac_link_down = stmmac_mac_link_down,
> .mac_link_up = stmmac_mac_link_up,
> + .mac_disable_tx_lpi = stmmac_mac_disable_tx_lpi,
> + .mac_enable_tx_lpi = stmmac_mac_enable_tx_lpi,
> };
>
> /**
> @@ -1189,9 +1204,6 @@ static int stmmac_init_phy(struct net_device *dev)
> return -ENODEV;
> }
>
> - if (priv->dma_cap.eee)
> - phy_support_eee(phydev);
> -
> ret = phylink_connect_phy(priv->phylink, phydev);
> } else {
> fwnode_handle_put(phy_fwnode);
> @@ -1201,7 +1213,12 @@ static int stmmac_init_phy(struct net_device *dev)
> if (ret == 0) {
> struct ethtool_keee eee;
>
> - /* Configure phylib's copy of the LPI timer */
> + /* Configure phylib's copy of the LPI timer. Normally,
> + * phylink_config.lpi_timer_default would do this, but there is
> + * a chance that userspace could change the eee_timer setting
> + * via sysfs before the first open. Thus, preserve existing
> + * behaviour.
> + */
> if (!phylink_ethtool_get_eee(priv->phylink, &eee)) {
> eee.tx_lpi_timer = priv->tx_lpi_timer;
> phylink_ethtool_set_eee(priv->phylink, &eee);
> @@ -1234,6 +1251,9 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
> /* Stmmac always requires an RX clock for hardware initialization */
> priv->phylink_config.mac_requires_rxc = true;
>
> + if (!(priv->plat->flags & STMMAC_FLAG_RX_CLK_RUNS_IN_LPI))
> + priv->phylink_config.eee_rx_clk_stop_enable = true;
> +
> mdio_bus_data = priv->plat->mdio_bus_data;
> if (mdio_bus_data)
> priv->phylink_config.default_an_inband =
> @@ -1255,6 +1275,19 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
> priv->phylink_config.supported_interfaces,
> pcs->supported_interfaces);
>
> + if (priv->dma_cap.eee) {
> + /* Assume all supported interfaces also support LPI */
> + memcpy(priv->phylink_config.lpi_interfaces,
> + priv->phylink_config.supported_interfaces,
> + sizeof(priv->phylink_config.lpi_interfaces));
> +
> + /* All full duplex speeds above 100Mbps are supported */
> + priv->phylink_config.lpi_capabilities = ~(MAC_1000FD - 1) |
> + MAC_100FD;
> + priv->phylink_config.lpi_timer_default = eee_timer * 1000;
> + priv->phylink_config.eee_enabled_default = true;
> + }
> +
> fwnode = priv->plat->port_node;
> if (!fwnode)
> fwnode = dev_fwnode(priv->device);
I have been tracking down a suspend regression on Tegra186 and bisect is
pointing to this change. If I revert this on top of v6.14-rc2 then
suspend is working again. This is observed on the Jetson TX2 board
(specifically tegra186-p2771-0000.dts).
This device is using NFS for testing. So it appears that for this board
networking does not restart and the board hangs. Looking at the logs I
do see this on resume ...
[ 64.129079] dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
[ 64.133125] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed
My first thought was if 'dma_cap.eee' is not supported for this device,
but from what I can see it is and 'dma_cap.eee' is true. Here are some
more details on this device regarding the ethernet controller.
[ 4.221837] dwc-eth-dwmac 2490000.ethernet: Adding to iommu group 3
[ 4.239289] dwc-eth-dwmac 2490000.ethernet: User ID: 0x10, Synopsys ID: 0x41
[ 4.244020] dwc-eth-dwmac 2490000.ethernet: DWMAC4/5
[ 4.249042] dwc-eth-dwmac 2490000.ethernet: DMA HW capability register supported
[ 4.256406] dwc-eth-dwmac 2490000.ethernet: RX Checksum Offload Engine supported
[ 4.263768] dwc-eth-dwmac 2490000.ethernet: TX Checksum insertion supported
[ 4.270700] dwc-eth-dwmac 2490000.ethernet: Wake-Up On Lan supported
[ 4.277063] dwc-eth-dwmac 2490000.ethernet: TSO supported
[ 4.282401] dwc-eth-dwmac 2490000.ethernet: Enable RX Mitigation via HW Watchdog Timer
[ 4.290293] dwc-eth-dwmac 2490000.ethernet: Enabled L3L4 Flow TC (entries=8)
[ 4.297309] dwc-eth-dwmac 2490000.ethernet: Enabled RFS Flow TC (entries=10)
[ 4.304327] dwc-eth-dwmac 2490000.ethernet: TSO feature enabled
[ 4.310220] dwc-eth-dwmac 2490000.ethernet: Using 40/40 bits DMA host/device width
Let me know if you have any thoughts.
Thanks!
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-13 11:05 ` Jon Hunter
@ 2025-02-13 11:37 ` Russell King (Oracle)
2025-02-13 12:00 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-13 11:37 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Thu, Feb 13, 2025 at 11:05:01AM +0000, Jon Hunter wrote:
> Hi Russell,
>
> On 15/01/2025 20:43, Russell King (Oracle) wrote:
> > Convert stmmac to use phylink managed EEE support rather than delving
> > into phylib:
> >
> > 1. Move the stmmac_eee_init() calls out of mac_link_down() and
> > mac_link_up() methods into the new mac_{enable,disable}_lpi()
> > methods. We leave the calls to stmmac_set_eee_pls() in place as
> > these change bits which tell the EEE hardware when the link came
> > up or down, and is used for a separate hardware timer. However,
> > symmetrically conditionalise this with priv->dma_cap.eee.
> >
> > 2. Update the current LPI timer each time LPI is enabled - which we
> > need for software-timed LPI.
> >
> > 3. With phylink managed EEE, phylink manages the receive clock stop
> > configuration via phylink_config.eee_rx_clk_stop_enable. Set this
> > appropriately which makes the call to phy_eee_rx_clock_stop()
> > redundant.
> >
> > 4. From what I can work out, all supported interfaces support LPI
> > signalling on stmmac (there's no restriction implemented.) It
> > also appears to support LPI at all full duplex speeds at or over
> > 100M. Set these capabilities.
> >
> > 5. The default timer appears to be derived from a module parameter.
> > Set this the same, although we keep code that reconfigures the
> > timer in stmmac_init_phy().
> >
> > 6. Remove the direct call to phy_support_eee(), which phylink will do
> > on the drivers behalf if phylink_config.eee_enabled_default is set.
> >
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > ---
> > .../net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++++++++++----
> > 1 file changed, 45 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index acd6994c1764..c5d293be8ab9 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -988,8 +988,8 @@ static void stmmac_mac_link_down(struct phylink_config *config,
> > struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> > stmmac_mac_set(priv, priv->ioaddr, false);
> > - stmmac_eee_init(priv, false);
> > - stmmac_set_eee_pls(priv, priv->hw, false);
> > + if (priv->dma_cap.eee)
> > + stmmac_set_eee_pls(priv, priv->hw, false);
> > if (stmmac_fpe_supported(priv))
> > stmmac_fpe_link_state_handle(priv, false);
> > @@ -1096,13 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
> > writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
> > stmmac_mac_set(priv, priv->ioaddr, true);
> > - if (phy && priv->dma_cap.eee) {
> > - phy_eee_rx_clock_stop(phy, !(priv->plat->flags &
> > - STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));
> > - priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer;
> > - stmmac_eee_init(priv, phy->enable_tx_lpi);
> > + if (priv->dma_cap.eee)
> > stmmac_set_eee_pls(priv, priv->hw, true);
> > - }
> > if (stmmac_fpe_supported(priv))
> > stmmac_fpe_link_state_handle(priv, true);
> > @@ -1111,12 +1106,32 @@ static void stmmac_mac_link_up(struct phylink_config *config,
> > stmmac_hwtstamp_correct_latency(priv, priv);
> > }
> > +static void stmmac_mac_disable_tx_lpi(struct phylink_config *config)
> > +{
> > + struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> > +
> > + stmmac_eee_init(priv, false);
> > +}
> > +
> > +static int stmmac_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
> > + bool tx_clk_stop)
> > +{
> > + struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> > +
> > + priv->tx_lpi_timer = timer;
> > + stmmac_eee_init(priv, true);
> > +
> > + return 0;
> > +}
> > +
> > static const struct phylink_mac_ops stmmac_phylink_mac_ops = {
> > .mac_get_caps = stmmac_mac_get_caps,
> > .mac_select_pcs = stmmac_mac_select_pcs,
> > .mac_config = stmmac_mac_config,
> > .mac_link_down = stmmac_mac_link_down,
> > .mac_link_up = stmmac_mac_link_up,
> > + .mac_disable_tx_lpi = stmmac_mac_disable_tx_lpi,
> > + .mac_enable_tx_lpi = stmmac_mac_enable_tx_lpi,
> > };
> > /**
> > @@ -1189,9 +1204,6 @@ static int stmmac_init_phy(struct net_device *dev)
> > return -ENODEV;
> > }
> > - if (priv->dma_cap.eee)
> > - phy_support_eee(phydev);
> > -
> > ret = phylink_connect_phy(priv->phylink, phydev);
> > } else {
> > fwnode_handle_put(phy_fwnode);
> > @@ -1201,7 +1213,12 @@ static int stmmac_init_phy(struct net_device *dev)
> > if (ret == 0) {
> > struct ethtool_keee eee;
> > - /* Configure phylib's copy of the LPI timer */
> > + /* Configure phylib's copy of the LPI timer. Normally,
> > + * phylink_config.lpi_timer_default would do this, but there is
> > + * a chance that userspace could change the eee_timer setting
> > + * via sysfs before the first open. Thus, preserve existing
> > + * behaviour.
> > + */
> > if (!phylink_ethtool_get_eee(priv->phylink, &eee)) {
> > eee.tx_lpi_timer = priv->tx_lpi_timer;
> > phylink_ethtool_set_eee(priv->phylink, &eee);
> > @@ -1234,6 +1251,9 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
> > /* Stmmac always requires an RX clock for hardware initialization */
> > priv->phylink_config.mac_requires_rxc = true;
> > + if (!(priv->plat->flags & STMMAC_FLAG_RX_CLK_RUNS_IN_LPI))
> > + priv->phylink_config.eee_rx_clk_stop_enable = true;
> > +
> > mdio_bus_data = priv->plat->mdio_bus_data;
> > if (mdio_bus_data)
> > priv->phylink_config.default_an_inband =
> > @@ -1255,6 +1275,19 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
> > priv->phylink_config.supported_interfaces,
> > pcs->supported_interfaces);
> > + if (priv->dma_cap.eee) {
> > + /* Assume all supported interfaces also support LPI */
> > + memcpy(priv->phylink_config.lpi_interfaces,
> > + priv->phylink_config.supported_interfaces,
> > + sizeof(priv->phylink_config.lpi_interfaces));
> > +
> > + /* All full duplex speeds above 100Mbps are supported */
> > + priv->phylink_config.lpi_capabilities = ~(MAC_1000FD - 1) |
> > + MAC_100FD;
> > + priv->phylink_config.lpi_timer_default = eee_timer * 1000;
> > + priv->phylink_config.eee_enabled_default = true;
> > + }
> > +
> > fwnode = priv->plat->port_node;
> > if (!fwnode)
> > fwnode = dev_fwnode(priv->device);
>
>
> I have been tracking down a suspend regression on Tegra186 and bisect is
> pointing to this change. If I revert this on top of v6.14-rc2 then
> suspend is working again. This is observed on the Jetson TX2 board
> (specifically tegra186-p2771-0000.dts).
Thanks for the report.
> This device is using NFS for testing. So it appears that for this board
> networking does not restart and the board hangs. Looking at the logs I
> do see this on resume ...
>
> [ 64.129079] dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
> [ 64.133125] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed
>
> My first thought was if 'dma_cap.eee' is not supported for this device,
> but from what I can see it is and 'dma_cap.eee' is true. Here are some
> more details on this device regarding the ethernet controller.
Could you see whether disabling EEE through ethtool (maybe first try
turning tx-lpi off before using the "eee off") to see whether that
makes any difference please?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-13 11:37 ` Russell King (Oracle)
@ 2025-02-13 12:00 ` Russell King (Oracle)
2025-02-14 10:58 ` Jon Hunter
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-13 12:00 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Thu, Feb 13, 2025 at 11:37:17AM +0000, Russell King (Oracle) wrote:
> On Thu, Feb 13, 2025 at 11:05:01AM +0000, Jon Hunter wrote:
> > Hi Russell,
> >
> > On 15/01/2025 20:43, Russell King (Oracle) wrote:
> > > Convert stmmac to use phylink managed EEE support rather than delving
> > > into phylib:
> > >
> > > 1. Move the stmmac_eee_init() calls out of mac_link_down() and
> > > mac_link_up() methods into the new mac_{enable,disable}_lpi()
> > > methods. We leave the calls to stmmac_set_eee_pls() in place as
> > > these change bits which tell the EEE hardware when the link came
> > > up or down, and is used for a separate hardware timer. However,
> > > symmetrically conditionalise this with priv->dma_cap.eee.
> > >
> > > 2. Update the current LPI timer each time LPI is enabled - which we
> > > need for software-timed LPI.
> > >
> > > 3. With phylink managed EEE, phylink manages the receive clock stop
> > > configuration via phylink_config.eee_rx_clk_stop_enable. Set this
> > > appropriately which makes the call to phy_eee_rx_clock_stop()
> > > redundant.
> > >
> > > 4. From what I can work out, all supported interfaces support LPI
> > > signalling on stmmac (there's no restriction implemented.) It
> > > also appears to support LPI at all full duplex speeds at or over
> > > 100M. Set these capabilities.
> > >
> > > 5. The default timer appears to be derived from a module parameter.
> > > Set this the same, although we keep code that reconfigures the
> > > timer in stmmac_init_phy().
> > >
> > > 6. Remove the direct call to phy_support_eee(), which phylink will do
> > > on the drivers behalf if phylink_config.eee_enabled_default is set.
> > >
> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > > ---
> > > .../net/ethernet/stmicro/stmmac/stmmac_main.c | 57 +++++++++++++++----
> > > 1 file changed, 45 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > index acd6994c1764..c5d293be8ab9 100644
> > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > @@ -988,8 +988,8 @@ static void stmmac_mac_link_down(struct phylink_config *config,
> > > struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> > > stmmac_mac_set(priv, priv->ioaddr, false);
> > > - stmmac_eee_init(priv, false);
> > > - stmmac_set_eee_pls(priv, priv->hw, false);
> > > + if (priv->dma_cap.eee)
> > > + stmmac_set_eee_pls(priv, priv->hw, false);
> > > if (stmmac_fpe_supported(priv))
> > > stmmac_fpe_link_state_handle(priv, false);
> > > @@ -1096,13 +1096,8 @@ static void stmmac_mac_link_up(struct phylink_config *config,
> > > writel(ctrl, priv->ioaddr + MAC_CTRL_REG);
> > > stmmac_mac_set(priv, priv->ioaddr, true);
> > > - if (phy && priv->dma_cap.eee) {
> > > - phy_eee_rx_clock_stop(phy, !(priv->plat->flags &
> > > - STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));
> > > - priv->tx_lpi_timer = phy->eee_cfg.tx_lpi_timer;
> > > - stmmac_eee_init(priv, phy->enable_tx_lpi);
> > > + if (priv->dma_cap.eee)
> > > stmmac_set_eee_pls(priv, priv->hw, true);
> > > - }
> > > if (stmmac_fpe_supported(priv))
> > > stmmac_fpe_link_state_handle(priv, true);
> > > @@ -1111,12 +1106,32 @@ static void stmmac_mac_link_up(struct phylink_config *config,
> > > stmmac_hwtstamp_correct_latency(priv, priv);
> > > }
> > > +static void stmmac_mac_disable_tx_lpi(struct phylink_config *config)
> > > +{
> > > + struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> > > +
> > > + stmmac_eee_init(priv, false);
> > > +}
> > > +
> > > +static int stmmac_mac_enable_tx_lpi(struct phylink_config *config, u32 timer,
> > > + bool tx_clk_stop)
> > > +{
> > > + struct stmmac_priv *priv = netdev_priv(to_net_dev(config->dev));
> > > +
> > > + priv->tx_lpi_timer = timer;
> > > + stmmac_eee_init(priv, true);
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static const struct phylink_mac_ops stmmac_phylink_mac_ops = {
> > > .mac_get_caps = stmmac_mac_get_caps,
> > > .mac_select_pcs = stmmac_mac_select_pcs,
> > > .mac_config = stmmac_mac_config,
> > > .mac_link_down = stmmac_mac_link_down,
> > > .mac_link_up = stmmac_mac_link_up,
> > > + .mac_disable_tx_lpi = stmmac_mac_disable_tx_lpi,
> > > + .mac_enable_tx_lpi = stmmac_mac_enable_tx_lpi,
> > > };
> > > /**
> > > @@ -1189,9 +1204,6 @@ static int stmmac_init_phy(struct net_device *dev)
> > > return -ENODEV;
> > > }
> > > - if (priv->dma_cap.eee)
> > > - phy_support_eee(phydev);
> > > -
> > > ret = phylink_connect_phy(priv->phylink, phydev);
> > > } else {
> > > fwnode_handle_put(phy_fwnode);
> > > @@ -1201,7 +1213,12 @@ static int stmmac_init_phy(struct net_device *dev)
> > > if (ret == 0) {
> > > struct ethtool_keee eee;
> > > - /* Configure phylib's copy of the LPI timer */
> > > + /* Configure phylib's copy of the LPI timer. Normally,
> > > + * phylink_config.lpi_timer_default would do this, but there is
> > > + * a chance that userspace could change the eee_timer setting
> > > + * via sysfs before the first open. Thus, preserve existing
> > > + * behaviour.
> > > + */
> > > if (!phylink_ethtool_get_eee(priv->phylink, &eee)) {
> > > eee.tx_lpi_timer = priv->tx_lpi_timer;
> > > phylink_ethtool_set_eee(priv->phylink, &eee);
> > > @@ -1234,6 +1251,9 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
> > > /* Stmmac always requires an RX clock for hardware initialization */
> > > priv->phylink_config.mac_requires_rxc = true;
> > > + if (!(priv->plat->flags & STMMAC_FLAG_RX_CLK_RUNS_IN_LPI))
> > > + priv->phylink_config.eee_rx_clk_stop_enable = true;
> > > +
> > > mdio_bus_data = priv->plat->mdio_bus_data;
> > > if (mdio_bus_data)
> > > priv->phylink_config.default_an_inband =
> > > @@ -1255,6 +1275,19 @@ static int stmmac_phy_setup(struct stmmac_priv *priv)
> > > priv->phylink_config.supported_interfaces,
> > > pcs->supported_interfaces);
> > > + if (priv->dma_cap.eee) {
> > > + /* Assume all supported interfaces also support LPI */
> > > + memcpy(priv->phylink_config.lpi_interfaces,
> > > + priv->phylink_config.supported_interfaces,
> > > + sizeof(priv->phylink_config.lpi_interfaces));
> > > +
> > > + /* All full duplex speeds above 100Mbps are supported */
> > > + priv->phylink_config.lpi_capabilities = ~(MAC_1000FD - 1) |
> > > + MAC_100FD;
> > > + priv->phylink_config.lpi_timer_default = eee_timer * 1000;
> > > + priv->phylink_config.eee_enabled_default = true;
> > > + }
> > > +
> > > fwnode = priv->plat->port_node;
> > > if (!fwnode)
> > > fwnode = dev_fwnode(priv->device);
> >
> >
> > I have been tracking down a suspend regression on Tegra186 and bisect is
> > pointing to this change. If I revert this on top of v6.14-rc2 then
> > suspend is working again. This is observed on the Jetson TX2 board
> > (specifically tegra186-p2771-0000.dts).
>
> Thanks for the report.
>
> > This device is using NFS for testing. So it appears that for this board
> > networking does not restart and the board hangs. Looking at the logs I
> > do see this on resume ...
> >
> > [ 64.129079] dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
> > [ 64.133125] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed
> >
> > My first thought was if 'dma_cap.eee' is not supported for this device,
> > but from what I can see it is and 'dma_cap.eee' is true. Here are some
> > more details on this device regarding the ethernet controller.
>
> Could you see whether disabling EEE through ethtool (maybe first try
> turning tx-lpi off before using the "eee off") to see whether that
> makes any difference please?
One thing that I'm wondering is - old code used to do:
- phy_eee_rx_clock_stop(phy, !(priv->plat->flags &
- STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));
The new code sets:
+ if (!(priv->plat->flags & STMMAC_FLAG_RX_CLK_RUNS_IN_LPI))
+ priv->phylink_config.eee_rx_clk_stop_enable = true;
which does the same thing in phylink - phylink_bringup_phy() will call
phy_eee_rx_clock_stop() when the PHY is attahed. So this happens at a
different time.
We know that stmmac_reset() can fail when the PHY receive clock is
stopped - at least with some cores.
So, I'm wondering whether I've inadvertently fixed another bug in stmmac
which has uncovered a different bug - maybe the PHY clock must never be
stopped even in LPI - or maybe we need to have a way of temporarily
disabling the PHY's clock-stop ability during stmmac_reset().
In addition to what I asked previously, could you also intrument
phy_eee_rx_clock_stop() and test before/after this patch to see
(a) whether it gets called at all before this patch and (b) confirm
the enable/disable state before and after.
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-13 12:00 ` Russell King (Oracle)
@ 2025-02-14 10:58 ` Jon Hunter
2025-02-14 11:21 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-14 10:58 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 13/02/2025 12:00, Russell King (Oracle) wrote:
...
>>> I have been tracking down a suspend regression on Tegra186 and bisect is
>>> pointing to this change. If I revert this on top of v6.14-rc2 then
>>> suspend is working again. This is observed on the Jetson TX2 board
>>> (specifically tegra186-p2771-0000.dts).
>>
>> Thanks for the report.
>>
>>> This device is using NFS for testing. So it appears that for this board
>>> networking does not restart and the board hangs. Looking at the logs I
>>> do see this on resume ...
>>>
>>> [ 64.129079] dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
>>> [ 64.133125] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine initialization failed
>>>
>>> My first thought was if 'dma_cap.eee' is not supported for this device,
>>> but from what I can see it is and 'dma_cap.eee' is true. Here are some
>>> more details on this device regarding the ethernet controller.
>>
>> Could you see whether disabling EEE through ethtool (maybe first try
>> turning tx-lpi off before using the "eee off") to see whether that
>> makes any difference please?
>
> One thing that I'm wondering is - old code used to do:
>
> - phy_eee_rx_clock_stop(phy, !(priv->plat->flags &
> - STMMAC_FLAG_RX_CLK_RUNS_IN_LPI));
>
> The new code sets:
>
> + if (!(priv->plat->flags & STMMAC_FLAG_RX_CLK_RUNS_IN_LPI))
> + priv->phylink_config.eee_rx_clk_stop_enable = true;
>
> which does the same thing in phylink - phylink_bringup_phy() will call
> phy_eee_rx_clock_stop() when the PHY is attahed. So this happens at a
> different time.
>
> We know that stmmac_reset() can fail when the PHY receive clock is
> stopped - at least with some cores.
>
> So, I'm wondering whether I've inadvertently fixed another bug in stmmac
> which has uncovered a different bug - maybe the PHY clock must never be
> stopped even in LPI - or maybe we need to have a way of temporarily
> disabling the PHY's clock-stop ability during stmmac_reset().
>
> In addition to what I asked previously, could you also intrument
> phy_eee_rx_clock_stop() and test before/after this patch to see
> (a) whether it gets called at all before this patch and (b) confirm
> the enable/disable state before and after.
Thanks for the feedback. So ...
1. I can confirm that suspend works if I disable EEE via ethtool
2. Prior to this change I do see phy_eee_rx_clock_stop being called
to enable the clock resuming from suspend, but after this change
it is not.
Prior to this change I see (note the prints around 389-392 are when
we resume from suspend) ...
[ 4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0
[ 4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[ 7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
[ 389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[ 392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
After this change I see ...
[ 4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
[ 4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[ 191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
So yes definitely related to the PHY clock.
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-14 10:58 ` Jon Hunter
@ 2025-02-14 11:21 ` Russell King (Oracle)
2025-02-14 17:03 ` Jon Hunter
2025-02-19 14:01 ` Jon Hunter
0 siblings, 2 replies; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-14 11:21 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Fri, Feb 14, 2025 at 10:58:55AM +0000, Jon Hunter wrote:
> Thanks for the feedback. So ...
>
> 1. I can confirm that suspend works if I disable EEE via ethtool
> 2. Prior to this change I do see phy_eee_rx_clock_stop being called
> to enable the clock resuming from suspend, but after this change
> it is not.
>
> Prior to this change I see (note the prints around 389-392 are when
> we resume from suspend) ...
>
> [ 4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0
This is a bug in phylink - it shouldn't have been calling
phy_eee_rx_clock_stop() where a MAC doesn't support phylink managed EEE.
> [ 4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> [ 7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
Presumably, this is when the link comes up before suspend, so the PHY
has been configured to allow the RX clock to be stopped prior to suspend
> [ 389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> [ 392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
Presumably, as this is after resume, this is again when the link comes
up (that's the only time that stmmac calls phy_eee_rx_clock_stop().)
> After this change I see ...
>
> [ 4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
> [ 4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> [ 191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
To me, this looks no different - the PHY was configured for clock stop
before suspending in both cases.
However, something else to verify with the old code - after boot and the
link comes up (so you get the second phy_eee_rx_clock_stop() at 7s),
try unplugging the link and re-plugging it. Then try suspending.
The point of this test is to verify whether the PHY ignores changes to
the RX clock stop configuration while the link is up.
The next stage is to instrument dwmac4_set_eee_mode(),
dwmac4_reset_eee_mode() and dwmac4_set_eee_lpi_entry_timer() to print
the final register values in each function vs dwmac4_set_lpi_mode() in
the new code. Also, I think instrumenting stmmac_common_interrupt() to
print a message when we get either CORE_IRQ_TX_PATH_IN_LPI_MODE or
CORE_IRQ_TX_PATH_EXIT_LPI_MODE indicating a change in LPI state would
be a good idea.
I'd like to see how this all ties up with suspend, resume, link up
and down events, so please don't trim the log so much.
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-14 11:21 ` Russell King (Oracle)
@ 2025-02-14 17:03 ` Jon Hunter
2025-02-19 14:01 ` Jon Hunter
1 sibling, 0 replies; 41+ messages in thread
From: Jon Hunter @ 2025-02-14 17:03 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 14/02/2025 11:21, Russell King (Oracle) wrote:
> On Fri, Feb 14, 2025 at 10:58:55AM +0000, Jon Hunter wrote:
>> Thanks for the feedback. So ...
>>
>> 1. I can confirm that suspend works if I disable EEE via ethtool
>> 2. Prior to this change I do see phy_eee_rx_clock_stop being called
>> to enable the clock resuming from suspend, but after this change
>> it is not.
>>
>> Prior to this change I see (note the prints around 389-392 are when
>> we resume from suspend) ...
>>
>> [ 4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0
>
> This is a bug in phylink - it shouldn't have been calling
> phy_eee_rx_clock_stop() where a MAC doesn't support phylink managed EEE.
>
>> [ 4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>
> Presumably, this is when the link comes up before suspend, so the PHY
> has been configured to allow the RX clock to be stopped prior to suspend
>
>> [ 389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>
> Presumably, as this is after resume, this is again when the link comes
> up (that's the only time that stmmac calls phy_eee_rx_clock_stop().)
>
>> After this change I see ...
>>
>> [ 4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>> [ 4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>
> To me, this looks no different - the PHY was configured for clock stop
> before suspending in both cases.
>
> However, something else to verify with the old code - after boot and the
> link comes up (so you get the second phy_eee_rx_clock_stop() at 7s),
> try unplugging the link and re-plugging it. Then try suspending.
OK will do. I will have to try that when I am back in the office next week.
> The point of this test is to verify whether the PHY ignores changes to
> the RX clock stop configuration while the link is up.
>
>
>
> The next stage is to instrument dwmac4_set_eee_mode(),
> dwmac4_reset_eee_mode() and dwmac4_set_eee_lpi_entry_timer() to print
> the final register values in each function vs dwmac4_set_lpi_mode() in
> the new code. Also, I think instrumenting stmmac_common_interrupt() to
> print a message when we get either CORE_IRQ_TX_PATH_IN_LPI_MODE or
> CORE_IRQ_TX_PATH_EXIT_LPI_MODE indicating a change in LPI state would
> be a good idea.
>
> I'd like to see how this all ties up with suspend, resume, link up
> and down events, so please don't trim the log so much.
OK thanks. I will instrument these too and get some more logs.
Cheers!
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-14 11:21 ` Russell King (Oracle)
2025-02-14 17:03 ` Jon Hunter
@ 2025-02-19 14:01 ` Jon Hunter
2025-02-19 15:36 ` Russell King (Oracle)
1 sibling, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-19 14:01 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 14/02/2025 11:21, Russell King (Oracle) wrote:
> On Fri, Feb 14, 2025 at 10:58:55AM +0000, Jon Hunter wrote:
>> Thanks for the feedback. So ...
>>
>> 1. I can confirm that suspend works if I disable EEE via ethtool
>> 2. Prior to this change I do see phy_eee_rx_clock_stop being called
>> to enable the clock resuming from suspend, but after this change
>> it is not.
>>
>> Prior to this change I see (note the prints around 389-392 are when
>> we resume from suspend) ...
>>
>> [ 4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0
>
> This is a bug in phylink - it shouldn't have been calling
> phy_eee_rx_clock_stop() where a MAC doesn't support phylink managed EEE.
>
>> [ 4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>
> Presumably, this is when the link comes up before suspend, so the PHY
> has been configured to allow the RX clock to be stopped prior to suspend
>
>> [ 389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>
> Presumably, as this is after resume, this is again when the link comes
> up (that's the only time that stmmac calls phy_eee_rx_clock_stop().)
>
>> After this change I see ...
>>
>> [ 4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>> [ 4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>
> To me, this looks no different - the PHY was configured for clock stop
> before suspending in both cases.
>
> However, something else to verify with the old code - after boot and the
> link comes up (so you get the second phy_eee_rx_clock_stop() at 7s),
> try unplugging the link and re-plugging it. Then try suspending.
I still need to try this but I am still not back to the office to get to this.
> The point of this test is to verify whether the PHY ignores changes to
> the RX clock stop configuration while the link is up.
>
>
>
> The next stage is to instrument dwmac4_set_eee_mode(),
> dwmac4_reset_eee_mode() and dwmac4_set_eee_lpi_entry_timer() to print
> the final register values in each function vs dwmac4_set_lpi_mode() in
> the new code. Also, I think instrumenting stmmac_common_interrupt() to
> print a message when we get either CORE_IRQ_TX_PATH_IN_LPI_MODE or
> CORE_IRQ_TX_PATH_EXIT_LPI_MODE indicating a change in LPI state would
> be a good idea.
>
> I'd like to see how this all ties up with suspend, resume, link up
> and down events, so please don't trim the log so much.
I have been testing on top of v6.14-rc2 which does not have
dwmac4_set_lpi_mode(). However, instrumenting the other functions,
for a bad case I see ...
[ 477.494226] PM: suspend entry (deep)
[ 477.501869] Filesystems sync: 0.006 seconds
[ 477.504518] Freezing user space processes
[ 477.509067] Freezing user space processes completed (elapsed 0.001 seconds)
[ 477.514770] OOM killer disabled.
[ 477.517940] Freezing remaining freezable tasks
[ 477.523449] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 477.566870] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[ 477.586423] dwc-eth-dwmac 2490000.ethernet eth0: disable EEE
[ 477.592052] dwmac4_set_eee_lpi_entry_timer: entered
[ 477.596997] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0x0
[ 477.680193] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
[ 477.723771] Disabling non-boot CPUs ...
[ 477.726898] psci: CPU5 killed (polled 0 ms)
[ 477.731364] psci: CPU4 killed (polled 0 ms)
[ 477.735439] psci: CPU3 killed (polled 0 ms)
[ 477.740198] psci: CPU2 killed (polled 0 ms)
[ 477.744220] psci: CPU1 killed (polled 0 ms)
[ 477.748546] Enabling non-boot CPUs ...
[ 477.751996] Detected PIPT I-cache on CPU1
[ 477.754800] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU1: 0x0000009444c004
[ 477.766211] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU1: 0x00000010305116
[ 477.778379] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU1: 0x00000003001066
[ 477.790231] CPU1: Booted secondary processor 0x0000000000 [0x4e0f0030]
[ 477.797726] CPU1 is up
[ 477.799388] Detected PIPT I-cache on CPU2
[ 477.802952] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU2: 0x0000009444c004
[ 477.814415] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU2: 0x00000010305116
[ 477.826537] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU2: 0x00000003001066
[ 477.838440] CPU2: Booted secondary processor 0x0000000001 [0x4e0f0030]
[ 477.845865] CPU2 is up
[ 477.847325] Detected PIPT I-cache on CPU3
[ 477.851136] CPU3: Booted secondary processor 0x0000000101 [0x411fd073]
[ 477.857958] CPU3 is up
[ 477.860108] Detected PIPT I-cache on CPU4
[ 477.863949] CPU4: Booted secondary processor 0x0000000102 [0x411fd073]
[ 477.870714] CPU4 is up
[ 477.872933] Detected PIPT I-cache on CPU5
[ 477.876778] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
[ 477.883556] CPU5 is up
[ 477.985628] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[ 477.993771] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
[ 478.171396] dwmac4: Master AXI performs any burst length
[ 478.174480] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[ 478.181934] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[ 478.202977] dwmac4_set_eee_lpi_entry_timer: entered
[ 478.207918] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
[ 478.287646] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
[ 478.295538] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
[ 478.372819] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[ 478.382118] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[ 478.410458] OOM killer enabled.
[ 478.411350] Restarting tasks ... done.
[ 478.415459] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 478.417763] random: crng reseeded on system resumption
[ 478.425747] PM: suspend exit
[ 478.474698] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 478.533428] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 478.600368] VDDIO_SDMMC3_AP: voltage operation not allowed
For a good case I see ...
[ 28.548472] PM: suspend entry (deep)
[ 28.560503] Filesystems sync: 0.010 seconds
[ 28.563622] Freezing user space processes
[ 28.567838] Freezing user space processes completed (elapsed 0.001 seconds)
[ 28.573380] OOM killer disabled.
[ 28.576563] Freezing remaining freezable tasks
[ 28.582100] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[ 28.627180] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[ 28.646770] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
[ 28.690432] Disabling non-boot CPUs ...
[ 28.693397] psci: CPU5 killed (polled 4 ms)
[ 28.697401] psci: CPU4 killed (polled 4 ms)
[ 28.702649] psci: CPU3 killed (polled 0 ms)
[ 28.707369] psci: CPU2 killed (polled 0 ms)
[ 28.711482] psci: CPU1 killed (polled 0 ms)
[ 28.970011] Enabling non-boot CPUs ...
[ 28.974751] Detected PIPT I-cache on CPU1
[ 28.977635] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU1: 0x0000009444c004
[ 28.989014] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU1: 0x00000010305116
[ 29.001163] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU1: 0x00000003001066
[ 29.013015] CPU1: Booted secondary processor 0x0000000000 [0x4e0f0030]
[ 29.020526] CPU1 is up
[ 29.022168] Detected PIPT I-cache on CPU2
[ 29.025747] CPU features: SANITY CHECK: Unexpected variation in SYS_CTR_EL0. Boot CPU: 0x0000008444c004, CPU2: 0x0000009444c004
[ 29.037182] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_AA64DFR0_EL1. Boot CPU: 0x00000010305106, CPU2: 0x00000010305116
[ 29.049328] CPU features: SANITY CHECK: Unexpected variation in SYS_ID_DFR0_EL1. Boot CPU: 0x00000003010066, CPU2: 0x00000003001066
[ 29.061196] CPU2: Booted secondary processor 0x0000000001 [0x4e0f0030]
[ 29.068779] CPU2 is up
[ 29.070095] Detected PIPT I-cache on CPU3
[ 29.073916] CPU3: Booted secondary processor 0x0000000101 [0x411fd073]
[ 29.080749] CPU3 is up
[ 29.082898] Detected PIPT I-cache on CPU4
[ 29.086729] CPU4: Booted secondary processor 0x0000000102 [0x411fd073]
[ 29.093496] CPU4 is up
[ 29.095715] Detected PIPT I-cache on CPU5
[ 29.099556] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
[ 29.106351] CPU5 is up
[ 29.218549] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[ 29.234190] dwmac4: Master AXI performs any burst length
[ 29.237263] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[ 29.244732] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[ 29.267679] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[ 29.270504] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[ 29.306091] OOM killer enabled.
[ 29.306981] Restarting tasks ... done.
[ 29.311423] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 29.314095] random: crng reseeded on system resumption
[ 29.321404] PM: suspend exit
[ 29.370286] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 29.429655] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 29.496567] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 32.968855] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
[ 32.974779] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_link_up: tx_lpi_timer 1000000
[ 32.988755] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
The more I have been testing, the more I feel that this is timing
related. In good cases, I see the MAC link coming up well after the
PHY. Even with your change I did see suspend work on occassion and
when it does I see ...
[ 79.775977] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
[ 79.784196] dwmac4: Master AXI performs any burst length
[ 79.787280] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
[ 79.794736] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
[ 79.816642] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
[ 79.820437] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
[ 79.854481] OOM killer enabled.
[ 79.855372] Restarting tasks ... done.
[ 79.859460] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 79.861297] random: crng reseeded on system resumption
[ 79.869773] PM: suspend exit
[ 79.914909] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 79.974322] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 80.041236] VDDIO_SDMMC3_AP: voltage operation not allowed
[ 83.547730] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
[ 83.566859] dwmac4_set_eee_lpi_entry_timer: entered
[ 83.571782] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
[ 83.651520] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
[ 83.659425] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
On a good case, the stmmac_mac_enable_tx_lpi call always happens
much later. It seems that after this change it is more often
that the link is coming up sooner and I guess probably too soon.
May be we were getting lucky before?
Anyway, I made the following change for testing and this is
working ...
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b34ebb916b89..44187e230a1e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7906,16 +7906,6 @@ int stmmac_resume(struct device *dev)
return ret;
}
- rtnl_lock();
- if (device_may_wakeup(priv->device) && priv->plat->pmt) {
- phylink_resume(priv->phylink);
- } else {
- phylink_resume(priv->phylink);
- if (device_may_wakeup(priv->device))
- phylink_speed_up(priv->phylink);
- }
- rtnl_unlock();
-
rtnl_lock();
mutex_lock(&priv->lock);
@@ -7930,6 +7920,13 @@ int stmmac_resume(struct device *dev)
stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
+ if (device_may_wakeup(priv->device) && priv->plat->pmt) {
+ phylink_resume(priv->phylink);
+ } else {
+ phylink_resume(priv->phylink);
+ if (device_may_wakeup(priv->device))
+ phylink_speed_up(priv->phylink);
+ }
stmmac_enable_all_queues(priv);
stmmac_enable_all_dma_irq(priv);
I noticed that in __stmmac_open() the phylink_start() is
called after stmmac_hw_setup and stmmac_init_coalesce, where
as in stmmac_resume, phylink_resume() is called before these.
I am not saying that this is correct in any way, but seems
to indicate that the PHY is coming up too soon (at least for
this device). I have ran 100 suspend iterations with the above
and I have not seen any failures.
Let me know if you have any thoughts on this.
Thanks!
Jon
--
nvpublic
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-19 14:01 ` Jon Hunter
@ 2025-02-19 15:36 ` Russell King (Oracle)
2025-02-19 17:52 ` Jon Hunter
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-19 15:36 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Wed, Feb 19, 2025 at 02:01:34PM +0000, Jon Hunter wrote:
>
> On 14/02/2025 11:21, Russell King (Oracle) wrote:
> > On Fri, Feb 14, 2025 at 10:58:55AM +0000, Jon Hunter wrote:
> > > Thanks for the feedback. So ...
> > >
> > > 1. I can confirm that suspend works if I disable EEE via ethtool
> > > 2. Prior to this change I do see phy_eee_rx_clock_stop being called
> > > to enable the clock resuming from suspend, but after this change
> > > it is not.
> > >
> > > Prior to this change I see (note the prints around 389-392 are when
> > > we resume from suspend) ...
> > >
> > > [ 4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0
> >
> > This is a bug in phylink - it shouldn't have been calling
> > phy_eee_rx_clock_stop() where a MAC doesn't support phylink managed EEE.
> >
> > > [ 4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> > > [ 7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
> >
> > Presumably, this is when the link comes up before suspend, so the PHY
> > has been configured to allow the RX clock to be stopped prior to suspend
> >
> > > [ 389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> > > [ 392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
> >
> > Presumably, as this is after resume, this is again when the link comes
> > up (that's the only time that stmmac calls phy_eee_rx_clock_stop().)
> >
> > > After this change I see ...
> > >
> > > [ 4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
> > > [ 4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> > > [ 191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> >
> > To me, this looks no different - the PHY was configured for clock stop
> > before suspending in both cases.
> >
> > However, something else to verify with the old code - after boot and the
> > link comes up (so you get the second phy_eee_rx_clock_stop() at 7s),
> > try unplugging the link and re-plugging it. Then try suspending.
>
> I still need to try this but I am still not back to the office to get to this.
> > The point of this test is to verify whether the PHY ignores changes to
> > the RX clock stop configuration while the link is up.
> >
> >
> >
> > The next stage is to instrument dwmac4_set_eee_mode(),
> > dwmac4_reset_eee_mode() and dwmac4_set_eee_lpi_entry_timer() to print
> > the final register values in each function vs dwmac4_set_lpi_mode() in
> > the new code. Also, I think instrumenting stmmac_common_interrupt() to
> > print a message when we get either CORE_IRQ_TX_PATH_IN_LPI_MODE or
> > CORE_IRQ_TX_PATH_EXIT_LPI_MODE indicating a change in LPI state would
> > be a good idea.
> >
> > I'd like to see how this all ties up with suspend, resume, link up
> > and down events, so please don't trim the log so much.
>
> I have been testing on top of v6.14-rc2 which does not have
> dwmac4_set_lpi_mode(). However, instrumenting the other functions,
> for a bad case I see ...
>
> [ 477.494226] PM: suspend entry (deep)
> [ 477.501869] Filesystems sync: 0.006 seconds
> [ 477.504518] Freezing user space processes
> [ 477.509067] Freezing user space processes completed (elapsed 0.001 seconds)
> [ 477.514770] OOM killer disabled.
> [ 477.517940] Freezing remaining freezable tasks
> [ 477.523449] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [ 477.566870] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
> [ 477.586423] dwc-eth-dwmac 2490000.ethernet eth0: disable EEE
> [ 477.592052] dwmac4_set_eee_lpi_entry_timer: entered
> [ 477.596997] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0x0
> [ 477.680193] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
This tells me WoL is not enabled, and thus phylink_suspend() did a
phylink_stop() which took the link administratively down and disabled
LPI at the MAC. The actual physical link on the media may still be up
at this point, and the remote end may still signal LPI to the local
PHY.
...system suspends and resumes...
> [ 477.876778] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
> [ 477.883556] CPU5 is up
stmmac_resume() gets called here, which will call into phylink_resume()
and, because WoL wasn't used at suspend time, will call phylink_start()
which immediately prints:
> [ 477.985628] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
and then it allows the phylink resolver to run in a separate workqueue.
The output from the phylink resolver thread, I'll label as "^WQ".
Messages from the thread that called stmmac_resume() I'll labell with
"^RES".
> [ 477.993771] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
^WQ
At this point, the workqueue has called mac_link_up() and this indicates
that that method has completed and it's now calling mac_enable_tx_lpi().
In other words, the transmitter and receiver have been enabled here!
This is key...
> [ 478.171396] dwmac4: Master AXI performs any burst length
^RES
dwmac4_dma_axi(), which is called from stmmac_init_dma_engine() which
then goes on to call stmmac_reset(). As noted above, however, the
MAC has had its transmitter and receiver enabled at this point, so
hitting the hardware with a reset probably undoes all that.
stmmac_init_dma_engine() is called from stmmac_hw_setup() and
stmmac_resume() _after_ calling phylink_resume().
> [ 478.174480] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
^RES
Printed by stmmac_safety_feat_configuration() which is called from
stmmac_hw_setup().
> [ 478.181934] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
^RES
Printed by stmmac_init_ptp() called from stmmac_hw_setup().
> [ 478.202977] dwmac4_set_eee_lpi_entry_timer: entered
^WQ
> [ 478.207918] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
^WQ
> [ 478.287646] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
^WQ
> [ 478.295538] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
^WQ
So clearly the phylink resolver is racing with the rest of the stmmac
resume path - which doesn't surprise me in the least. I believe I raised
the fact that calling phylink_resume() before the hardware was ready to
handle link-up is a bad idea precisely because of races like this.
The reason stmmac does this is because of it's quirk that it needs the
receive clock from the PHY in order for stmmac_reset() to work.
> For a good case I see ...
>
> [ 28.548472] PM: suspend entry (deep)
> [ 28.560503] Filesystems sync: 0.010 seconds
> [ 28.563622] Freezing user space processes
> [ 28.567838] Freezing user space processes completed (elapsed 0.001 seconds)
> [ 28.573380] OOM killer disabled.
> [ 28.576563] Freezing remaining freezable tasks
> [ 28.582100] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [ 28.627180] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
> [ 28.646770] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
Same as above...
...system suspends and resumes...
> [ 29.099556] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
> [ 29.106351] CPU5 is up
> [ 29.218549] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
^RES
> [ 29.234190] dwmac4: Master AXI performs any burst length
^RES
> [ 29.237263] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
^RES
> [ 29.244732] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
^RES
> [ 29.306981] Restarting tasks ... done.
> [ 29.311423] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 29.314095] random: crng reseeded on system resumption
> [ 29.321404] PM: suspend exit
> [ 29.370286] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 29.429655] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 29.496567] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 32.968855] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
^WQ
> [ 32.974779] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_link_up: tx_lpi_timer 1000000
^WQ
> [ 32.988755] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
^WQ
So here, phylink_resolve() runs later.
I think if you run this same test with an earlier kernel, you'll get
much the same random behaviour, maybe with different weightings on
"success" and "failure" because of course the code has changed - but
only because that's caused a change in timings of the already present
race.
> The more I have been testing, the more I feel that this is timing
> related. In good cases, I see the MAC link coming up well after the
> PHY. Even with your change I did see suspend work on occassion and
> when it does I see ...
>
> [ 79.775977] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> [ 79.784196] dwmac4: Master AXI performs any burst length
> [ 79.787280] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
> [ 79.794736] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
> [ 79.816642] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
> [ 79.820437] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
> [ 79.854481] OOM killer enabled.
> [ 79.855372] Restarting tasks ... done.
> [ 79.859460] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 79.861297] random: crng reseeded on system resumption
> [ 79.869773] PM: suspend exit
> [ 79.914909] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 79.974322] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 80.041236] VDDIO_SDMMC3_AP: voltage operation not allowed
> [ 83.547730] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
> [ 83.566859] dwmac4_set_eee_lpi_entry_timer: entered
> [ 83.571782] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
> [ 83.651520] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
> [ 83.659425] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
>
> On a good case, the stmmac_mac_enable_tx_lpi call always happens
> much later. It seems that after this change it is more often
> that the link is coming up sooner and I guess probably too soon.
> May be we were getting lucky before?
I think this is pure lottery.
> Anyway, I made the following change for testing and this is
> working ...
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index b34ebb916b89..44187e230a1e 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -7906,16 +7906,6 @@ int stmmac_resume(struct device *dev)
> return ret;
> }
> - rtnl_lock();
> - if (device_may_wakeup(priv->device) && priv->plat->pmt) {
> - phylink_resume(priv->phylink);
> - } else {
> - phylink_resume(priv->phylink);
> - if (device_may_wakeup(priv->device))
> - phylink_speed_up(priv->phylink);
> - }
> - rtnl_unlock();
> -
> rtnl_lock();
> mutex_lock(&priv->lock);
> @@ -7930,6 +7920,13 @@ int stmmac_resume(struct device *dev)
> stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
> + if (device_may_wakeup(priv->device) && priv->plat->pmt) {
> + phylink_resume(priv->phylink);
> + } else {
> + phylink_resume(priv->phylink);
> + if (device_may_wakeup(priv->device))
> + phylink_speed_up(priv->phylink);
> + }
> stmmac_enable_all_queues(priv);
> stmmac_enable_all_dma_irq(priv);
>
> I noticed that in __stmmac_open() the phylink_start() is
> called after stmmac_hw_setup and stmmac_init_coalesce, where
> as in stmmac_resume, phylink_resume() is called before these.
> I am not saying that this is correct in any way, but seems
> to indicate that the PHY is coming up too soon (at least for
> this device). I have ran 100 suspend iterations with the above
> and I have not seen any failures.
>
> Let me know if you have any thoughts on this.
With my phylink-maintainer hat on, this is definitely the correct
solution - maybe even moving the phylink_resume() call later.
phylink_resume() should only be called when the driver is prepared
to handle and cope with an immediate call to the mac_link_up()
method, and it's clear that its current placement is such that the
driver isn't prepared for that.
However... see:
36d18b5664ef ("net: stmmac: start phylink instance before stmmac_hw_setup()")
but I also questioned this in:
https://lore.kernel.org/netdev/20210903080147.GS22278@shell.armlinux.org.uk/
see the bottom of that email starting "While reading stmmac_resume(), I
have to question the placement of this code block:". The response was:
"There is a story here, SNPS EQOS IP need PHY provides RXC clock for
MAC's receive logic, so we need phylink_start() to bring PHY link up,
that make PHY resume back, PHY could stop RXC clock when in suspended
state. This is the reason why calling phylink_start() before re-config
MAC."
However, in 21d9ba5bc551 ("net: phylink: add PHY_F_RXC_ALWAYS_ON to PHY
dev flags") and associated patches, I added a way that phylib can be
told that the MAC requires the RXC at all times.
Romain Gantois arranged for this flag to always be set for stmmac in
commit 58329b03a595 ("net: stmmac: Signal to PHY/PCS drivers to keep RX
clock on"), which will pass PHY_F_RXC_ALWAYS_ON to the PHY driver.
Whether the PHY driver honours this flag or not depends on which
driver is used.
So, my preference would be to move phylink_resume() later, removing
the race condition. If there's any regressions, then we need to
_properly_ solve them by ensuring that the PHY keeps the RX clock
running by honouring PHY_F_RXC_ALWAYS_ON. That's going to need
everyone to test their stmmac platforms to find all the cases that
need fixing...
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-19 15:36 ` Russell King (Oracle)
@ 2025-02-19 17:52 ` Jon Hunter
2025-02-19 19:13 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-19 17:52 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 19/02/2025 15:36, Russell King (Oracle) wrote:
> On Wed, Feb 19, 2025 at 02:01:34PM +0000, Jon Hunter wrote:
>>
>> On 14/02/2025 11:21, Russell King (Oracle) wrote:
>>> On Fri, Feb 14, 2025 at 10:58:55AM +0000, Jon Hunter wrote:
>>>> Thanks for the feedback. So ...
>>>>
>>>> 1. I can confirm that suspend works if I disable EEE via ethtool
>>>> 2. Prior to this change I do see phy_eee_rx_clock_stop being called
>>>> to enable the clock resuming from suspend, but after this change
>>>> it is not.
>>>>
>>>> Prior to this change I see (note the prints around 389-392 are when
>>>> we resume from suspend) ...
>>>>
>>>> [ 4.654454] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 0
>>>
>>> This is a bug in phylink - it shouldn't have been calling
>>> phy_eee_rx_clock_stop() where a MAC doesn't support phylink managed EEE.
>>>
>>>> [ 4.723123] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>>>> [ 7.629652] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>>>
>>> Presumably, this is when the link comes up before suspend, so the PHY
>>> has been configured to allow the RX clock to be stopped prior to suspend
>>>
>>>> [ 389.086185] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>>>> [ 392.863744] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>>>
>>> Presumably, as this is after resume, this is again when the link comes
>>> up (that's the only time that stmmac calls phy_eee_rx_clock_stop().)
>>>
>>>> After this change I see ...
>>>>
>>>> [ 4.644614] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>>>> [ 4.679224] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>>>> [ 191.219828] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>>>
>>> To me, this looks no different - the PHY was configured for clock stop
>>> before suspending in both cases.
>>>
>>> However, something else to verify with the old code - after boot and the
>>> link comes up (so you get the second phy_eee_rx_clock_stop() at 7s),
>>> try unplugging the link and re-plugging it. Then try suspending.
>>
>> I still need to try this but I am still not back to the office to get to this.
>> > The point of this test is to verify whether the PHY ignores changes to
>>> the RX clock stop configuration while the link is up.
>>>
>>>
>>>
>>> The next stage is to instrument dwmac4_set_eee_mode(),
>>> dwmac4_reset_eee_mode() and dwmac4_set_eee_lpi_entry_timer() to print
>>> the final register values in each function vs dwmac4_set_lpi_mode() in
>>> the new code. Also, I think instrumenting stmmac_common_interrupt() to
>>> print a message when we get either CORE_IRQ_TX_PATH_IN_LPI_MODE or
>>> CORE_IRQ_TX_PATH_EXIT_LPI_MODE indicating a change in LPI state would
>>> be a good idea.
>>>
>>> I'd like to see how this all ties up with suspend, resume, link up
>>> and down events, so please don't trim the log so much.
>>
>> I have been testing on top of v6.14-rc2 which does not have
>> dwmac4_set_lpi_mode(). However, instrumenting the other functions,
>> for a bad case I see ...
>>
>> [ 477.494226] PM: suspend entry (deep)
>> [ 477.501869] Filesystems sync: 0.006 seconds
>> [ 477.504518] Freezing user space processes
>> [ 477.509067] Freezing user space processes completed (elapsed 0.001 seconds)
>> [ 477.514770] OOM killer disabled.
>> [ 477.517940] Freezing remaining freezable tasks
>> [ 477.523449] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>> [ 477.566870] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
>> [ 477.586423] dwc-eth-dwmac 2490000.ethernet eth0: disable EEE
>> [ 477.592052] dwmac4_set_eee_lpi_entry_timer: entered
>> [ 477.596997] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0x0
>> [ 477.680193] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
>
> This tells me WoL is not enabled, and thus phylink_suspend() did a
> phylink_stop() which took the link administratively down and disabled
> LPI at the MAC. The actual physical link on the media may still be up
> at this point, and the remote end may still signal LPI to the local
> PHY.
>
> ...system suspends and resumes...
>> [ 477.876778] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
>> [ 477.883556] CPU5 is up
>
> stmmac_resume() gets called here, which will call into phylink_resume()
> and, because WoL wasn't used at suspend time, will call phylink_start()
> which immediately prints:
>
>> [ 477.985628] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>
> and then it allows the phylink resolver to run in a separate workqueue.
> The output from the phylink resolver thread, I'll label as "^WQ".
> Messages from the thread that called stmmac_resume() I'll labell with
> "^RES".
>
>> [ 477.993771] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
> ^WQ
>
> At this point, the workqueue has called mac_link_up() and this indicates
> that that method has completed and it's now calling mac_enable_tx_lpi().
> In other words, the transmitter and receiver have been enabled here!
> This is key...
>
>> [ 478.171396] dwmac4: Master AXI performs any burst length
> ^RES
>
> dwmac4_dma_axi(), which is called from stmmac_init_dma_engine() which
> then goes on to call stmmac_reset(). As noted above, however, the
> MAC has had its transmitter and receiver enabled at this point, so
> hitting the hardware with a reset probably undoes all that.
> stmmac_init_dma_engine() is called from stmmac_hw_setup() and
> stmmac_resume() _after_ calling phylink_resume().
>
>> [ 478.174480] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
> ^RES
>
> Printed by stmmac_safety_feat_configuration() which is called from
> stmmac_hw_setup().
>
>> [ 478.181934] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
> ^RES
>
> Printed by stmmac_init_ptp() called from stmmac_hw_setup().
>
>> [ 478.202977] dwmac4_set_eee_lpi_entry_timer: entered
> ^WQ
>> [ 478.207918] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
> ^WQ
>> [ 478.287646] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
> ^WQ
>> [ 478.295538] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
> ^WQ
>
> So clearly the phylink resolver is racing with the rest of the stmmac
> resume path - which doesn't surprise me in the least. I believe I raised
> the fact that calling phylink_resume() before the hardware was ready to
> handle link-up is a bad idea precisely because of races like this.
>
> The reason stmmac does this is because of it's quirk that it needs the
> receive clock from the PHY in order for stmmac_reset() to work.
I do see the reset fail infrequently on previous kernels with this
device and when it does I see these messages ...
dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine
initialization failed
>> For a good case I see ...
>>
>> [ 28.548472] PM: suspend entry (deep)
>> [ 28.560503] Filesystems sync: 0.010 seconds
>> [ 28.563622] Freezing user space processes
>> [ 28.567838] Freezing user space processes completed (elapsed 0.001 seconds)
>> [ 28.573380] OOM killer disabled.
>> [ 28.576563] Freezing remaining freezable tasks
>> [ 28.582100] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>> [ 28.627180] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
>> [ 28.646770] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
>
> Same as above...
>
> ...system suspends and resumes...
>> [ 29.099556] CPU5: Booted secondary processor 0x0000000103 [0x411fd073]
>> [ 29.106351] CPU5 is up
>> [ 29.218549] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
> ^RES
>> [ 29.234190] dwmac4: Master AXI performs any burst length
> ^RES
>> [ 29.237263] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
> ^RES
>> [ 29.244732] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
> ^RES
>> [ 29.306981] Restarting tasks ... done.
>> [ 29.311423] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 29.314095] random: crng reseeded on system resumption
>> [ 29.321404] PM: suspend exit
>> [ 29.370286] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 29.429655] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 29.496567] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 32.968855] Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
> ^WQ
>> [ 32.974779] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_link_up: tx_lpi_timer 1000000
> ^WQ
>> [ 32.988755] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
> ^WQ
>
> So here, phylink_resolve() runs later.
>
> I think if you run this same test with an earlier kernel, you'll get
> much the same random behaviour, maybe with different weightings on
> "success" and "failure" because of course the code has changed - but
> only because that's caused a change in timings of the already present
> race.
>
>> The more I have been testing, the more I feel that this is timing
>> related. In good cases, I see the MAC link coming up well after the
>> PHY. Even with your change I did see suspend work on occassion and
>> when it does I see ...
>>
>> [ 79.775977] dwc-eth-dwmac 2490000.ethernet eth0: configuring for phy/rgmii link mode
>> [ 79.784196] dwmac4: Master AXI performs any burst length
>> [ 79.787280] dwc-eth-dwmac 2490000.ethernet eth0: No Safety Features support found
>> [ 79.794736] dwc-eth-dwmac 2490000.ethernet eth0: IEEE 1588-2008 Advanced Timestamp supported
>> [ 79.816642] usb-conn-gpio 3520000.padctl:ports:usb2-0:connector: repeated role: device
>> [ 79.820437] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
>> [ 79.854481] OOM killer enabled.
>> [ 79.855372] Restarting tasks ... done.
>> [ 79.859460] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 79.861297] random: crng reseeded on system resumption
>> [ 79.869773] PM: suspend exit
>> [ 79.914909] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 79.974322] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 80.041236] VDDIO_SDMMC3_AP: voltage operation not allowed
>> [ 83.547730] dwc-eth-dwmac 2490000.ethernet eth0: stmmac_mac_enable_tx_lpi: tx_lpi_timer 1000000
>> [ 83.566859] dwmac4_set_eee_lpi_entry_timer: entered
>> [ 83.571782] dwmac4_set_eee_lpi_entry_timer: GMAC4_LPI_CTRL_STATUS 0xf4240
>> [ 83.651520] dwc-eth-dwmac 2490000.ethernet eth0: Energy-Efficient Ethernet initialized
>> [ 83.659425] dwc-eth-dwmac 2490000.ethernet eth0: Link is Up - 1Gbps/Full - flow control rx/tx
>>
>> On a good case, the stmmac_mac_enable_tx_lpi call always happens
>> much later. It seems that after this change it is more often
>> that the link is coming up sooner and I guess probably too soon.
>> May be we were getting lucky before?
>
> I think this is pure lottery.
Yes it does appear to be.
>> Anyway, I made the following change for testing and this is
>> working ...
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index b34ebb916b89..44187e230a1e 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -7906,16 +7906,6 @@ int stmmac_resume(struct device *dev)
>> return ret;
>> }
>> - rtnl_lock();
>> - if (device_may_wakeup(priv->device) && priv->plat->pmt) {
>> - phylink_resume(priv->phylink);
>> - } else {
>> - phylink_resume(priv->phylink);
>> - if (device_may_wakeup(priv->device))
>> - phylink_speed_up(priv->phylink);
>> - }
>> - rtnl_unlock();
>> -
>> rtnl_lock();
>> mutex_lock(&priv->lock);
>> @@ -7930,6 +7920,13 @@ int stmmac_resume(struct device *dev)
>> stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
>> + if (device_may_wakeup(priv->device) && priv->plat->pmt) {
>> + phylink_resume(priv->phylink);
>> + } else {
>> + phylink_resume(priv->phylink);
>> + if (device_may_wakeup(priv->device))
>> + phylink_speed_up(priv->phylink);
>> + }
>> stmmac_enable_all_queues(priv);
>> stmmac_enable_all_dma_irq(priv);
>>
>> I noticed that in __stmmac_open() the phylink_start() is
>> called after stmmac_hw_setup and stmmac_init_coalesce, where
>> as in stmmac_resume, phylink_resume() is called before these.
>> I am not saying that this is correct in any way, but seems
>> to indicate that the PHY is coming up too soon (at least for
>> this device). I have ran 100 suspend iterations with the above
>> and I have not seen any failures.
>>
>> Let me know if you have any thoughts on this.
>
> With my phylink-maintainer hat on, this is definitely the correct
> solution - maybe even moving the phylink_resume() call later.
> phylink_resume() should only be called when the driver is prepared
> to handle and cope with an immediate call to the mac_link_up()
> method, and it's clear that its current placement is such that the
> driver isn't prepared for that.
>
> However... see:
>
> 36d18b5664ef ("net: stmmac: start phylink instance before stmmac_hw_setup()")
>
> but I also questioned this in:
>
> https://lore.kernel.org/netdev/20210903080147.GS22278@shell.armlinux.org.uk/
>
> see the bottom of that email starting "While reading stmmac_resume(), I
> have to question the placement of this code block:". The response was:
>
> "There is a story here, SNPS EQOS IP need PHY provides RXC clock for
> MAC's receive logic, so we need phylink_start() to bring PHY link up,
> that make PHY resume back, PHY could stop RXC clock when in suspended
> state. This is the reason why calling phylink_start() before re-config
> MAC."
>
> However, in 21d9ba5bc551 ("net: phylink: add PHY_F_RXC_ALWAYS_ON to PHY
> dev flags") and associated patches, I added a way that phylib can be
> told that the MAC requires the RXC at all times.
>
> Romain Gantois arranged for this flag to always be set for stmmac in
> commit 58329b03a595 ("net: stmmac: Signal to PHY/PCS drivers to keep RX
> clock on"), which will pass PHY_F_RXC_ALWAYS_ON to the PHY driver.
> Whether the PHY driver honours this flag or not depends on which
> driver is used.
>
> So, my preference would be to move phylink_resume() later, removing
> the race condition. If there's any regressions, then we need to
> _properly_ solve them by ensuring that the PHY keeps the RX clock
> running by honouring PHY_F_RXC_ALWAYS_ON. That's going to need
> everyone to test their stmmac platforms to find all the cases that
> need fixing...
Thanks for the in-depth analysis and feedback. We have 3 SoCs that use
this driver and so I will do some testing with this change on all of them.
Thanks again.
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-19 17:52 ` Jon Hunter
@ 2025-02-19 19:13 ` Russell King (Oracle)
2025-02-19 20:05 ` Jon Hunter
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-19 19:13 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Wed, Feb 19, 2025 at 05:52:34PM +0000, Jon Hunter wrote:
> On 19/02/2025 15:36, Russell King (Oracle) wrote:
> > So clearly the phylink resolver is racing with the rest of the stmmac
> > resume path - which doesn't surprise me in the least. I believe I raised
> > the fact that calling phylink_resume() before the hardware was ready to
> > handle link-up is a bad idea precisely because of races like this.
> >
> > The reason stmmac does this is because of it's quirk that it needs the
> > receive clock from the PHY in order for stmmac_reset() to work.
>
> I do see the reset fail infrequently on previous kernels with this device
> and when it does I see these messages ...
>
> dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
> dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine
> initialization failed
I wonder whether it's also racing with phylib, but phylink_resume()
calling phylink_start() going in to call phy_start() is all synchronous.
That causes __phy_resume() to be called.
Which PHY device/driver is being used?
> > So, my preference would be to move phylink_resume() later, removing
> > the race condition. If there's any regressions, then we need to
> > _properly_ solve them by ensuring that the PHY keeps the RX clock
> > running by honouring PHY_F_RXC_ALWAYS_ON. That's going to need
> > everyone to test their stmmac platforms to find all the cases that
> > need fixing...
>
> Thanks for the in-depth analysis and feedback. We have 3 SoCs that use this
> driver and so I will do some testing with this change on all of them.
Thanks!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-19 19:13 ` Russell King (Oracle)
@ 2025-02-19 20:05 ` Jon Hunter
2025-02-19 20:57 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-19 20:05 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 19/02/2025 19:13, Russell King (Oracle) wrote:
> On Wed, Feb 19, 2025 at 05:52:34PM +0000, Jon Hunter wrote:
>> On 19/02/2025 15:36, Russell King (Oracle) wrote:
>>> So clearly the phylink resolver is racing with the rest of the stmmac
>>> resume path - which doesn't surprise me in the least. I believe I raised
>>> the fact that calling phylink_resume() before the hardware was ready to
>>> handle link-up is a bad idea precisely because of races like this.
>>>
>>> The reason stmmac does this is because of it's quirk that it needs the
>>> receive clock from the PHY in order for stmmac_reset() to work.
>>
>> I do see the reset fail infrequently on previous kernels with this device
>> and when it does I see these messages ...
>>
>> dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
>> dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine
>> initialization failed
>
> I wonder whether it's also racing with phylib, but phylink_resume()
> calling phylink_start() going in to call phy_start() is all synchronous.
> That causes __phy_resume() to be called.
>
> Which PHY device/driver is being used?
Looks like it is this Broadcom driver ...
Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
Thanks
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-19 20:05 ` Jon Hunter
@ 2025-02-19 20:57 ` Russell King (Oracle)
2025-02-25 14:21 ` Jon Hunter
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-19 20:57 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Wed, Feb 19, 2025 at 08:05:57PM +0000, Jon Hunter wrote:
> On 19/02/2025 19:13, Russell King (Oracle) wrote:
> > On Wed, Feb 19, 2025 at 05:52:34PM +0000, Jon Hunter wrote:
> > > On 19/02/2025 15:36, Russell King (Oracle) wrote:
> > > > So clearly the phylink resolver is racing with the rest of the stmmac
> > > > resume path - which doesn't surprise me in the least. I believe I raised
> > > > the fact that calling phylink_resume() before the hardware was ready to
> > > > handle link-up is a bad idea precisely because of races like this.
> > > >
> > > > The reason stmmac does this is because of it's quirk that it needs the
> > > > receive clock from the PHY in order for stmmac_reset() to work.
> > >
> > > I do see the reset fail infrequently on previous kernels with this device
> > > and when it does I see these messages ...
> > >
> > > dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
> > > dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine
> > > initialization failed
> >
> > I wonder whether it's also racing with phylib, but phylink_resume()
> > calling phylink_start() going in to call phy_start() is all synchronous.
> > That causes __phy_resume() to be called.
> >
> > Which PHY device/driver is being used?
>
>
> Looks like it is this Broadcom driver ...
>
> Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
I don't see anything special happening in the PHY driver - it doesn't
implement suspend/resume/config_aneg methods, so there's nothing going
on with clocks in that driver beyond generic stuff.
So, let's try something (I haven't tested this, and its likely you
will need to work it in to your other change.)
Essentially, this disables the receive clock stop around the reset,
something the stmmac driver has never done in the past.
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 1cbea627b216..8e975863a2e3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
rtnl_lock();
mutex_lock(&priv->lock);
+ phy_eee_rx_clock_stop(priv->dev->phydev, false);
+
stmmac_reset_queues_param(priv);
stmmac_free_tx_skbufs(priv);
@@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
+ phy_eee_rx_clock_stop(priv->dev->phydev,
+ priv->phylink_config.eee_rx_clk_stop_enable);
+
stmmac_enable_all_queues(priv);
stmmac_enable_all_dma_irq(priv);
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-19 20:57 ` Russell King (Oracle)
@ 2025-02-25 14:21 ` Jon Hunter
2025-02-26 10:02 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-25 14:21 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
Hi Russell,
On 19/02/2025 20:57, Russell King (Oracle) wrote:
> On Wed, Feb 19, 2025 at 08:05:57PM +0000, Jon Hunter wrote:
>> On 19/02/2025 19:13, Russell King (Oracle) wrote:
>>> On Wed, Feb 19, 2025 at 05:52:34PM +0000, Jon Hunter wrote:
>>>> On 19/02/2025 15:36, Russell King (Oracle) wrote:
>>>>> So clearly the phylink resolver is racing with the rest of the stmmac
>>>>> resume path - which doesn't surprise me in the least. I believe I raised
>>>>> the fact that calling phylink_resume() before the hardware was ready to
>>>>> handle link-up is a bad idea precisely because of races like this.
>>>>>
>>>>> The reason stmmac does this is because of it's quirk that it needs the
>>>>> receive clock from the PHY in order for stmmac_reset() to work.
>>>>
>>>> I do see the reset fail infrequently on previous kernels with this device
>>>> and when it does I see these messages ...
>>>>
>>>> dwc-eth-dwmac 2490000.ethernet: Failed to reset the dma
>>>> dwc-eth-dwmac 2490000.ethernet eth0: stmmac_hw_setup: DMA engine
>>>> initialization failed
>>>
>>> I wonder whether it's also racing with phylib, but phylink_resume()
>>> calling phylink_start() going in to call phy_start() is all synchronous.
>>> That causes __phy_resume() to be called.
>>>
>>> Which PHY device/driver is being used?
>>
>>
>> Looks like it is this Broadcom driver ...
>>
>> Broadcom BCM89610 stmmac-0:00: phy_eee_rx_clock_stop: clk_stop_enable 1
>
> I don't see anything special happening in the PHY driver - it doesn't
> implement suspend/resume/config_aneg methods, so there's nothing going
> on with clocks in that driver beyond generic stuff.
>
> So, let's try something (I haven't tested this, and its likely you
> will need to work it in to your other change.)
>
> Essentially, this disables the receive clock stop around the reset,
> something the stmmac driver has never done in the past.
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 1cbea627b216..8e975863a2e3 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
> rtnl_lock();
> mutex_lock(&priv->lock);
>
> + phy_eee_rx_clock_stop(priv->dev->phydev, false);
> +
> stmmac_reset_queues_param(priv);
>
> stmmac_free_tx_skbufs(priv);
> @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
>
> stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
>
> + phy_eee_rx_clock_stop(priv->dev->phydev,
> + priv->phylink_config.eee_rx_clk_stop_enable);
> +
> stmmac_enable_all_queues(priv);
> stmmac_enable_all_dma_irq(priv);
>
Sorry for the delay, I have been testing various issues recently and
needed a bit more time to test this.
It turns out that what I had proposed last week does not work. I believe
that with all the various debug/instrumentation I had added, I was again
getting lucky. So when I tested again this week on top of vanilla
v6.14-rc2, it did not work :-(
However, what you are suggesting above, all by itself, is working. I
have tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is
working reliably. I have also tested on some other boards that use the
same stmmac driver (but use the Aquantia PHY) and I have not seen any
issues. So this does fix the issue I am seeing.
I know we are getting quite late in the rc for v6.14, but not sure if we
could add this as a fix?
Thanks!
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-25 14:21 ` Jon Hunter
@ 2025-02-26 10:02 ` Russell King (Oracle)
2025-02-26 10:11 ` Jon Hunter
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-26 10:02 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Tue, Feb 25, 2025 at 02:21:01PM +0000, Jon Hunter wrote:
> Hi Russell,
>
> On 19/02/2025 20:57, Russell King (Oracle) wrote:
> > So, let's try something (I haven't tested this, and its likely you
> > will need to work it in to your other change.)
> >
> > Essentially, this disables the receive clock stop around the reset,
> > something the stmmac driver has never done in the past.
> >
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index 1cbea627b216..8e975863a2e3 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
> > rtnl_lock();
> > mutex_lock(&priv->lock);
> > + phy_eee_rx_clock_stop(priv->dev->phydev, false);
> > +
> > stmmac_reset_queues_param(priv);
> > stmmac_free_tx_skbufs(priv);
> > @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
> > stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
> > + phy_eee_rx_clock_stop(priv->dev->phydev,
> > + priv->phylink_config.eee_rx_clk_stop_enable);
> > +
> > stmmac_enable_all_queues(priv);
> > stmmac_enable_all_dma_irq(priv);
>
>
> Sorry for the delay, I have been testing various issues recently and needed
> a bit more time to test this.
>
> It turns out that what I had proposed last week does not work. I believe
> that with all the various debug/instrumentation I had added, I was again
> getting lucky. So when I tested again this week on top of vanilla v6.14-rc2,
> it did not work :-(
>
> However, what you are suggesting above, all by itself, is working. I have
> tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is working
> reliably. I have also tested on some other boards that use the same stmmac
> driver (but use the Aquantia PHY) and I have not seen any issues. So this
> does fix the issue I am seeing.
>
> I know we are getting quite late in the rc for v6.14, but not sure if we
> could add this as a fix?
The patch above was something of a hack, bypassing the layering, so I
would like to consider how this should be done properly.
I'm still wondering whether the early call to phylink_resume() is
symptomatic of this same issue, or whether there is a PHY that needs
phy_start() to be called to output its clock even with link down that
we don't know about.
The phylink_resume() call is relevant to this because I'd like to put:
phy_eee_rx_clock_stop(priv->dev->phydev,
priv->phylink_config.eee_rx_clk_stop_enable);
in there to ensure that the PHY is correctly configured for clock-stop,
but given stmmac's placement that wouldn't work.
I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
at the PHY.
I think the only thing we could do is try solving this problem as per
above and see what the fall-out from it is. I don't get the impression
that stmmac users are particularly active at testing patches though, so
it may take months to get breakage reports.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 10:02 ` Russell King (Oracle)
@ 2025-02-26 10:11 ` Jon Hunter
2025-02-26 10:59 ` Russell King (Oracle)
2025-02-26 11:37 ` Russell King (Oracle)
0 siblings, 2 replies; 41+ messages in thread
From: Jon Hunter @ 2025-02-26 10:11 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 26/02/2025 10:02, Russell King (Oracle) wrote:
> On Tue, Feb 25, 2025 at 02:21:01PM +0000, Jon Hunter wrote:
>> Hi Russell,
>>
>> On 19/02/2025 20:57, Russell King (Oracle) wrote:
>>> So, let's try something (I haven't tested this, and its likely you
>>> will need to work it in to your other change.)
>>>
>>> Essentially, this disables the receive clock stop around the reset,
>>> something the stmmac driver has never done in the past.
>>>
>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>> index 1cbea627b216..8e975863a2e3 100644
>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>> @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
>>> rtnl_lock();
>>> mutex_lock(&priv->lock);
>>> + phy_eee_rx_clock_stop(priv->dev->phydev, false);
>>> +
>>> stmmac_reset_queues_param(priv);
>>> stmmac_free_tx_skbufs(priv);
>>> @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
>>> stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
>>> + phy_eee_rx_clock_stop(priv->dev->phydev,
>>> + priv->phylink_config.eee_rx_clk_stop_enable);
>>> +
>>> stmmac_enable_all_queues(priv);
>>> stmmac_enable_all_dma_irq(priv);
>>
>>
>> Sorry for the delay, I have been testing various issues recently and needed
>> a bit more time to test this.
>>
>> It turns out that what I had proposed last week does not work. I believe
>> that with all the various debug/instrumentation I had added, I was again
>> getting lucky. So when I tested again this week on top of vanilla v6.14-rc2,
>> it did not work :-(
>>
>> However, what you are suggesting above, all by itself, is working. I have
>> tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is working
>> reliably. I have also tested on some other boards that use the same stmmac
>> driver (but use the Aquantia PHY) and I have not seen any issues. So this
>> does fix the issue I am seeing.
>>
>> I know we are getting quite late in the rc for v6.14, but not sure if we
>> could add this as a fix?
>
> The patch above was something of a hack, bypassing the layering, so I
> would like to consider how this should be done properly.
>
> I'm still wondering whether the early call to phylink_resume() is
> symptomatic of this same issue, or whether there is a PHY that needs
> phy_start() to be called to output its clock even with link down that
> we don't know about.
>
> The phylink_resume() call is relevant to this because I'd like to put:
>
> phy_eee_rx_clock_stop(priv->dev->phydev,
> priv->phylink_config.eee_rx_clk_stop_enable);
>
> in there to ensure that the PHY is correctly configured for clock-stop,
> but given stmmac's placement that wouldn't work.
>
> I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
> at the PHY.
>
> I think the only thing we could do is try solving this problem as per
> above and see what the fall-out from it is. I don't get the impression
> that stmmac users are particularly active at testing patches though, so
> it may take months to get breakage reports.
We can ask Furong to test as he seems to active and making changes, but
otherwise I am not sure how well it is being tested across various
devices. On the other hand, it feels like there are still lingering
issues like this with the driver and so I would hope this is moving in
the right direction.
Let me know if you have a patch you want me to test and I will run in on
our Tegra186, Tegra194 and Tegra234 devices that all use this.
Thanks
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 10:11 ` Jon Hunter
@ 2025-02-26 10:59 ` Russell King (Oracle)
2025-02-26 15:55 ` Jon Hunter
2025-02-26 11:37 ` Russell King (Oracle)
1 sibling, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-26 10:59 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Wed, Feb 26, 2025 at 10:11:58AM +0000, Jon Hunter wrote:
>
> On 26/02/2025 10:02, Russell King (Oracle) wrote:
> > On Tue, Feb 25, 2025 at 02:21:01PM +0000, Jon Hunter wrote:
> > > Hi Russell,
> > >
> > > On 19/02/2025 20:57, Russell King (Oracle) wrote:
> > > > So, let's try something (I haven't tested this, and its likely you
> > > > will need to work it in to your other change.)
> > > >
> > > > Essentially, this disables the receive clock stop around the reset,
> > > > something the stmmac driver has never done in the past.
> > > >
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > index 1cbea627b216..8e975863a2e3 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
> > > > rtnl_lock();
> > > > mutex_lock(&priv->lock);
> > > > + phy_eee_rx_clock_stop(priv->dev->phydev, false);
> > > > +
> > > > stmmac_reset_queues_param(priv);
> > > > stmmac_free_tx_skbufs(priv);
> > > > @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
> > > > stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
> > > > + phy_eee_rx_clock_stop(priv->dev->phydev,
> > > > + priv->phylink_config.eee_rx_clk_stop_enable);
> > > > +
> > > > stmmac_enable_all_queues(priv);
> > > > stmmac_enable_all_dma_irq(priv);
> > >
> > >
> > > Sorry for the delay, I have been testing various issues recently and needed
> > > a bit more time to test this.
> > >
> > > It turns out that what I had proposed last week does not work. I believe
> > > that with all the various debug/instrumentation I had added, I was again
> > > getting lucky. So when I tested again this week on top of vanilla v6.14-rc2,
> > > it did not work :-(
> > >
> > > However, what you are suggesting above, all by itself, is working. I have
> > > tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is working
> > > reliably. I have also tested on some other boards that use the same stmmac
> > > driver (but use the Aquantia PHY) and I have not seen any issues. So this
> > > does fix the issue I am seeing.
> > >
> > > I know we are getting quite late in the rc for v6.14, but not sure if we
> > > could add this as a fix?
> >
> > The patch above was something of a hack, bypassing the layering, so I
> > would like to consider how this should be done properly.
> >
> > I'm still wondering whether the early call to phylink_resume() is
> > symptomatic of this same issue, or whether there is a PHY that needs
> > phy_start() to be called to output its clock even with link down that
> > we don't know about.
> >
> > The phylink_resume() call is relevant to this because I'd like to put:
> >
> > phy_eee_rx_clock_stop(priv->dev->phydev,
> > priv->phylink_config.eee_rx_clk_stop_enable);
> >
> > in there to ensure that the PHY is correctly configured for clock-stop,
> > but given stmmac's placement that wouldn't work.
> >
> > I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
> > at the PHY.
> >
> > I think the only thing we could do is try solving this problem as per
> > above and see what the fall-out from it is. I don't get the impression
> > that stmmac users are particularly active at testing patches though, so
> > it may take months to get breakage reports.
>
>
> We can ask Furong to test as he seems to active and making changes, but
> otherwise I am not sure how well it is being tested across various devices.
> On the other hand, it feels like there are still lingering issues like this
> with the driver and so I would hope this is moving in the right direction.
>
> Let me know if you have a patch you want me to test and I will run in on our
> Tegra186, Tegra194 and Tegra234 devices that all use this.
Do we think this needs to be a patch for the net tree or the net-next
tree? I think we've established that it's been a long-standing bug,
so maybe if we target net-next to give it more time to be tested?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 10:11 ` Jon Hunter
2025-02-26 10:59 ` Russell King (Oracle)
@ 2025-02-26 11:37 ` Russell King (Oracle)
2025-02-26 17:24 ` Jon Hunter
1 sibling, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-26 11:37 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1894 bytes --]
On Wed, Feb 26, 2025 at 10:11:58AM +0000, Jon Hunter wrote:
> On 26/02/2025 10:02, Russell King (Oracle) wrote:
> > The patch above was something of a hack, bypassing the layering, so I
> > would like to consider how this should be done properly.
> >
> > I'm still wondering whether the early call to phylink_resume() is
> > symptomatic of this same issue, or whether there is a PHY that needs
> > phy_start() to be called to output its clock even with link down that
> > we don't know about.
> >
> > The phylink_resume() call is relevant to this because I'd like to put:
> >
> > phy_eee_rx_clock_stop(priv->dev->phydev,
> > priv->phylink_config.eee_rx_clk_stop_enable);
> >
> > in there to ensure that the PHY is correctly configured for clock-stop,
> > but given stmmac's placement that wouldn't work.
> >
> > I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
> > at the PHY.
> >
> > I think the only thing we could do is try solving this problem as per
> > above and see what the fall-out from it is. I don't get the impression
> > that stmmac users are particularly active at testing patches though, so
> > it may take months to get breakage reports.
>
>
> We can ask Furong to test as he seems to active and making changes, but
> otherwise I am not sure how well it is being tested across various devices.
> On the other hand, it feels like there are still lingering issues like this
> with the driver and so I would hope this is moving in the right direction.
>
> Let me know if you have a patch you want me to test and I will run in on our
> Tegra186, Tegra194 and Tegra234 devices that all use this.
The attached patches shows what I'm thinking of - it's just been roughed
out, and only been build tested.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
[-- Attachment #2: 0001-net-phylink-add-config-of-PHY-receive-clock-stop-in-.patch --]
[-- Type: text/x-diff, Size: 1723 bytes --]
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 1/5] net: phylink: add config of PHY receive
clock-stop in phylink_resume()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/phy/phylink.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index a3b186ab3854..0aae0bb2a254 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2264,7 +2264,7 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy,
pl->mac_tx_clk_stop = phy_eee_tx_clock_stop_capable(phy) > 0;
if (pl->mac_supports_eee_ops) {
- /* Explicitly configure whether the PHY is allowed to stop it's
+ /* Explicitly configure whether the PHY is allowed to stop its
* receive clock.
*/
ret = phy_eee_rx_clock_stop(phy,
@@ -2645,8 +2645,22 @@ EXPORT_SYMBOL_GPL(phylink_suspend);
*/
void phylink_resume(struct phylink *pl)
{
+ int ret;
+
ASSERT_RTNL();
+ if (pl->mac_supports_eee_ops && pl->phydev) {
+ /* Explicitly configure whether the PHY is allowed to stop its
+ * receive clock on resume to ensure that it is correctly
+ * configured.
+ */
+ ret = phy_eee_rx_clock_stop(pl->phydev,
+ pl->config->eee_rx_clk_stop_enable);
+ if (ret == -EOPNOTSUPP)
+ phylink_warn(pl, "failed to set rx clock stop: %pe\n",
+ ERR_PTR(ret));
+ }
+
if (test_bit(PHYLINK_DISABLE_MAC_WOL, &pl->phylink_disable_state)) {
/* Wake-on-Lan enabled, MAC handling */
--
2.30.2
[-- Attachment #3: 0002-net-phylink-add-phylink_prepare_resume.patch --]
[-- Type: text/x-diff, Size: 2365 bytes --]
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 2/5] net: phylink: add phylink_prepare_resume()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
Add a resume preparation function, which will ensure that the receive
clock from the PHY is appropriately configured while resuming.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/phy/phylink.c | 24 ++++++++++++++++++++++++
include/linux/phylink.h | 1 +
2 files changed, 25 insertions(+)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 0aae0bb2a254..976e569feb70 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -2636,6 +2636,30 @@ void phylink_suspend(struct phylink *pl, bool mac_wol)
}
EXPORT_SYMBOL_GPL(phylink_suspend);
+/**
+ * phylink_prepare_resume() - prepare to resume a network device
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Optional, thus must be called prior to phylink_resume().
+ *
+ * Prepare to resume a network device, preparing the PHY as necessary.
+ */
+void phylink_prepare_resume(struct phylink *pl)
+{
+ ASSERT_RTNL();
+
+ /* If the MAC requires the receive clock, but receive clock
+ * stop was enabled at the PHY, we need to ensure that the
+ * receive clock is running. Disable receive clock stop.
+ * phylink_resume() will re-enable it if necessary.
+ */
+ if (pl->mac_supports_eee_ops && pl->phydev &&
+ pl->config->mac_requires_rxc &&
+ pl->config->eee_rx_clk_stop_enable)
+ phy_eee_rx_clock_stop(pl->phydev, false);
+}
+EXPORT_SYMBOL_GPL(phylink_prepare_resume);
+
/**
* phylink_resume() - handle a network device resume event
* @pl: a pointer to a &struct phylink returned from phylink_create()
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 08df65f6867a..071ed4683c8c 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -699,6 +699,7 @@ void phylink_start(struct phylink *);
void phylink_stop(struct phylink *);
void phylink_suspend(struct phylink *pl, bool mac_wol);
+void phylink_prepare_resume(struct phylink *pl);
void phylink_resume(struct phylink *pl);
void phylink_ethtool_get_wol(struct phylink *, struct ethtool_wolinfo *);
--
2.30.2
[-- Attachment #4: 0003-net-stmmac-move-phylink_resume-after-resume-setup-is.patch --]
[-- Type: text/x-diff, Size: 1800 bytes --]
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 3/5] net: stmmac: move phylink_resume() after resume
setup is complete
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
Move phylink_resume() to be after the setup in stmmac_resume() has
completed, as phylink_resume() may result in an immediate call to the
.mac_link_up method, which will enable the transmitter and receiver,
and enable the transmit queues.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index d552e64eaa90..b9f651d77c4f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7926,16 +7926,6 @@ int stmmac_resume(struct device *dev)
return ret;
}
- rtnl_lock();
- if (device_may_wakeup(priv->device) && priv->plat->pmt) {
- phylink_resume(priv->phylink);
- } else {
- phylink_resume(priv->phylink);
- if (device_may_wakeup(priv->device))
- phylink_speed_up(priv->phylink);
- }
- rtnl_unlock();
-
rtnl_lock();
mutex_lock(&priv->lock);
@@ -7954,6 +7944,15 @@ int stmmac_resume(struct device *dev)
stmmac_enable_all_dma_irq(priv);
mutex_unlock(&priv->lock);
+
+ if (device_may_wakeup(priv->device) && priv->plat->pmt) {
+ phylink_resume(priv->phylink);
+ } else {
+ phylink_resume(priv->phylink);
+ if (device_may_wakeup(priv->device))
+ phylink_speed_up(priv->phylink);
+ }
+
rtnl_unlock();
netif_device_attach(ndev);
--
2.30.2
[-- Attachment #5: 0004-net-stmmac-simplify-calls-to-phylink_suspend-and-phy.patch --]
[-- Type: text/x-diff, Size: 2370 bytes --]
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 4/5] net: stmmac: simplify calls to phylink_suspend()
and phylink_resume()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
Currently, the calls to phylink's suspend and resume functions are
inside overly complex tests, and boil down to:
if (device_may_wakeup(priv->device) && priv->plat->pmt) {
call phylink
} else {
call phylink and
if (device_may_wakeup(priv->device))
do something else
}
This results in phylink always being called, possibly with differing
arguments for phylink_suspend().
Simplify this code, noting that each site is slightly different due to
the order in which phylink is called and the "something else".
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 22 +++++++------------
1 file changed, 8 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index b9f651d77c4f..262718e5c4f3 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7831,13 +7831,11 @@ int stmmac_suspend(struct device *dev)
mutex_unlock(&priv->lock);
rtnl_lock();
- if (device_may_wakeup(priv->device) && priv->plat->pmt) {
- phylink_suspend(priv->phylink, true);
- } else {
- if (device_may_wakeup(priv->device))
- phylink_speed_down(priv->phylink, false);
- phylink_suspend(priv->phylink, false);
- }
+ if (device_may_wakeup(priv->device) && !priv->plat->pmt)
+ phylink_speed_down(priv->phylink, false);
+
+ phylink_suspend(priv->phylink,
+ device_may_wakeup(priv->device) && priv->plat->pmt);
rtnl_unlock();
if (stmmac_fpe_supported(priv))
@@ -7945,13 +7943,9 @@ int stmmac_resume(struct device *dev)
mutex_unlock(&priv->lock);
- if (device_may_wakeup(priv->device) && priv->plat->pmt) {
- phylink_resume(priv->phylink);
- } else {
- phylink_resume(priv->phylink);
- if (device_may_wakeup(priv->device))
- phylink_speed_up(priv->phylink);
- }
+ phylink_resume(priv->phylink);
+ if (device_may_wakeup(priv->device) && !priv->plat->pmt)
+ phylink_speed_up(priv->phylink);
rtnl_unlock();
--
2.30.2
[-- Attachment #6: 0005-net-stmmac-call-phylink_prepare_resume.patch --]
[-- Type: text/x-diff, Size: 1473 bytes --]
From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 5/5] net: stmmac: call phylink_prepare_resume()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"
The stmmac core needs the receive clock to be running in order to
complete its software triggered reset. However, the media link may
be in EEE low-power mode, and as the driver allows the PHY receive
clock to be disabled, the receive clock may not be present while
resuming. This has been observed with Tegra 186.
Fix this by using the newly provided phylink_prepare_resume() to
temporarily disable receive clock stop while resuming. phylink_resume()
will restore the receive clock stop setting according to the
configuration passed from the netdev driver.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 262718e5c4f3..31ec1818211d 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -7925,6 +7925,8 @@ int stmmac_resume(struct device *dev)
}
rtnl_lock();
+ phylink_prepare_resume(priv->phylink);
+
mutex_lock(&priv->lock);
stmmac_reset_queues_param(priv);
--
2.30.2
^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 10:59 ` Russell King (Oracle)
@ 2025-02-26 15:55 ` Jon Hunter
2025-02-26 16:00 ` Russell King (Oracle)
0 siblings, 1 reply; 41+ messages in thread
From: Jon Hunter @ 2025-02-26 15:55 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 26/02/2025 10:59, Russell King (Oracle) wrote:
> On Wed, Feb 26, 2025 at 10:11:58AM +0000, Jon Hunter wrote:
>>
>> On 26/02/2025 10:02, Russell King (Oracle) wrote:
>>> On Tue, Feb 25, 2025 at 02:21:01PM +0000, Jon Hunter wrote:
>>>> Hi Russell,
>>>>
>>>> On 19/02/2025 20:57, Russell King (Oracle) wrote:
>>>>> So, let's try something (I haven't tested this, and its likely you
>>>>> will need to work it in to your other change.)
>>>>>
>>>>> Essentially, this disables the receive clock stop around the reset,
>>>>> something the stmmac driver has never done in the past.
>>>>>
>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>> index 1cbea627b216..8e975863a2e3 100644
>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>> @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
>>>>> rtnl_lock();
>>>>> mutex_lock(&priv->lock);
>>>>> + phy_eee_rx_clock_stop(priv->dev->phydev, false);
>>>>> +
>>>>> stmmac_reset_queues_param(priv);
>>>>> stmmac_free_tx_skbufs(priv);
>>>>> @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
>>>>> stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
>>>>> + phy_eee_rx_clock_stop(priv->dev->phydev,
>>>>> + priv->phylink_config.eee_rx_clk_stop_enable);
>>>>> +
>>>>> stmmac_enable_all_queues(priv);
>>>>> stmmac_enable_all_dma_irq(priv);
>>>>
>>>>
>>>> Sorry for the delay, I have been testing various issues recently and needed
>>>> a bit more time to test this.
>>>>
>>>> It turns out that what I had proposed last week does not work. I believe
>>>> that with all the various debug/instrumentation I had added, I was again
>>>> getting lucky. So when I tested again this week on top of vanilla v6.14-rc2,
>>>> it did not work :-(
>>>>
>>>> However, what you are suggesting above, all by itself, is working. I have
>>>> tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is working
>>>> reliably. I have also tested on some other boards that use the same stmmac
>>>> driver (but use the Aquantia PHY) and I have not seen any issues. So this
>>>> does fix the issue I am seeing.
>>>>
>>>> I know we are getting quite late in the rc for v6.14, but not sure if we
>>>> could add this as a fix?
>>>
>>> The patch above was something of a hack, bypassing the layering, so I
>>> would like to consider how this should be done properly.
>>>
>>> I'm still wondering whether the early call to phylink_resume() is
>>> symptomatic of this same issue, or whether there is a PHY that needs
>>> phy_start() to be called to output its clock even with link down that
>>> we don't know about.
>>>
>>> The phylink_resume() call is relevant to this because I'd like to put:
>>>
>>> phy_eee_rx_clock_stop(priv->dev->phydev,
>>> priv->phylink_config.eee_rx_clk_stop_enable);
>>>
>>> in there to ensure that the PHY is correctly configured for clock-stop,
>>> but given stmmac's placement that wouldn't work.
>>>
>>> I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
>>> at the PHY.
>>>
>>> I think the only thing we could do is try solving this problem as per
>>> above and see what the fall-out from it is. I don't get the impression
>>> that stmmac users are particularly active at testing patches though, so
>>> it may take months to get breakage reports.
>>
>>
>> We can ask Furong to test as he seems to active and making changes, but
>> otherwise I am not sure how well it is being tested across various devices.
>> On the other hand, it feels like there are still lingering issues like this
>> with the driver and so I would hope this is moving in the right direction.
>>
>> Let me know if you have a patch you want me to test and I will run in on our
>> Tegra186, Tegra194 and Tegra234 devices that all use this.
>
> Do we think this needs to be a patch for the net tree or the net-next
> tree? I think we've established that it's been a long-standing bug,
> so maybe if we target net-next to give it more time to be tested?
>
Yes I agree there is a long-standing issue here. What is unfortunate for
Linux v6.14 is that failure rate is much higher. However, I don't see
what I can really do about that. I can mark suspend as broken for Linux
v6.14 for this device and then hopefully we will get this resolved
properly.
Thanks!
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 15:55 ` Jon Hunter
@ 2025-02-26 16:00 ` Russell King (Oracle)
2025-02-26 16:06 ` Jon Hunter
0 siblings, 1 reply; 41+ messages in thread
From: Russell King (Oracle) @ 2025-02-26 16:00 UTC (permalink / raw)
To: Jon Hunter
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On Wed, Feb 26, 2025 at 03:55:47PM +0000, Jon Hunter wrote:
>
> On 26/02/2025 10:59, Russell King (Oracle) wrote:
> > On Wed, Feb 26, 2025 at 10:11:58AM +0000, Jon Hunter wrote:
> > >
> > > On 26/02/2025 10:02, Russell King (Oracle) wrote:
> > > > On Tue, Feb 25, 2025 at 02:21:01PM +0000, Jon Hunter wrote:
> > > > > Hi Russell,
> > > > >
> > > > > On 19/02/2025 20:57, Russell King (Oracle) wrote:
> > > > > > So, let's try something (I haven't tested this, and its likely you
> > > > > > will need to work it in to your other change.)
> > > > > >
> > > > > > Essentially, this disables the receive clock stop around the reset,
> > > > > > something the stmmac driver has never done in the past.
> > > > > >
> > > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > > > index 1cbea627b216..8e975863a2e3 100644
> > > > > > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > > > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > > > > > @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
> > > > > > rtnl_lock();
> > > > > > mutex_lock(&priv->lock);
> > > > > > + phy_eee_rx_clock_stop(priv->dev->phydev, false);
> > > > > > +
> > > > > > stmmac_reset_queues_param(priv);
> > > > > > stmmac_free_tx_skbufs(priv);
> > > > > > @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
> > > > > > stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
> > > > > > + phy_eee_rx_clock_stop(priv->dev->phydev,
> > > > > > + priv->phylink_config.eee_rx_clk_stop_enable);
> > > > > > +
> > > > > > stmmac_enable_all_queues(priv);
> > > > > > stmmac_enable_all_dma_irq(priv);
> > > > >
> > > > >
> > > > > Sorry for the delay, I have been testing various issues recently and needed
> > > > > a bit more time to test this.
> > > > >
> > > > > It turns out that what I had proposed last week does not work. I believe
> > > > > that with all the various debug/instrumentation I had added, I was again
> > > > > getting lucky. So when I tested again this week on top of vanilla v6.14-rc2,
> > > > > it did not work :-(
> > > > >
> > > > > However, what you are suggesting above, all by itself, is working. I have
> > > > > tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is working
> > > > > reliably. I have also tested on some other boards that use the same stmmac
> > > > > driver (but use the Aquantia PHY) and I have not seen any issues. So this
> > > > > does fix the issue I am seeing.
> > > > >
> > > > > I know we are getting quite late in the rc for v6.14, but not sure if we
> > > > > could add this as a fix?
> > > >
> > > > The patch above was something of a hack, bypassing the layering, so I
> > > > would like to consider how this should be done properly.
> > > >
> > > > I'm still wondering whether the early call to phylink_resume() is
> > > > symptomatic of this same issue, or whether there is a PHY that needs
> > > > phy_start() to be called to output its clock even with link down that
> > > > we don't know about.
> > > >
> > > > The phylink_resume() call is relevant to this because I'd like to put:
> > > >
> > > > phy_eee_rx_clock_stop(priv->dev->phydev,
> > > > priv->phylink_config.eee_rx_clk_stop_enable);
> > > >
> > > > in there to ensure that the PHY is correctly configured for clock-stop,
> > > > but given stmmac's placement that wouldn't work.
> > > >
> > > > I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
> > > > at the PHY.
> > > >
> > > > I think the only thing we could do is try solving this problem as per
> > > > above and see what the fall-out from it is. I don't get the impression
> > > > that stmmac users are particularly active at testing patches though, so
> > > > it may take months to get breakage reports.
> > >
> > >
> > > We can ask Furong to test as he seems to active and making changes, but
> > > otherwise I am not sure how well it is being tested across various devices.
> > > On the other hand, it feels like there are still lingering issues like this
> > > with the driver and so I would hope this is moving in the right direction.
> > >
> > > Let me know if you have a patch you want me to test and I will run in on our
> > > Tegra186, Tegra194 and Tegra234 devices that all use this.
> >
> > Do we think this needs to be a patch for the net tree or the net-next
> > tree? I think we've established that it's been a long-standing bug,
> > so maybe if we target net-next to give it more time to be tested?
> >
>
> Yes I agree there is a long-standing issue here. What is unfortunate for
> Linux v6.14 is that failure rate is much higher. However, I don't see what I
> can really do about that. I can mark suspend as broken for Linux v6.14 for
> this device and then hopefully we will get this resolved properly.
If we put the patches in net-next, it can have longer to be tested - it
won't go straight into 6.14, but will wait until after net-next gets
merged, and it'll then be backported to 6.14 stable trees.
I think the fix that I've outlined is too big and too risky to go
straight into 6.14, but the smaller fix may be better, but would then
need to be rewritten into the larger fix.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 16:00 ` Russell King (Oracle)
@ 2025-02-26 16:06 ` Jon Hunter
0 siblings, 0 replies; 41+ messages in thread
From: Jon Hunter @ 2025-02-26 16:06 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel, linux-stm32, Marcin Wojtas, Maxime Coquelin,
netdev, Paolo Abeni, UNGLinuxDriver, linux-tegra@vger.kernel.org
On 26/02/2025 16:00, Russell King (Oracle) wrote:
> On Wed, Feb 26, 2025 at 03:55:47PM +0000, Jon Hunter wrote:
>>
>> On 26/02/2025 10:59, Russell King (Oracle) wrote:
>>> On Wed, Feb 26, 2025 at 10:11:58AM +0000, Jon Hunter wrote:
>>>>
>>>> On 26/02/2025 10:02, Russell King (Oracle) wrote:
>>>>> On Tue, Feb 25, 2025 at 02:21:01PM +0000, Jon Hunter wrote:
>>>>>> Hi Russell,
>>>>>>
>>>>>> On 19/02/2025 20:57, Russell King (Oracle) wrote:
>>>>>>> So, let's try something (I haven't tested this, and its likely you
>>>>>>> will need to work it in to your other change.)
>>>>>>>
>>>>>>> Essentially, this disables the receive clock stop around the reset,
>>>>>>> something the stmmac driver has never done in the past.
>>>>>>>
>>>>>>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>>>> index 1cbea627b216..8e975863a2e3 100644
>>>>>>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>>>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>>>>>>> @@ -7926,6 +7926,8 @@ int stmmac_resume(struct device *dev)
>>>>>>> rtnl_lock();
>>>>>>> mutex_lock(&priv->lock);
>>>>>>> + phy_eee_rx_clock_stop(priv->dev->phydev, false);
>>>>>>> +
>>>>>>> stmmac_reset_queues_param(priv);
>>>>>>> stmmac_free_tx_skbufs(priv);
>>>>>>> @@ -7937,6 +7939,9 @@ int stmmac_resume(struct device *dev)
>>>>>>> stmmac_restore_hw_vlan_rx_fltr(priv, ndev, priv->hw);
>>>>>>> + phy_eee_rx_clock_stop(priv->dev->phydev,
>>>>>>> + priv->phylink_config.eee_rx_clk_stop_enable);
>>>>>>> +
>>>>>>> stmmac_enable_all_queues(priv);
>>>>>>> stmmac_enable_all_dma_irq(priv);
>>>>>>
>>>>>>
>>>>>> Sorry for the delay, I have been testing various issues recently and needed
>>>>>> a bit more time to test this.
>>>>>>
>>>>>> It turns out that what I had proposed last week does not work. I believe
>>>>>> that with all the various debug/instrumentation I had added, I was again
>>>>>> getting lucky. So when I tested again this week on top of vanilla v6.14-rc2,
>>>>>> it did not work :-(
>>>>>>
>>>>>> However, what you are suggesting above, all by itself, is working. I have
>>>>>> tested this on top of vanilla v6.14-rc2 and v6.14-rc4 and it is working
>>>>>> reliably. I have also tested on some other boards that use the same stmmac
>>>>>> driver (but use the Aquantia PHY) and I have not seen any issues. So this
>>>>>> does fix the issue I am seeing.
>>>>>>
>>>>>> I know we are getting quite late in the rc for v6.14, but not sure if we
>>>>>> could add this as a fix?
>>>>>
>>>>> The patch above was something of a hack, bypassing the layering, so I
>>>>> would like to consider how this should be done properly.
>>>>>
>>>>> I'm still wondering whether the early call to phylink_resume() is
>>>>> symptomatic of this same issue, or whether there is a PHY that needs
>>>>> phy_start() to be called to output its clock even with link down that
>>>>> we don't know about.
>>>>>
>>>>> The phylink_resume() call is relevant to this because I'd like to put:
>>>>>
>>>>> phy_eee_rx_clock_stop(priv->dev->phydev,
>>>>> priv->phylink_config.eee_rx_clk_stop_enable);
>>>>>
>>>>> in there to ensure that the PHY is correctly configured for clock-stop,
>>>>> but given stmmac's placement that wouldn't work.
>>>>>
>>>>> I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
>>>>> at the PHY.
>>>>>
>>>>> I think the only thing we could do is try solving this problem as per
>>>>> above and see what the fall-out from it is. I don't get the impression
>>>>> that stmmac users are particularly active at testing patches though, so
>>>>> it may take months to get breakage reports.
>>>>
>>>>
>>>> We can ask Furong to test as he seems to active and making changes, but
>>>> otherwise I am not sure how well it is being tested across various devices.
>>>> On the other hand, it feels like there are still lingering issues like this
>>>> with the driver and so I would hope this is moving in the right direction.
>>>>
>>>> Let me know if you have a patch you want me to test and I will run in on our
>>>> Tegra186, Tegra194 and Tegra234 devices that all use this.
>>>
>>> Do we think this needs to be a patch for the net tree or the net-next
>>> tree? I think we've established that it's been a long-standing bug,
>>> so maybe if we target net-next to give it more time to be tested?
>>>
>>
>> Yes I agree there is a long-standing issue here. What is unfortunate for
>> Linux v6.14 is that failure rate is much higher. However, I don't see what I
>> can really do about that. I can mark suspend as broken for Linux v6.14 for
>> this device and then hopefully we will get this resolved properly.
>
> If we put the patches in net-next, it can have longer to be tested - it
> won't go straight into 6.14, but will wait until after net-next gets
> merged, and it'll then be backported to 6.14 stable trees.
Yes that would be great.
> I think the fix that I've outlined is too big and too risky to go
> straight into 6.14, but the smaller fix may be better, but would then
> need to be rewritten into the larger fix.
I think it is fine and better to get it fixed for the long term.
Thanks
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support
2025-02-26 11:37 ` Russell King (Oracle)
@ 2025-02-26 17:24 ` Jon Hunter
0 siblings, 0 replies; 41+ messages in thread
From: Jon Hunter @ 2025-02-26 17:24 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Alexandre Torgue, Andrew Lunn,
Bryan Whitehead, David S. Miller, Eric Dumazet, Jakub Kicinski,
linux-arm-kernel@lists.infradead.org,
linux-stm32@st-md-mailman.stormreply.com, Marcin Wojtas,
Maxime Coquelin, netdev@vger.kernel.org, Paolo Abeni,
UNGLinuxDriver@microchip.com, linux-tegra@vger.kernel.org
On 26/02/2025 11:37, Russell King (Oracle) wrote:
> On Wed, Feb 26, 2025 at 10:11:58AM +0000, Jon Hunter wrote:
>> On 26/02/2025 10:02, Russell King (Oracle) wrote:
>>> The patch above was something of a hack, bypassing the layering, so I
>>> would like to consider how this should be done properly.
>>>
>>> I'm still wondering whether the early call to phylink_resume() is
>>> symptomatic of this same issue, or whether there is a PHY that needs
>>> phy_start() to be called to output its clock even with link down that
>>> we don't know about.
>>>
>>> The phylink_resume() call is relevant to this because I'd like to put:
>>>
>>> phy_eee_rx_clock_stop(priv->dev->phydev,
>>> priv->phylink_config.eee_rx_clk_stop_enable);
>>>
>>> in there to ensure that the PHY is correctly configured for clock-stop,
>>> but given stmmac's placement that wouldn't work.
>>>
>>> I'm then thinking of phylink_pre_resume() to disable the EEE clock-stop
>>> at the PHY.
>>>
>>> I think the only thing we could do is try solving this problem as per
>>> above and see what the fall-out from it is. I don't get the impression
>>> that stmmac users are particularly active at testing patches though, so
>>> it may take months to get breakage reports.
>>
>>
>> We can ask Furong to test as he seems to active and making changes, but
>> otherwise I am not sure how well it is being tested across various devices.
>> On the other hand, it feels like there are still lingering issues like this
>> with the driver and so I would hope this is moving in the right direction.
>>
>> Let me know if you have a patch you want me to test and I will run in on our
>> Tegra186, Tegra194 and Tegra234 devices that all use this.
>
> The attached patches shows what I'm thinking of - it's just been roughed
> out, and only been build tested.
I have tested these patches on Tegra186, Tegra194 and Tegra234 that they
are all working fine.
Thanks
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2025-02-26 17:24 UTC | newest]
Thread overview: 41+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-15 20:42 [PATCH net-next 0/9] net: add phylink managed EEE support Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 1/9] net: mdio: add definition for clock stop capable bit Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 2/9] net: phy: add support for querying PHY clock stop capability Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 3/9] net: phylink: add phylink_link_is_up() helper Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 4/9] net: phylink: add EEE management Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 5/9] net: mvneta: convert to phylink EEE implementation Russell King (Oracle)
2025-01-15 20:42 ` [PATCH net-next 6/9] net: mvpp2: add " Russell King (Oracle)
2025-01-16 8:27 ` Maxime Chevallier
2025-01-15 20:42 ` [PATCH net-next 7/9] net: lan743x: use netdev in lan743x_phylink_mac_link_down() Russell King (Oracle)
2025-01-15 20:43 ` [PATCH net-next 8/9] net: lan743x: convert to phylink managed EEE Russell King (Oracle)
2025-01-15 20:43 ` [PATCH net-next 9/9] net: stmmac: convert to phylink managed EEE support Russell King (Oracle)
2025-02-13 11:05 ` Jon Hunter
2025-02-13 11:37 ` Russell King (Oracle)
2025-02-13 12:00 ` Russell King (Oracle)
2025-02-14 10:58 ` Jon Hunter
2025-02-14 11:21 ` Russell King (Oracle)
2025-02-14 17:03 ` Jon Hunter
2025-02-19 14:01 ` Jon Hunter
2025-02-19 15:36 ` Russell King (Oracle)
2025-02-19 17:52 ` Jon Hunter
2025-02-19 19:13 ` Russell King (Oracle)
2025-02-19 20:05 ` Jon Hunter
2025-02-19 20:57 ` Russell King (Oracle)
2025-02-25 14:21 ` Jon Hunter
2025-02-26 10:02 ` Russell King (Oracle)
2025-02-26 10:11 ` Jon Hunter
2025-02-26 10:59 ` Russell King (Oracle)
2025-02-26 15:55 ` Jon Hunter
2025-02-26 16:00 ` Russell King (Oracle)
2025-02-26 16:06 ` Jon Hunter
2025-02-26 11:37 ` Russell King (Oracle)
2025-02-26 17:24 ` Jon Hunter
2025-01-16 0:40 ` [PATCH net-next 0/9] net: add " Jacob Keller
2025-01-17 1:40 ` patchwork-bot+netdevbpf
2025-01-17 8:56 ` Jiawen Wu
2025-01-17 9:05 ` Russell King (Oracle)
2025-01-17 10:17 ` Jiawen Wu
2025-01-17 12:23 ` Russell King (Oracle)
2025-01-20 1:51 ` Jiawen Wu
2025-01-20 9:54 ` Russell King (Oracle)
2025-01-20 9:59 ` Jiawen Wu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).