* [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks
@ 2024-10-03 14:07 Vladimir Oltean
2024-12-12 14:21 ` Vladimir Oltean
2024-12-12 15:56 ` Russell King (Oracle)
0 siblings, 2 replies; 4+ messages in thread
From: Vladimir Oltean @ 2024-10-03 14:07 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Russell King, Heiner Kallweit, Andrew Lunn, Florian Fainelli,
linux-kernel
sja1105_static_config_reload() changes major settings in the switch and
it requires a reset. A use case is to change things like Qdiscs (but see
sja1105_reset_reasons[] for full list) while PTP synchronization is
running, and the servo loop must not exit the locked state (s2).
Therefore, stopping and restarting the phylink instances of all ports is
not desirable, because that also stops the phylib state machine, and
retriggers a seconds-long auto-negotiation process that breaks PTP.
Thus, saving and restoring the link management settings is handled
privately by the driver.
The method got progressively more complex as SGMII support got added,
because this is handled through the xpcs phylink_pcs component, to which
we don't have unfettered access. Nonetheless, the switch reset line is
hardwired to also reset the XPCS, creating a situation where it loses
state and needs to be reprogrammed at a moment in time outside phylink's
control.
Although commits 907476c66d73 ("net: dsa: sja1105: call PCS
config/link_up via pcs_ops structure") and 41bf58314b17 ("net: dsa:
sja1105: use phylink_pcs internally") made the sja1105 <-> xpcs
interaction slightly prettier, we still depend heavily on the PCS being
"XPCS-like", because to back up its settings, we read the MII_BMCR
register, through a mdiobus_c45_read() operation, breaking all layering
separation.
But the phylink instance already has all that state, and more. It's just
that it's private. In this proposal, phylink offers 2 helpers for
walking the MAC and PCS drivers again through the callbacks required
during a destructive reset operation: mac_link_down() -> pcs_link_down()
-> mac_config() -> pcs_config() -> mac_link_up() -> pcs_link_up().
This creates the unique opportunity to simplify away even more code than
just the xpcs handling from sja1105_static_config_reload().
The sja1105_set_port_config() method is also invoked from
sja1105_mac_link_up(). And since that is now called directly by
phylink.. we can just remove it from sja1105_static_config_reload().
This makes it possible to re-merge sja1105_set_port_speed() and
sja1105_set_port_config() in a later change.
Note that my only setups with sja1105 where the xpcs is used is with the
xpcs on the CPU-facing port (fixed-link). Thus, I cannot test xpcs + PHY.
But the replay procedure walks through all ports, and I did test a
regular RGMII user port + a PHY.
ptp4l[54.552]: master offset 5 s2 freq -931 path delay 764
ptp4l[55.551]: master offset 22 s2 freq -913 path delay 764
ptp4l[56.551]: master offset 13 s2 freq -915 path delay 765
ptp4l[57.552]: master offset 5 s2 freq -919 path delay 765
ptp4l[58.553]: master offset 13 s2 freq -910 path delay 765
ptp4l[59.553]: master offset 13 s2 freq -906 path delay 765
ptp4l[60.553]: master offset 6 s2 freq -909 path delay 765
ptp4l[61.553]: master offset 6 s2 freq -907 path delay 765
ptp4l[62.553]: master offset 6 s2 freq -906 path delay 765
ptp4l[63.553]: master offset 14 s2 freq -896 path delay 765
$ ip link set br0 type bridge vlan_filtering 1
[ 63.983283] sja1105 spi2.0 sw0p0: Link is Down
[ 63.991913] sja1105 spi2.0: Link is Down
[ 64.009784] sja1105 spi2.0: Reset switch and programmed static config. Reason: VLAN filtering
[ 64.020217] sja1105 spi2.0 sw0p0: Link is Up - 1Gbps/Full - flow control off
[ 64.030683] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
ptp4l[64.554]: master offset 7397 s2 freq +6491 path delay 765
ptp4l[65.554]: master offset 38 s2 freq +1352 path delay 765
ptp4l[66.554]: master offset -2225 s2 freq -900 path delay 764
ptp4l[67.555]: master offset -2226 s2 freq -1569 path delay 765
ptp4l[68.555]: master offset -1553 s2 freq -1563 path delay 765
ptp4l[69.555]: master offset -865 s2 freq -1341 path delay 765
ptp4l[70.555]: master offset -401 s2 freq -1137 path delay 765
ptp4l[71.556]: master offset -145 s2 freq -1001 path delay 765
ptp4l[72.558]: master offset -26 s2 freq -926 path delay 765
ptp4l[73.557]: master offset 30 s2 freq -877 path delay 765
ptp4l[74.557]: master offset 47 s2 freq -851 path delay 765
ptp4l[75.557]: master offset 29 s2 freq -855 path delay 765
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
This was not what was discussed in
https://lore.kernel.org/netdev/E1ssjcz-005Ns9-D5@rmk-PC.armlinux.org.uk/,
but I will approach that perhaps differently, depending on the feedback here.
drivers/net/dsa/sja1105/sja1105_main.c | 58 ++++----------------------
drivers/net/phy/phylink.c | 51 +++++++++++++++++++++-
include/linux/phylink.h | 5 +++
3 files changed, 63 insertions(+), 51 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index af38b8959d8d..864f697105d8 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2291,14 +2291,12 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
{
struct ptp_system_timestamp ptp_sts_before;
struct ptp_system_timestamp ptp_sts_after;
- u16 bmcr[SJA1105_MAX_NUM_PORTS] = {0};
- u64 mac_speed[SJA1105_MAX_NUM_PORTS];
struct sja1105_mac_config_entry *mac;
struct dsa_switch *ds = priv->ds;
+ struct dsa_port *dp;
s64 t1, t2, t3, t4;
- s64 t12, t34;
- int rc, i;
- s64 now;
+ s64 t12, t34, now;
+ int rc;
mutex_lock(&priv->fdb_lock);
mutex_lock(&priv->mgmt_lock);
@@ -2310,13 +2308,9 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
* switch wants to see in the static config in order to allow us to
* change it through the dynamic interface later.
*/
- for (i = 0; i < ds->num_ports; i++) {
- mac_speed[i] = mac[i].speed;
- mac[i].speed = priv->info->port_speed[SJA1105_SPEED_AUTO];
-
- if (priv->pcs[i])
- bmcr[i] = mdiobus_c45_read(priv->mdio_pcs, i,
- MDIO_MMD_VEND2, MDIO_CTRL1);
+ dsa_switch_for_each_available_port(dp, ds) {
+ phylink_replay_link_begin(dp->pl);
+ mac[dp->index].speed = priv->info->port_speed[SJA1105_SPEED_AUTO];
}
/* No PTP operations can run right now */
@@ -2370,44 +2364,8 @@ int sja1105_static_config_reload(struct sja1105_private *priv,
goto out;
}
- for (i = 0; i < ds->num_ports; i++) {
- struct phylink_pcs *pcs = priv->pcs[i];
- unsigned int neg_mode;
-
- mac[i].speed = mac_speed[i];
- rc = sja1105_set_port_config(priv, i);
- if (rc < 0)
- goto out;
-
- if (!pcs)
- continue;
-
- if (bmcr[i] & BMCR_ANENABLE)
- neg_mode = PHYLINK_PCS_NEG_INBAND_ENABLED;
- else
- neg_mode = PHYLINK_PCS_NEG_OUTBAND;
-
- rc = pcs->ops->pcs_config(pcs, neg_mode, priv->phy_mode[i],
- NULL, true);
- if (rc < 0)
- goto out;
-
- if (neg_mode == PHYLINK_PCS_NEG_OUTBAND) {
- int speed = SPEED_UNKNOWN;
-
- if (priv->phy_mode[i] == PHY_INTERFACE_MODE_2500BASEX)
- speed = SPEED_2500;
- else if (bmcr[i] & BMCR_SPEED1000)
- speed = SPEED_1000;
- else if (bmcr[i] & BMCR_SPEED100)
- speed = SPEED_100;
- else
- speed = SPEED_10;
-
- pcs->ops->pcs_link_up(pcs, neg_mode, priv->phy_mode[i],
- speed, DUPLEX_FULL);
- }
- }
+ dsa_switch_for_each_available_port(dp, ds)
+ phylink_replay_link_end(dp->pl);
rc = sja1105_reload_cbs(priv);
if (rc < 0)
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 4309317de3d1..ba13f80ed45f 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -80,6 +80,7 @@ struct phylink {
bool mac_link_dropped;
bool using_mac_select_pcs;
+ bool force_major_config;
struct sfp_bus *sfp_bus;
bool sfp_may_have_phy;
@@ -1551,7 +1552,8 @@ static void phylink_resolve(struct work_struct *w)
}
if (mac_config) {
- if (link_state.interface != pl->link_config.interface) {
+ if (link_state.interface != pl->link_config.interface ||
+ pl->force_major_config) {
/* The interface has changed, force the link down and
* then reconfigure.
*/
@@ -1561,6 +1563,7 @@ static void phylink_resolve(struct work_struct *w)
}
phylink_major_config(pl, false, &link_state);
pl->link_config.interface = link_state.interface;
+ pl->force_major_config = false;
}
}
@@ -3870,6 +3873,52 @@ void phylink_mii_c45_pcs_get_state(struct mdio_device *pcs,
}
EXPORT_SYMBOL_GPL(phylink_mii_c45_pcs_get_state);
+/**
+ * phylink_replay_link_begin() - begin replay of link callbacks for driver
+ * which loses state
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Helper for MAC drivers which may perform a destructive reset at runtime.
+ * Both the own driver's mac_link_down() method is called, as well as the
+ * pcs_link_down() method of the split PCS (if any).
+ *
+ * This is similar to phylink_stop(), except it does not alter the state of
+ * the phylib PHY (it is assumed that it is not affected by the MAC destructive
+ * reset).
+ */
+void phylink_replay_link_begin(struct phylink *pl)
+{
+ ASSERT_RTNL();
+
+ phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED);
+}
+EXPORT_SYMBOL_GPL(phylink_replay_link_begin);
+
+/**
+ * phylink_replay_link_end() - end replay of link callbacks for driver
+ * which lost state
+ * @pl: a pointer to a &struct phylink returned from phylink_create()
+ *
+ * Helper for MAC drivers which may perform a destructive reset at runtime.
+ * Both the own driver's mac_config() and mac_link_up() methods, as well as the
+ * pcs_config() and pcs_link_up() method of the split PCS (if any), are called.
+ *
+ * This is similar to phylink_start(), except it does not alter the state of
+ * the phylib PHY.
+ *
+ * One must call this method only within the same rtnl_lock() critical section
+ * as a previous phylink_replay_link_start().
+ */
+void phylink_replay_link_end(struct phylink *pl)
+{
+ ASSERT_RTNL();
+
+ pl->force_major_config = true;
+ phylink_enable_and_run_resolve(pl, PHYLINK_DISABLE_STOPPED);
+ flush_work(&pl->resolve);
+}
+EXPORT_SYMBOL_GPL(phylink_replay_link_end);
+
static int __init phylink_init(void)
{
for (int i = 0; i < ARRAY_SIZE(phylink_sfp_interface_preference); ++i)
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 5c01048860c4..d978daafd0e9 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -687,4 +687,9 @@ void phylink_mii_c45_pcs_get_state(struct mdio_device *pcs,
void phylink_decode_usxgmii_word(struct phylink_link_state *state,
uint16_t lpa);
+
+void phylink_replay_link_begin(struct phylink *pl);
+
+void phylink_replay_link_end(struct phylink *pl);
+
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks
2024-10-03 14:07 [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks Vladimir Oltean
@ 2024-12-12 14:21 ` Vladimir Oltean
2024-12-12 15:56 ` Russell King (Oracle)
1 sibling, 0 replies; 4+ messages in thread
From: Vladimir Oltean @ 2024-12-12 14:21 UTC (permalink / raw)
To: Russell King, netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Heiner Kallweit, Andrew Lunn, Florian Fainelli, linux-kernel
On Thu, Oct 03, 2024 at 05:07:53PM +0300, Vladimir Oltean wrote:
> sja1105_static_config_reload() changes major settings in the switch and
> it requires a reset. A use case is to change things like Qdiscs (but see
> sja1105_reset_reasons[] for full list) while PTP synchronization is
> running, and the servo loop must not exit the locked state (s2).
> Therefore, stopping and restarting the phylink instances of all ports is
> not desirable, because that also stops the phylib state machine, and
> retriggers a seconds-long auto-negotiation process that breaks PTP.
> Thus, saving and restoring the link management settings is handled
> privately by the driver.
>
> The method got progressively more complex as SGMII support got added,
> because this is handled through the xpcs phylink_pcs component, to which
> we don't have unfettered access. Nonetheless, the switch reset line is
> hardwired to also reset the XPCS, creating a situation where it loses
> state and needs to be reprogrammed at a moment in time outside phylink's
> control.
>
> Although commits 907476c66d73 ("net: dsa: sja1105: call PCS
> config/link_up via pcs_ops structure") and 41bf58314b17 ("net: dsa:
> sja1105: use phylink_pcs internally") made the sja1105 <-> xpcs
> interaction slightly prettier, we still depend heavily on the PCS being
> "XPCS-like", because to back up its settings, we read the MII_BMCR
> register, through a mdiobus_c45_read() operation, breaking all layering
> separation.
>
> But the phylink instance already has all that state, and more. It's just
> that it's private. In this proposal, phylink offers 2 helpers for
> walking the MAC and PCS drivers again through the callbacks required
> during a destructive reset operation: mac_link_down() -> pcs_link_down()
> -> mac_config() -> pcs_config() -> mac_link_up() -> pcs_link_up().
>
> This creates the unique opportunity to simplify away even more code than
> just the xpcs handling from sja1105_static_config_reload().
> The sja1105_set_port_config() method is also invoked from
> sja1105_mac_link_up(). And since that is now called directly by
> phylink.. we can just remove it from sja1105_static_config_reload().
> This makes it possible to re-merge sja1105_set_port_speed() and
> sja1105_set_port_config() in a later change.
>
> Note that my only setups with sja1105 where the xpcs is used is with the
> xpcs on the CPU-facing port (fixed-link). Thus, I cannot test xpcs + PHY.
> But the replay procedure walks through all ports, and I did test a
> regular RGMII user port + a PHY.
>
> ptp4l[54.552]: master offset 5 s2 freq -931 path delay 764
> ptp4l[55.551]: master offset 22 s2 freq -913 path delay 764
> ptp4l[56.551]: master offset 13 s2 freq -915 path delay 765
> ptp4l[57.552]: master offset 5 s2 freq -919 path delay 765
> ptp4l[58.553]: master offset 13 s2 freq -910 path delay 765
> ptp4l[59.553]: master offset 13 s2 freq -906 path delay 765
> ptp4l[60.553]: master offset 6 s2 freq -909 path delay 765
> ptp4l[61.553]: master offset 6 s2 freq -907 path delay 765
> ptp4l[62.553]: master offset 6 s2 freq -906 path delay 765
> ptp4l[63.553]: master offset 14 s2 freq -896 path delay 765
> $ ip link set br0 type bridge vlan_filtering 1
> [ 63.983283] sja1105 spi2.0 sw0p0: Link is Down
> [ 63.991913] sja1105 spi2.0: Link is Down
> [ 64.009784] sja1105 spi2.0: Reset switch and programmed static config. Reason: VLAN filtering
> [ 64.020217] sja1105 spi2.0 sw0p0: Link is Up - 1Gbps/Full - flow control off
> [ 64.030683] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
> ptp4l[64.554]: master offset 7397 s2 freq +6491 path delay 765
> ptp4l[65.554]: master offset 38 s2 freq +1352 path delay 765
> ptp4l[66.554]: master offset -2225 s2 freq -900 path delay 764
> ptp4l[67.555]: master offset -2226 s2 freq -1569 path delay 765
> ptp4l[68.555]: master offset -1553 s2 freq -1563 path delay 765
> ptp4l[69.555]: master offset -865 s2 freq -1341 path delay 765
> ptp4l[70.555]: master offset -401 s2 freq -1137 path delay 765
> ptp4l[71.556]: master offset -145 s2 freq -1001 path delay 765
> ptp4l[72.558]: master offset -26 s2 freq -926 path delay 765
> ptp4l[73.557]: master offset 30 s2 freq -877 path delay 765
> ptp4l[74.557]: master offset 47 s2 freq -851 path delay 765
> ptp4l[75.557]: master offset 29 s2 freq -855 path delay 765
>
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> This was not what was discussed in
> https://lore.kernel.org/netdev/E1ssjcz-005Ns9-D5@rmk-PC.armlinux.org.uk/,
> but I will approach that perhaps differently, depending on the feedback here.
I don't want to repost this because I have no updates to it, but could I
get some feedback please?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks
2024-10-03 14:07 [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks Vladimir Oltean
2024-12-12 14:21 ` Vladimir Oltean
@ 2024-12-12 15:56 ` Russell King (Oracle)
2024-12-12 17:20 ` Vladimir Oltean
1 sibling, 1 reply; 4+ messages in thread
From: Russell King (Oracle) @ 2024-12-12 15:56 UTC (permalink / raw)
To: Vladimir Oltean
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Heiner Kallweit, Andrew Lunn, Florian Fainelli,
linux-kernel
On Thu, Oct 03, 2024 at 05:07:53PM +0300, Vladimir Oltean wrote:
> sja1105_static_config_reload() changes major settings in the switch and
> it requires a reset. A use case is to change things like Qdiscs (but see
> sja1105_reset_reasons[] for full list) while PTP synchronization is
> running, and the servo loop must not exit the locked state (s2).
> Therefore, stopping and restarting the phylink instances of all ports is
> not desirable, because that also stops the phylib state machine, and
> retriggers a seconds-long auto-negotiation process that breaks PTP.
However:
> ptp4l[54.552]: master offset 5 s2 freq -931 path delay 764
> ptp4l[55.551]: master offset 22 s2 freq -913 path delay 764
> ptp4l[56.551]: master offset 13 s2 freq -915 path delay 765
> ptp4l[57.552]: master offset 5 s2 freq -919 path delay 765
> ptp4l[58.553]: master offset 13 s2 freq -910 path delay 765
> ptp4l[59.553]: master offset 13 s2 freq -906 path delay 765
> ptp4l[60.553]: master offset 6 s2 freq -909 path delay 765
> ptp4l[61.553]: master offset 6 s2 freq -907 path delay 765
> ptp4l[62.553]: master offset 6 s2 freq -906 path delay 765
> ptp4l[63.553]: master offset 14 s2 freq -896 path delay 765
> $ ip link set br0 type bridge vlan_filtering 1
> [ 63.983283] sja1105 spi2.0 sw0p0: Link is Down
> [ 63.991913] sja1105 spi2.0: Link is Down
> [ 64.009784] sja1105 spi2.0: Reset switch and programmed static config. Reason: VLAN filtering
> [ 64.020217] sja1105 spi2.0 sw0p0: Link is Up - 1Gbps/Full - flow control off
> [ 64.030683] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
> ptp4l[64.554]: master offset 7397 s2 freq +6491 path delay 765
> ptp4l[65.554]: master offset 38 s2 freq +1352 path delay 765
> ptp4l[66.554]: master offset -2225 s2 freq -900 path delay 764
> ptp4l[67.555]: master offset -2226 s2 freq -1569 path delay 765
> ptp4l[68.555]: master offset -1553 s2 freq -1563 path delay 765
> ptp4l[69.555]: master offset -865 s2 freq -1341 path delay 765
> ptp4l[70.555]: master offset -401 s2 freq -1137 path delay 765
> ptp4l[71.556]: master offset -145 s2 freq -1001 path delay 765
doesn't this change in offset and frequency indicate that the PTP clock
was still disrupted, and needed to be re-synchronised? If it was
unaffected, then I would have expected the offset and frequency to
remain similar to before the reset happened.
Nevertheless...
> @@ -1551,7 +1552,8 @@ static void phylink_resolve(struct work_struct *w)
> }
>
> if (mac_config) {
> - if (link_state.interface != pl->link_config.interface) {
> + if (link_state.interface != pl->link_config.interface ||
> + pl->force_major_config) {
> /* The interface has changed, force the link down and
> * then reconfigure.
> */
> @@ -1561,6 +1563,7 @@ static void phylink_resolve(struct work_struct *w)
> }
> phylink_major_config(pl, false, &link_state);
> pl->link_config.interface = link_state.interface;
> + pl->force_major_config = false;
> }
> }
This will delay the major config until the link comes up, as mac_config
only gets set true for fixed-link and PHY when the link is up. For
inband mode, things get less certain, because mac_config will only be
true if there is a PHY present and the PHY link was up. Otherwise,
inband leaves mac_config false, and thus if force_major_config was
true, that would persist indefinitely.
> +/**
> + * phylink_replay_link_begin() - begin replay of link callbacks for driver
> + * which loses state
> + * @pl: a pointer to a &struct phylink returned from phylink_create()
> + *
> + * Helper for MAC drivers which may perform a destructive reset at runtime.
> + * Both the own driver's mac_link_down() method is called, as well as the
> + * pcs_link_down() method of the split PCS (if any).
> + *
> + * This is similar to phylink_stop(), except it does not alter the state of
> + * the phylib PHY (it is assumed that it is not affected by the MAC destructive
> + * reset).
> + */
> +void phylink_replay_link_begin(struct phylink *pl)
> +{
> + ASSERT_RTNL();
> +
> + phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED);
I would prefer this used a different disable flag, so that...
> +}
> +EXPORT_SYMBOL_GPL(phylink_replay_link_begin);
> +
> +/**
> + * phylink_replay_link_end() - end replay of link callbacks for driver
> + * which lost state
> + * @pl: a pointer to a &struct phylink returned from phylink_create()
> + *
> + * Helper for MAC drivers which may perform a destructive reset at runtime.
> + * Both the own driver's mac_config() and mac_link_up() methods, as well as the
> + * pcs_config() and pcs_link_up() method of the split PCS (if any), are called.
> + *
> + * This is similar to phylink_start(), except it does not alter the state of
> + * the phylib PHY.
> + *
> + * One must call this method only within the same rtnl_lock() critical section
> + * as a previous phylink_replay_link_start().
> + */
> +void phylink_replay_link_end(struct phylink *pl)
> +{
> + ASSERT_RTNL();
> +
> + pl->force_major_config = true;
> + phylink_enable_and_run_resolve(pl, PHYLINK_DISABLE_STOPPED);
this can check that phylink_replay_link_begin() was previously called
to catch programming errors. There shouldn't be any conflict with
phylink_start()/phylink_stop() since the RTNL is held, but I think
its still worth checking that phylink_replay_link_begin() was
indeed called previously.
Other than those points, I think for sja1105 this is a better approach,
and as it's lightweight in phylink, I don't think having this will add
much maintenance burden, so I'm happy with the approach.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks
2024-12-12 15:56 ` Russell King (Oracle)
@ 2024-12-12 17:20 ` Vladimir Oltean
0 siblings, 0 replies; 4+ messages in thread
From: Vladimir Oltean @ 2024-12-12 17:20 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Heiner Kallweit, Andrew Lunn, Florian Fainelli,
linux-kernel
On Thu, Dec 12, 2024 at 03:56:37PM +0000, Russell King (Oracle) wrote:
> On Thu, Oct 03, 2024 at 05:07:53PM +0300, Vladimir Oltean wrote:
> > sja1105_static_config_reload() changes major settings in the switch and
> > it requires a reset. A use case is to change things like Qdiscs (but see
> > sja1105_reset_reasons[] for full list) while PTP synchronization is
> > running, and the servo loop must not exit the locked state (s2).
> > Therefore, stopping and restarting the phylink instances of all ports is
> > not desirable, because that also stops the phylib state machine, and
> > retriggers a seconds-long auto-negotiation process that breaks PTP.
>
> However:
>
> > ptp4l[63.553]: master offset 14 s2 freq -896 path delay 765
> > $ ip link set br0 type bridge vlan_filtering 1
> > [ 63.983283] sja1105 spi2.0 sw0p0: Link is Down
> > [ 63.991913] sja1105 spi2.0: Link is Down
> > [ 64.009784] sja1105 spi2.0: Reset switch and programmed static config. Reason: VLAN filtering
> > [ 64.020217] sja1105 spi2.0 sw0p0: Link is Up - 1Gbps/Full - flow control off
> > [ 64.030683] sja1105 spi2.0: Link is Up - 1Gbps/Full - flow control off
> > ptp4l[64.554]: master offset 7397 s2 freq +6491 path delay 765
> > ptp4l[65.554]: master offset 38 s2 freq +1352 path delay 765
> > ptp4l[66.554]: master offset -2225 s2 freq -900 path delay 764
>
> doesn't this change in offset and frequency indicate that the PTP clock
> was still disrupted, and needed to be re-synchronised? If it was
> unaffected, then I would have expected the offset and frequency to
> remain similar to before the reset happened.
Regarding expectations: I cannot avoid having _some_ disruption, as the
hardware doesn't have proper assist for this operation. It's an automotive
switch intended to be used with a static configuration programmed once.
Offloading the dynamic Linux network stack is somewhat outside of its
design intentions.
The driver measures the time elapsed, in the CLOCK_REALTIME domain,
during the switch reset, and then reprograms the switch PTP clock with
the value pre-reset plus that elapsed time. It's still better than
leaving the PTP time in January 1970, which would require the user space
PTP stack to exit the "servo locked" state and perform a clock step.
I also haven't actually shown the post-reset output when it has reached
the steady state again, just a few seconds worth of ouput, that's why
the frequency adjustment is not yet equal to the previous value.
There are ways to further reduce the convergence time in real life systems,
which I didn't bother to apply here, like synchronize CLOCK_REALTIME to
the switch PHC (to better approximate the leap missed during switch reset),
or increasing the Sync packet interval (SJA1105 is frequently used with
gPTP which has a Sync interval of 125 ms, here I tested with 1 s).
It still really doesn't compare at all to the disruption caused by
alternatives such as dropping the link in the PHY, or not restoring the
PTP time at all.
> Nevertheless...
>
> > @@ -1551,7 +1552,8 @@ static void phylink_resolve(struct work_struct *w)
> > }
> >
> > if (mac_config) {
> > - if (link_state.interface != pl->link_config.interface) {
> > + if (link_state.interface != pl->link_config.interface ||
> > + pl->force_major_config) {
> > /* The interface has changed, force the link down and
> > * then reconfigure.
> > */
> > @@ -1561,6 +1563,7 @@ static void phylink_resolve(struct work_struct *w)
> > }
> > phylink_major_config(pl, false, &link_state);
> > pl->link_config.interface = link_state.interface;
> > + pl->force_major_config = false;
> > }
> > }
>
> This will delay the major config until the link comes up, as mac_config
> only gets set true for fixed-link and PHY when the link is up. For
> inband mode, things get less certain, because mac_config will only be
> true if there is a PHY present and the PHY link was up. Otherwise,
> inband leaves mac_config false, and thus if force_major_config was
> true, that would persist indefinitely.
Ok, I certainly wasn't careful enough when analyzing the existing code path.
If I understand you correctly, you're thinking something like this should be
sufficient to avoid depending on bool mac_config? The diff is
incremental over the change posted here.
-- >8 --
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 3dcd1f47093a..893acab0d9bd 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -1712,20 +1712,18 @@ static void phylink_resolve(struct work_struct *w)
if (pl->act_link_an_mode != MLO_AN_FIXED)
phylink_apply_manual_flow(pl, &link_state);
- if (mac_config) {
- if (link_state.interface != pl->link_config.interface ||
- pl->force_major_config) {
- /* The interface has changed, force the link down and
- * then reconfigure.
- */
- if (cur_link_state) {
- phylink_link_down(pl);
- cur_link_state = false;
- }
- phylink_major_config(pl, false, &link_state);
- pl->link_config.interface = link_state.interface;
- pl->force_major_config = false;
+ if ((mac_config && link_state.interface != pl->link_config.interface) ||
+ pl->force_major_config) {
+ /* The interface has changed or a forced major configuration
+ * was requested, so force the link down and then reconfigure.
+ */
+ if (cur_link_state) {
+ phylink_link_down(pl);
+ cur_link_state = false;
}
+ phylink_major_config(pl, false, &link_state);
+ pl->link_config.interface = link_state.interface;
+ pl->force_major_config = false;
}
if (link_state.link != cur_link_state) {
-- >8 --
Not worth it, IMO, to complicate the logic with yet one more layer of
"ifs" for the pl->link_config.interface and pl->force_major_config
reassignment. It's ok if they are assigned their current values, when
the code block is entered on a transition of the other variable.
> > +void phylink_replay_link_begin(struct phylink *pl)
> > +{
> > + ASSERT_RTNL();
> > +
> > + phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED);
>
> I would prefer this used a different disable flag, so that...
>
> > +}
> > +
> > +void phylink_replay_link_end(struct phylink *pl)
> > +{
> > + ASSERT_RTNL();
> > +
> > + pl->force_major_config = true;
> > + phylink_enable_and_run_resolve(pl, PHYLINK_DISABLE_STOPPED);
>
> this can check that phylink_replay_link_begin() was previously called
> to catch programming errors. There shouldn't be any conflict with
> phylink_start()/phylink_stop() since the RTNL is held, but I think
> its still worth checking that phylink_replay_link_begin() was
> indeed called previously.
This should be fine, right?
-- >8 --
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 893acab0d9bd..c9fb0bb024d5 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -34,6 +34,7 @@ enum {
PHYLINK_DISABLE_STOPPED,
PHYLINK_DISABLE_LINK,
PHYLINK_DISABLE_MAC_WOL,
+ PHYLINK_DISABLE_REPLAY,
PCS_STATE_DOWN = 0,
PCS_STATE_STARTING,
@@ -4070,7 +4071,7 @@ void phylink_replay_link_begin(struct phylink *pl)
{
ASSERT_RTNL();
- phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED);
+ phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_REPLAY);
}
EXPORT_SYMBOL_GPL(phylink_replay_link_begin);
@@ -4093,8 +4094,13 @@ void phylink_replay_link_end(struct phylink *pl)
{
ASSERT_RTNL();
+ if (WARN(!test_bit(PHYLINK_DISABLE_REPLAY,
+ &pl->phylink_disable_state),
+ "phylink_replay_link_end() called without a prior phylink_replay_link_begin()\n"))
+ return;
+
pl->force_major_config = true;
- phylink_enable_and_run_resolve(pl, PHYLINK_DISABLE_STOPPED);
+ phylink_enable_and_run_resolve(pl, PHYLINK_DISABLE_REPLAY);
flush_work(&pl->resolve);
}
EXPORT_SYMBOL_GPL(phylink_replay_link_end);
-- >8 --
> Other than those points, I think for sja1105 this is a better approach,
> and as it's lightweight in phylink, I don't think having this will add
> much maintenance burden, so I'm happy with the approach.
Thanks, this is encouraging. I'll continue to work on it, and then clean
up what's left in sja1105, as mentioned in the commit message.
I've retested with both changes, and it still appears to work.
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-12-12 17:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-03 14:07 [RFC PATCH net-next] net: dsa: sja1105: let phylink help with the replay of link callbacks Vladimir Oltean
2024-12-12 14:21 ` Vladimir Oltean
2024-12-12 15:56 ` Russell King (Oracle)
2024-12-12 17:20 ` Vladimir Oltean
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).