* [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
@ 2026-04-23 7:22 Corey Leavitt
2026-04-23 8:40 ` Corey Leavitt
0 siblings, 1 reply; 10+ messages in thread
From: Corey Leavitt @ 2026-04-23 7:22 UTC (permalink / raw)
Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
On systems where a PSE controller driver loads as a module and a
device-tree PHY node carries a `pses = <&pse_pi>` reference,
fwnode_mdiobus_register_phy() tries to resolve the PSE handle before
the controller driver has probed. of_pse_control_get() returns
-EPROBE_DEFER, the enclosing MDIO/DSA probe fails, and driver-core
re-queues the work. The retry loop spins until the PSE driver module
loads and its controller registers.
Commit fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse
control") made each retry expensive. It reordered
fwnode_mdiobus_register_phy() so the PHY is registered before the
PSE lookup. Every deferral now performs a full
phy_device_register() / phy_device_remove() cycle. On a board with a
sufficiently tight watchdog the retry loop can starve the watchdog
kthread. On the reporting hardware (MT7621 + gpio-wdt, 1-second
margin) the retry loop converts a slow probe phase into a reset
before userspace loads.
The affected population today looks small. OpenWrt, where PSE
actually ships, is still on 6.12 (pre-regression), and most
environments with CONFIG_PSE_*=m do not have boards whose DT
references a PSE controller from a PHY. Still, the mechanism is
general. Any modular PSE driver combined with the documented
`pses = <&...>` binding reproduces the retry loop. Whether it
reaches brick-grade or merely slow/flaky boot depends on local
watchdog timing. More exposure is expected as distribution and
embedded kernels move to 6.13 and later.
The narrow fix would be to partially revert the ordering in
fa2f0454174c so each defer is cheap again. That keeps the same
architecture (fwnode_mdio holding PSE knowledge, -EPROBE_DEFER
flowing across the subsystem boundary), and any future reorder
reintroduces the same class of bug. This series takes the larger
fix: decouple PSE controller lookup from MDIO registration entirely.
pse_core now publishes a BLOCKING_NOTIFIER chain with REGISTERED
and UNREGISTERED events. phy_device subscribes, owns phydev->psec
lifetime, and attaches PSE handles in response to controller
lifecycle rather than during probe. fwnode_mdio loses its PSE
awareness, and -EPROBE_DEFER no longer flows out of fwnode_mdio.
Patch breakdown:
1. Scope the pse_control regulator handle to kref lifetime
(Fixes: d83e13761d5b). A latent bug that patch 4 makes
reachable.
2. Add the notifier chain (enum, head, register/unregister
helpers). Pure infrastructure. No subscribers yet, no
observable change.
3. Fire REGISTERED and UNREGISTERED events from the controller
register/unregister paths. Still no subscribers, still no
observable change.
4. Subscribe from the PHY layer, take ownership of phydev->psec
via the notifier, and remove fwnode_find_pse_control() from
fwnode_mdio.
Patch 1 is bundled here per stable-kernel-rules section 4
reachability guidance. On mainline today, with no notifier
subscriber, no caller drives the dangling regulator-handle sequence.
Patches 2 and 3 are deliberately split to separate "add
infrastructure" from "wire it up". Happy to fold them if maintainers
prefer the combined form.
Validated on a Cudy C200P (MT7621 + IP804AR) running an OpenWrt
build of 6.18.21 with the series applied. A lockdep build
(CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP) shows no splats
from the series' code paths during boot, PHY attach, PHY detach, or
a full controller unbind/rebind cycle. ethtool --set-pse drives all
four PoE-capable LAN ports, and a Ruckus H510 class-4 PD plugged
into lan3 negotiates and receives 48 V.
The C200P has no SFP cage, so the SFP path change in sfp.c
(phy_device_register -> phy_device_register_locked) isn't exercised
on the bench. Verified by call-graph audit: every path reaching
sfp_sm_probe_phy() holds rtnl at entry, via sfp_timeout,
sfp_check_state, sfp_probe, sfp_remove, or
sfp_bus_{add,del}_upstream.
Not addressed by this series: ethtool --show-pse returns "No data
available" on DSA netdevs in 6.18, because dev->phydev is NULL for
DSA-frontend netdevs and ethnl_req_get_phydev() therefore returns
NULL. That's a DSA / ethtool integration quirk that predates this
work.
Sending as RFC because this is my first net-next series. I'd
appreciate maintainer guidance on whether patch 1 should go to net
rather than net-next, and whether the patch 2/3 split is preferred
to the combined form.
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
Corey Leavitt (4):
net: pse-pd: scope pse_control regulator handle to kref lifetime
net: pse-pd: add notifier chain for controller lifecycle events
net: pse-pd: fire lifecycle events on controller register/unregister
net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
drivers/net/mdio/fwnode_mdio.c | 34 ----------
drivers/net/phy/phy_device.c | 144 ++++++++++++++++++++++++++++++++++++++---
drivers/net/phy/sfp.c | 2 +-
drivers/net/pse-pd/pse_core.c | 60 ++++++++++++++++-
include/linux/phy.h | 2 +
include/linux/pse-pd/pse.h | 41 ++++++++++++
6 files changed, 236 insertions(+), 47 deletions(-)
---
base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
change-id: 20260422-pse-notifier-decouple-efa80d77f4be
Best regards,
--
Corey Leavitt <corey@leavitt.info>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
@ 2026-04-23 7:42 Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime Corey Leavitt via B4 Relay
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Corey Leavitt via B4 Relay @ 2026-04-23 7:42 UTC (permalink / raw)
To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King
Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
On systems where a PSE controller driver loads as a module and a
device-tree PHY node carries a `pses = <&pse_pi>` reference,
fwnode_mdiobus_register_phy() tries to resolve the PSE handle before
the controller driver has probed. of_pse_control_get() returns
-EPROBE_DEFER, the enclosing MDIO/DSA probe fails, and driver-core
re-queues the work. The retry loop spins until the PSE driver module
loads and its controller registers.
Commit fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse
control") made each retry expensive. It reordered
fwnode_mdiobus_register_phy() so the PHY is registered before the
PSE lookup. Every deferral now performs a full
phy_device_register() / phy_device_remove() cycle. On a board with a
sufficiently tight watchdog the retry loop can starve the watchdog
kthread. On the reporting hardware (MT7621 + gpio-wdt, 1-second
margin) the retry loop converts a slow probe phase into a reset
before userspace loads.
The affected population today looks small. OpenWrt, where PSE
actually ships, is still on 6.12 (pre-regression), and most
environments with CONFIG_PSE_*=m do not have boards whose DT
references a PSE controller from a PHY. Still, the mechanism is
general. Any modular PSE driver combined with the documented
`pses = <&...>` binding reproduces the retry loop. Whether it
reaches brick-grade or merely slow/flaky boot depends on local
watchdog timing. More exposure is expected as distribution and
embedded kernels move to 6.13 and later.
The narrow fix would be to partially revert the ordering in
fa2f0454174c so each defer is cheap again. That keeps the same
architecture (fwnode_mdio holding PSE knowledge, -EPROBE_DEFER
flowing across the subsystem boundary), and any future reorder
reintroduces the same class of bug. This series takes the larger
fix: decouple PSE controller lookup from MDIO registration entirely.
pse_core now publishes a BLOCKING_NOTIFIER chain with REGISTERED
and UNREGISTERED events. phy_device subscribes, owns phydev->psec
lifetime, and attaches PSE handles in response to controller
lifecycle rather than during probe. fwnode_mdio loses its PSE
awareness, and -EPROBE_DEFER no longer flows out of fwnode_mdio.
Patch breakdown:
1. Scope the pse_control regulator handle to kref lifetime
(Fixes: d83e13761d5b). A latent bug that patch 4 makes
reachable.
2. Add the notifier chain (enum, head, register/unregister
helpers). Pure infrastructure. No subscribers yet, no
observable change.
3. Fire REGISTERED and UNREGISTERED events from the controller
register/unregister paths. Still no subscribers, still no
observable change.
4. Subscribe from the PHY layer, take ownership of phydev->psec
via the notifier, and remove fwnode_find_pse_control() from
fwnode_mdio.
Patch 1 is bundled here per stable-kernel-rules section 4
reachability guidance. On mainline today, with no notifier
subscriber, no caller drives the dangling regulator-handle sequence.
Patches 2 and 3 are deliberately split to separate "add
infrastructure" from "wire it up". Happy to fold them if maintainers
prefer the combined form.
Validated on a Cudy C200P (MT7621 + IP804AR) running an OpenWrt
build of 6.18.21 with the series applied. A lockdep build
(CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP) shows no splats
from the series' code paths during boot, PHY attach, PHY detach, or
a full controller unbind/rebind cycle. ethtool --set-pse drives all
four PoE-capable LAN ports, and a Ruckus H510 class-4 PD plugged
into lan3 negotiates and receives 48 V.
The C200P has no SFP cage, so the SFP path change in sfp.c
(phy_device_register -> phy_device_register_locked) isn't exercised
on the bench. Verified by call-graph audit: every path reaching
sfp_sm_probe_phy() holds rtnl at entry, via sfp_timeout,
sfp_check_state, sfp_probe, sfp_remove, or
sfp_bus_{add,del}_upstream.
Not addressed by this series: ethtool --show-pse returns "No data
available" on DSA netdevs in 6.18, because dev->phydev is NULL for
DSA-frontend netdevs and ethnl_req_get_phydev() therefore returns
NULL. That's a DSA / ethtool integration quirk that predates this
work.
Sending as RFC because this is my first net-next series. I'd
appreciate maintainer guidance on whether patch 1 should go to net
rather than net-next, and whether the patch 2/3 split is preferred
to the combined form.
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
Corey Leavitt (4):
net: pse-pd: scope pse_control regulator handle to kref lifetime
net: pse-pd: add notifier chain for controller lifecycle events
net: pse-pd: fire lifecycle events on controller register/unregister
net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
drivers/net/mdio/fwnode_mdio.c | 34 ----------
drivers/net/phy/phy_device.c | 144 ++++++++++++++++++++++++++++++++++++++---
drivers/net/phy/sfp.c | 2 +-
drivers/net/pse-pd/pse_core.c | 60 ++++++++++++++++-
include/linux/phy.h | 2 +
include/linux/pse-pd/pse.h | 41 ++++++++++++
6 files changed, 236 insertions(+), 47 deletions(-)
---
base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
change-id: 20260422-pse-notifier-decouple-efa80d77f4be
Best regards,
--
Corey Leavitt <corey@leavitt.info>
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime
2026-04-23 7:42 [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Corey Leavitt via B4 Relay
@ 2026-04-23 7:42 ` Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 2/4] net: pse-pd: add notifier chain for controller lifecycle events Corey Leavitt via B4 Relay
` (3 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Corey Leavitt via B4 Relay @ 2026-04-23 7:42 UTC (permalink / raw)
To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King
Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
From: Corey Leavitt <corey@leavitt.info>
__pse_control_release() drops psec->ps via devm_regulator_put(), which
only succeeds if the devres entry added by the matching
devm_regulator_get_exclusive() is still present on pcdev->dev at the
time the pse_control's kref hits zero.
In practice that assumption does not hold when the controller is
unbound while any pse_control still has consumers: pcdev->dev's
devres list is released LIFO, so every per-attach regulator-GET
devres runs (and regulator_put()s the underlying regulator) before
pse_controller_unregister() itself is invoked. Any later
pse_control_put() from that unbind path then reads psec->ps as a
dangling pointer inside devm_regulator_put() and WARNs at
drivers/regulator/devres.c:232 (devres_release() fails to find the
already-released match).
The pse_control's consumer handle is logically scoped to the
pse_control's refcount, not to pcdev->dev's devres lifetime. Switch
to the plain regulator_get_exclusive() / regulator_put() pair so
__pse_control_release() does the right put regardless of whether
the controller's devres has already been unwound.
No change to the regulator-framework-visible refcount or lifetime of
the underlying regulator: a single get paired with a single put. The
existing devm_regulator_register() for the per-PI rails is unchanged
(those ARE correctly scoped to the controller's lifetime).
Fixes: d83e13761d5b ("net: pse-pd: Use regulator framework within PSE framework")
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
drivers/net/pse-pd/pse_core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index f6b94ac7a68a..893ec2185947 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -1362,7 +1362,7 @@ static void __pse_control_release(struct kref *kref)
if (psec->pcdev->pi[psec->id].admin_state_enabled)
regulator_disable(psec->ps);
- devm_regulator_put(psec->ps);
+ regulator_put(psec->ps);
module_put(psec->pcdev->owner);
@@ -1431,8 +1431,8 @@ pse_control_get_internal(struct pse_controller_dev *pcdev, unsigned int index,
goto free_psec;
pcdev->pi[index].admin_state_enabled = ret;
- psec->ps = devm_regulator_get_exclusive(pcdev->dev,
- rdev_get_name(pcdev->pi[index].rdev));
+ psec->ps = regulator_get_exclusive(pcdev->dev,
+ rdev_get_name(pcdev->pi[index].rdev));
if (IS_ERR(psec->ps)) {
ret = PTR_ERR(psec->ps);
goto put_module;
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH RFC net-next 2/4] net: pse-pd: add notifier chain for controller lifecycle events
2026-04-23 7:42 [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime Corey Leavitt via B4 Relay
@ 2026-04-23 7:42 ` Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 3/4] net: pse-pd: fire lifecycle events on controller register/unregister Corey Leavitt via B4 Relay
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Corey Leavitt via B4 Relay @ 2026-04-23 7:42 UTC (permalink / raw)
To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King
Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
From: Corey Leavitt <corey@leavitt.info>
Introduce a blocking notifier chain that allows other subsystems to be
informed when a PSE controller is registered or unregistered, and
provide pse_register_notifier() / pse_unregister_notifier() as the
subscriber interface.
Subsequent patches will use this to let the phy subsystem own the
phydev->psec lifecycle directly, decoupling PSE lookup from
fwnode_mdiobus_register_phy() and removing the probe-time
-EPROBE_DEFER coupling that currently exists between mdio, phy and
pse-pd when the PSE controller driver is modular.
A blocking chain (rather than atomic) is used because callbacks will
take rtnl_lock and call back into pse_core via of_pse_control_get().
The enum pse_controller_event is placed outside the
IS_ENABLED(CONFIG_PSE_CONTROLLER) guard so that subscribers compiled
into a kernel without PSE support can still reference the event
values in dead-code paths without breaking the build.
This patch is pure infrastructure: nothing fires events yet, and
nothing subscribes. No observable behavior change.
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
drivers/net/pse-pd/pse_core.c | 34 ++++++++++++++++++++++++++++++++++
include/linux/pse-pd/pse.h | 32 ++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+)
diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 893ec2185947..80c5c6c1758c 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -8,6 +8,7 @@
#include <linux/device.h>
#include <linux/ethtool.h>
#include <linux/ethtool_netlink.h>
+#include <linux/notifier.h>
#include <linux/of.h>
#include <linux/phy.h>
#include <linux/pse-pd/pse.h>
@@ -23,6 +24,39 @@ static LIST_HEAD(pse_controller_list);
static DEFINE_XARRAY_ALLOC(pse_pw_d_map);
static DEFINE_MUTEX(pse_pw_d_mutex);
+static BLOCKING_NOTIFIER_HEAD(pse_controller_notifier);
+
+/**
+ * pse_register_notifier - register a callback for PSE controller events
+ * @nb: notifier block to register
+ *
+ * See enum pse_controller_event for events fired and their subscriber
+ * contract. Callbacks run in process context; they may sleep, take
+ * rtnl, and call of_pse_control_get(). The chain fires synchronously,
+ * so a PSE controller driver's probe/unbind path must not hold any
+ * such lock when calling pse_controller_register() or
+ * pse_controller_unregister().
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int pse_register_notifier(struct notifier_block *nb)
+{
+ return blocking_notifier_chain_register(&pse_controller_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(pse_register_notifier);
+
+/**
+ * pse_unregister_notifier - unregister a previously registered callback
+ * @nb: notifier block previously passed to pse_register_notifier()
+ *
+ * Return: 0 on success, negative error code otherwise.
+ */
+int pse_unregister_notifier(struct notifier_block *nb)
+{
+ return blocking_notifier_chain_unregister(&pse_controller_notifier, nb);
+}
+EXPORT_SYMBOL_GPL(pse_unregister_notifier);
+
/**
* struct pse_control - a PSE control
* @pcdev: a pointer to the PSE controller device
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 4e5696cfade7..78fe3a2b1ea8 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -21,6 +21,7 @@ struct net_device;
struct phy_device;
struct pse_controller_dev;
struct netlink_ext_ack;
+struct notifier_block;
/* C33 PSE extended state and substate. */
struct ethtool_c33_pse_ext_state_info {
@@ -337,6 +338,24 @@ enum pse_budget_eval_strategies {
PSE_BUDGET_EVAL_STRAT_DYNAMIC = 1 << 2,
};
+/**
+ * enum pse_controller_event - PSE controller lifecycle events
+ *
+ * Event data in callbacks is always a pointer to the struct
+ * pse_controller_dev firing the event.
+ *
+ * @PSE_REGISTERED: controller added to pse_controller_list and
+ * resolvable by of_pse_control_get().
+ * @PSE_UNREGISTERED: controller about to be removed from
+ * pse_controller_list. Subscribers holding pse_control references
+ * targeting it must drop them before returning and must not
+ * acquire new references for it.
+ */
+enum pse_controller_event {
+ PSE_REGISTERED,
+ PSE_UNREGISTERED,
+};
+
#if IS_ENABLED(CONFIG_PSE_CONTROLLER)
int pse_controller_register(struct pse_controller_dev *pcdev);
void pse_controller_unregister(struct pse_controller_dev *pcdev);
@@ -366,6 +385,9 @@ int pse_ethtool_set_prio(struct pse_control *psec,
bool pse_has_podl(struct pse_control *psec);
bool pse_has_c33(struct pse_control *psec);
+int pse_register_notifier(struct notifier_block *nb);
+int pse_unregister_notifier(struct notifier_block *nb);
+
#else
static inline struct pse_control *of_pse_control_get(struct device_node *node,
@@ -416,6 +438,16 @@ static inline bool pse_has_c33(struct pse_control *psec)
return false;
}
+static inline int pse_register_notifier(struct notifier_block *nb)
+{
+ return 0;
+}
+
+static inline int pse_unregister_notifier(struct notifier_block *nb)
+{
+ return 0;
+}
+
#endif
#endif
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH RFC net-next 3/4] net: pse-pd: fire lifecycle events on controller register/unregister
2026-04-23 7:42 [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 2/4] net: pse-pd: add notifier chain for controller lifecycle events Corey Leavitt via B4 Relay
@ 2026-04-23 7:42 ` Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 4/4] net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook Corey Leavitt via B4 Relay
2026-04-23 9:05 ` [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Kory Maincent
4 siblings, 0 replies; 10+ messages in thread
From: Corey Leavitt via B4 Relay @ 2026-04-23 7:42 UTC (permalink / raw)
To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King
Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
From: Corey Leavitt <corey@leavitt.info>
Hook the newly-introduced pse_controller_notifier chain so that
pse_controller_register() fires PSE_REGISTERED after the controller
has been added to pse_controller_list (i.e. is now resolvable by
of_pse_control_get()), and pse_controller_unregister() fires
PSE_UNREGISTERED before the controller is removed from the list
(while it is still valid to dereference from a subscriber's
pse_control pointer targeting it).
With no subscribers yet, this is observably a no-op. A later change
wires the phy subsystem in as the first subscriber.
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
drivers/net/pse-pd/pse_core.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 80c5c6c1758c..82125502a8e3 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -1138,6 +1138,9 @@ int pse_controller_register(struct pse_controller_dev *pcdev)
list_add(&pcdev->list, &pse_controller_list);
mutex_unlock(&pse_list_mutex);
+ blocking_notifier_call_chain(&pse_controller_notifier,
+ PSE_REGISTERED, pcdev);
+
return 0;
}
EXPORT_SYMBOL_GPL(pse_controller_register);
@@ -1148,6 +1151,9 @@ EXPORT_SYMBOL_GPL(pse_controller_register);
*/
void pse_controller_unregister(struct pse_controller_dev *pcdev)
{
+ blocking_notifier_call_chain(&pse_controller_notifier,
+ PSE_UNREGISTERED, pcdev);
+
pse_flush_pw_ds(pcdev);
pse_release_pis(pcdev);
if (pcdev->irq)
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH RFC net-next 4/4] net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
2026-04-23 7:42 [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Corey Leavitt via B4 Relay
` (2 preceding siblings ...)
2026-04-23 7:42 ` [PATCH RFC net-next 3/4] net: pse-pd: fire lifecycle events on controller register/unregister Corey Leavitt via B4 Relay
@ 2026-04-23 7:42 ` Corey Leavitt via B4 Relay
2026-04-23 9:05 ` [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Kory Maincent
4 siblings, 0 replies; 10+ messages in thread
From: Corey Leavitt via B4 Relay @ 2026-04-23 7:42 UTC (permalink / raw)
To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King
Cc: Andrew Lunn, netdev, linux-kernel, Corey Leavitt
From: Corey Leavitt <corey@leavitt.info>
Transfer ownership of phydev->psec from fwnode_mdio to the phy
subsystem itself. The phy subsystem now subscribes to the pse-pd
notifier chain and manages psec attach/detach in response to PSE
controller lifecycle events, while fwnode_mdio loses its PSE
awareness entirely.
Split phy_device_register() into a public entry point that takes
rtnl_lock() and a phy_device_register_locked() variant that assumes
rtnl is already held. Callers that already hold rtnl (the SFP
module state machine via __sfp_sm_event) use the _locked form to
avoid deadlock; all other callers use the unchanged public API.
This pair mirrors the register_netdevice() / register_netdev()
split convention already established in the core networking stack.
rtnl must span the full registration sequence through device_add(),
not just phy_try_attach_pse(): a PSE_REGISTERED event firing between
a narrow attach lock and device_add() would walk mdio_bus_type, find
the phy not yet on the bus, and leave it permanently unattached.
With rtnl held across the full registration sequence:
- At phy_device_register_locked(), phy_try_attach_pse() attempts
an of_pse_control_get() for phys whose DT pses phandle resolves
now. If the controller is already registered, psec is attached
before device_add() makes the phy visible on mdio_bus_type.
If the controller is not yet registered, the swallow-error path
leaves psec NULL and relies on the subsequent notifier event.
- On PSE_REGISTERED: an rtnl-guarded bus walk retries the attach
for every registered phy whose psec is still NULL. This is the
"phy was enumerated before the PSE controller loaded" case,
which is the root cause of the boot-time probe-retry storm on
systems with a modular PSE controller driver. Because the
pse_controller_notifier is fired synchronously, a concurrent
pse_controller_register() either (a) completes list_add and
releases pse_list_mutex before this function takes rtnl, in
which case phy_try_attach_pse() finds the controller in the
list and attaches; or (b) fires its notifier during this
function, in which case the callback blocks on rtnl until this
function returns, then walks the bus and finds the phy fully
registered (attaching if psec is still NULL).
- On PSE_UNREGISTERED: an rtnl-guarded bus walk releases every
phydev->psec that targets the departing controller before
pse_release_pis() frees pcdev->pi. Without this, a phy still
holding a pse_control reference would cause a use-after-free
in __pse_control_release's pcdev->pi[psec->id] access, and the
PSE driver module could not finish unloading while any phy
still held a reference via module_put().
Introduce phy_try_attach_pse() as the rtnl-guarded helper used by
both the register path and the notifier walk. Holding rtnl across
of_pse_control_get() is safe because pse_list_mutex is never held
in the opposite order.
Expose pse_control_matches_pcdev() as a predicate so subscribers
can identify which of their held pse_control references target a
given controller, without leaking the struct pse_controller_dev *
out of pse_control opacity.
Move the final pse_control_put() of phydev->psec from
phy_device_remove() to phy_device_release(). The kobject release
callback runs only after every reference on the device has been
dropped, including the bus iterator references taken by
bus_for_each_dev() in the notifier walk, which means by the time
release fires no concurrent reader or writer of phydev->psec can
exist. The mdio_bus_type klist is set up in bus_register() with
klist_devices_get() / klist_devices_put() (drivers/base/bus.c),
which bracket each iteration step with get_device() / put_device()
on the underlying struct device; that reference defers the release
callback from firing until the walk has advanced past this phy.
Keeping phy_device_remove() unchanged avoids introducing a new
locking contract on its many callers (sfp, fixed_phy, xgbe, hns,
netsec, bcm_sf2, mdiobus_unregister).
Finally, delete fwnode_find_pse_control() and its call site in
fwnode_mdiobus_register_phy(), and drop the PSE header from
fwnode_mdio.c. This removes the probe-time -EPROBE_DEFER coupling
between mdio and pse-pd that caused the boot-hang regression on
systems with a modular PSE controller driver and a DT phy with a
pses phandle: the MDIO/DSA probe no longer sees any PSE-originated
-EPROBE_DEFER, so the probe-retry storm is gone. fwnode_mdio is
now PSE-agnostic.
Fixes: fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse control")
Signed-off-by: Corey Leavitt <corey@leavitt.info>
---
drivers/net/mdio/fwnode_mdio.c | 34 ----------
drivers/net/phy/phy_device.c | 144 ++++++++++++++++++++++++++++++++++++++---
drivers/net/phy/sfp.c | 2 +-
drivers/net/pse-pd/pse_core.c | 14 ++++
include/linux/phy.h | 2 +
include/linux/pse-pd/pse.h | 9 +++
6 files changed, 161 insertions(+), 44 deletions(-)
diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c
index ba7091518265..7bd979b59f49 100644
--- a/drivers/net/mdio/fwnode_mdio.c
+++ b/drivers/net/mdio/fwnode_mdio.c
@@ -11,33 +11,11 @@
#include <linux/fwnode_mdio.h>
#include <linux/of.h>
#include <linux/phy.h>
-#include <linux/pse-pd/pse.h>
MODULE_AUTHOR("Calvin Johnson <calvin.johnson@oss.nxp.com>");
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("FWNODE MDIO bus (Ethernet PHY) accessors");
-static struct pse_control *
-fwnode_find_pse_control(struct fwnode_handle *fwnode,
- struct phy_device *phydev)
-{
- struct pse_control *psec;
- struct device_node *np;
-
- if (!IS_ENABLED(CONFIG_PSE_CONTROLLER))
- return NULL;
-
- np = to_of_node(fwnode);
- if (!np)
- return NULL;
-
- psec = of_pse_control_get(np, phydev);
- if (PTR_ERR(psec) == -ENOENT)
- return NULL;
-
- return psec;
-}
-
static struct mii_timestamper *
fwnode_find_mii_timestamper(struct fwnode_handle *fwnode)
{
@@ -118,7 +96,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
struct fwnode_handle *child, u32 addr)
{
struct mii_timestamper *mii_ts = NULL;
- struct pse_control *psec = NULL;
struct phy_device *phy;
bool is_c45;
u32 phy_id;
@@ -159,14 +136,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
goto clean_phy;
}
- psec = fwnode_find_pse_control(child, phy);
- if (IS_ERR(psec)) {
- rc = PTR_ERR(psec);
- goto unregister_phy;
- }
-
- phy->psec = psec;
-
/* phy->mii_ts may already be defined by the PHY driver. A
* mii_timestamper probed via the device tree will still have
* precedence.
@@ -176,9 +145,6 @@ int fwnode_mdiobus_register_phy(struct mii_bus *bus,
return 0;
-unregister_phy:
- if (is_acpi_node(child) || is_of_node(child))
- phy_device_remove(phy);
clean_phy:
phy_device_free(phy);
clean_mii_ts:
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index c2cdf1ae3542..7948800e6e49 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -223,8 +223,19 @@ static void phy_mdio_device_free(struct mdio_device *mdiodev)
static void phy_device_release(struct device *dev)
{
+ struct phy_device *phydev = to_phy_device(dev);
+
+ /* bus_for_each_dev() holds get_device() across each iteration
+ * step, deferring this release callback until any in-flight PSE
+ * notifier walk has advanced past this phy. pse_control_put()
+ * takes pse_list_mutex, so this path must run in sleepable
+ * context.
+ */
+ might_sleep();
+ pse_control_put(phydev->psec);
+
fwnode_handle_put(dev->fwnode);
- kfree(to_phy_device(dev));
+ kfree(phydev);
}
static void phy_mdio_device_remove(struct mdio_device *mdiodev)
@@ -1102,14 +1113,102 @@ struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45)
}
EXPORT_SYMBOL(get_phy_device);
-/**
- * phy_device_register - Register the phy device on the MDIO bus
- * @phydev: phy_device structure to be added to the MDIO bus
+/* Best-effort attach of phydev->psec from a DT `pses = <&...>` phandle.
+ * Caller must hold rtnl. Errors are swallowed; the notifier retries
+ * at PSE_REGISTERED time.
*/
-int phy_device_register(struct phy_device *phydev)
+static void phy_try_attach_pse(struct phy_device *phydev)
+{
+ struct pse_control *psec;
+ struct device_node *np;
+
+ ASSERT_RTNL();
+
+ np = phydev->mdio.dev.of_node;
+ if (!np)
+ return;
+
+ if (phydev->psec)
+ return;
+
+ psec = of_pse_control_get(np, phydev);
+ if (IS_ERR(psec))
+ return;
+
+ phydev->psec = psec;
+}
+
+static int phy_pse_attach_one(struct device *dev, void *data __maybe_unused)
+{
+ ASSERT_RTNL();
+
+ if (dev->type != &mdio_bus_phy_type)
+ return 0;
+
+ phy_try_attach_pse(to_phy_device(dev));
+ return 0;
+}
+
+static int phy_pse_detach_one(struct device *dev, void *data)
+{
+ struct pse_controller_dev *pcdev = data;
+ struct phy_device *phydev;
+ struct pse_control *psec;
+
+ ASSERT_RTNL();
+
+ if (dev->type != &mdio_bus_phy_type)
+ return 0;
+
+ phydev = to_phy_device(dev);
+ psec = phydev->psec;
+ if (!psec || !pse_control_matches_pcdev(psec, pcdev))
+ return 0;
+
+ phydev->psec = NULL;
+ pse_control_put(psec);
+ return 0;
+}
+
+static int phy_pse_notifier_event(struct notifier_block *nb,
+ unsigned long event, void *data)
+{
+ switch (event) {
+ case PSE_REGISTERED:
+ rtnl_lock();
+ bus_for_each_dev(&mdio_bus_type, NULL, NULL,
+ phy_pse_attach_one);
+ rtnl_unlock();
+ return NOTIFY_OK;
+ case PSE_UNREGISTERED:
+ rtnl_lock();
+ bus_for_each_dev(&mdio_bus_type, NULL, data,
+ phy_pse_detach_one);
+ rtnl_unlock();
+ return NOTIFY_OK;
+ default:
+ return NOTIFY_DONE;
+ }
+}
+
+static struct notifier_block phy_pse_notifier __read_mostly = {
+ .notifier_call = phy_pse_notifier_event,
+};
+
+/**
+ * phy_device_register_locked - Register the phy device on the MDIO bus
+ * @phydev: phy_device structure to be added to the MDIO bus
+ *
+ * Same as phy_device_register() but caller must already hold rtnl_lock().
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int phy_device_register_locked(struct phy_device *phydev)
{
int err;
+ ASSERT_RTNL();
+
err = mdiobus_register_device(&phydev->mdio);
if (err)
return err;
@@ -1124,6 +1223,8 @@ int phy_device_register(struct phy_device *phydev)
goto out;
}
+ phy_try_attach_pse(phydev);
+
err = device_add(&phydev->mdio.dev);
if (err) {
phydev_err(phydev, "failed to add\n");
@@ -1133,12 +1234,32 @@ int phy_device_register(struct phy_device *phydev)
return 0;
out:
- /* Assert the reset signal */
+ /* If phy_try_attach_pse() set phydev->psec before device_add()
+ * failed, the caller's phy_device_free() -> phy_device_release()
+ * chain will drop it.
+ */
phy_device_reset(phydev, 1);
-
mdiobus_unregister_device(&phydev->mdio);
return err;
}
+EXPORT_SYMBOL(phy_device_register_locked);
+
+/**
+ * phy_device_register - Register the phy device on the MDIO bus
+ * @phydev: phy_device structure to be added to the MDIO bus
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+int phy_device_register(struct phy_device *phydev)
+{
+ int err;
+
+ rtnl_lock();
+ err = phy_device_register_locked(phydev);
+ rtnl_unlock();
+
+ return err;
+}
EXPORT_SYMBOL(phy_device_register);
/**
@@ -1152,8 +1273,6 @@ EXPORT_SYMBOL(phy_device_register);
void phy_device_remove(struct phy_device *phydev)
{
unregister_mii_timestamper(phydev->mii_ts);
- pse_control_put(phydev->psec);
-
device_del(&phydev->mdio.dev);
/* Assert the reset signal */
@@ -3962,8 +4081,14 @@ static int __init phy_init(void)
if (rc)
goto err_c45;
+ rc = pse_register_notifier(&phy_pse_notifier);
+ if (rc)
+ goto err_genphy;
+
return 0;
+err_genphy:
+ phy_driver_unregister(&genphy_driver);
err_c45:
phy_driver_unregister(&genphy_c45_driver);
err_ethtool_phy_ops:
@@ -3980,6 +4105,7 @@ static int __init phy_init(void)
static void __exit phy_exit(void)
{
+ pse_unregister_notifier(&phy_pse_notifier);
phy_driver_unregister(&genphy_c45_driver);
phy_driver_unregister(&genphy_driver);
rtnl_lock();
diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c
index bd970f753beb..d19fe0f30c5d 100644
--- a/drivers/net/phy/sfp.c
+++ b/drivers/net/phy/sfp.c
@@ -1932,7 +1932,7 @@ static int sfp_sm_probe_phy(struct sfp *sfp, int addr, bool is_c45)
/* Mark this PHY as being on a SFP module */
phy->is_on_sfp_module = true;
- err = phy_device_register(phy);
+ err = phy_device_register_locked(phy);
if (err) {
phy_device_free(phy);
dev_err(sfp->dev, "phy_device_register failed: %pe\n",
diff --git a/drivers/net/pse-pd/pse_core.c b/drivers/net/pse-pd/pse_core.c
index 82125502a8e3..a0667324a029 100644
--- a/drivers/net/pse-pd/pse_core.c
+++ b/drivers/net/pse-pd/pse_core.c
@@ -2016,3 +2016,17 @@ bool pse_has_c33(struct pse_control *psec)
return psec->pcdev->types & ETHTOOL_PSE_C33;
}
EXPORT_SYMBOL_GPL(pse_has_c33);
+
+/**
+ * pse_control_matches_pcdev - Test whether a pse_control targets a controller
+ * @psec: pse_control obtained from of_pse_control_get()
+ * @pcdev: PSE controller to compare against
+ *
+ * Return: %true if @psec was obtained from @pcdev, %false otherwise.
+ */
+bool pse_control_matches_pcdev(struct pse_control *psec,
+ struct pse_controller_dev *pcdev)
+{
+ return psec->pcdev == pcdev;
+}
+EXPORT_SYMBOL_GPL(pse_control_matches_pcdev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 199a7aaa341b..865b9baddb85 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -2158,6 +2158,8 @@ struct phy_device *fwnode_phy_find_device(struct fwnode_handle *phy_fwnode);
struct fwnode_handle *fwnode_get_phy_node(const struct fwnode_handle *fwnode);
struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45);
int phy_device_register(struct phy_device *phy);
+/* Caller must hold rtnl_lock(); see phy_device_register() for the public form. */
+int phy_device_register_locked(struct phy_device *phy);
void phy_device_free(struct phy_device *phydev);
void phy_device_remove(struct phy_device *phydev);
int phy_get_c45_ids(struct phy_device *phydev);
diff --git a/include/linux/pse-pd/pse.h b/include/linux/pse-pd/pse.h
index 78fe3a2b1ea8..d4310ca71a3e 100644
--- a/include/linux/pse-pd/pse.h
+++ b/include/linux/pse-pd/pse.h
@@ -385,6 +385,9 @@ int pse_ethtool_set_prio(struct pse_control *psec,
bool pse_has_podl(struct pse_control *psec);
bool pse_has_c33(struct pse_control *psec);
+bool pse_control_matches_pcdev(struct pse_control *psec,
+ struct pse_controller_dev *pcdev);
+
int pse_register_notifier(struct notifier_block *nb);
int pse_unregister_notifier(struct notifier_block *nb);
@@ -438,6 +441,12 @@ static inline bool pse_has_c33(struct pse_control *psec)
return false;
}
+static inline bool pse_control_matches_pcdev(struct pse_control *psec,
+ struct pse_controller_dev *pcdev)
+{
+ return false;
+}
+
static inline int pse_register_notifier(struct notifier_block *nb)
{
return 0;
--
2.53.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
2026-04-23 7:22 Corey Leavitt
@ 2026-04-23 8:40 ` Corey Leavitt
2026-04-23 12:08 ` Jonas Gorski
0 siblings, 1 reply; 10+ messages in thread
From: Corey Leavitt @ 2026-04-23 8:40 UTC (permalink / raw)
To: Oleksij Rempel, Kory Maincent, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King
Cc: Andrew Lunn, netdev, linux-kernel
Apologies for the noise -- this series was inadvertently sent twice. The
first send went out through an SMTP path that stripped the patatt
developer signature and re-encoded the bodies as quoted-printable.
Please disregard this thread and use the clean, signed copy as canonical:
https://lore.kernel.org/netdev/20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info/T/
Any v2 will land as a reply to that thread. Sorry for the extra inbox
traffic.
Thanks,
Corey
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
2026-04-23 7:42 [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Corey Leavitt via B4 Relay
` (3 preceding siblings ...)
2026-04-23 7:42 ` [PATCH RFC net-next 4/4] net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook Corey Leavitt via B4 Relay
@ 2026-04-23 9:05 ` Kory Maincent
2026-04-23 9:48 ` Corey Leavitt
4 siblings, 1 reply; 10+ messages in thread
From: Kory Maincent @ 2026-04-23 9:05 UTC (permalink / raw)
To: Corey Leavitt via B4 Relay
Cc: corey, Oleksij Rempel, Andrew Lunn, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Heiner Kallweit, Russell King,
Andrew Lunn, netdev, linux-kernel
Hello Corey,
On Thu, 23 Apr 2026 01:42:13 -0600
Corey Leavitt via B4 Relay <devnull+corey.leavitt.info@kernel.org> wrote:
> On systems where a PSE controller driver loads as a module and a
> device-tree PHY node carries a `pses = <&pse_pi>` reference,
> fwnode_mdiobus_register_phy() tries to resolve the PSE handle before
> the controller driver has probed. of_pse_control_get() returns
> -EPROBE_DEFER, the enclosing MDIO/DSA probe fails, and driver-core
> re-queues the work. The retry loop spins until the PSE driver module
> loads and its controller registers.
I will take a look at your series but FYI there was already a RFC series
tackling this issue:
https://lore.kernel.org/lkml/20260330132952.2950531-4-github@szelinsky.de/
It rose a debate and there was currently no final solution.
> Commit fa2f0454174c ("net: pse-pd: Introduce attached_phydev to pse
> control") made each retry expensive. It reordered
> fwnode_mdiobus_register_phy() so the PHY is registered before the
> PSE lookup. Every deferral now performs a full
> phy_device_register() / phy_device_remove() cycle. On a board with a
> sufficiently tight watchdog the retry loop can starve the watchdog
> kthread. On the reporting hardware (MT7621 + gpio-wdt, 1-second
> margin) the retry loop converts a slow probe phase into a reset
> before userspace loads.
>
> The affected population today looks small. OpenWrt, where PSE
> actually ships, is still on 6.12 (pre-regression), and most
> environments with CONFIG_PSE_*=m do not have boards whose DT
> references a PSE controller from a PHY. Still, the mechanism is
> general. Any modular PSE driver combined with the documented
> `pses = <&...>` binding reproduces the retry loop. Whether it
> reaches brick-grade or merely slow/flaky boot depends on local
> watchdog timing. More exposure is expected as distribution and
> embedded kernels move to 6.13 and later.
>
> The narrow fix would be to partially revert the ordering in
> fa2f0454174c so each defer is cheap again. That keeps the same
> architecture (fwnode_mdio holding PSE knowledge, -EPROBE_DEFER
> flowing across the subsystem boundary), and any future reorder
> reintroduces the same class of bug. This series takes the larger
> fix: decouple PSE controller lookup from MDIO registration entirely.
> pse_core now publishes a BLOCKING_NOTIFIER chain with REGISTERED
> and UNREGISTERED events. phy_device subscribes, owns phydev->psec
> lifetime, and attaches PSE handles in response to controller
> lifecycle rather than during probe. fwnode_mdio loses its PSE
> awareness, and -EPROBE_DEFER no longer flows out of fwnode_mdio.
>
> Patch breakdown:
>
> 1. Scope the pse_control regulator handle to kref lifetime
> (Fixes: d83e13761d5b). A latent bug that patch 4 makes
> reachable.
> 2. Add the notifier chain (enum, head, register/unregister
> helpers). Pure infrastructure. No subscribers yet, no
> observable change.
> 3. Fire REGISTERED and UNREGISTERED events from the controller
> register/unregister paths. Still no subscribers, still no
> observable change.
> 4. Subscribe from the PHY layer, take ownership of phydev->psec
> via the notifier, and remove fwnode_find_pse_control() from
> fwnode_mdio.
>
> Patch 1 is bundled here per stable-kernel-rules section 4
> reachability guidance. On mainline today, with no notifier
> subscriber, no caller drives the dangling regulator-handle sequence.
> Patches 2 and 3 are deliberately split to separate "add
> infrastructure" from "wire it up". Happy to fold them if maintainers
> prefer the combined form.
>
> Validated on a Cudy C200P (MT7621 + IP804AR) running an OpenWrt
> build of 6.18.21 with the series applied. A lockdep build
> (CONFIG_PROVE_LOCKING + CONFIG_DEBUG_ATOMIC_SLEEP) shows no splats
> from the series' code paths during boot, PHY attach, PHY detach, or
> a full controller unbind/rebind cycle. ethtool --set-pse drives all
> four PoE-capable LAN ports, and a Ruckus H510 class-4 PD plugged
> into lan3 negotiates and receives 48 V.
>
> The C200P has no SFP cage, so the SFP path change in sfp.c
> (phy_device_register -> phy_device_register_locked) isn't exercised
> on the bench. Verified by call-graph audit: every path reaching
> sfp_sm_probe_phy() holds rtnl at entry, via sfp_timeout,
> sfp_check_state, sfp_probe, sfp_remove, or
> sfp_bus_{add,del}_upstream.
>
> Not addressed by this series: ethtool --show-pse returns "No data
> available" on DSA netdevs in 6.18, because dev->phydev is NULL for
> DSA-frontend netdevs and ethnl_req_get_phydev() therefore returns
> NULL. That's a DSA / ethtool integration quirk that predates this
> work.
>
> Sending as RFC because this is my first net-next series. I'd
> appreciate maintainer guidance on whether patch 1 should go to net
> rather than net-next, and whether the patch 2/3 split is preferred
> to the combined form.
>
> Signed-off-by: Corey Leavitt <corey@leavitt.info>
> ---
> Corey Leavitt (4):
> net: pse-pd: scope pse_control regulator handle to kref lifetime
> net: pse-pd: add notifier chain for controller lifecycle events
> net: pse-pd: fire lifecycle events on controller register/unregister
> net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook
>
> drivers/net/mdio/fwnode_mdio.c | 34 ----------
> drivers/net/phy/phy_device.c | 144
> ++++++++++++++++++++++++++++++++++++++--- drivers/net/phy/sfp.c |
> 2 +- drivers/net/pse-pd/pse_core.c | 60 ++++++++++++++++-
> include/linux/phy.h | 2 +
> include/linux/pse-pd/pse.h | 41 ++++++++++++
> 6 files changed, 236 insertions(+), 47 deletions(-)
> ---
> base-commit: 1f5ffc672165ff851063a5fd044b727ab2517ae3
> change-id: 20260422-pse-notifier-decouple-efa80d77f4be
>
> Best regards,
> --
> Corey Leavitt <corey@leavitt.info>
>
>
--
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
2026-04-23 9:05 ` [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Kory Maincent
@ 2026-04-23 9:48 ` Corey Leavitt
0 siblings, 0 replies; 10+ messages in thread
From: Corey Leavitt @ 2026-04-23 9:48 UTC (permalink / raw)
To: Kory Maincent, Carlo Szelinsky
Cc: Oleksij Rempel, Andrew Lunn, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Heiner Kallweit,
Russell King, netdev, linux-kernel
Hi Kory,
Thanks for the pointer -- I had not seen Carlo's thread; I should have
searched lore before sending and will do so before v2. Adding Carlo on
cc.
Having read it end-to-end, my read of the state as of 2026-04-13 was
that the conversation had narrowed to two open directions: propagate
EPROBE_DEFER further up into phylink/MAC probe (Andrew/Russell), or
resolve psec at PSE controller register time (your msg on 9 Apr,
"save the phandle ... then at PSE register time look for each PHY and
try to resolve every unresolved phandle"). Nothing concrete had been
posted for either.
This RFC implements the second direction. pse_core publishes a
BLOCKING_NOTIFIER chain with REGISTERED / UNREGISTERED events,
phy_device subscribes, and psec ownership moves from fwnode_mdio probe
into the notifier handler. Concretely with respect to points raised in
the earlier thread:
- fwnode_mdio loses PSE awareness entirely, so the MDIO bus scan no
longer sees -EPROBE_DEFER from PSE lookup. Consistent with
Andrew's point that bus and device lifecycles are separate.
- psec is acquired at PSE register time, before
regulator_late_cleanup (30s) can run. Carlo's admin_state_synced
guard (his patch 1) therefore isn't needed in this model. psec
resolution happens eagerly on the REGISTERED event rather than
lazily on first ethtool access, so his patch 2 is also not needed.
And because fwnode_mdio no longer looks up PSE at all, the
non-fatal EPROBE_DEFER handling there (patch 3) drops out. This
series is a different architectural shape, not an increment on
his v2.
- Oleksij's concern about lazy resolution dropping UAPI
notifications is addressed: the notifier fires at register time,
so boot-time observer semantics are preserved.
- One caveat I already owe a fix for in v2: the attach helper in
phy_device currently treats every error from of_pse_control_get()
as retry-on-notifier, including non-transient ones. Carlo's v2
patch 3 was careful to differentiate -EPROBE_DEFER from bad-DT
errors at the fwnode_mdio lookup site (which matches his msg 1
concern about catching broken bindings at boot rather than
silently later). I need to preserve that discrimination at the
notifier-handler site -- phydev_warn() on anything other than
-EPROBE_DEFER. Trivial, but worth flagging.
- The DSA genphy force-bind sequence Carlo hit
(phy_attach_direct -> device_bind_driver -> deferred retry
skipped) does not apply, because psec attachment is not tied to
phy_probe.
- Patch 1 of this series scopes the regulator handle held by
pse_control to its own kref lifetime, fixing a latent
dangling-handle sequence that the notifier unregister path makes
reachable. This is a separate regulator-lifetime bug from the one
Carlo's patch 1 addresses.
Validated end-to-end on a Cudy C200P (MT7621 DSA + i2c IP804AR as
module), with lockdep active, across the i2c driver unbind/rebind
cycle that triggers UNREGISTERED -> REGISTERED on the live system.
The cover letter has the full evidence.
I would welcome your view on whether this is the shape you had in
mind on 9 Apr, or whether the MDI-based binding you raised with
Maxime is the better endpoint and we should be reshaping around that.
Happy to keep this RFC as the scaffolding either way.
Carlo -- your debugging work on the DSA phy_attach_direct interaction
is what made it clear this kind of approach was needed; thanks for
laying that groundwork. Would value your thoughts on the tradeoffs.
Best regards,
Corey
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe
2026-04-23 8:40 ` Corey Leavitt
@ 2026-04-23 12:08 ` Jonas Gorski
0 siblings, 0 replies; 10+ messages in thread
From: Jonas Gorski @ 2026-04-23 12:08 UTC (permalink / raw)
To: Corey Leavitt, Oleksij Rempel, Kory Maincent, Andrew Lunn,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Heiner Kallweit, Russell King
Cc: Andrew Lunn, netdev, linux-kernel
On 23/04/2026 10:40, Corey Leavitt wrote:
> Apologies for the noise -- this series was inadvertently sent twice. The
> first send went out through an SMTP path that stripped the patatt
> developer signature and re-encoded the bodies as quoted-printable.
>
> Please disregard this thread and use the clean, signed copy as canonical:
>
> https://lore.kernel.org/netdev/20260423-pse-notifier-decouple-v1-0-86ed750a9d62@leavitt.info/T/
>
> Any v2 will land as a reply to that thread. Sorry for the extra inbox
> traffic.
https://docs.kernel.org/process/maintainer-netdev.html#resending-after-review
"The new version of patches should be posted as a separate thread, not as a
reply to the previous posting. Change log should include a link to the
previous posting (see Changes requested)."
Best regards,
Jonas
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-04-23 12:08 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-23 7:42 [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 1/4] net: pse-pd: scope pse_control regulator handle to kref lifetime Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 2/4] net: pse-pd: add notifier chain for controller lifecycle events Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 3/4] net: pse-pd: fire lifecycle events on controller register/unregister Corey Leavitt via B4 Relay
2026-04-23 7:42 ` [PATCH RFC net-next 4/4] net: phy: own phydev->psec via PSE notifier and remove fwnode_mdio hook Corey Leavitt via B4 Relay
2026-04-23 9:05 ` [PATCH RFC net-next 0/4] net: pse-pd: decouple controller lookup from MDIO probe Kory Maincent
2026-04-23 9:48 ` Corey Leavitt
-- strict thread matches above, loose matches on Subject: below --
2026-04-23 7:22 Corey Leavitt
2026-04-23 8:40 ` Corey Leavitt
2026-04-23 12:08 ` Jonas Gorski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox