[PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
@ 2023-05-22 15:58 Russell King (Oracle)
  2023-05-22 16:06 ` Russell King (Oracle)
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Russell King (Oracle) @ 2023-05-22 15:58 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev

When taking a network interface down (or removing a SFP module) after
the PHY has encountered an error, phy_stop() complains incorrectly
that it was called from HALTED state.

The reason this is incorrect is that the network driver will have
called phy_start() when the interface was brought up, and the fact
that the PHY has a problem bears no relationship to the administrative
state of the interface. Taking the interface administratively down
(which calls phy_stop()) is always the right thing to do after a
successful phy_start() call, whether or not the PHY has encountered
an error.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 11 +++++++----
 include/linux/phy.h   |  7 +++++--
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 0c0df38cd1ab..bdf00b2b2c1d 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -52,6 +52,7 @@ static const char *phy_state_to_str(enum phy_state st)
 	PHY_STATE_STR(NOLINK)
 	PHY_STATE_STR(CABLETEST)
 	PHY_STATE_STR(HALTED)
+	PHY_STATE_STR(ERROR)
 	}
 
 	return NULL;
@@ -1184,7 +1185,7 @@ void phy_stop_machine(struct phy_device *phydev)
 static void phy_process_error(struct phy_device *phydev)
 {
 	mutex_lock(&phydev->lock);
-	phydev->state = PHY_HALTED;
+	phydev->state = PHY_ERROR;
 	mutex_unlock(&phydev->lock);
 
 	phy_trigger_machine(phydev);
@@ -1198,10 +1199,10 @@ static void phy_error_precise(struct phy_device *phydev,
 }
 
 /**
- * phy_error - enter HALTED state for this PHY device
+ * phy_error - enter ERROR state for this PHY device
  * @phydev: target phy_device struct
  *
- * Moves the PHY to the HALTED state in response to a read
+ * Moves the PHY to the ERROR state in response to a read
  * or write error, and tells the controller the link is down.
  * Must not be called from interrupt context, or while the
  * phydev->lock is held.
@@ -1326,7 +1327,8 @@ void phy_stop(struct phy_device *phydev)
 	struct net_device *dev = phydev->attached_dev;
 	enum phy_state old_state;
 
-	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN) {
+	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN &&
+	    phydev->state != PHY_ERROR) {
 		WARN(1, "called from state %s\n",
 		     phy_state_to_str(phydev->state));
 		return;
@@ -1443,6 +1445,7 @@ void phy_state_machine(struct work_struct *work)
 		}
 		break;
 	case PHY_HALTED:
+	case PHY_ERROR:
 		if (phydev->link) {
 			phydev->link = 0;
 			phy_link_down(phydev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 2da87a36200d..7addde5d14c0 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -497,14 +497,17 @@ struct phy_device *mdiobus_scan_c22(struct mii_bus *bus, int addr);
  * Once complete, move to UP to restart the PHY.
  * - phy_stop aborts the running test and moves to @PHY_HALTED
  *
- * @PHY_HALTED: PHY is up, but no polling or interrupts are done. Or
- * PHY is in an error state.
+ * @PHY_HALTED: PHY is up, but no polling or interrupts are done.
  * - phy_start moves to @PHY_UP
+ *
+ * @PHY_ERROR: PHY is up, but is in an error state.
+ * - phy_stop moves to @PHY_HALTED
  */
 enum phy_state {
 	PHY_DOWN = 0,
 	PHY_READY,
 	PHY_HALTED,
+	PHY_ERROR,
 	PHY_UP,
 	PHY_RUNNING,
 	PHY_NOLINK,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-05-22 15:58 [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY Russell King (Oracle)
@ 2023-05-22 16:06 ` Russell King (Oracle)
  2023-05-22 19:03 ` Florian Fainelli
  2023-05-24  7:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 14+ messages in thread
From: Russell King (Oracle) @ 2023-05-22 16:06 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev

On Mon, May 22, 2023 at 04:58:08PM +0100, Russell King (Oracle) wrote:
> When taking a network interface down (or removing a SFP module) after
> the PHY has encountered an error, phy_stop() complains incorrectly
> that it was called from HALTED state.
> 
> The reason this is incorrect is that the network driver will have
> called phy_start() when the interface was brought up, and the fact
> that the PHY has a problem bears no relationship to the administrative
> state of the interface. Taking the interface administratively down
> (which calls phy_stop()) is always the right thing to do after a
> successful phy_start() call, whether or not the PHY has encountered
> an error.

Note that I can reproduce this by repeatedly plugging and unplugging any
SFP with a PHY that we access - if one unplugs it while the PHY is being
accessed, phylib logs an error, and then the unplug event happens which
ends up correctly calling phy_stop(), which then spits out a kernel
warning.

One may suggest that this is an unlikely event, but any SFP using the
Rollball I2C protocol to access the PHY, each access can take tens of
milliseconds, which is more than enough time to hit this.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-05-22 15:58 [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY Russell King (Oracle)
  2023-05-22 16:06 ` Russell King (Oracle)
@ 2023-05-22 19:03 ` Florian Fainelli
  2023-09-04  9:50   ` Jijie Shao
  2023-05-24  7:30 ` patchwork-bot+netdevbpf
  2 siblings, 1 reply; 14+ messages in thread
From: Florian Fainelli @ 2023-05-22 19:03 UTC (permalink / raw)
  To: Russell King (Oracle), Andrew Lunn, Heiner Kallweit
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev

On 5/22/23 08:58, Russell King (Oracle) wrote:
> When taking a network interface down (or removing a SFP module) after
> the PHY has encountered an error, phy_stop() complains incorrectly
> that it was called from HALTED state.
> 
> The reason this is incorrect is that the network driver will have
> called phy_start() when the interface was brought up, and the fact
> that the PHY has a problem bears no relationship to the administrative
> state of the interface. Taking the interface administratively down
> (which calls phy_stop()) is always the right thing to do after a
> successful phy_start() call, whether or not the PHY has encountered
> an error.
> 
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>

I would argue that the entire phy_error() needs a revamp, yes it's 
important to know if we have an error "talking" to the PHY, but there is 
no much that is being actionable in other situations than pluggable SFP 
modules.
-- 
Florian


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-05-22 15:58 [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY Russell King (Oracle)
  2023-05-22 16:06 ` Russell King (Oracle)
  2023-05-22 19:03 ` Florian Fainelli
@ 2023-05-24  7:30 ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-05-24  7:30 UTC (permalink / raw)
  To: Russell King; +Cc: andrew, hkallweit1, davem, edumazet, kuba, pabeni, netdev

Hello:

This patch was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Mon, 22 May 2023 16:58:08 +0100 you wrote:
> When taking a network interface down (or removing a SFP module) after
> the PHY has encountered an error, phy_stop() complains incorrectly
> that it was called from HALTED state.
> 
> The reason this is incorrect is that the network driver will have
> called phy_start() when the interface was brought up, and the fact
> that the PHY has a problem bears no relationship to the administrative
> state of the interface. Taking the interface administratively down
> (which calls phy_stop()) is always the right thing to do after a
> successful phy_start() call, whether or not the PHY has encountered
> an error.
> 
> [...]

Here is the summary with links:
  - [net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
    https://git.kernel.org/netdev/net-next/c/59088b5a946e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-05-22 19:03 ` Florian Fainelli
@ 2023-09-04  9:50   ` Jijie Shao
  2023-09-04 13:43     ` Andrew Lunn
  2023-09-04 14:42     ` Russell King (Oracle)
  0 siblings, 2 replies; 14+ messages in thread
From: Jijie Shao @ 2023-09-04  9:50 UTC (permalink / raw)
  To: f.fainelli, Andrew Lunn
  Cc: davem, edumazet, hkallweit1, kuba, netdev, pabeni, rmk+kernel,
	shenjian15@huawei.com, liuyonglong@huawei.com, wangjie125,
	chenhao418, Hao Lan, shaojijie, wangpeiyang1@huawei.com

Hi all,
We encountered an issue when resetting our netdevice recently, it seems
related to this patch.

During our process, we stop phy first and call phy_start() later.
phy_check_link_status returns error because it read mdio failed. The
reason why it happened is that the cmdq is unusable when we reset and we
can't access to mdio.

The process and logs are showed as followed:
Process:
reset process       |    phy_state_machine           |  phy_state
==========================================================================
                     | mutex_lock(&phydev->lock);     | PHY_RUNNING
                     | ...                            |
                     | case PHY_RUNNING:              |
                     | err = phy_check_link_status()  | PHY_RUNNING
                     | ...                            |
                     | mutex_unlock(&phydev->lock)    | PHY_RUNNING
  phy_stop()         |                                |
    ...              |                                |
    mutex_lock()     |                                | PHY_RUNNING
    ...              |                                |
    phydev->state =  |                                |
      PHY_HALTED;    |                                |  PHY_HALTED
    ...              |                                |
    mutex_unlock()   |                                |  PHY_HALTED
                     | phy_error_precise():           |
                     |   mutex_lock(&phydev->lock);   | PHY_HALTED
                     |   phydev->state = PHY_ERROR;   | PHY_ERROR
                     |   mutex_unlock(&phydev->lock); | PHY_ERROR
                     |                                |
phy_start()         |                                |  PHY_ERROR
   ...               |                                |
Logs:
[ 2622.146721] hns3 0000:35:00.0 eth1: Setting reset type 6
[ 2622.155182] hns3 0000:35:00.0: received reset event, reset type is 6
[ 2622.171641] hns3 0000:35:00.0: global reset requested
[ 2622.181867] hns3 0000:35:00.0: global reset interrupt
[ 2623.351382] ------------[ cut here ]------------
[ 2623.358012] phy_check_link_status+0x0/0xe0: returned: -16
[ 2623.370106] hns3 0000:35:00.0 eth1: net stop
[ 2623.377599] WARNING: CPU: 0 PID: 10 at drivers/net/phy/phy.c:1211
phy_state_machine+0xac/0x2b8
[ 2623.386026] RTL8211F Gigabit Ethernet mii-0000:35:00.0:02: PHY state
change RUNNING -> HALTED
                 ...
[ 2623.540165] Call trace:
[ 2623.543034]  phy_state_machine+0xac/0x2b8
[ 2623.548028]  process_one_work+0x1ec/0x478
[ 2623.552732]  worker_thread+0x74/0x448
[ 2623.556855]  kthread+0x120/0x130
[ 2623.560920]  ret_from_fork+0x10/0x20
[ 2623.565355] ---[ end trace 0000000000000000 ]---
[ 2623.577722] RTL8211F Gigabit Ethernet mii-0000:35:00.0:02: PHY state
change RUNNING -> ERROR
[ 2623.590490] hns3 0000:35:00.0 eth1: link down
[ 2623.707230] hns3 0000:35:00.0: prepare wait ok
[ 2624.169139] hns3 0000:35:00.0: The firmware version is 3.10.11.25
[ 2624.501223] hns3 0000:35:00.0: phc initializes ok!
[ 2624.553486] hns3 0000:35:00.0: Reset done, hclge driver initialization
finished.
[ 2625.586470] ------------[ cut here ]------------
[ 2625.593882] called from state ERROR
[ 2625.600677] WARNING: CPU: 1 PID: 352 at drivers/net/phy/phy.c:1392
phy_start+0x50/0xc8
                 ...
[ 2625.750077] Call trace:
[ 2625.752799]  phy_start+0x50/0xc8
[ 2625.756974]  hclge_mac_start_phy+0x34/0x50 [hclge]
                 ...
[ 2625.831224] ---[ end trace 0000000000000000 ]---
[ 2625.843790] hns3 0000:35:00.0 eth1: net open

We supposed to start phy successfully after calling phy_stop. However, the
phy state is PHY_ERROR. As aboved process, we can find
phy_check_link_status is called before phy_stop, but the final phy state
is set due to an error from phy_check_link_status. Becuase we reset our
netdevice successfully, the phy should not be PHY_ERROR when we call
phy_start. So, we supposed it might be a bug.

Additionally, what can we do if the phy is in PHY_ERROR?

Thanks!
Jijie Shao

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-04  9:50   ` Jijie Shao
@ 2023-09-04 13:43     ` Andrew Lunn
  2023-09-05  8:49       ` Jijie Shao
  2023-09-04 14:42     ` Russell King (Oracle)
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Lunn @ 2023-09-04 13:43 UTC (permalink / raw)
  To: Jijie Shao
  Cc: f.fainelli, davem, edumazet, hkallweit1, kuba, netdev, pabeni,
	rmk+kernel, shenjian15@huawei.com, liuyonglong@huawei.com,
	wangjie125, chenhao418, Hao Lan, wangpeiyang1@huawei.com

On Mon, Sep 04, 2023 at 05:50:32PM +0800, Jijie Shao wrote:
> Hi all,
> We encountered an issue when resetting our netdevice recently, it seems
> related to this patch.
> 
> During our process, we stop phy first and call phy_start() later.
> phy_check_link_status returns error because it read mdio failed. The
> reason why it happened is that the cmdq is unusable when we reset and we
> can't access to mdio.

At what point in the flow below do you apply the reset which stops
access to the MDIO bus? Ideally you want to do phy_stop(), then apply
the reset, get the hardware working again, and then do a phy_start().

> 
> The process and logs are showed as followed:
> Process:
> reset process       |    phy_state_machine           |  phy_state
> ==========================================================================
>                     | mutex_lock(&phydev->lock);     | PHY_RUNNING
>                     | ...                            |
>                     | case PHY_RUNNING:              |
>                     | err = phy_check_link_status()  | PHY_RUNNING
>                     | ...                            |
>                     | mutex_unlock(&phydev->lock)    | PHY_RUNNING
>  phy_stop()         |                                |
>    ...              |                                |
>    mutex_lock()     |                                | PHY_RUNNING
>    ...              |                                |
>    phydev->state =  |                                |
>      PHY_HALTED;    |                                |  PHY_HALTED
>    ...              |                                |
>    mutex_unlock()   |                                |  PHY_HALTED
>                     | phy_error_precise():           |
>                     |   mutex_lock(&phydev->lock);   | PHY_HALTED
>                     |   phydev->state = PHY_ERROR;   | PHY_ERROR
>                     |   mutex_unlock(&phydev->lock); | PHY_ERROR
>                     |                                |
> phy_start()         |                                |  PHY_ERROR
>   ...               |                                |


	Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-04  9:50   ` Jijie Shao
  2023-09-04 13:43     ` Andrew Lunn
@ 2023-09-04 14:42     ` Russell King (Oracle)
  2023-09-05  8:59       ` Jijie Shao
  1 sibling, 1 reply; 14+ messages in thread
From: Russell King (Oracle) @ 2023-09-04 14:42 UTC (permalink / raw)
  To: Jijie Shao
  Cc: f.fainelli, Andrew Lunn, davem, edumazet, hkallweit1, kuba,
	netdev, pabeni, shenjian15@huawei.com, liuyonglong@huawei.com,
	wangjie125, chenhao418, Hao Lan, wangpeiyang1@huawei.com

On Mon, Sep 04, 2023 at 05:50:32PM +0800, Jijie Shao wrote:
> Hi all,
> We encountered an issue when resetting our netdevice recently, it seems
> related to this patch.
> 
> During our process, we stop phy first and call phy_start() later.
> phy_check_link_status returns error because it read mdio failed. The
> reason why it happened is that the cmdq is unusable when we reset and we
> can't access to mdio.

Are you suggesting that the sequence is:

phy_stop();
reset netdev
phy_start();

?

Is the reason for doing this because you've already detected an issue
with the hardware, and you're trying to recover it - and before you've
called phy_stop() the hardware is already dead?

If that is the case, I'm not really sure what you expect to happen
here. You've identified a race where the state machine is running in
unison with phy_stop(), but in this circumstance it is also possible
that the state machine could complete executing and have called
phy_error_precise() before phy_stop() has even been called. In that
case, you'll still get a warning-splat on the console from
phy_error_precise().

The only difference is that phy_stop() won't warn.

That all said, this is obviously buggy, because phy_stop() has set
the phydev state to PHY_HALTED and the state machine has unexpectedly
changed its state.

I wonder whether we should be tracking the phy_start/stop state
separately, since we've had issues with phy_stop() warning when an
error has occurred (see commit 59088b5a946e).

Maybe something like this (untested)?

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index df54c137c5f5..d57f6de8a562 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -810,7 +810,8 @@ int phy_start_cable_test(struct phy_device *phydev,
 		goto out;
 	}
 
-	if (phydev->state < PHY_UP ||
+	if (phydev->oper_state != PHY_OPER_STARTED ||
+	    phydev->state < PHY_UP ||
 	    phydev->state > PHY_CABLETEST) {
 		NL_SET_ERR_MSG(extack,
 			       "PHY not configured. Try setting interface up");
@@ -881,7 +882,8 @@ int phy_start_cable_test_tdr(struct phy_device *phydev,
 		goto out;
 	}
 
-	if (phydev->state < PHY_UP ||
+	if (phydev->oper_state != PHY_OPER_STARTED ||
+	    phydev->state < PHY_UP ||
 	    phydev->state > PHY_CABLETEST) {
 		NL_SET_ERR_MSG(extack,
 			       "PHY not configured. Try setting interface up");
@@ -1364,10 +1366,8 @@ void phy_stop(struct phy_device *phydev)
 	struct net_device *dev = phydev->attached_dev;
 	enum phy_state old_state;
 
-	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN &&
-	    phydev->state != PHY_ERROR) {
-		WARN(1, "called from state %s\n",
-		     phy_state_to_str(phydev->state));
+	if (phydev->oper_state != PHY_OPER_STARTED) {
+		WARN(1, "called when not started\n");
 		return;
 	}
 
@@ -1382,6 +1382,7 @@ void phy_stop(struct phy_device *phydev)
 	if (phydev->sfp_bus)
 		sfp_upstream_stop(phydev->sfp_bus);
 
+	phydev->oper_state = PHY_OPER_STOPPED;
 	phydev->state = PHY_HALTED;
 	phy_process_state_change(phydev, old_state);
 
@@ -1411,9 +1412,8 @@ void phy_start(struct phy_device *phydev)
 {
 	mutex_lock(&phydev->lock);
 
-	if (phydev->state != PHY_READY && phydev->state != PHY_HALTED) {
-		WARN(1, "called from state %s\n",
-		     phy_state_to_str(phydev->state));
+	if (phydev->oper_state != PHY_OPER_STOPPED) {
+		WARN(1, "called when not stopped\n");
 		goto out;
 	}
 
@@ -1423,6 +1423,7 @@ void phy_start(struct phy_device *phydev)
 	/* if phy was suspended, bring the physical link up again */
 	__phy_resume(phydev);
 
+	phydev->oper_state = PHY_OPER_STARTED;
 	phydev->state = PHY_UP;
 
 	phy_start_machine(phydev);
@@ -1442,14 +1443,18 @@ void phy_state_machine(struct work_struct *work)
 			container_of(dwork, struct phy_device, state_queue);
 	struct net_device *dev = phydev->attached_dev;
 	bool needs_aneg = false, do_suspend = false;
-	enum phy_state old_state;
+	enum phy_state old_state, state;
 	const void *func = NULL;
 	bool finished = false;
 	int err = 0;
 
 	mutex_lock(&phydev->lock);
 
-	old_state = phydev->state;
+	state = old_state = phydev->state;
+
+	/* If the PHY is stopped, then force state to halted. */
+	if (phydev->oper_state == PHY_OPER_STOPPED)
+		state = PHY_HALTED;
 
 	switch (phydev->state) {
 	case PHY_DOWN:
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 5dcab361a220..b128d903adb3 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -519,6 +519,11 @@ enum phy_state {
 	PHY_CABLETEST,
 };
 
+enum phy_oper_state {
+	PHY_OPER_STOPPED,
+	PHY_OPER_STARTED,
+};
+
 #define MDIO_MMD_NUM 32
 
 /**
@@ -670,6 +675,7 @@ struct phy_device {
 	int rate_matching;
 
 	enum phy_state state;
+	enum phy_oper_state oper_state;
 
 	u32 dev_flags;
 
@@ -1221,7 +1227,8 @@ int phy_speed_down_core(struct phy_device *phydev);
  */
 static inline bool phy_is_started(struct phy_device *phydev)
 {
-	return phydev->state >= PHY_UP;
+	return phydev->oper_state == PHY_OPER_STARTED &&
+	       phydev->state >= PHY_UP;
 }
 
 void phy_resolve_aneg_pause(struct phy_device *phydev);
-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-04 13:43     ` Andrew Lunn
@ 2023-09-05  8:49       ` Jijie Shao
  2023-09-05 12:09         ` Andrew Lunn
  2023-09-05 13:48         ` Russell King (Oracle)
  0 siblings, 2 replies; 14+ messages in thread
From: Jijie Shao @ 2023-09-05  8:49 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: shaojijie, f.fainelli, davem, edumazet, hkallweit1, kuba, netdev,
	pabeni, rmk+kernel, shenjian15@huawei.com, liuyonglong@huawei.com,
	wangjie125, chenhao418, Hao Lan, wangpeiyang1@huawei.com


on 2023/9/4 21:43, Andrew Lunn wrote:
> On Mon, Sep 04, 2023 at 05:50:32PM +0800, Jijie Shao wrote:
>> Hi all,
>> We encountered an issue when resetting our netdevice recently, it seems
>> related to this patch.
>>
>> During our process, we stop phy first and call phy_start() later.
>> phy_check_link_status returns error because it read mdio failed. The
>> reason why it happened is that the cmdq is unusable when we reset and we
>> can't access to mdio.
> At what point in the flow below do you apply the reset which stops
> access to the MDIO bus? Ideally you want to do phy_stop(), then apply
> the reset, get the hardware working again, and then do a phy_start().
>

When we do a phy_stop(), hardware might be error and we can't access to
mdio.And our process is read/write mdio failed first, then do phy_stop(),
reset hardware and call phy_start() finally.

We note there are several times lock during phy_state_machine(). The first
is to handle phydev state. It's noting that a competition of phydev lock
happend again if phy_check_link_status() returns an error. Why we don't
held lock until changing state to PHY_ERROR if phy_check_link_status()
returns an error?

Jijie Shao


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-04 14:42     ` Russell King (Oracle)
@ 2023-09-05  8:59       ` Jijie Shao
  0 siblings, 0 replies; 14+ messages in thread
From: Jijie Shao @ 2023-09-05  8:59 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: shaojijie, f.fainelli, Andrew Lunn, davem, edumazet, hkallweit1,
	kuba, netdev, pabeni, shenjian15@huawei.com,
	liuyonglong@huawei.com, wangjie125, chenhao418, Hao Lan,
	wangpeiyang1@huawei.com


on 2023/9/4 22:42, Russell King (Oracle) wrote:
> Are you suggesting that the sequence is:
>
> phy_stop();
> reset netdev
> phy_start();
>
> ?
>
> Is the reason for doing this because you've already detected an issue
> with the hardware, and you're trying to recover it - and before you've
> called phy_stop() the hardware is already dead?

In our case, hardware is already dead before we called phy_stop().

>
> If that is the case, I'm not really sure what you expect to happen
> here. You've identified a race where the state machine is running in

What we expect is that the phy state is PHY_HALTED after we called
phy_stop(). So that we can do a phy_start() successully after resetting
our netdev.

> unison with phy_stop(), but in this circumstance it is also possible
> that the state machine could complete executing and have called
> phy_error_precise() before phy_stop() has even been called. In that
> case, you'll still get a warning-splat on the console from
> phy_error_precise().
>
> The only difference is that phy_stop() won't warn.
>
> That all said, this is obviously buggy, because phy_stop() has set
> the phydev state to PHY_HALTED and the state machine has unexpectedly
> changed its state.
>
> I wonder whether we should be tracking the phy_start/stop state
> separately, since we've had issues with phy_stop() warning when an
> error has occurred (see commit 59088b5a946e).
>
> Maybe something like this (untested)?

Thanks for your patch and it works in our case. By the way, when would you
push this patch?


Jijie Shao


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-05  8:49       ` Jijie Shao
@ 2023-09-05 12:09         ` Andrew Lunn
  2023-09-05 14:00           ` Russell King (Oracle)
  2023-09-05 13:48         ` Russell King (Oracle)
  1 sibling, 1 reply; 14+ messages in thread
From: Andrew Lunn @ 2023-09-05 12:09 UTC (permalink / raw)
  To: Jijie Shao
  Cc: f.fainelli, davem, edumazet, hkallweit1, kuba, netdev, pabeni,
	rmk+kernel, shenjian15@huawei.com, liuyonglong@huawei.com,
	wangjie125, chenhao418, Hao Lan, wangpeiyang1@huawei.com

> When we do a phy_stop(), hardware might be error and we can't access to
> mdio.And our process is read/write mdio failed first, then do phy_stop(),
> reset hardware and call phy_start() finally.

If the hardware/fimrware is already dead, you have to expect a stack
trace, because the once a second poll can happen, before you notice
the hardware/firmware is dead and call phy_stop().

You might want to also disconnect the PHY and reconnect it after the
reset.

But you should prioritise finding why your hardware/firmware dies,
that is the real problem, and we should not really be adding hacks to
the phylib core to work around broken hardware. Such hacks belong
since your driver, e.g. if the MDIO read/write fails, check if the
firmware still responds, and trigger your reset process immediately.

     Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-05  8:49       ` Jijie Shao
  2023-09-05 12:09         ` Andrew Lunn
@ 2023-09-05 13:48         ` Russell King (Oracle)
  2023-09-05 15:24           ` Russell King (Oracle)
  1 sibling, 1 reply; 14+ messages in thread
From: Russell King (Oracle) @ 2023-09-05 13:48 UTC (permalink / raw)
  To: Jijie Shao
  Cc: Andrew Lunn, f.fainelli, davem, edumazet, hkallweit1, kuba,
	netdev, pabeni, shenjian15@huawei.com, liuyonglong@huawei.com,
	wangjie125, chenhao418, Hao Lan, wangpeiyang1@huawei.com

[-- Attachment #1: Type: text/plain, Size: 2093 bytes --]

On Tue, Sep 05, 2023 at 04:49:31PM +0800, Jijie Shao wrote:
> We note there are several times lock during phy_state_machine(). The first
> is to handle phydev state. It's noting that a competition of phydev lock
> happend again if phy_check_link_status() returns an error. Why we don't
> held lock until changing state to PHY_ERROR if phy_check_link_status()
> returns an error?

You are quite correct that isn't very good. We can easily get rid of
some of this mess, but I don't think all which leaves it still open to
the race you describe.

The problem is phy_suspend().

First, it calls phy_ethtool_get_wol() which takes the phydev lock. This
can be dealt with if we save the state at probe time, and then update
the state when phy_ethtool_set_wol() is called.

Second, phy_suspend() calls ->suspend without holding the phydev lock,
and holding the lock while calling that may not be safe. Having had a
brief look over the implementations (but not delving into any PTP
function they may call) does seem to suggest that shouldn't be a big
problem, but I don't know whether holding the phydev lock while calling
PTP functions is likely to cause issues.

However, looking at that has lead me to the conclusion that there is a
lot of duplication of WoL condition testing. phy_suspend() already
avoids calling ->suspend() if either phy_ethtool_get_wol() indicates
that WoL is enabled, or if the netdev says WoL is enabled.

Many of the ->suspend() implementations, for example,
lan88xx_suspend(), dp83822_suspend(), etc test some kind of flag to
determine whether WoL is enabled and thus seem to be re-implementing
what phy_suspend() is already doing. I think there is scope to delete
this code from several drivers.

The easy bits are the patches I've attached to this email. These
won't on their own be sufficient to close the race you've identified
due to the phy_suspend() issue, but are the best we can do until we
can work out what to do about that.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

[-- Attachment #2: 0001-net-phy-always-call-phy_process_state_change-under-l.patch --]
[-- Type: text/x-diff, Size: 1297 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Subject: [PATCH net-next 1/3] net: phy: always call phy_process_state_change()
 under lock

phy_stop() calls phy_process_state_change() while holding the phydev
lock, so also arrange for phy_state_machine() to do the same, so that
this function is called with consistent locking.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index df54c137c5f5..1e5218935eb3 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1506,6 +1506,7 @@ void phy_state_machine(struct work_struct *work)
 	if (err < 0)
 		phy_error_precise(phydev, func, err);
 
+	mutex_lock(&phydev->lock);
 	phy_process_state_change(phydev, old_state);
 
 	/* Only re-schedule a PHY state machine change if we are polling the
@@ -1516,7 +1517,6 @@ void phy_state_machine(struct work_struct *work)
 	 * state machine would be pointless and possibly error prone when
 	 * called from phy_disconnect() synchronously.
 	 */
-	mutex_lock(&phydev->lock);
 	if (phy_polling_mode(phydev) && phy_is_started(phydev))
 		phy_queue_state_machine(phydev, PHY_STATE_TIME);
 	mutex_unlock(&phydev->lock);
-- 
2.30.2


[-- Attachment #3: 0002-net-phy-call-phy_error_precise-while-holding-the-loc.patch --]
[-- Type: text/x-diff, Size: 1206 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Subject: [PATCH net-next 2/3] net: phy: call phy_error_precise() while holding
 the lock

Move the locking out of phy_error_precise() and to its only call site,
merging with the locked region that has already been taken.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 1e5218935eb3..990d387b31bd 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1231,9 +1231,7 @@ static void phy_error_precise(struct phy_device *phydev,
 			      const void *func, int err)
 {
 	WARN(1, "%pS: returned: %d\n", func, err);
-	mutex_lock(&phydev->lock);
 	phy_process_error(phydev);
-	mutex_unlock(&phydev->lock);
 }
 
 /**
@@ -1503,10 +1501,10 @@ void phy_state_machine(struct work_struct *work)
 	if (err == -ENODEV)
 		return;
 
+	mutex_lock(&phydev->lock);
 	if (err < 0)
 		phy_error_precise(phydev, func, err);
 
-	mutex_lock(&phydev->lock);
 	phy_process_state_change(phydev, old_state);
 
 	/* Only re-schedule a PHY state machine change if we are polling the
-- 
2.30.2


[-- Attachment #4: 0003-net-phy-move-call-to-start-aneg.patch --]
[-- Type: text/x-diff, Size: 1044 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Subject: [PATCH net-next 3/3] net: phy: move call to start aneg

Move the call to start auto-negotiation inside the lock in the PHYLIB
state machine, calling the locked variant _phy_start_aneg(). This
avoids unnecessarily releasing and re-acquiring the lock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 990d387b31bd..5bb33af2a4cb 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1489,14 +1489,15 @@ void phy_state_machine(struct work_struct *work)
 		break;
 	}
 
+	if (needs_aneg) {
+		err = _phy_start_aneg(phydev);
+		func = &_phy_start_aneg;
+	}
+
 	mutex_unlock(&phydev->lock);
 
-	if (needs_aneg) {
-		err = phy_start_aneg(phydev);
-		func = &phy_start_aneg;
-	} else if (do_suspend) {
+	if (do_suspend)
 		phy_suspend(phydev);
-	}
 
 	if (err == -ENODEV)
 		return;
-- 
2.30.2


[-- Attachment #5: 0004-net-phy-track-PHY-WoL-enable-state.patch --]
[-- Type: text/x-diff, Size: 3759 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Subject: [PATCH net-next 4/4] net: phy: track PHY WoL enable state

Track the PHY Wake-on-LAN enable state so we don't need to call
phy_ethtool_get_wol() in phy_suspend(), thus taking the phydev lock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c        | 10 ++++++++++
 drivers/net/phy/phy_device.c | 17 ++++++++++++++---
 include/linux/phy.h          |  2 ++
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 5bb33af2a4cb..2324544aae0d 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1644,11 +1644,21 @@ EXPORT_SYMBOL(phy_ethtool_set_eee);
  */
 int phy_ethtool_set_wol(struct phy_device *phydev, struct ethtool_wolinfo *wol)
 {
+	struct ethtool_wolinfo w = { .cmd = ETHTOOL_GWOL };
 	int ret;
 
 	if (phydev->drv && phydev->drv->set_wol) {
 		mutex_lock(&phydev->lock);
 		ret = phydev->drv->set_wol(phydev, wol);
+
+		/* Read back the WoL enabled state for the PHY and update
+		 * our saved state.
+		 */
+		if (phydev->drv->get_wol) {
+			phydev->drv->get_wol(phydev, &w);
+			phydev->phy_wol_enabled = w.wolopts;
+		}
+
 		mutex_unlock(&phydev->lock);
 
 		return ret;
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 2ce74593d6e4..c874c1ec5ada 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1850,7 +1850,6 @@ EXPORT_SYMBOL(phy_detach);
 
 int phy_suspend(struct phy_device *phydev)
 {
-	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
 	struct net_device *netdev = phydev->attached_dev;
 	struct phy_driver *phydrv = phydev->drv;
 	int ret;
@@ -1858,8 +1857,9 @@ int phy_suspend(struct phy_device *phydev)
 	if (phydev->suspended)
 		return 0;
 
-	phy_ethtool_get_wol(phydev, &wol);
-	phydev->wol_enabled = wol.wolopts || (netdev && netdev->wol_enabled);
+	phydev->wol_enabled = phydev->phy_wol_enabled ||
+			      (netdev && netdev->wol_enabled);
+
 	/* If the device has WOL enabled, we cannot suspend the PHY */
 	if (phydev->wol_enabled && !(phydrv->flags & PHY_ALWAYS_CALL_SUSPEND))
 		return -EBUSY;
@@ -3251,6 +3251,14 @@ struct fwnode_handle *fwnode_get_phy_node(const struct fwnode_handle *fwnode)
 }
 EXPORT_SYMBOL_GPL(fwnode_get_phy_node);
 
+static void phy_update_wol_state(struct phy_device *phydev)
+{
+	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
+
+	phy_ethtool_get_wol(phydev, &wol);
+	phydev->phy_wol_enabled = wol.wolopts;
+}
+
 /**
  * phy_probe - probe and init a PHY device
  * @dev: device to probe and init
@@ -3364,6 +3372,9 @@ static int phy_probe(struct device *dev)
 	/* Set the state to READY by default */
 	phydev->state = PHY_READY;
 
+	/* Initialise phydev->wol_enabled state */
+	phy_update_wol_state(phydev);
+
 	/* Get the LEDs from the device tree, and instantiate standard
 	 * LEDs for them.
 	 */
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 5dcab361a220..514c1d077bb9 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -557,6 +557,7 @@ struct macsec_ops;
  * @downshifted_rate: Set true if link speed has been downshifted.
  * @is_on_sfp_module: Set true if PHY is located on an SFP module.
  * @mac_managed_pm: Set true if MAC driver takes of suspending/resuming PHY
+ * @phy_wol_enabled: Set to true if the PHY has Wake-on-Lan enabled.
  * @wol_enabled: Set to true if the PHY or the attached MAC have Wake-on-LAN
  * 		 enabled.
  * @state: State of the PHY for management purposes
@@ -655,6 +656,7 @@ struct phy_device {
 	unsigned downshifted_rate:1;
 	unsigned is_on_sfp_module:1;
 	unsigned mac_managed_pm:1;
+	unsigned phy_wol_enabled:1;
 	unsigned wol_enabled:1;
 
 	unsigned autoneg:1;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-05 12:09         ` Andrew Lunn
@ 2023-09-05 14:00           ` Russell King (Oracle)
  0 siblings, 0 replies; 14+ messages in thread
From: Russell King (Oracle) @ 2023-09-05 14:00 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Jijie Shao, f.fainelli, davem, edumazet, hkallweit1, kuba, netdev,
	pabeni, shenjian15@huawei.com, liuyonglong@huawei.com, wangjie125,
	chenhao418, Hao Lan, wangpeiyang1@huawei.com

On Tue, Sep 05, 2023 at 02:09:29PM +0200, Andrew Lunn wrote:
> > When we do a phy_stop(), hardware might be error and we can't access to
> > mdio.And our process is read/write mdio failed first, then do phy_stop(),
> > reset hardware and call phy_start() finally.
> 
> If the hardware/fimrware is already dead, you have to expect a stack
> trace, because the once a second poll can happen, before you notice
> the hardware/firmware is dead and call phy_stop().
> 
> You might want to also disconnect the PHY and reconnect it after the
> reset.

Andrew,

I think that's what is being tried here, but there's a race between
phy_stop() and phy_state_machine() which is screwing up phydev->state.

Honestly, the locking in phy_state_machine() is insane, allows this
race to happen, and really needs fixing... and I think that the
phydev->lock usage has become really insane over the years. We have
some driver methods now which are always called with the lock held,
others where the lock may or may not be held, and others where the
lock isn't held - and none of this is documented.

Please can you have a look at the four patches I've just posted as
attached to my previous email. I think we need to start sorting out
some of this crazyness and santising the locking.

My four patches address most of it, except the call to phy_suspend().
If that can be solved, then we can improve the locking more, and
eliminate the race entirely.

If we held the lock over the entire state machine function, then the
problem that has been reported here would not exist - phy_stop()
would not be able to "nip in" during the middle of the PHY state
machine running, and thus we would not see the PHY_HALTED state
overwritten by PHY_ERROR unexpectedly.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-05 13:48         ` Russell King (Oracle)
@ 2023-09-05 15:24           ` Russell King (Oracle)
  2023-09-06 12:59             ` Andrew Lunn
  0 siblings, 1 reply; 14+ messages in thread
From: Russell King (Oracle) @ 2023-09-05 15:24 UTC (permalink / raw)
  To: Jijie Shao
  Cc: Andrew Lunn, f.fainelli, davem, edumazet, hkallweit1, kuba,
	netdev, pabeni, shenjian15@huawei.com, liuyonglong@huawei.com,
	wangjie125, chenhao418, Hao Lan, wangpeiyang1@huawei.com

[-- Attachment #1: Type: text/plain, Size: 2335 bytes --]

On Tue, Sep 05, 2023 at 02:48:13PM +0100, Russell King (Oracle) wrote:
> On Tue, Sep 05, 2023 at 04:49:31PM +0800, Jijie Shao wrote:
> > We note there are several times lock during phy_state_machine(). The first
> > is to handle phydev state. It's noting that a competition of phydev lock
> > happend again if phy_check_link_status() returns an error. Why we don't
> > held lock until changing state to PHY_ERROR if phy_check_link_status()
> > returns an error?
> 
> You are quite correct that isn't very good. We can easily get rid of
> some of this mess, but I don't think all which leaves it still open to
> the race you describe.
> 
> The problem is phy_suspend().
> 
> First, it calls phy_ethtool_get_wol() which takes the phydev lock. This
> can be dealt with if we save the state at probe time, and then update
> the state when phy_ethtool_set_wol() is called.
> 
> Second, phy_suspend() calls ->suspend without holding the phydev lock,
> and holding the lock while calling that may not be safe. Having had a
> brief look over the implementations (but not delving into any PTP
> function they may call) does seem to suggest that shouldn't be a big
> problem, but I don't know whether holding the phydev lock while calling
> PTP functions is likely to cause issues.
> 
> However, looking at that has lead me to the conclusion that there is a
> lot of duplication of WoL condition testing. phy_suspend() already
> avoids calling ->suspend() if either phy_ethtool_get_wol() indicates
> that WoL is enabled, or if the netdev says WoL is enabled.
> 
> Many of the ->suspend() implementations, for example,
> lan88xx_suspend(), dp83822_suspend(), etc test some kind of flag to
> determine whether WoL is enabled and thus seem to be re-implementing
> what phy_suspend() is already doing. I think there is scope to delete
> this code from several drivers.
> 
> The easy bits are the patches I've attached to this email. These
> won't on their own be sufficient to close the race you've identified
> due to the phy_suspend() issue, but are the best we can do until we
> can work out what to do about that.

Having looked deeper at this, I think there may be a solution. See
these follow-on patches.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

[-- Attachment #2: 0005-net-phy-move-phy_suspend-to-end-of-phy_state_machine.patch --]
[-- Type: text/x-diff, Size: 1430 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
To: Andrew Lunn <andrew@lunn.ch>,
	Heiner Kallweit <hkallweit1@gmail.com>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 5/8] net: phy: move phy_suspend() to end of
 phy_state_machine()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"

Move the call to phy_suspend() to the end of phy_state_machine() after
we release the lock.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 2324544aae0d..010e54d7ccb2 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1494,15 +1494,11 @@ void phy_state_machine(struct work_struct *work)
 		func = &_phy_start_aneg;
 	}
 
-	mutex_unlock(&phydev->lock);
-
-	if (do_suspend)
-		phy_suspend(phydev);
-
-	if (err == -ENODEV)
+	if (err == -ENODEV) {
+		mutex_unlock(&phydev->lock);
 		return;
+	}
 
-	mutex_lock(&phydev->lock);
 	if (err < 0)
 		phy_error_precise(phydev, func, err);
 
@@ -1519,6 +1515,9 @@ void phy_state_machine(struct work_struct *work)
 	if (phy_polling_mode(phydev) && phy_is_started(phydev))
 		phy_queue_state_machine(phydev, PHY_STATE_TIME);
 	mutex_unlock(&phydev->lock);
+
+	if (do_suspend)
+		phy_suspend(phydev);
 }
 
 /**
-- 
2.30.2


[-- Attachment #3: 0006-net-phy-move-phy_state_machine.patch --]
[-- Type: text/x-diff, Size: 5140 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
To: Andrew Lunn <andrew@lunn.ch>,
	Heiner Kallweit <hkallweit1@gmail.com>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 6/8] net: phy: move phy_state_machine()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"

Move phy_state_machine() before phy_stop() to avoid subsequent patches
introducing forward references.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 152 +++++++++++++++++++++---------------------
 1 file changed, 76 insertions(+), 76 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 010e54d7ccb2..508a5434684a 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1353,82 +1353,6 @@ void phy_free_interrupt(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_free_interrupt);
 
-/**
- * phy_stop - Bring down the PHY link, and stop checking the status
- * @phydev: target phy_device struct
- */
-void phy_stop(struct phy_device *phydev)
-{
-	struct net_device *dev = phydev->attached_dev;
-	enum phy_state old_state;
-
-	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN &&
-	    phydev->state != PHY_ERROR) {
-		WARN(1, "called from state %s\n",
-		     phy_state_to_str(phydev->state));
-		return;
-	}
-
-	mutex_lock(&phydev->lock);
-	old_state = phydev->state;
-
-	if (phydev->state == PHY_CABLETEST) {
-		phy_abort_cable_test(phydev);
-		netif_testing_off(dev);
-	}
-
-	if (phydev->sfp_bus)
-		sfp_upstream_stop(phydev->sfp_bus);
-
-	phydev->state = PHY_HALTED;
-	phy_process_state_change(phydev, old_state);
-
-	mutex_unlock(&phydev->lock);
-
-	phy_state_machine(&phydev->state_queue.work);
-	phy_stop_machine(phydev);
-
-	/* Cannot call flush_scheduled_work() here as desired because
-	 * of rtnl_lock(), but PHY_HALTED shall guarantee irq handler
-	 * will not reenable interrupts.
-	 */
-}
-EXPORT_SYMBOL(phy_stop);
-
-/**
- * phy_start - start or restart a PHY device
- * @phydev: target phy_device struct
- *
- * Description: Indicates the attached device's readiness to
- *   handle PHY-related work.  Used during startup to start the
- *   PHY, and after a call to phy_stop() to resume operation.
- *   Also used to indicate the MDIO bus has cleared an error
- *   condition.
- */
-void phy_start(struct phy_device *phydev)
-{
-	mutex_lock(&phydev->lock);
-
-	if (phydev->state != PHY_READY && phydev->state != PHY_HALTED) {
-		WARN(1, "called from state %s\n",
-		     phy_state_to_str(phydev->state));
-		goto out;
-	}
-
-	if (phydev->sfp_bus)
-		sfp_upstream_start(phydev->sfp_bus);
-
-	/* if phy was suspended, bring the physical link up again */
-	__phy_resume(phydev);
-
-	phydev->state = PHY_UP;
-
-	phy_start_machine(phydev);
-out:
-	mutex_unlock(&phydev->lock);
-}
-EXPORT_SYMBOL(phy_start);
-
 /**
  * phy_state_machine - Handle the state machine
  * @work: work_struct that describes the work to be done
@@ -1520,6 +1444,82 @@ void phy_state_machine(struct work_struct *work)
 		phy_suspend(phydev);
 }
 
+/**
+ * phy_stop - Bring down the PHY link, and stop checking the status
+ * @phydev: target phy_device struct
+ */
+void phy_stop(struct phy_device *phydev)
+{
+	struct net_device *dev = phydev->attached_dev;
+	enum phy_state old_state;
+
+	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN &&
+	    phydev->state != PHY_ERROR) {
+		WARN(1, "called from state %s\n",
+		     phy_state_to_str(phydev->state));
+		return;
+	}
+
+	mutex_lock(&phydev->lock);
+	old_state = phydev->state;
+
+	if (phydev->state == PHY_CABLETEST) {
+		phy_abort_cable_test(phydev);
+		netif_testing_off(dev);
+	}
+
+	if (phydev->sfp_bus)
+		sfp_upstream_stop(phydev->sfp_bus);
+
+	phydev->state = PHY_HALTED;
+	phy_process_state_change(phydev, old_state);
+
+	mutex_unlock(&phydev->lock);
+
+	phy_state_machine(&phydev->state_queue.work);
+	phy_stop_machine(phydev);
+
+	/* Cannot call flush_scheduled_work() here as desired because
+	 * of rtnl_lock(), but PHY_HALTED shall guarantee irq handler
+	 * will not reenable interrupts.
+	 */
+}
+EXPORT_SYMBOL(phy_stop);
+
+/**
+ * phy_start - start or restart a PHY device
+ * @phydev: target phy_device struct
+ *
+ * Description: Indicates the attached device's readiness to
+ *   handle PHY-related work.  Used during startup to start the
+ *   PHY, and after a call to phy_stop() to resume operation.
+ *   Also used to indicate the MDIO bus has cleared an error
+ *   condition.
+ */
+void phy_start(struct phy_device *phydev)
+{
+	mutex_lock(&phydev->lock);
+
+	if (phydev->state != PHY_READY && phydev->state != PHY_HALTED) {
+		WARN(1, "called from state %s\n",
+		     phy_state_to_str(phydev->state));
+		goto out;
+	}
+
+	if (phydev->sfp_bus)
+		sfp_upstream_start(phydev->sfp_bus);
+
+	/* if phy was suspended, bring the physical link up again */
+	__phy_resume(phydev);
+
+	phydev->state = PHY_UP;
+
+	phy_start_machine(phydev);
+out:
+	mutex_unlock(&phydev->lock);
+}
+EXPORT_SYMBOL(phy_start);
+
 /**
  * phy_mac_interrupt - MAC says the link has changed
  * @phydev: phy_device struct with changed link
-- 
2.30.2


[-- Attachment #4: 0007-net-phy-split-locked-and-unlocked-section-of-phy_sta.patch --]
[-- Type: text/x-diff, Size: 4220 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
To: Andrew Lunn <andrew@lunn.ch>,
	Heiner Kallweit <hkallweit1@gmail.com>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 7/8] net: phy: split locked and unlocked section of
 phy_state_machine()
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"

Split out the locked and unlocked sections of phy_state_machine() into
two separate functions which can be called inside the phydev lock and
outside the phydev lock as appropriate.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 68 ++++++++++++++++++++++++++-----------------
 1 file changed, 42 insertions(+), 26 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 508a5434684a..b63d9f9320d7 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1353,33 +1353,27 @@ void phy_free_interrupt(struct phy_device *phydev)
 }
 EXPORT_SYMBOL(phy_free_interrupt);
 
-/**
- * phy_state_machine - Handle the state machine
- * @work: work_struct that describes the work to be done
- */
-void phy_state_machine(struct work_struct *work)
+enum phy_state_work {
+	PHY_STATE_WORK_NONE,
+	PHY_STATE_WORK_ANEG,
+	PHY_STATE_WORK_SUSPEND,
+};
+
+static enum phy_state_work _phy_state_machine(struct phy_device *phydev)
 {
-	struct delayed_work *dwork = to_delayed_work(work);
-	struct phy_device *phydev =
-			container_of(dwork, struct phy_device, state_queue);
+	enum phy_state_work state_work = PHY_STATE_WORK_NONE;
 	struct net_device *dev = phydev->attached_dev;
-	bool needs_aneg = false, do_suspend = false;
-	enum phy_state old_state;
+	enum phy_state old_state = phydev->state;
 	const void *func = NULL;
 	bool finished = false;
 	int err = 0;
 
-	mutex_lock(&phydev->lock);
-
-	old_state = phydev->state;
-
 	switch (phydev->state) {
 	case PHY_DOWN:
 	case PHY_READY:
 		break;
 	case PHY_UP:
-		needs_aneg = true;
-
+		state_work = PHY_STATE_WORK_ANEG;
 		break;
 	case PHY_NOLINK:
 	case PHY_RUNNING:
@@ -1391,7 +1385,7 @@ void phy_state_machine(struct work_struct *work)
 		if (err) {
 			phy_abort_cable_test(phydev);
 			netif_testing_off(dev);
-			needs_aneg = true;
+			state_work = PHY_STATE_WORK_ANEG;
 			phydev->state = PHY_UP;
 			break;
 		}
@@ -1399,7 +1393,7 @@ void phy_state_machine(struct work_struct *work)
 		if (finished) {
 			ethnl_cable_test_finished(phydev);
 			netif_testing_off(dev);
-			needs_aneg = true;
+			state_work = PHY_STATE_WORK_ANEG;
 			phydev->state = PHY_UP;
 		}
 		break;
@@ -1409,19 +1403,17 @@ void phy_state_machine(struct work_struct *work)
 			phydev->link = 0;
 			phy_link_down(phydev);
 		}
-		do_suspend = true;
+		state_work = PHY_STATE_WORK_SUSPEND;
 		break;
 	}
 
-	if (needs_aneg) {
+	if (state_work == PHY_STATE_WORK_ANEG) {
 		err = _phy_start_aneg(phydev);
 		func = &_phy_start_aneg;
 	}
 
-	if (err == -ENODEV) {
-		mutex_unlock(&phydev->lock);
-		return;
-	}
+	if (err == -ENODEV)
+		return state_work;
 
 	if (err < 0)
 		phy_error_precise(phydev, func, err);
@@ -1438,12 +1430,36 @@ void phy_state_machine(struct work_struct *work)
 	 */
 	if (phy_polling_mode(phydev) && phy_is_started(phydev))
 		phy_queue_state_machine(phydev, PHY_STATE_TIME);
-	mutex_unlock(&phydev->lock);
 
-	if (do_suspend)
+	return state_work;
+}
+
+/* unlocked part of the PHY state machine */
+static void _phy_state_machine_post_work(struct phy_device *phydev,
+					 enum phy_state_work state_work)
+{
+	if (state_work == PHY_STATE_WORK_SUSPEND)
 		phy_suspend(phydev);
 }
 
+/**
+ * phy_state_machine - Handle the state machine
+ * @work: work_struct that describes the work to be done
+ */
+void phy_state_machine(struct work_struct *work)
+{
+	struct delayed_work *dwork = to_delayed_work(work);
+	struct phy_device *phydev =
+			container_of(dwork, struct phy_device, state_queue);
+	enum phy_state_work state_work;
+
+	mutex_lock(&phydev->lock);
+	state_work = _phy_state_machine(phydev);
+	mutex_unlock(&phydev->lock);
+
+	_phy_state_machine_post_work(phydev, state_work);
+}
+
 /**
  * phy_stop - Bring down the PHY link, and stop checking the status
  * @phydev: target phy_device struct
-- 
2.30.2


[-- Attachment #5: 0008-net-phy-convert-phy_stop-to-use-split-state-machine.patch --]
[-- Type: text/x-diff, Size: 1498 bytes --]

From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
To: Andrew Lunn <andrew@lunn.ch>,
	Heiner Kallweit <hkallweit1@gmail.com>
Bcc: linux@mail.armlinux.org.uk
Subject: [PATCH net-next 8/8] net: phy: convert phy_stop() to use split state
 machine
MIME-Version: 1.0
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
Content-Type: text/plain; charset="utf-8"

Convert phy_stop() to use the new locked-section and unlocked-section
parts of the PHY state machine.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
---
 drivers/net/phy/phy.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index b63d9f9320d7..5a5156f9788b 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1467,6 +1467,7 @@ void phy_state_machine(struct work_struct *work)
 void phy_stop(struct phy_device *phydev)
 {
 	struct net_device *dev = phydev->attached_dev;
+	enum phy_state_work state_work;
 	enum phy_state old_state;
 
 	if (!phy_is_started(phydev) && phydev->state != PHY_DOWN &&
@@ -1490,9 +1491,10 @@ void phy_stop(struct phy_device *phydev)
 	phydev->state = PHY_HALTED;
 	phy_process_state_change(phydev, old_state);
 
+	state_work = _phy_state_machine(phydev);
 	mutex_unlock(&phydev->lock);
 
-	phy_state_machine(&phydev->state_queue.work);
+	_phy_state_machine_post_work(phydev, state_work);
 	phy_stop_machine(phydev);
 
 	/* Cannot call flush_scheduled_work() here as desired because
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY
  2023-09-05 15:24           ` Russell King (Oracle)
@ 2023-09-06 12:59             ` Andrew Lunn
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Lunn @ 2023-09-06 12:59 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Jijie Shao, f.fainelli, davem, edumazet, hkallweit1, kuba, netdev,
	pabeni, shenjian15@huawei.com, liuyonglong@huawei.com, wangjie125,
	chenhao418, Hao Lan, wangpeiyang1@huawei.com

> Having looked deeper at this, I think there may be a solution. See
> these follow-on patches.

Hi Russell

I'm on vacation at the moment with limited time and network access.

> Move the call to phy_suspend() to the end of phy_state_machine() after
> we release the lock.

I know this is a quick RFC exploration of the problem space, but it
would be good to comment about 'Why?'. Suspend and resume has had
deadlock issues in the past, which is why they don't take the lock.

> Split out the locked and unlocked sections of phy_state_machine() into
> two separate functions which can be called inside the phydev lock and
> outside the phydev lock as appropriate.

Again, i think some mention of suspend/resume would be good, since
that is what is causing these issues. Maybe we also need to add a
comment next to struct phy_device lock about what the lock should be
protecting.

	Andrew

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-09-06 12:59 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-22 15:58 [PATCH net-next] net: phy: avoid kernel warning dump when stopping an errored PHY Russell King (Oracle)
2023-05-22 16:06 ` Russell King (Oracle)
2023-05-22 19:03 ` Florian Fainelli
2023-09-04  9:50   ` Jijie Shao
2023-09-04 13:43     ` Andrew Lunn
2023-09-05  8:49       ` Jijie Shao
2023-09-05 12:09         ` Andrew Lunn
2023-09-05 14:00           ` Russell King (Oracle)
2023-09-05 13:48         ` Russell King (Oracle)
2023-09-05 15:24           ` Russell King (Oracle)
2023-09-06 12:59             ` Andrew Lunn
2023-09-04 14:42     ` Russell King (Oracle)
2023-09-05  8:59       ` Jijie Shao
2023-05-24  7:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).