* [PATCH] net: phy: Fix premature resume by a PHY driver
@ 2025-07-18 15:42 Abid Ali
2025-07-18 16:10 ` Russell King (Oracle)
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Abid Ali @ 2025-07-18 15:42 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit, Russell King, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: netdev, linux-kernel, Abid Ali
There are possibilities for phy_resume to be executed when the ethernet
interface is initially taken UP after bootup. This is harmless in most
cases, but the respective PHY driver`s resume callback cannot have any
logic that should only be executed if it was previously suspended.
In stmmac for instance, 2 entry points of phy_resume:
1. stmmac_open->phylink_of_phy_connect->phy_attach_direct->phy_resume
commit 1211ce530771 ("net: phy: resume/suspend PHYs on attach/detach")
This is not needed at the initial interface UP but required if the PHY
may suspend when the interface is taken DOWN.
2. stmmac_open->phylink_start->phy_start->__phy_resume
commit 9e573cfc35c6 ("net: phy: start interrupts in phy_start")
This patch does not introduce the __phy_resume in phy_start but removes
it from being conditional to PHY_HALTED. Now it fails to ensure if it
really needs to resume.
Prevent these duplicate access and provide logic exclusivity for resume
callback in a PHY driver.
Signed-off-by: Abid Ali <dev.nuvorolabs@gmail.com>
---
drivers/net/phy/phy_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 73f9cb2e2844..68583bb74aec 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1846,7 +1846,7 @@ int __phy_resume(struct phy_device *phydev)
lockdep_assert_held(&phydev->lock);
- if (!phydrv || !phydrv->resume)
+ if (!phydrv || !phydrv->resume && phydev->suspended)
return 0;
ret = phydrv->resume(phydev);
---
base-commit: 347e9f5043c89695b01e66b3ed111755afcf1911
change-id: 20250718-phy_resume-cc86109396cc
Best regards,
--
Abid Ali <dev.nuvorolabs@gmail.com>
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-18 15:42 [PATCH] net: phy: Fix premature resume by a PHY driver Abid Ali
@ 2025-07-18 16:10 ` Russell King (Oracle)
2025-07-19 6:25 ` Abid Ali
2025-07-19 5:37 ` kernel test robot
2025-07-20 18:09 ` Dan Carpenter
2 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2025-07-18 16:10 UTC (permalink / raw)
To: Abid Ali
Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
On Fri, Jul 18, 2025 at 03:42:22PM +0000, Abid Ali wrote:
> There are possibilities for phy_resume to be executed when the ethernet
> interface is initially taken UP after bootup. This is harmless in most
> cases, but the respective PHY driver`s resume callback cannot have any
> logic that should only be executed if it was previously suspended.
Sorry but no. The PHY will be "resumed" from boot, even if it wasn't
"suspended". So the idea that resume should only be called if it was
previously suspended is incorrect.
E.g. .ndo_open -> ... -> phy_attach_direct() -> phy_resume() ->
phydrv->resume()
During this path, the PHY may or may not be suspended, depending on
the state of the hardware when control was passed to the kernel,
which includes kexec().
PHY drivers must cope with an already functional PHY when their
resume() method is called.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-18 15:42 [PATCH] net: phy: Fix premature resume by a PHY driver Abid Ali
2025-07-18 16:10 ` Russell King (Oracle)
@ 2025-07-19 5:37 ` kernel test robot
2025-07-20 18:09 ` Dan Carpenter
2 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2025-07-19 5:37 UTC (permalink / raw)
To: Abid Ali, Andrew Lunn, Heiner Kallweit, Russell King,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: llvm, oe-kbuild-all, netdev, linux-kernel, Abid Ali
Hi Abid,
kernel test robot noticed the following build warnings:
[auto build test WARNING on 347e9f5043c89695b01e66b3ed111755afcf1911]
url: https://github.com/intel-lab-lkp/linux/commits/Abid-Ali/net-phy-Fix-premature-resume-by-a-PHY-driver/20250718-234858
base: 347e9f5043c89695b01e66b3ed111755afcf1911
patch link: https://lore.kernel.org/r/20250718-phy_resume-v1-1-9c6b59580bee%40gmail.com
patch subject: [PATCH] net: phy: Fix premature resume by a PHY driver
config: i386-buildonly-randconfig-004-20250719 (https://download.01.org/0day-ci/archive/20250719/202507191322.YG6cNwF6-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250719/202507191322.YG6cNwF6-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202507191322.YG6cNwF6-lkp@intel.com/
All warnings (new ones prefixed by >>):
>> drivers/net/phy/phy_device.c:1849:33: warning: '&&' within '||' [-Wlogical-op-parentheses]
1849 | if (!phydrv || !phydrv->resume && phydev->suspended)
| ~~ ~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~
drivers/net/phy/phy_device.c:1849:33: note: place parentheses around the '&&' expression to silence this warning
1849 | if (!phydrv || !phydrv->resume && phydev->suspended)
| ^
| ( )
1 warning generated.
vim +1849 drivers/net/phy/phy_device.c
1841
1842 int __phy_resume(struct phy_device *phydev)
1843 {
1844 const struct phy_driver *phydrv = phydev->drv;
1845 int ret;
1846
1847 lockdep_assert_held(&phydev->lock);
1848
> 1849 if (!phydrv || !phydrv->resume && phydev->suspended)
1850 return 0;
1851
1852 ret = phydrv->resume(phydev);
1853 if (!ret)
1854 phydev->suspended = false;
1855
1856 return ret;
1857 }
1858 EXPORT_SYMBOL(__phy_resume);
1859
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-18 16:10 ` Russell King (Oracle)
@ 2025-07-19 6:25 ` Abid Ali
2025-07-19 7:48 ` Russell King (Oracle)
0 siblings, 1 reply; 9+ messages in thread
From: Abid Ali @ 2025-07-19 6:25 UTC (permalink / raw)
To: Abid Ali
Cc: Russell King, Andrew Lunn, Heiner Kallweit, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
On Fri, Jul 18, 2025 at 05:10:54 PM +0100, Russell King (Oracle) wrote:
> Sorry but no. The PHY will be "resumed" from boot, even if it wasn't
> "suspended". So the idea that resume should only be called if it was
> previously suspended is incorrect.
> E.g. .ndo_open -> ... -> phy_attach_direct() -> phy_resume() ->
> phydrv->resume()
I do point this path out and there is also a second call
(2) .ndo_open -> phylink_start -> phy_start -> __phy_resume
This would mean 2 calls to the PHY resume every time an interface is
taken UP is expected behaviour?.
> During this path, the PHY may or may not be suspended, depending on
> the state of the hardware when control was passed to the kernel,
> which includes kexec().
> PHY drivers must cope with an already functional PHY when their
> resume() method is called.
This is not what I expected, but if it is by design I do not see a
need to fight it. Just to make it clear, if we need to reset a PHY
after it returns from suspend(or any code thats dependant), the driver`s
callback should provide this guarantee?.
if yes, this was just a misconception of relating it to resume callbacks
provided by the PM subsystem where such behaviour is not exepcted
Obvious logic error in the patch but this patch is not required as it stands.
- if (!phydrv || !phydrv->resume && phydev->suspended)
+ if (!phydrv || !phydrv->resume || !phydev->suspended)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-19 6:25 ` Abid Ali
@ 2025-07-19 7:48 ` Russell King (Oracle)
2025-07-19 11:34 ` Abid Ali
0 siblings, 1 reply; 9+ messages in thread
From: Russell King (Oracle) @ 2025-07-19 7:48 UTC (permalink / raw)
To: Abid Ali
Cc: Andrew Lunn, Heiner Kallweit, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
On Sat, Jul 19, 2025 at 06:25:47AM +0000, Abid Ali wrote:
> On Fri, Jul 18, 2025 at 05:10:54 PM +0100, Russell King (Oracle) wrote:
> > Sorry but no. The PHY will be "resumed" from boot, even if it wasn't
> > "suspended". So the idea that resume should only be called if it was
> > previously suspended is incorrect.
>
> > E.g. .ndo_open -> ... -> phy_attach_direct() -> phy_resume() ->
> > phydrv->resume()
>
> I do point this path out and there is also a second call
> (2) .ndo_open -> phylink_start -> phy_start -> __phy_resume
> This would mean 2 calls to the PHY resume every time an interface is
> taken UP is expected behaviour?.
The whole point is this:
> > During this path, the PHY may or may not be suspended, depending on
> > the state of the hardware when control was passed to the kernel,
> > which includes kexec().
Thus, the resume function *must* cope with an already resumed PHY,
and thus adding extra complexity to try to ignore calling the resume
function if it wasn't previously suspended is likely to cause
regressions - phydrv->suspended will be clear for the initial call
to ->resume(). Thus, if the PHY was suspended at boot time, it won't
be resumed when one attempts to bring up the interface initially.
Just fix your driver's resume function.
> > PHY drivers must cope with an already functional PHY when their
> > resume() method is called.
>
> This is not what I expected, but if it is by design I do not see a
> need to fight it. Just to make it clear, if we need to reset a PHY
> after it returns from suspend(or any code thats dependant), the driver`s
> callback should provide this guarantee?.
Hardware or software reset?
How much a software reset disrupts the PHY is PHY dependent. E.g. there
are PHYs that need to be software reset for configuration and
advertisement changes, but all the software configuration is preserved
over such a reset.
So I can't answer your question. It depends how disruptive the reset
you are talking about is to your PHY implementation.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-19 7:48 ` Russell King (Oracle)
@ 2025-07-19 11:34 ` Abid Ali
2025-07-19 15:32 ` Russell King (Oracle)
0 siblings, 1 reply; 9+ messages in thread
From: Abid Ali @ 2025-07-19 11:34 UTC (permalink / raw)
To: Abid Ali
Cc: Russell King, Andrew Lunn, Heiner Kallweit, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
On Sat, Jul 19, 2025 at 08:48:20 AM +0100, Russell King (Oracle) wrote:
> > I do point this path out and there is also a second call
> > (2) .ndo_open -> phylink_start -> phy_start -> __phy_resume
> > This would mean 2 calls to the PHY resume every time an interface is
> > taken UP is expected behaviour?.
>
> The whole point is this:
>
> > > During this path, the PHY may or may not be suspended, depending on
> > > the state of the hardware when control was passed to the kernel,
> > > which includes kexec().
>
> Thus, the resume function *must* cope with an already resumed PHY,
> and thus adding extra complexity to try to ignore calling the resume
> function if it wasn't previously suspended is likely to cause
> regressions - phydrv->suspended will be clear for the initial call
> to ->resume(). Thus, if the PHY was suspended at boot time, it won't
> be resumed when one attempts to bring up the interface initially.
yea, I get your point.
> Hardware or software reset?
>
> How much a software reset disrupts the PHY is PHY dependent. E.g. there
> are PHYs that need to be software reset for configuration and
> advertisement changes, but all the software configuration is preserved
> over such a reset.
The PHY we have loses power when the kernel PM goes to suspend and we
need have a hardware reset upon its bootup in resume.
As an unintentional consequence this ended with 2 additional
resets (reset-delay-us in dts + 2 PHY resume) at boot->interface-UP.
In the end the "phydev->state" in the driver`s resume callback was used to
prevent it and checking further, it was evident that there were 2
intentional calls for phy_resume from .ndo_open which didnt look obvious.
This particular scenario was not the point of the commit but rather
having some protection for phy_resume but I guess its not possible.
To keep it simple, these would be my present understanding.
1. Should the PHY driver be able handle consecutive resume callbacks?
a. yes. It would have to be taken care in the driver.
2. Why does phy_resume exec twice in .ndo_open with PHYLINK API?
a. can happen but still dont have clarity on why .ndo_open does this.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-19 11:34 ` Abid Ali
@ 2025-07-19 15:32 ` Russell King (Oracle)
0 siblings, 0 replies; 9+ messages in thread
From: Russell King (Oracle) @ 2025-07-19 15:32 UTC (permalink / raw)
To: Abid Ali
Cc: Andrew Lunn, Heiner Kallweit, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
On Sat, Jul 19, 2025 at 11:34:50AM +0000, Abid Ali wrote:
> The PHY we have loses power when the kernel PM goes to suspend and we
> need have a hardware reset upon its bootup in resume.
Other PHY drivers are fine with this.
> As an unintentional consequence this ended with 2 additional
> resets (reset-delay-us in dts + 2 PHY resume) at boot->interface-UP.
Presumably, the resets occur because your PHY driver (which phy driver,
and are the patches for this merged?) is causing the hardware reset to
occur?
> In the end the "phydev->state" in the driver`s resume callback was used to
> prevent it and checking further, it was evident that there were 2
> intentional calls for phy_resume from .ndo_open which didnt look obvious.
>
> This particular scenario was not the point of the commit but rather
> having some protection for phy_resume but I guess its not possible.
> To keep it simple, these would be my present understanding.
>
> 1. Should the PHY driver be able handle consecutive resume callbacks?
> a. yes. It would have to be taken care in the driver.
Correct, but I think we need full details.
> 2. Why does phy_resume exec twice in .ndo_open with PHYLINK API?
> a. can happen but still dont have clarity on why .ndo_open does this.
Nothing to do with phylink. Exactly the same would happen with drivers
that use the phylib API, attaching the PHY in .ndo_open and then
calling phy_start(). Phylink is just a wrapper around these.
The problem I have with the idea of changing the behaviour in core
code is that we can't test every driver that is making use of phylib
(whether directly or via phylink.) It would be a monumental task to
do that level of testing.
So, if there's another clean way to solve the issue, I would much
rather that approach was taken.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-18 15:42 [PATCH] net: phy: Fix premature resume by a PHY driver Abid Ali
2025-07-18 16:10 ` Russell King (Oracle)
2025-07-19 5:37 ` kernel test robot
@ 2025-07-20 18:09 ` Dan Carpenter
2025-07-21 6:07 ` Abid Ali
2 siblings, 1 reply; 9+ messages in thread
From: Dan Carpenter @ 2025-07-20 18:09 UTC (permalink / raw)
To: oe-kbuild, Abid Ali, Andrew Lunn, Heiner Kallweit, Russell King,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: lkp, oe-kbuild-all, netdev, linux-kernel, Abid Ali
Hi Abid,
kernel test robot noticed the following build warnings:
url: https://github.com/intel-lab-lkp/linux/commits/Abid-Ali/net-phy-Fix-premature-resume-by-a-PHY-driver/20250718-234858
base: 347e9f5043c89695b01e66b3ed111755afcf1911
patch link: https://lore.kernel.org/r/20250718-phy_resume-v1-1-9c6b59580bee%40gmail.com
patch subject: [PATCH] net: phy: Fix premature resume by a PHY driver
config: sparc-randconfig-r071-20250719 (https://download.01.org/0day-ci/archive/20250720/202507200229.iOChX4Rp-lkp@intel.com/config)
compiler: sparc-linux-gcc (GCC) 14.3.0
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
| Closes: https://lore.kernel.org/r/202507200229.iOChX4Rp-lkp@intel.com/
smatch warnings:
drivers/net/phy/phy_device.c:1852 __phy_resume() error: we previously assumed 'phydrv->resume' could be null (see line 1849)
vim +1852 drivers/net/phy/phy_device.c
9c2c2e62df3fa3 Andrew Lunn 2018-02-27 1842 int __phy_resume(struct phy_device *phydev)
481b5d938b4a60 Sebastian Hesselbarth 2013-12-13 1843 {
0bd199fd9c19aa Russell King (Oracle 2024-02-02 1844) const struct phy_driver *phydrv = phydev->drv;
8a8f8281e7e7a8 Heiner Kallweit 2020-03-26 1845 int ret;
481b5d938b4a60 Sebastian Hesselbarth 2013-12-13 1846
e6e918d4eb93f4 Heiner Kallweit 2021-01-06 1847 lockdep_assert_held(&phydev->lock);
f5e64032a799d4 Russell King 2017-12-12 1848
9421d84b1b3e16 Abid Ali 2025-07-18 @1849 if (!phydrv || !phydrv->resume && phydev->suspended)
^^^^^^^^^^^^^^^
This checks for if the resume pointer is NULL, but if the resume is
NULL but suspend is also NULL. I'm surprised that the compiler allows
us to write that comparison without adding parenthesis. I thought that
it would complain about && having higher precedence.
8a8f8281e7e7a8 Heiner Kallweit 2020-03-26 1850 return 0;
8a477a6fb6a336 Florian Fainelli 2015-01-26 1851
8a8f8281e7e7a8 Heiner Kallweit 2020-03-26 @1852 ret = phydrv->resume(phydev);
^^^^^^^^^^^^^^
Then this will crash.
8a8f8281e7e7a8 Heiner Kallweit 2020-03-26 1853 if (!ret)
8a477a6fb6a336 Florian Fainelli 2015-01-26 1854 phydev->suspended = false;
8a477a6fb6a336 Florian Fainelli 2015-01-26 1855
8a477a6fb6a336 Florian Fainelli 2015-01-26 1856 return ret;
481b5d938b4a60 Sebastian Hesselbarth 2013-12-13 1857 }
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] net: phy: Fix premature resume by a PHY driver
2025-07-20 18:09 ` Dan Carpenter
@ 2025-07-21 6:07 ` Abid Ali
0 siblings, 0 replies; 9+ messages in thread
From: Abid Ali @ 2025-07-21 6:07 UTC (permalink / raw)
To: Abid Ali
Cc: Dan Carpenter, Russell King, Andrew Lunn, Heiner Kallweit,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
netdev, linux-kernel
Hi Dan,
On Sun, Jul 20, 2025 at 21:09:19 +0300, Dan Carpenter wrote:
> smatch warnings:
> drivers/net/phy/phy_device.c:1852 __phy_resume() error: we previously assumed 'phydrv->resume' could be null (see line 1849)
>
> vim +1852 drivers/net/phy/phy_device.c
>
> 9421d84b1b3e16 Abid Ali 2025-07-18 @1849 if (!phydrv || !phydrv->resume && phydev->suspended)
> ^^^^^^^^^^^^^^^
> This checks for if the resume pointer is NULL, but if the resume is
> NULL but suspend is also NULL. I'm surprised that the compiler allows
> us to write that comparison without adding parenthesis. I thought that
> it would complain about && having higher precedence.
>
> 8a8f8281e7e7a8 Heiner Kallweit 2020-03-26 @1852 ret = phydrv->resume(phydev);
> ^^^^^^^^^^^^^^
> Then this will crash.
yea, there is an obvious error here and have pointed it in the thread.
I have not gone forward with the V2 for now as we there are other concerns
thats not easy to address.
LINK: https://lore.kernel.org/netdev/aHu6kzOpaoDFR8BM@shell.armlinux.org.uk/
For now the idea is to deal with issue pointed in the commit in the driver`s
resume callback itself.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-07-21 6:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-18 15:42 [PATCH] net: phy: Fix premature resume by a PHY driver Abid Ali
2025-07-18 16:10 ` Russell King (Oracle)
2025-07-19 6:25 ` Abid Ali
2025-07-19 7:48 ` Russell King (Oracle)
2025-07-19 11:34 ` Abid Ali
2025-07-19 15:32 ` Russell King (Oracle)
2025-07-19 5:37 ` kernel test robot
2025-07-20 18:09 ` Dan Carpenter
2025-07-21 6:07 ` Abid Ali
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).