From: Heiner Kallweit <hkallweit1@gmail.com>
To: Russell King <rmk+kernel@armlinux.org.uk>,
Andrew Lunn <andrew@lunn.ch>,
Florian Fainelli <f.fainelli@gmail.com>
Cc: "David S. Miller" <davem@davemloft.net>, netdev@vger.kernel.org
Subject: Re: [PATCH net] net: phy: make phy_error() report which PHY has failed
Date: Tue, 17 Dec 2019 22:41:34 +0100 [thread overview]
Message-ID: <c96f14cd-7139-ebc7-9562-2f92d8b044fc@gmail.com> (raw)
In-Reply-To: <E1ihCLZ-0001Vo-Nw@rmk-PC.armlinux.org.uk>
On 17.12.2019 13:53, Russell King wrote:
> phy_error() is called from phy_interrupt() or phy_state_machine(), and
> uses WARN_ON() to print a backtrace. The backtrace is not useful when
> reporting a PHY error.
>
> However, a system may contain multiple ethernet PHYs, and phy_error()
> gives no clue which one caused the problem.
>
> Replace WARN_ON() with a call to phydev_err() so that we can see which
> PHY had an error, and also inform the user that we are halting the PHY.
>
> Fixes: fa7b28c11bbf ("net: phy: print stack trace in phy_error")
> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
> There is another related problem in this area. If an error is detected
> while the PHY is running, phy_error() moves to PHY_HALTED state. If we
> try to take the network device down, then:
>
> void phy_stop(struct phy_device *phydev)
> {
> if (!phy_is_started(phydev)) {
> WARN(1, "called from state %s\n",
> phy_state_to_str(phydev->state));
> return;
> }
>
> triggers, and we never do any of the phy_stop() cleanup. I'm not sure
> what the best way to solve this is - introducing a PHY_ERROR state may
> be a solution, but I think we want some phy_is_started() sites to
> return true for it and others to return false.
>
> Heiner - you introduced the above warning, could you look at improving
> this case so we don't print a warning and taint the kernel when taking
> a network device down after phy_error() please?
>
I think we need both types of information:
- the affected PHY device
- the stack trace to see where the issue was triggered
In general it's not fully clear yet what's the appropriate reaction
after a PHY error. Few reasons for PHY errors I see:
- MDIO bus may not be accessible, e.g. because parent device is in
power-down mode
- Frequently polling is used to determine end of a MDIO operation.
If timeout for polling is too low, then we may end up with an
-ETIMEDOUT.
In case of singular timeouts they may be acceptable or not.
- If we miss a single PHY status poll, then this may be acceptable.
- But if e.g. a relevant setting in config_init fails, then this
may not be acceptable.
The current behavior has been existing for the last 15 years,
and I'm just aware of one issue with PHY errors. The case I've
seen was triggered by timeouts, and adjusting the timeouts
fixed it: c3b084c24c8a (net: fec: Adjust ENET MDIO timeouts)
> Thanks.
>
> drivers/net/phy/phy.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index 49300fb59757..06fbca959383 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -663,7 +663,7 @@ void phy_stop_machine(struct phy_device *phydev)
> */
> static void phy_error(struct phy_device *phydev)
> {
> - WARN_ON(1);
> + phydev_err(phydev, "Error detected, halting PHY\n");
>
> mutex_lock(&phydev->lock);
> phydev->state = PHY_HALTED;
>
next prev parent reply other threads:[~2019-12-17 21:41 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-17 12:53 [PATCH net] net: phy: make phy_error() report which PHY has failed Russell King
2019-12-17 21:41 ` Heiner Kallweit [this message]
2019-12-17 23:34 ` Russell King - ARM Linux admin
2019-12-18 20:54 ` Heiner Kallweit
2019-12-18 22:09 ` Russell King - ARM Linux admin
2019-12-19 7:10 ` Heiner Kallweit
2019-12-19 17:06 ` Russell King - ARM Linux admin
2019-12-20 18:46 ` Florian Fainelli
2019-12-20 22:28 ` Heiner Kallweit
2019-12-19 20:50 ` David Miller
2019-12-19 21:05 ` Andrew Lunn
2019-12-19 22:14 ` David Miller
2019-12-19 21:45 ` Russell King - ARM Linux admin
2019-12-20 9:18 ` Andrew Lunn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c96f14cd-7139-ebc7-9562-2f92d8b044fc@gmail.com \
--to=hkallweit1@gmail.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=f.fainelli@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=rmk+kernel@armlinux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).