netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: kernel test robot <oliver.sang@intel.com>
Cc: Andrew Lunn <andrew@lunn.ch>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org, Jakub Kicinski <kuba@kernel.org>,
	Jacob Keller <jacob.e.keller@intel.com>,
	netdev@vger.kernel.org, Kory Maincent <kory.maincent@bootlin.com>
Subject: Re: [linus:master] [net]  03abf2a7c6: WARNING:suspicious_RCU_usage
Date: Wed, 5 Feb 2025 17:19:13 +0000	[thread overview]
Message-ID: <Z6OdkdI2ss19FyVT@shell.armlinux.org.uk> (raw)
In-Reply-To: <202502051331.7587ac82-lkp@intel.com>

On Wed, Feb 05, 2025 at 02:08:04PM +0800, kernel test robot wrote:
> kernel test robot noticed "WARNING:suspicious_RCU_usage" on:
> 
> commit: 03abf2a7c65451e663b078b0ed1bfa648cd9380f ("net: phylink: add EEE management")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

I think there's multiple issues here that need addressing:

1) calling phy_detach() in a context that phy_attach() is allowed
   causes this warning, which seems absurd (being able to attach but
   not detach on error is a problem.)

This is the root cause of the issue, and others have run into this same
problem. There's already been an effort to address this:
   https://lore.kernel.org/r/20250117141446.1076951-1-kory.maincent@bootlin.com
   https://lore.kernel.org/r/20250117173645.1107460-1-kory.maincent@bootlin.com
   https://lore.kernel.org/r/20250120141926.1290763-1-kory.maincent@bootlin.com
and I think the conclusion is that the RTNL had to be held while calling
phy_detach().

2) phy_modify_mmd() returning -EPERM. Having traced through the code,
   this comes from my swphy.c which returns -1 (eww). However, as this
   code was extracted from fixed_phy.c, and the emulation is provided
   for userspace, this is part of the uAPI of the kernel and can't be
   changed.

3) the blamed commit introduces a call to phy_modify_mmd() to set the
   clock-stop bit, which ought not be done unless phylink managed EEE
   is being used.

(2) and (3) together is what ends up causing:

> [   19.646149][   T22] dsa-loop fixed-0:1f lan1 (uninitialized): failed to connect to PHY: -EPERM
> [   19.647542][   T22] dsa-loop fixed-0:1f lan1 (uninitialized): error -1 setting up PHY for tree 0, switch 0, port 0
> [   19.649283][   T22] dsa-loop fixed-0:1f lan2 (uninitialized): PHY [dsa-0.0:01] driver [Generic PHY] (irq=POLL)
> [   19.650853][   T22] dsa-loop fixed-0:1f lan2 (uninitialized): failed to connect to PHY: -EPERM
> [   19.652238][   T22] dsa-loop fixed-0:1f lan2 (uninitialized): error -1 setting up PHY for tree 0, switch 0, port 1
> [   19.653856][   T22] dsa-loop fixed-0:1f lan3 (uninitialized): PHY [dsa-0.0:02] driver [Generic PHY] (irq=POLL)
> [   19.655392][   T22] dsa-loop fixed-0:1f lan3 (uninitialized): failed to connect to PHY: -EPERM
> [   19.656689][   T22] dsa-loop fixed-0:1f lan3 (uninitialized): error -1 setting up PHY for tree 0, switch 0, port 2
> [   19.658308][   T22] dsa-loop fixed-0:1f lan4 (uninitialized): PHY [dsa-0.0:03] driver [Generic PHY] (irq=POLL)
> [   19.659841][   T22] dsa-loop fixed-0:1f lan4 (uninitialized): failed to connect to PHY: -EPERM
> [   19.661168][   T22] dsa-loop fixed-0:1f lan4 (uninitialized): error -1 setting up PHY for tree 0, switch 0, port 3
> [   19.663018][   T22] DSA: tree 0 setup
> [   19.663591][   T22] dsa-loop fixed-0:1f: DSA mockup driver: 0x1f

which then causes phy_detach() to be called, which then triggers the
"suspicious RCU" warning.

This has merely revealed a problem in the error handling since Kory's
commit on the 12th December, and actually has nothing to do with the
blamed commit, other than it revealing the latent problem.

The "hold RTNL" solution isn't trivial to implement here - phylink's
PHY connection functions can be called with RTNL already held, so it
isn't a simple case of throwing locking at phylink (which will cause
a deadlock) - it needs every phylink user to be audited and individual
patches to take the RTNL in the driver generated as necessary. I'm not
sure when I'll be able to do that. It's also a locking change for this
API - going from not needing the RTNL to requiring it.

This is probably going to result in more kernel warnings being
generated when I throw in ASSERT_RTNL() into phylink paths that could
call phy_detach(). Sounds joyful.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

  reply	other threads:[~2025-02-05 17:19 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-05  6:08 [linus:master] [net] 03abf2a7c6: WARNING:suspicious_RCU_usage kernel test robot
2025-02-05 17:19 ` Russell King (Oracle) [this message]
2025-03-27 13:09   ` Kory Maincent

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z6OdkdI2ss19FyVT@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=andrew@lunn.ch \
    --cc=jacob.e.keller@intel.com \
    --cc=kory.maincent@bootlin.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).