All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: "Rafał Miłecki" <zajec5@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>,
	Heiner Kallweit <hkallweit1@gmail.com>,
	Network Development <netdev@vger.kernel.org>,
	Florian Fainelli <f.fainelli@gmail.com>,
	BCM Kernel Feedback <bcm-kernel-feedback-list@broadcom.com>,
	Vivek Unune <npcomplete13@gmail.com>
Subject: Re: Lockup in phy_probe() for MDIO device (Broadcom's switch)
Date: Thu, 30 Sep 2021 11:40:43 +0100	[thread overview]
Message-ID: <YVWUKwEXrd39t8iw@shell.armlinux.org.uk> (raw)
In-Reply-To: <5715f818-a279-d514-dcac-73a94c1d30ef@gmail.com>

On Thu, Sep 30, 2021 at 12:30:52PM +0200, Rafał Miłecki wrote:
> On 30.09.2021 12:17, Russell King (Oracle) wrote:
> > On Thu, Sep 30, 2021 at 11:58:21AM +0200, Rafał Miłecki wrote:
> > > This isn't necessarily a PHY / MDIO regression. It could be some core
> > > change that exposed a PHY / MDIO bug.
> > 
> > I think what's going on is that the switch device is somehow being
> > probed by phylib. It looks to me like we don't check that the mdio
> > device being matched in phy_bus_match() is actually a PHY (by
> > checking whether mdiodev->flags & MDIO_DEVICE_FLAG_PHY is true
> > before proceeding with any matching.)
> > 
> > We do, however, check the driver side. This looks to me like a problem
> > especially when the mdio bus can contain a mixture of PHY devices and
> > non-PHY devices. However, I would expect this to also be blowing up in
> > the mainline kernel as well - but it doesn't seem to.
> > 
> > Maybe Andrew can provide a reason why this doesn't happen - maybe we've
> > just been lucky with out-of-bounds read accesses (to the non-existent
> > phy_device wrapped around the mdio_device?)
> 
> I'll see if I can use buildroot to test unmodified kernel.
> 
> 
> > If my theory is correct, this patch should solve your issue:
> > 
> > diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> > index ba5ad86ec826..dac017174ab1 100644
> > --- a/drivers/net/phy/phy_device.c
> > +++ b/drivers/net/phy/phy_device.c
> > @@ -462,7 +462,8 @@ static int phy_bus_match(struct device *dev, struct device_driver *drv)
> >   	const int num_ids = ARRAY_SIZE(phydev->c45_ids.device_ids);
> >   	int i;
> > -	if (!(phydrv->mdiodrv.flags & MDIO_DEVICE_IS_PHY))
> > +	if (!(phydrv->mdiodrv.flags & MDIO_DEVICE_IS_PHY) ||
> > +	    !(phydev->mdio.flags & MDIO_DEVICE_FLAG_PHY))
> >   		return 0;
> >   	if (phydrv->match_phy_device)
> > 
> 
> Unfortunately this doesn't seem to help

Hmm.

In phy_probe, can you add:

	WARN_ON(!(phydev->mdio.flags & MDIO_DEVICE_FLAG_PHY));

just to make sure we have a real PHY device there please? Maybe also
print the value of the flags argument.

MDIO_DEVICE_FLAG_PHY is set by phy_create_device() before the mutex is
initialised, so if it is set, the lock should be initialised.

Maybe also print mdiodev->flags in mdio_device_register() as well, so
we can see what is being registered and the flags being used for that
device.

Could it be that openwrt is carrying a patch that is causing this
issue?

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

  reply	other threads:[~2021-09-30 10:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-30  9:58 Lockup in phy_probe() for MDIO device (Broadcom's switch) Rafał Miłecki
2021-09-30 10:15 ` Rafał Miłecki
2021-09-30 10:17 ` Russell King (Oracle)
2021-09-30 10:30   ` Rafał Miłecki
2021-09-30 10:40     ` Russell King (Oracle) [this message]
2021-09-30 11:29       ` Rafał Miłecki
2021-09-30 11:44         ` Russell King (Oracle)
2021-09-30 12:14           ` Rafał Miłecki
2021-09-30 12:30             ` Russell King (Oracle)
2021-09-30 12:51               ` Rafał Miłecki
2021-09-30 13:07                 ` Russell King (Oracle)
2021-09-30 13:21                   ` Russell King (Oracle)
2021-09-30 13:32                     ` Andrew Lunn
2021-09-30 13:47                       ` Rafał Miłecki
2021-09-30 13:42                   ` Rafał Miłecki
2021-09-30 13:54                     ` Russell King (Oracle)
2021-09-30 11:22     ` Rafał Miłecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVWUKwEXrd39t8iw@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=andrew@lunn.ch \
    --cc=bcm-kernel-feedback-list@broadcom.com \
    --cc=f.fainelli@gmail.com \
    --cc=hkallweit1@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=npcomplete13@gmail.com \
    --cc=zajec5@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.