From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: David Epping <david.epping@missinglinkelectronics.com>
Cc: Andrew Lunn <andrew@lunn.ch>, Vladimir Oltean <olteanv@gmail.com>,
Heiner Kallweit <hkallweit1@gmail.com>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
UNGLinuxDriver@microchip.com
Subject: Re: [PATCH net v2 0/3] net: phy: mscc: support VSC8501
Date: Tue, 23 May 2023 15:32:06 +0100 [thread overview]
Message-ID: <ZGzOZm0oJimLHCDA@shell.armlinux.org.uk> (raw)
In-Reply-To: <20230523133236.GA7185@nucnuc.mle>
On Tue, May 23, 2023 at 03:32:36PM +0200, David Epping wrote:
> On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote:
> > > - I left the mutex_lock(&phydev->lock) in the
> > > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
> > > is required to repeatedly access phydev->interface and
> > > phy_interface_is_rgmii(phydev) in a consistent way.
> >
> > Just adding to Russell comment.
> >
> > As a general rule of thumb, if your driver is doing something which no
> > other driver is doing, you have to consider if it is correct. A PHY
> > driver taking phydev->lock is very unusual. So at minimum you should
> > be able to explain why it is needed. And when it comes to locking,
> > locking is hard, so you really should understand it.
> >
> > Now the mscc is an odd device, because it has multiple PHYs in the
> > package, and a number of registers are shared between these PHYs. So
> > it does have different locking requirements to most PHYs. However, i
> > don't think that is involved here. Those oddities are hidden behind
> > phy_base_write() and phy_base_read().
> >
> > Andrew
>
> Russell, Andrew,
>
> as you stated, locking is hard, and I am not in detail familiar with
> the mscc driver and the supported PHYs behavior. Also, I only have
> VSC8501, the single PHY chip, and none of the multi PHY chips to test.
> And testing these corner cases and race conditions is hard anyways.
> Thus my current patch is not touching the locking code at all, and
> assumes the current mainline code is correct in that regard.
> Because I don't understand all implications, I'm hesitant to change
> the existing locking scheme.
> Maybe this can be a separate patch? In the current patch set I'm not
> making the situation worse (unlike the first one where I added locks
> which Russell pointed out).
> If you insist and all agree the locks should be removed with this
> patch set, I'll update it of course.
Reading through this driver, IMHO it's clear that the original author
didn't have much idea about locking.
Your assumption that taking phydev->lock in vsc85xx_rgmii_set_skews()
protects phydev->interface is provably false, because:
static int vsc8584_config_init(struct phy_device *phydev)
{
...
if (phy_interface_is_rgmii(phydev)) {
ret = vsc85xx_rgmii_set_skews(phydev, VSC8572_RGMII_CNTL,
VSC8572_RGMII_RX_DELAY_MASK,
VSC8572_RGMII_TX_DELAY_MASK);
This accesses phydev->interface without holding phydev->lock,
before entering vsc85xx_rgmii_set_skews().
The second place that vsc85xx_rgmii_set_skews() is called from is
vsc85xx_default_config() which also accesses phydev->interface,
again without taking the phydev->lock.
So both paths into vsc85xx_rgmii_set_skews() have already read
phydev->interface without taking the lock. If this was what the
lock in vsc85xx_rgmii_set_skews() was protecting, then surely it
would need to protect those reads as well! It doesn't.
Also, with knowledge of phylib, I can say that this lock is
completely unnecessary when accessing phydev->interface in any
PHY driver .config_init method, which is the only place that
vsc85xx_rgmii_set_skews() is called from.
Having read the rest of the driver, it would appear that phydev->lock
is being abused to protect register accesses. This is evidenced by
the following, where I also set out why it's wrong:
vsc85xx_led_cntl_set()... which should be using phy_modify(), not
phy_read()..modify..phy_write(), which is open to races e.g. from
userspace MDIO access. phydev->lock doesn't solve anything there.
vsc85xx_edge_rate_cntl_set()... which correctly uses
phy_modify_paged() which itself will correctly prevent racy accesses
by taking the MDIO bus lock. It makes no accesses to anything else,
so phydev->lock here is entirely unnecessary.
vsc85xx_mac_if_set()... which is another case of racy access in the
same way as vsc85xx_led_cntl_set().
vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set()... both of
which IMHO show a complete misunderstanding for locking. At least
both of these functions are safe from other threads accessing the
bus because they correctly use phy_select_page()...phy_restore_page()
(which uses the MDIO bus lock to guarantee no other access will
happen.) BTW, I'm the author of phy_select_page()...phy_restore_page()
which were added to ensure that PHY drivers can _safely_ access
paged registers without fear of anything else disrupting accesses
to those paged registers, not even from userspace.
Essentially, taking phydev->lock offers absolutely zero protection
against another thread making accesses to the PHYs registers. The
*only* lock which prevents concurrent access to registers on devices
on a MDIO bus is the MDIO bus lock.
The only locking decision that I can see in this driver that is
correct is in phy_base_(read|write) which correctly demand that the
MDIO bus lock is held.
Oh my, things get even more "fun"...
vsc8574_config_pre_init() requires the MDIO bus lock to be held when
its called. This function uses request_firmware(), which can call out
to userspace and then *block* waiting for userspace to respond. This
will block *all* access to any device on that MDIO bus until the
firmware request has been satisfied.
Sorry, but the locking in this driver is nothing but a mess.
I'm sorry that you're the one who's modifying the driver when we've
spotted this, but please can you add a patch which first removes
phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on
top please?
At least that starts to reduce the amount of broken locking in this
driver.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2023-05-23 14:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 9:04 [PATCH net v2 0/3] net: phy: mscc: support VSC8501 David Epping
2023-05-23 9:04 ` [PATCH net v2 1/3] net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE David Epping
2023-05-23 9:04 ` [PATCH net v2 2/3] net: phy: mscc: add support for VSC8501 David Epping
2023-05-23 9:04 ` [PATCH net v2 3/3] net: phy: mscc: enable VSC8501/2 RGMII RX clock David Epping
2023-05-23 9:12 ` [PATCH net v2 0/3] net: phy: mscc: support VSC8501 Russell King (Oracle)
2023-05-23 13:16 ` Andrew Lunn
2023-05-23 13:32 ` David Epping
2023-05-23 14:32 ` Russell King (Oracle) [this message]
2023-05-23 14:57 ` David Epping
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZGzOZm0oJimLHCDA@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=UNGLinuxDriver@microchip.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=david.epping@missinglinkelectronics.com \
--cc=edumazet@google.com \
--cc=hkallweit1@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.