From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: David Epping <david.epping@missinglinkelectronics.com>
Cc: Andrew Lunn <andrew@lunn.ch>, Vladimir Oltean <olteanv@gmail.com>,
Heiner Kallweit <hkallweit1@gmail.com>,
"David S . Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
UNGLinuxDriver@microchip.com
Subject: Re: [PATCH net v2 0/3] net: phy: mscc: support VSC8501
Date: Tue, 23 May 2023 15:32:06 +0100 [thread overview]
Message-ID: <ZGzOZm0oJimLHCDA@shell.armlinux.org.uk> (raw)
In-Reply-To: <20230523133236.GA7185@nucnuc.mle>
On Tue, May 23, 2023 at 03:32:36PM +0200, David Epping wrote:
> On Tue, May 23, 2023 at 03:16:51PM +0200, Andrew Lunn wrote:
> > > - I left the mutex_lock(&phydev->lock) in the
> > > vsc85xx_update_rgmii_cntl() function, as I'm not sure whether it
> > > is required to repeatedly access phydev->interface and
> > > phy_interface_is_rgmii(phydev) in a consistent way.
> >
> > Just adding to Russell comment.
> >
> > As a general rule of thumb, if your driver is doing something which no
> > other driver is doing, you have to consider if it is correct. A PHY
> > driver taking phydev->lock is very unusual. So at minimum you should
> > be able to explain why it is needed. And when it comes to locking,
> > locking is hard, so you really should understand it.
> >
> > Now the mscc is an odd device, because it has multiple PHYs in the
> > package, and a number of registers are shared between these PHYs. So
> > it does have different locking requirements to most PHYs. However, i
> > don't think that is involved here. Those oddities are hidden behind
> > phy_base_write() and phy_base_read().
> >
> > Andrew
>
> Russell, Andrew,
>
> as you stated, locking is hard, and I am not in detail familiar with
> the mscc driver and the supported PHYs behavior. Also, I only have
> VSC8501, the single PHY chip, and none of the multi PHY chips to test.
> And testing these corner cases and race conditions is hard anyways.
> Thus my current patch is not touching the locking code at all, and
> assumes the current mainline code is correct in that regard.
> Because I don't understand all implications, I'm hesitant to change
> the existing locking scheme.
> Maybe this can be a separate patch? In the current patch set I'm not
> making the situation worse (unlike the first one where I added locks
> which Russell pointed out).
> If you insist and all agree the locks should be removed with this
> patch set, I'll update it of course.
Reading through this driver, IMHO it's clear that the original author
didn't have much idea about locking.
Your assumption that taking phydev->lock in vsc85xx_rgmii_set_skews()
protects phydev->interface is provably false, because:
static int vsc8584_config_init(struct phy_device *phydev)
{
...
if (phy_interface_is_rgmii(phydev)) {
ret = vsc85xx_rgmii_set_skews(phydev, VSC8572_RGMII_CNTL,
VSC8572_RGMII_RX_DELAY_MASK,
VSC8572_RGMII_TX_DELAY_MASK);
This accesses phydev->interface without holding phydev->lock,
before entering vsc85xx_rgmii_set_skews().
The second place that vsc85xx_rgmii_set_skews() is called from is
vsc85xx_default_config() which also accesses phydev->interface,
again without taking the phydev->lock.
So both paths into vsc85xx_rgmii_set_skews() have already read
phydev->interface without taking the lock. If this was what the
lock in vsc85xx_rgmii_set_skews() was protecting, then surely it
would need to protect those reads as well! It doesn't.
Also, with knowledge of phylib, I can say that this lock is
completely unnecessary when accessing phydev->interface in any
PHY driver .config_init method, which is the only place that
vsc85xx_rgmii_set_skews() is called from.
Having read the rest of the driver, it would appear that phydev->lock
is being abused to protect register accesses. This is evidenced by
the following, where I also set out why it's wrong:
vsc85xx_led_cntl_set()... which should be using phy_modify(), not
phy_read()..modify..phy_write(), which is open to races e.g. from
userspace MDIO access. phydev->lock doesn't solve anything there.
vsc85xx_edge_rate_cntl_set()... which correctly uses
phy_modify_paged() which itself will correctly prevent racy accesses
by taking the MDIO bus lock. It makes no accesses to anything else,
so phydev->lock here is entirely unnecessary.
vsc85xx_mac_if_set()... which is another case of racy access in the
same way as vsc85xx_led_cntl_set().
vsc8531_pre_init_seq_set() and vsc85xx_eee_init_seq_set()... both of
which IMHO show a complete misunderstanding for locking. At least
both of these functions are safe from other threads accessing the
bus because they correctly use phy_select_page()...phy_restore_page()
(which uses the MDIO bus lock to guarantee no other access will
happen.) BTW, I'm the author of phy_select_page()...phy_restore_page()
which were added to ensure that PHY drivers can _safely_ access
paged registers without fear of anything else disrupting accesses
to those paged registers, not even from userspace.
Essentially, taking phydev->lock offers absolutely zero protection
against another thread making accesses to the PHYs registers. The
*only* lock which prevents concurrent access to registers on devices
on a MDIO bus is the MDIO bus lock.
The only locking decision that I can see in this driver that is
correct is in phy_base_(read|write) which correctly demand that the
MDIO bus lock is held.
Oh my, things get even more "fun"...
vsc8574_config_pre_init() requires the MDIO bus lock to be held when
its called. This function uses request_firmware(), which can call out
to userspace and then *block* waiting for userspace to respond. This
will block *all* access to any device on that MDIO bus until the
firmware request has been satisfied.
Sorry, but the locking in this driver is nothing but a mess.
I'm sorry that you're the one who's modifying the driver when we've
spotted this, but please can you add a patch which first removes
phydev->lock from vsc85xx_rgmii_set_skews() and then your patch on
top please?
At least that starts to reduce the amount of broken locking in this
driver.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
next prev parent reply other threads:[~2023-05-23 14:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-23 9:04 [PATCH net v2 0/3] net: phy: mscc: support VSC8501 David Epping
2023-05-23 9:04 ` [PATCH net v2 1/3] net: phy: mscc: add VSC8502 to MODULE_DEVICE_TABLE David Epping
2023-05-23 9:04 ` [PATCH net v2 2/3] net: phy: mscc: add support for VSC8501 David Epping
2023-05-23 9:04 ` [PATCH net v2 3/3] net: phy: mscc: enable VSC8501/2 RGMII RX clock David Epping
2023-05-23 9:12 ` [PATCH net v2 0/3] net: phy: mscc: support VSC8501 Russell King (Oracle)
2023-05-23 13:16 ` Andrew Lunn
2023-05-23 13:32 ` David Epping
2023-05-23 14:32 ` Russell King (Oracle) [this message]
2023-05-23 14:57 ` David Epping
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZGzOZm0oJimLHCDA@shell.armlinux.org.uk \
--to=linux@armlinux.org.uk \
--cc=UNGLinuxDriver@microchip.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=david.epping@missinglinkelectronics.com \
--cc=edumazet@google.com \
--cc=hkallweit1@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=olteanv@gmail.com \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).