netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Russell King (Oracle)" <linux@armlinux.org.uk>
To: Christian Marangi <ansuelsmth@gmail.com>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Rob Herring <robh@kernel.org>,
	Krzysztof Kozlowski <krzk+dt@kernel.org>,
	Conor Dooley <conor+dt@kernel.org>,
	Heiner Kallweit <hkallweit1@gmail.com>,
	Philipp Zabel <p.zabel@pengutronix.de>,
	Daniel Golle <daniel@makrotopia.org>,
	netdev@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org, upstream@airoha.com
Subject: Re: [net-next PATCH 3/6] net: phylink: Correctly handle PCS probe defer from PCS provider
Date: Wed, 19 Mar 2025 19:31:04 +0000	[thread overview]
Message-ID: <Z9sbeNTNy0dYhCgu@shell.armlinux.org.uk> (raw)
In-Reply-To: <67db005c.df0a0220.f7398.ba6b@mx.google.com>

On Wed, Mar 19, 2025 at 06:35:21PM +0100, Christian Marangi wrote:
> On Wed, Mar 19, 2025 at 05:02:50PM +0000, Russell King (Oracle) wrote:
> > My thoughts are that if a PCS goes away after a MAC driver has "got"
> > it, then:
> > 
> > 1. we need to recognise that those PHY interfaces and/or link modes
> >    are no longer available.
> > 2. if the PCS was in-use, then the link needs to be taken down at
> >    minimum and the .pcs_disable() method needs to be called to
> >    release any resources that .pcs_enable() enabled (e.g. irq masks,
> >    power enables, etc.)
> > 3. the MAC driver needs to be notified that the PCS pointer it
> >    stashed is no longer valid, so it doesn't return it for
> >    mac_select_pcs().
> 
> But why we need all these indirect handling and checks if we can
> make use of .remove and shutdown the interface. A removal of a PCS
> should cause the entire link to go down, isn't a dev_close enough to
> propagate this? If and when the interface will came up checks are done
> again and it will fail to go UP if PCS can't be found.
> 
> I know it's a drastic approach to call dev_close but link is down anyway
> so lets reinit everything from scratch. It should handle point 2 and 3
> right?

Let's look at what dev_close() does. This is how it's documented:

 * dev_close() - shutdown an interface
 * @dev: device to shutdown
 *
 * This function moves an active device into down state. A
 * %NETDEV_GOING_DOWN is sent to the netdev notifier chain. The device
 * is then deactivated and finally a %NETDEV_DOWN is sent to the notifier
 * chain.

So, this is equivalent to userspace doing:

# ip li set dev ethX down

and nothing prevents userspace doing:

# ip li set dev ethX up

after that call to dev_close() has returned.

If this happens, then the netdev driver's .ndo_open will be called,
which will then call phylink_start(), and that will attempt to bring
the link back up. That will call .mac_select_pcs(), which _if_ the
PCS is still "published" means it is _still_ accessible.

So your call that results in dev_close() with the PCS still being
published is ineffectual.

It's *no* different from this crap in the stmmac driver:

        stmmac_stop_all_dma(priv);
        stmmac_mac_set(priv, priv->ioaddr, false);
        unregister_netdev(ndev);

because *until* that unregister_netdev() call has completed, _userspace_
still has control over the netdev, and can do whatever it damn well
pleases.

Look, this is very very very simple.

If something is published to another part of the code, it is
discoverable, and it can be used or manipulated by new users.

If we wish to take something away, then first, it must be
unpublished to prevent new users discovering the resource. Then
existing users need to be dealt with in a safe way. Only at that
point can we be certain that there are no users, and thus the
underlying device begin to be torn down.

It's entirely logical!

> For point 1, additional entry like available_interface? And gets updated
> once a PCS gets removed??? Or if we don't like the parsing hell we map
> every interface to a PCS pointer? (not worth the wasted space IMHO)

At the moment, MAC drivers that I've updated will do things like:

                phy_interface_or(priv->phylink_config.supported_interfaces,
                                 priv->phylink_config.supported_interfaces,
                                 pcs->supported_interfaces);

phylink_config.supported_interfaces is the set of interface modes that
the MAC _and_ PCS subsystem supports. It's not just the MAC, it's both
together.

So, if a PCS is going away, then clearing the interface modes that the
PCS was providing would make sense - but there's a problem here. What
if the PCS is a bought-in bit of IP where the driver supports many modes
but the MAC doesn't use it for all those modes. So... which interface
modes get cleared is up to the MAC driver to decide.

> > There's probably a bunch more that needs to happen, and maybe need
> > to consider how to deal with "pcs came back".. but I haven't thought
> > that through yet.
> 
> Current approach supports PCS came back as we check the global provider
> list and the PCS is reachable again there.
> (we tasted various scenario with unbind/bind while the interface was
> up/down)

... because you look up the PCS in the mac_select_pcs() callback which
leads to a different race to what we have today, this time inside the
phylink code which thankfully phylink prints an error which is *NEVER*
supposed to happen.

Just trading one problem for another problem.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

  reply	other threads:[~2025-03-19 19:31 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-18 23:58 [net-next PATCH 0/6] net: pcs: Introduce support for PCS OF Christian Marangi
2025-03-18 23:58 ` [net-next PATCH 1/6] net: phylink: reset PCS-Phylink double reference on phylink_stop Christian Marangi
2025-03-18 23:58 ` [net-next PATCH 2/6] net: pcs: Implement OF support for PCS driver Christian Marangi
2025-03-19  9:11   ` Christian Marangi
2025-03-19  9:25   ` Christian Marangi
2025-03-19 15:17   ` Russell King (Oracle)
2025-03-19 16:03     ` Christian Marangi
2025-03-19 16:26       ` Russell King (Oracle)
2025-03-19 17:05   ` kernel test robot
2025-04-01 20:59   ` Sean Anderson
2025-03-18 23:58 ` [net-next PATCH 3/6] net: phylink: Correctly handle PCS probe defer from PCS provider Christian Marangi
2025-03-19 15:58   ` Russell King (Oracle)
2025-03-19 16:18     ` Christian Marangi
2025-03-19 17:02       ` Russell King (Oracle)
2025-03-19 17:35         ` Christian Marangi
2025-03-19 19:31           ` Russell King (Oracle) [this message]
2025-03-27 17:37             ` Christian Marangi
2025-03-27 18:08               ` Russell King (Oracle)
2025-03-28  8:59               ` Russell King (Oracle)
2025-03-18 23:58 ` [net-next PATCH 4/6] dt-bindings: net: ethernet-controller: permit to define multiple PCS Christian Marangi
2025-03-21 16:18   ` Rob Herring
2025-03-27 15:49     ` Christian Marangi
2025-04-01 20:12       ` Sean Anderson
2025-03-18 23:58 ` [net-next PATCH 5/6] net: pcs: airoha: add PCS driver for Airoha SoC Christian Marangi
2025-03-19  9:13   ` Christian Marangi
2025-03-19 20:41   ` kernel test robot
2025-03-20  1:54   ` kernel test robot
2025-03-21  6:35   ` kernel test robot
2025-03-18 23:58 ` [net-next PATCH 6/6] dt-bindings: net: pcs: Document support for Airoha Ethernet PCS Christian Marangi
2025-03-21 16:22   ` Rob Herring
2025-03-19 17:29 ` [net-next PATCH 0/6] net: pcs: Introduce support for PCS OF Russell King (Oracle)
2025-03-19 17:44   ` Christian Marangi
2025-04-02  0:14 ` Sean Anderson
2025-04-02 15:08   ` Christian Marangi (Ansuel)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9sbeNTNy0dYhCgu@shell.armlinux.org.uk \
    --to=linux@armlinux.org.uk \
    --cc=andrew+netdev@lunn.ch \
    --cc=ansuelsmth@gmail.com \
    --cc=conor+dt@kernel.org \
    --cc=daniel@makrotopia.org \
    --cc=davem@davemloft.net \
    --cc=devicetree@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=krzk+dt@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=p.zabel@pengutronix.de \
    --cc=pabeni@redhat.com \
    --cc=robh@kernel.org \
    --cc=upstream@airoha.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).