From: Christian Marangi <ansuelsmth@gmail.com>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Rob Herring <robh@kernel.org>,
Krzysztof Kozlowski <krzk+dt@kernel.org>,
Conor Dooley <conor+dt@kernel.org>,
Heiner Kallweit <hkallweit1@gmail.com>,
Philipp Zabel <p.zabel@pengutronix.de>,
Daniel Golle <daniel@makrotopia.org>,
netdev@vger.kernel.org, devicetree@vger.kernel.org,
linux-kernel@vger.kernel.org, upstream@airoha.com
Subject: Re: [net-next PATCH 3/6] net: phylink: Correctly handle PCS probe defer from PCS provider
Date: Thu, 27 Mar 2025 18:37:19 +0100 [thread overview]
Message-ID: <67e58cd2.7b0a0220.289480.1e35@mx.google.com> (raw)
In-Reply-To: <Z9sbeNTNy0dYhCgu@shell.armlinux.org.uk>
On Wed, Mar 19, 2025 at 07:31:04PM +0000, Russell King (Oracle) wrote:
> On Wed, Mar 19, 2025 at 06:35:21PM +0100, Christian Marangi wrote:
> > On Wed, Mar 19, 2025 at 05:02:50PM +0000, Russell King (Oracle) wrote:
> > > My thoughts are that if a PCS goes away after a MAC driver has "got"
> > > it, then:
> > >
> > > 1. we need to recognise that those PHY interfaces and/or link modes
> > > are no longer available.
> > > 2. if the PCS was in-use, then the link needs to be taken down at
> > > minimum and the .pcs_disable() method needs to be called to
> > > release any resources that .pcs_enable() enabled (e.g. irq masks,
> > > power enables, etc.)
> > > 3. the MAC driver needs to be notified that the PCS pointer it
> > > stashed is no longer valid, so it doesn't return it for
> > > mac_select_pcs().
> >
> > But why we need all these indirect handling and checks if we can
> > make use of .remove and shutdown the interface. A removal of a PCS
> > should cause the entire link to go down, isn't a dev_close enough to
> > propagate this? If and when the interface will came up checks are done
> > again and it will fail to go UP if PCS can't be found.
> >
> > I know it's a drastic approach to call dev_close but link is down anyway
> > so lets reinit everything from scratch. It should handle point 2 and 3
> > right?
>
> Let's look at what dev_close() does. This is how it's documented:
>
> * dev_close() - shutdown an interface
> * @dev: device to shutdown
> *
> * This function moves an active device into down state. A
> * %NETDEV_GOING_DOWN is sent to the netdev notifier chain. The device
> * is then deactivated and finally a %NETDEV_DOWN is sent to the notifier
> * chain.
>
> So, this is equivalent to userspace doing:
>
> # ip li set dev ethX down
>
> and nothing prevents userspace doing:
>
> # ip li set dev ethX up
>
> after that call to dev_close() has returned.
>
> If this happens, then the netdev driver's .ndo_open will be called,
> which will then call phylink_start(), and that will attempt to bring
> the link back up. That will call .mac_select_pcs(), which _if_ the
> PCS is still "published" means it is _still_ accessible.
>
> So your call that results in dev_close() with the PCS still being
> published is ineffectual.
>
> It's *no* different from this crap in the stmmac driver:
>
> stmmac_stop_all_dma(priv);
> stmmac_mac_set(priv, priv->ioaddr, false);
> unregister_netdev(ndev);
>
> because *until* that unregister_netdev() call has completed, _userspace_
> still has control over the netdev, and can do whatever it damn well
> pleases.
>
> Look, this is very very very simple.
>
> If something is published to another part of the code, it is
> discoverable, and it can be used or manipulated by new users.
>
> If we wish to take something away, then first, it must be
> unpublished to prevent new users discovering the resource. Then
> existing users need to be dealt with in a safe way. Only at that
> point can we be certain that there are no users, and thus the
> underlying device begin to be torn down.
>
> It's entirely logical!
>
OK so (I think this was also suggested in the more specific PCS patch)
- 1. unpublish the PCS from the provider
- 2. put down the link...
I feel point 2 is the big effort here to solve. Mainly your problem is
the fact that phylink_major_config should not handle PROBE_DEFER and
should always have all the expected PCS available. (returned from
mac_select_pcs)
So the validation MUST ALWAYS be done before reaching that code path.
That means that when a PCS is removed, the entire phylink should be
refreshed and reevaluated. And at the same time lock userspace from
doing anything fancy (as there might be a possibility for
phylink_major_config)
Daniel at some point in the brainstorm process suggested that we might
need something like phylink_impair() to lock it while it's getting
""refreshed"". Do you think that might be a good path for this?
One of the first implementation of this called phylink_stop (not
dev_stop) so maybe I should reconsider keeping everything phylink
related. But that wouldn't put the interface down from userspace if I'm
not wrong.
It's point 3 (of the old list) "the MAC driver needs to be notified that
the PCS pointer it stashed is no longer valid, so it doesn't return it for
mac_select_pcs()." my problem. I still feel MAC should not track PCS but
only react on the presence (or absence) of them.
And this point is really connected to point 1 so I guess point 1 is the
first to handle, before this. (I also feel it will magically solved once
point 1 is handled)
> > For point 1, additional entry like available_interface? And gets updated
> > once a PCS gets removed??? Or if we don't like the parsing hell we map
> > every interface to a PCS pointer? (not worth the wasted space IMHO)
>
> At the moment, MAC drivers that I've updated will do things like:
>
> phy_interface_or(priv->phylink_config.supported_interfaces,
> priv->phylink_config.supported_interfaces,
> pcs->supported_interfaces);
>
> phylink_config.supported_interfaces is the set of interface modes that
> the MAC _and_ PCS subsystem supports. It's not just the MAC, it's both
> together.
>
> So, if a PCS is going away, then clearing the interface modes that the
> PCS was providing would make sense - but there's a problem here. What
> if the PCS is a bought-in bit of IP where the driver supports many modes
> but the MAC doesn't use it for all those modes. So... which interface
> modes get cleared is up to the MAC driver to decide.
>
Should we add an OP to handle removal of PCS from a MAC? Like
.mac_release_pcs ? I might be wrong but isn't that giving too much
freedom to the driver?
I need to recheck how the interface validation work and what values are
used but with this removal thing on the table, supported_interfaces OR
with the PCS supported_interface might be problematic and maybe the
original values should be stored somewhere.
> > > There's probably a bunch more that needs to happen, and maybe need
> > > to consider how to deal with "pcs came back".. but I haven't thought
> > > that through yet.
> >
> > Current approach supports PCS came back as we check the global provider
> > list and the PCS is reachable again there.
> > (we tasted various scenario with unbind/bind while the interface was
> > up/down)
>
> ... because you look up the PCS in the mac_select_pcs() callback which
> leads to a different race to what we have today, this time inside the
> phylink code which thankfully phylink prints an error which is *NEVER*
> supposed to happen.
>
I want to make sure tho you are ok with the usage of .mac_select_pcs
for re-evaluation task.
Maybe a better approach is to introduce .mac_get_pcs and enforce the
usage only on validation phase? (aka in phylink_validate_mac_and_pcs)
AFAIK in that phase .mac_select_pcs can return errors if the requested
interface is not possible for one reason or another.
What do you think? In short 2 additional OP that with the select one
result in:
- get
- select
- release
--
Ansuel
next prev parent reply other threads:[~2025-03-27 17:37 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-18 23:58 [net-next PATCH 0/6] net: pcs: Introduce support for PCS OF Christian Marangi
2025-03-18 23:58 ` [net-next PATCH 1/6] net: phylink: reset PCS-Phylink double reference on phylink_stop Christian Marangi
2025-03-18 23:58 ` [net-next PATCH 2/6] net: pcs: Implement OF support for PCS driver Christian Marangi
2025-03-19 9:11 ` Christian Marangi
2025-03-19 9:25 ` Christian Marangi
2025-03-19 15:17 ` Russell King (Oracle)
2025-03-19 16:03 ` Christian Marangi
2025-03-19 16:26 ` Russell King (Oracle)
2025-03-19 17:05 ` kernel test robot
2025-04-01 20:59 ` Sean Anderson
2025-03-18 23:58 ` [net-next PATCH 3/6] net: phylink: Correctly handle PCS probe defer from PCS provider Christian Marangi
2025-03-19 15:58 ` Russell King (Oracle)
2025-03-19 16:18 ` Christian Marangi
2025-03-19 17:02 ` Russell King (Oracle)
2025-03-19 17:35 ` Christian Marangi
2025-03-19 19:31 ` Russell King (Oracle)
2025-03-27 17:37 ` Christian Marangi [this message]
2025-03-27 18:08 ` Russell King (Oracle)
2025-03-28 8:59 ` Russell King (Oracle)
2025-03-18 23:58 ` [net-next PATCH 4/6] dt-bindings: net: ethernet-controller: permit to define multiple PCS Christian Marangi
2025-03-21 16:18 ` Rob Herring
2025-03-27 15:49 ` Christian Marangi
2025-04-01 20:12 ` Sean Anderson
2025-03-18 23:58 ` [net-next PATCH 5/6] net: pcs: airoha: add PCS driver for Airoha SoC Christian Marangi
2025-03-19 9:13 ` Christian Marangi
2025-03-19 20:41 ` kernel test robot
2025-03-20 1:54 ` kernel test robot
2025-03-21 6:35 ` kernel test robot
2025-03-18 23:58 ` [net-next PATCH 6/6] dt-bindings: net: pcs: Document support for Airoha Ethernet PCS Christian Marangi
2025-03-21 16:22 ` Rob Herring
2025-03-19 17:29 ` [net-next PATCH 0/6] net: pcs: Introduce support for PCS OF Russell King (Oracle)
2025-03-19 17:44 ` Christian Marangi
2025-04-02 0:14 ` Sean Anderson
2025-04-02 15:08 ` Christian Marangi (Ansuel)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=67e58cd2.7b0a0220.289480.1e35@mx.google.com \
--to=ansuelsmth@gmail.com \
--cc=andrew+netdev@lunn.ch \
--cc=conor+dt@kernel.org \
--cc=daniel@makrotopia.org \
--cc=davem@davemloft.net \
--cc=devicetree@vger.kernel.org \
--cc=edumazet@google.com \
--cc=hkallweit1@gmail.com \
--cc=krzk+dt@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=netdev@vger.kernel.org \
--cc=p.zabel@pengutronix.de \
--cc=pabeni@redhat.com \
--cc=robh@kernel.org \
--cc=upstream@airoha.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.