Building the Linux kernel with Clang and LLVM
 help / color / mirror / Atom feed
From: Sean Anderson <sean.anderson@linux.dev>
To: Daniel Golle <daniel@makrotopia.org>
Cc: Christian Marangi <ansuelsmth@gmail.com>,
	Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Rob Herring <robh@kernel.org>,
	Krzysztof Kozlowski <krzk+dt@kernel.org>,
	Conor Dooley <conor+dt@kernel.org>,
	Lorenzo Bianconi <lorenzo@kernel.org>,
	Heiner Kallweit <hkallweit1@gmail.com>,
	Russell King <linux@armlinux.org.uk>,
	Philipp Zabel <p.zabel@pengutronix.de>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <nick.desaulniers+lkml@gmail.com>,
	Bill Wendling <morbo@google.com>,
	Justin Stitt <justinstitt@google.com>,
	netdev@vger.kernel.org, devicetree@vger.kernel.org,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-mediatek@lists.infradead.org, llvm@lists.linux.dev
Subject: Re: [net-next PATCH v4 03/11] net: phylink: introduce internal phylink PCS handling
Date: Mon, 19 May 2025 19:28:34 -0400	[thread overview]
Message-ID: <7bb37bea-917c-4082-9eaa-063c4d97833b@linux.dev> (raw)
In-Reply-To: <cdfbbdca-001b-4ed5-92ff-40fd3a8e3341@linux.dev>

On 5/19/25 14:10, Sean Anderson wrote:
> On 5/13/25 23:00, Daniel Golle wrote:
>> On Tue, May 13, 2025 at 03:23:32PM -0400, Sean Anderson wrote:
>>> On 5/13/25 15:03, Daniel Golle wrote:
>>> > just instead of having many
>>> > more or less identical implementations of .mac_select_pcs, this
>>> > functionality is moved into phylink. As a nice side-effect that also
>>> > makes managing the life-cycle of the PCS more easy, so we won't need all
>>> > the wrappers for all the PCS OPs.
>>> 
>>> I think the wrapper approach is very obviously correct. This way has me
>>> worried about exciting new concurrency bugs.
>> 
>> You may not be surprised to read that this was also our starting point 2
>> months ago, I had implemented support for standalone PCS very similar to
>> the approach you have published now, using refcnt'ed instances and
>> locked wrapper functions for all OPs. My approach, like yours, was to
>> create a new subsystem for standalone PCS drivers which is orthogonal to
>> phylink and only requires very few very small changes to phylink itself.
>> It was a draft and not as complete and well-documented like your series
>> now, of course.
>> 
>> I've then shared that implementation with Christian and some other
>> experienced OpenWrt developers and we concluded that having phylink handle
>> the PCS lifecycle and PCS selection would be the better and more elegant
>> approach for multiple reasons:
>>  - The lifetime management of the wrapper instances becomes tricky:
>>    We would either have to live with them being allocated by the
>>    MAC-driver (imagine test-case doing unbind and then bind in a loop
>>    for a while -- we would end up oom). Or we need some kind of garbage
>>    collecting mechanism which frees the wrapper once refcnt is zero --
>>    and as .select_pcs would 'get' the PCS (ie. bump refcnt) we'd need a
>>    'put' equivalent (eg. a .pcs_destroy() OP) in phylink.
>> 
>>    Russell repeatedly pointed me to the possibility of a PCS
>>    "disappearing" (and potentially "reappearing" some time later), and
>>    in this case it is unclear who would then ever call pcs_put(), or
>>    even notify the Ethernet driver or phylink about the PCS now being
>>    available (again). Using device_link_add(), like it is done in
>>    pcs-rzn1-miic.c, prevents the worst (ie. use-after-free), but also
>>    impacts all other netdevs exposed by the same Ethernet driver
>>    instance, and has a few other rather ugly implications.
> 
> SRCU neatly solves the lifetime management issues. The wrapper lives as
> long as anyone (provider or user) holds a reference. A PCS can disappear
> at any point and everything still works (although the link goes down).
> Device links are only an optimization; they cannot be relied on for
> correctness.
> 
>>  - phylink currently expects .mac_select_pcs to never fail. But we may
>>    need a mechanism similar to probe deferral in case the PCS is not
>>    yet available.
> 
> Which is why you grab the PCS in probe. If you want to be more dynamic,
> you can do it in netdev open like is done for PHYs.
> 
>>    Your series partially solves this in patch 11/11 "of: property: Add
>>    device link support for PCS", but also that still won't make the link
>>    come back in case of a PCS showing up late to the party, eg. due to
>>    constraints such as phy drivers (drivers/phy, not drivers/net/phy)
>>    waiting for nvmem providers, or PCS instances "going away" and
>>    "coming back" later.
> 
> This all works correctly due to device links. The only case that doesn't
> work automatically is something like
> 
> MAC built-in
>   MDIO built-in
>     PCS module
> 
> where the PCS module gets loaded late. In that case you have to manually
> re-probe the MAC. I think the best way to address this would be to grab
> the PCS in netdev open so that the MAC can probe without the PCS.
> 
>>  - removal of a PCS instance (eg. via sysfs unbind) would still
>>    require changes to phylink. there is no phylink function to
>>    impair the link in this case, and using dev_close() is a bit ugly,
>>    and also won't bring the link back up once the PCS (re-)appears.
> 
> This works just fine. There are two cases:
> 
> - If the PCS has an IRQ, we notify phylink and then it polls the PCS
>   (see below).
> - If the PCS is polled, phylink will call pcs_get_state and see that the
>   link is down.
> 
> Either way, the link goes down. But bringing the link back up is pretty
> unusual anyway. Unlike PHYs (which theoretically can be on removable
> busses) PCSs are generally permanently attached to their MACs. The only
> removable scenario I can think of is if the PCS is on an FPGA and the
> MAC is not.
> 
> So if the PCS goes away, the MAC is likely to follow shortly after
> (since the whole thing is on a removable bus). Or someone has manually
> removed the PCS, in which case I think it's reasonable to have them
> manually remove the MAC as well. If you really want to support this,
> then just grab the PCS in netdev open.

So I had a closer look at this and unfortunately it isn't as easy as
just grabbing the PCS in ndo_open. The problem is that we need to
know the supported interfaces before phylink_create. The interfaces are
validated and are visible to userspace as soon as the netdev is
registered. And we can't just defer phylink_create to ndo_open because a
lot of the ethtool ops are implemented with phylink. So this would
probably need something like phylink_update_supported_interfaces().

But TBH I don't think this use case is very relevant. As I said above,
it only affects FPGA reconfiguration and people manually unbinding
drivers. Either way I think they are savvy enough to reprobe the netdev.

--Sean

>>  - phylink anyway is the only user of PCS drivers, and will very likely
>>    always be. So why create another subsystem?
> 
> To avoid adding overhead for the majority of PCSs where the PCS is built
> into the MAC and literally can't be removed. We only pay the price for
> dynamicism on the drivers where it matters.
> 
>> All that being said I also see potential problems with Christians
>> current implementation as it doesn't prevent the Ethernet driver to
>> still store a pointer to struct phylink_pcs (returned eg. from
>> fwnode_pcs_get()).
>> 
>> Hence I would like to see an even more tight integration with phylink,
>> in the sense that pointers to 'struct phylink_pcs' should never be
>> exposed to the MAC driver, as only in that way we can be sure that
>> phylink, and only phylink, is responsible for reacting to a PCS "going
>> away".
> 
> OK, but then how does the MAC select the PCS? If there are multiple PCSs
> then ultimately someone has to configure a mux somewhere.
> 
>> Ie. instead of fwnode_phylink_pcs_parse() handing pointers to struct
>> phylink_pcs to the Ethernet driver, so it can use it to populate struct
>> phylink_config available_pcs member, this should be the responsibility
>> of phylink alltogether, directly populating the list of available PCS in
>> phylink's private structure.
>> 
>> Similarly, there should not be fwnode_pcs_get() but rather phylink
>> providing a function fwnode_phylink_pcs_register(phylink, fwnode) which
>> directly adds the PCS referenced to the internal list of available PCS.
> 
> This is difficult to work with for existing drivers. Many of them have
> non-standard ways of looking up their PCS that they need to support for
> backwards-compatibility. And some of them create the PCS themselves
> (such as if they are PCI devices with internal MDIO busses). It's much
> easier for the MAC to create or look up the PCS itself and then hand it
> off to phylink.
> 
>> I hope we can pick the best of all the suggested implementations, and
>> together come up with something even better.
> 
> Sure. And I think we were starting from a clean slate then this would be
> the obvious way to do things. But we must support existing drivers and
> provide an upgrade path for them. This is why I favor an incremental
> approach.
> 
> --Sean

  reply	other threads:[~2025-05-19 23:28 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-11 20:12 [net-next PATCH v4 00/11] net: pcs: Introduce support for fwnode PCS Christian Marangi
2025-05-11 20:12 ` [net-next PATCH v4 01/11] net: phy: introduce phy_interface_copy helper Christian Marangi
2025-05-13 18:18   ` Sean Anderson
2025-05-11 20:12 ` [net-next PATCH v4 02/11] net: phylink: keep and use MAC supported_interfaces in phylink struct Christian Marangi
2025-05-11 20:12 ` [net-next PATCH v4 03/11] net: phylink: introduce internal phylink PCS handling Christian Marangi
2025-05-13 18:18   ` Sean Anderson
2025-05-13 19:03     ` Daniel Golle
2025-05-13 19:23       ` Sean Anderson
2025-05-13 19:42         ` Sean Anderson
2025-05-14  3:00         ` Daniel Golle
2025-05-19 18:10           ` Sean Anderson
2025-05-19 23:28             ` Sean Anderson [this message]
2025-05-11 20:12 ` [net-next PATCH v4 04/11] net: phylink: add phylink_release_pcs() to externally release a PCS Christian Marangi
2025-05-11 20:12 ` [net-next PATCH v4 05/11] net: pcs: implement Firmware node support for PCS driver Christian Marangi
2025-05-13 18:40   ` Sean Anderson
2025-05-11 20:12 ` [net-next PATCH v4 06/11] net: phylink: support late PCS provider attach Christian Marangi
2025-05-11 20:12 ` [net-next PATCH v4 07/11] dt-bindings: net: ethernet-controller: permit to define multiple PCS Christian Marangi
2025-05-13 18:16   ` Sean Anderson
2025-05-14 21:32   ` Rob Herring
2025-05-11 20:12 ` [net-next PATCH v4 08/11] net: phylink: add .pcs_link_down PCS OP Christian Marangi
2025-05-13 18:44   ` Sean Anderson
2025-05-11 20:12 ` [net-next PATCH v4 09/11] dt-bindings: net: pcs: Document support for Airoha Ethernet PCS Christian Marangi
2025-05-13 18:25   ` Sean Anderson
2025-05-11 20:12 ` [net-next PATCH v4 10/11] net: pcs: airoha: add PCS driver for Airoha SoC Christian Marangi
2025-05-13  4:04   ` kernel test robot
2025-05-11 20:12 ` [net-next PATCH v4 11/11] net: airoha: add phylink support for GDM2/3/4 Christian Marangi
2025-05-12  9:21   ` Lorenzo Bianconi
2025-05-12  9:27     ` Christian Marangi
2025-05-12 12:53       ` Lorenzo Bianconi
2025-05-13 18:50   ` Sean Anderson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7bb37bea-917c-4082-9eaa-063c4d97833b@linux.dev \
    --to=sean.anderson@linux.dev \
    --cc=andrew+netdev@lunn.ch \
    --cc=ansuelsmth@gmail.com \
    --cc=conor+dt@kernel.org \
    --cc=daniel@makrotopia.org \
    --cc=davem@davemloft.net \
    --cc=devicetree@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=justinstitt@google.com \
    --cc=krzk+dt@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mediatek@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=llvm@lists.linux.dev \
    --cc=lorenzo@kernel.org \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=p.zabel@pengutronix.de \
    --cc=pabeni@redhat.com \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox