* Re: [RFC] comparing the propesed implementation for standalone PCS drivers
2025-07-09 13:52 ` Simon Horman
@ 2025-07-10 22:50 ` Sean Anderson
2025-07-10 23:58 ` Sean Anderson
2025-07-10 23:44 ` Christian Marangi (Ansuel)
1 sibling, 1 reply; 6+ messages in thread
From: Sean Anderson @ 2025-07-10 22:50 UTC (permalink / raw)
To: Simon Horman
Cc: Daniel Golle, netdev, Andrew Lunn, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Maxime Chevallier, Russell King,
Vineeth Karumanchi, Heiner Kallweit, linux-kernel, Kory Maincent,
Christian Marangi, Lei Wei, Michal Simek, Radhey Shyam Pandey,
Robert Hancock, John Crispin, Felix Fietkau, Robert Marko
On 7/9/25 09:52, Simon Horman wrote:
> On Fri, Jun 13, 2025 at 12:06:23PM -0400, Sean Anderson wrote:
>> On 6/13/25 08:55, Daniel Golle wrote:
>> > Hi netdev folks,
>> >
>> > there are currently 2 competing implementations for the groundworks to
>> > support standalone PCS drivers.
>> >
>> > https://patchwork.kernel.org/project/netdevbpf/list/?series=970582&state=%2A&archive=both
>> >
>> > https://patchwork.kernel.org/project/netdevbpf/list/?series=961784&state=%2A&archive=both
>> >
>> > They both kinda stalled due to a lack of feedback in the past 2 months
>> > since they have been published.
>> >
>> > Merging the 2 implementation is not a viable option due to rather large
>> > architecture differences:
>> >
>> > | Sean | Ansuel
>> > --------------------------------+-----------------------+-----------------------
>> > Architecture | Standalone subsystem | Built into phylink
>> > Need OPs wrapped | Yes | No
>> > resource lifecycle | New subsystem | phylink
>> > Supports hot remove | Yes | Yes
>> > Supports hot add | Yes (*) | Yes
>> > provides generic select_pcs | No | Yes
>> > support for #pcs-cell-cells | No | Yes
>> > allows migrating legacy drivers | Yes | Yes
>> > comes with tested migrations | Yes | No
>> >
>> > (*) requires MAC driver to also unload and subsequent re-probe for link
>> > to work again
>> >
>> > Obviously both architectures have pros and cons, here an incomplete and
>> > certainly biased list (please help completing it and discussing all
>> > details):
>> >
>> > Standalone Subsystem (Sean)
>> >
>> > pros
>> > ====
>> > * phylink code (mostly) untouched
>> > * doesn't burden systems which don't use dedicated PCS drivers
>> > * series provides tested migrations for all Ethernet drivers currently
>> > using dedicated PCS drivers
>> >
>> > cons
>> > ====
>> > * needs wrapper for each PCS OP
>> > * more complex resource management (malloc/free)
>> > * hot add and PCS showing up late (eg. due to deferred probe) are
>> > problematic
>> > * phylink is anyway the only user of that new subsystem
>>
>> I mean, if you want I can move the whole thing to live in phylink.c, but
>> that just enlarges the kernel if PCSs are not being used. The reverse
>> criticism can be made for Ansuel's series: most phylink users do not
>> have "dynamic" PCSs but the code is imtimately integrated with phylink
>> anyway.
>
> At the risk of stating the obvious it seems to me that a key decision
> that needs to be made is weather a new subsystem is the correct direction.
>
> If I understand things correctly it seems that not creating a new subsystem
> is likely to lead to a simpler implementation, at least in the near term.
It's really more of an unusual PCS driver with some routines for
registering and looking up devices. I would like to note that Ansuel's
approach has those same registration and lookup functions.
> While doing so lends itself towards greater flexibility in terms of users,
> I'd suggest a cleaner abstraction layer, and possibly a smaller footprint
> (I assume space consumed by unused code) for cases where PCS is not used.
I think the greatest strength of my implementation is its clean
interface. The rest of phylink doesn't know or care whether the PCS is a
traditional one (tied to the lifetime of the netdev) or whether it is
dynamically looked up.
> On the last point, I do wonder if there are other approaches to managing
> the footprint. And if so, that may tip the balance towards a new subsystem.
>
>
> Another way of framing this is: Say, hypothetically, Sean was to move his
> implementation into phylink.c. Then we might be able to have a clearer
> discussion of the merits of each implementation. Possibly driving towards
> common ground. But it seems hard to do so if we're unsure if there should
> be a new subsystem or not.
I really think it's just cosmetic. For example, in my implementation we have
/* pcs/core.c */
static void pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode,
struct phylink_link_state *state)
{
struct pcs_wrapper *wrapper = pcs_to_wrapper(pcs);
struct phylink_pcs *wrapped;
guard(srcu)(&pcs_srcu);
wrapped = srcu_dereference(wrapper->wrapped, &pcs_srcu);
if (wrapped)
wrapped->ops->pcs_get_state(wrapped, neg_mode, state);
else
state->link = 0;
}
/* phylink.c */
static void phylink_mac_pcs_get_state(struct phylink *pl,
struct phylink_link_state *state)
{
struct phylink_pcs *pcs;
/* ... snip ... */
pcs = pl->pcs;
if (pcs)
pcs->ops->pcs_get_state(pcs, pl->pcs_neg_mode, state);
else
state->link = 0;
}
and that would turn into
/* phylink.c */
static void phylink_mac_pcs_get_state(struct phylink *pl,
struct phylink_link_state *state)
{
struct pcs_wrapper *wrapper = pcs_to_wrapper(pcs);
struct phylink_pcs *pcs;
/* ... snip ... */
guard(srcu)(&pcs_srcu);
if (pl->pcs->ops == &pcs_wrapper_ops)
pcs = srcu_dereference(wrapper->wrapped, &pcs_srcu);
else
pcs = pl->pcs;
if (pcs)
pcs->ops->pcs_get_state(pcs, pl->pcs_neg_mode, state);
else
state->link = 0;
}
and TBH I like the former much better since we avoid special-casing the
wrapper stuff. We still have to do the wrapper stuff because the MAC
owns the PCS and we can't prevent it from passing phylink a stale PCS
pointer. Now, we could make phylink own the PCS, but that means going
with Ansuel's approach. And the main problem phylink owning the PCS is
that it complicates lookup for existing MACs that need to accomodate a
variety of nonstandard ways of looking up a PCS for backwards-
compatibility. The only real way to do it is something like
/* In mac_probe() or whatever */
scoped_guard(mutex)(&pcs_remove_lock) {
/* Just imagine some terrible contortions for compatibility here */
struct phylink_pcs *pcs = pcs_get(dev, "my_pcs");
if (IS_ERR(pcs))
return PTR_ERR(pcs);
list_add(pcs->list, &config.pcs_list);
ret = phylink_create(config, dev->fwnode, interface,
&mac_phylink_ops);
if (ret)
return ret;
}
/* At this point the PCS could have already been removed */
but even then the MAC has no idea how to mux the correct PCS. If you
have more than one dynamically-looked-up PCS they can't be
differentiated because they are both opaque pointers that may point to
stale memory at any time.
This is why I favor a wrapper approach because we can allocate some
memory that's tied to the lifetime of the MAC rather than the lifetime
of the PCS. Then we don't have to worry about whether the PCS is still
valid and we can get on with our lives.
--Sean
>> > phylink-managed standalone PCS drivers (Ansuel)
>> >
>> > pros
>> > ====
>> > * trivial resource management
>>
>> Actually, I would say the resource management is much more complex and
>> difficult to follow due to being spread out over many different
>> functions.
>>
>> > * no wrappers needed
>> > * full support for hot-add and deferred probe
>> > * avoids code duplication by providing generic select_pcs
>> > implementation
>> > * supports devices which provide more than one PCS port per device
>> > ('#pcs-cell-cells')
>> >
>> > cons
>> > ====
>> > * inclusion in phylink means more (dead) code on platforms not using
>> > dedicated PCS
>> > * series does not provide migrations for existing drivers
>> > (but that can be done after)
>> > * probably a bit harder to review as one needs to know phylink very well
>> >
>> >
>> > It would be great if more people can take a look and help deciding the
>> > general direction to go.
>>
>> I also encourage netdev maintainers to have a look; Russell does not
>> seem to have the time to review either system.
>>
>> > There are many drivers awaiting merge which require such
>> > infrastructure (most are fine with either of the two), some for more
>> > than a year by now.
>>
>> This is the major thing. PCS drivers should have been supported from the
>> start of phylink, and the longer there is no solution the more legacy
>> code there is to migrate.
>
> This seems to be something we can all agree on :)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC] comparing the propesed implementation for standalone PCS drivers
2025-07-09 13:52 ` Simon Horman
2025-07-10 22:50 ` Sean Anderson
@ 2025-07-10 23:44 ` Christian Marangi (Ansuel)
1 sibling, 0 replies; 6+ messages in thread
From: Christian Marangi (Ansuel) @ 2025-07-10 23:44 UTC (permalink / raw)
To: Simon Horman
Cc: Sean Anderson, Daniel Golle, netdev, Andrew Lunn,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Maxime Chevallier, Russell King, Vineeth Karumanchi,
Heiner Kallweit, linux-kernel, Kory Maincent, Lei Wei,
Michal Simek, Radhey Shyam Pandey, Robert Hancock, John Crispin,
Felix Fietkau, Robert Marko
Il giorno mer 9 lug 2025 alle ore 15:52 Simon Horman
<horms@kernel.org> ha scritto:
>
> On Fri, Jun 13, 2025 at 12:06:23PM -0400, Sean Anderson wrote:
> > On 6/13/25 08:55, Daniel Golle wrote:
> > > Hi netdev folks,
> > >
> > > there are currently 2 competing implementations for the groundworks to
> > > support standalone PCS drivers.
> > >
> > > https://patchwork.kernel.org/project/netdevbpf/list/?series=970582&state=%2A&archive=both
> > >
> > > https://patchwork.kernel.org/project/netdevbpf/list/?series=961784&state=%2A&archive=both
> > >
> > > They both kinda stalled due to a lack of feedback in the past 2 months
> > > since they have been published.
> > >
> > > Merging the 2 implementation is not a viable option due to rather large
> > > architecture differences:
> > >
> > > | Sean | Ansuel
> > > --------------------------------+-----------------------+-----------------------
> > > Architecture | Standalone subsystem | Built into phylink
> > > Need OPs wrapped | Yes | No
> > > resource lifecycle | New subsystem | phylink
> > > Supports hot remove | Yes | Yes
> > > Supports hot add | Yes (*) | Yes
> > > provides generic select_pcs | No | Yes
> > > support for #pcs-cell-cells | No | Yes
> > > allows migrating legacy drivers | Yes | Yes
> > > comes with tested migrations | Yes | No
> > >
> > > (*) requires MAC driver to also unload and subsequent re-probe for link
> > > to work again
> > >
> > > Obviously both architectures have pros and cons, here an incomplete and
> > > certainly biased list (please help completing it and discussing all
> > > details):
> > >
> > > Standalone Subsystem (Sean)
> > >
> > > pros
> > > ====
> > > * phylink code (mostly) untouched
> > > * doesn't burden systems which don't use dedicated PCS drivers
> > > * series provides tested migrations for all Ethernet drivers currently
> > > using dedicated PCS drivers
> > >
> > > cons
> > > ====
> > > * needs wrapper for each PCS OP
> > > * more complex resource management (malloc/free)
> > > * hot add and PCS showing up late (eg. due to deferred probe) are
> > > problematic
> > > * phylink is anyway the only user of that new subsystem
> >
> > I mean, if you want I can move the whole thing to live in phylink.c, but
> > that just enlarges the kernel if PCSs are not being used. The reverse
> > criticism can be made for Ansuel's series: most phylink users do not
> > have "dynamic" PCSs but the code is imtimately integrated with phylink
> > anyway.
>
> At the risk of stating the obvious it seems to me that a key decision
> that needs to be made is weather a new subsystem is the correct direction.
>
If you want to expand it a bit it's about new subsystem + making things
more deterministic.
> If I understand things correctly it seems that not creating a new subsystem
> is likely to lead to a simpler implementation, at least in the near term.
> While doing so lends itself towards greater flexibility in terms of users,
> I'd suggest a cleaner abstraction layer, and possibly a smaller footprint
> (I assume space consumed by unused code) for cases where PCS is not used.
>
Funnily enough almost all implementation have an attached PCS either
if it's something very basic or it's something more advanced (normally
this is 100% of the case when 10g is supported)
Soo case where PCS is not used are very little and in the case where
it's not used it's just an empty pointer and some bitmask for PHY
interface.
> On the last point, I do wonder if there are other approaches to managing
> the footprint. And if so, that may tip the balance towards a new subsystem.
>
>
> Another way of framing this is: Say, hypothetically, Sean was to move his
> implementation into phylink.c. Then we might be able to have a clearer
> discussion of the merits of each implementation. Possibly driving towards
> common ground. But it seems hard to do so if we're unsure if there should
> be a new subsystem or not.
>
Honestly speaking this case is very similar to some situation where Russell
had to intervene as the implementation reached criticality (a recent example is
EEE where the only solution was to provide to phylink more info so correct
decision could be made preventing MAC driver doing strange broken stuff)
I'm still with the idea that PCS handling in phylink should be improved.
For example there is a big problem where phylink doesn't exactly know
what interface are supported from PCS or MAC with the MAC driver
implement the common pattern of ORing the interface supported by MAC
and by the different PCS.
I feel that even if the wrapper solution gets accepted, phylink requires a
big overhaul for PCS handling. (And Russell more or less already started
it with filling some condition when the select_pcs fails when the interface
change)
Things are getting complex enough that in some scenarios the PCS
might fail calibration or might """explode"""" after a while and phylink
is currently not designed for that.
And also worth considering that for 1gigabit connection it's possible
that something will fallback from usxgmii to sgmii in this extreme case
and I feel phylink should be able to handle that smoothly.
This is really just to give some context hoping it gets some traction
on why we really need to start fixing the problem and putting effort
on it. (my opinion is that it will only get worse, I'm scared to see
the complexity of things when 10g+ stuff will reach consumer or
prosumer market)
> > > phylink-managed standalone PCS drivers (Ansuel)
> > >
> > > pros
> > > ====
> > > * trivial resource management
> >
> > Actually, I would say the resource management is much more complex and
> > difficult to follow due to being spread out over many different
> > functions.
> >
> > > * no wrappers needed
> > > * full support for hot-add and deferred probe
> > > * avoids code duplication by providing generic select_pcs
> > > implementation
> > > * supports devices which provide more than one PCS port per device
> > > ('#pcs-cell-cells')
> > >
> > > cons
> > > ====
> > > * inclusion in phylink means more (dead) code on platforms not using
> > > dedicated PCS
> > > * series does not provide migrations for existing drivers
> > > (but that can be done after)
> > > * probably a bit harder to review as one needs to know phylink very well
> > >
> > >
> > > It would be great if more people can take a look and help deciding the
> > > general direction to go.
> >
> > I also encourage netdev maintainers to have a look; Russell does not
> > seem to have the time to review either system.
> >
> > > There are many drivers awaiting merge which require such
> > > infrastructure (most are fine with either of the two), some for more
> > > than a year by now.
> >
> > This is the major thing. PCS drivers should have been supported from the
> > start of phylink, and the longer there is no solution the more legacy
> > code there is to migrate.
>
> This seems to be something we can all agree on :)
^ permalink raw reply [flat|nested] 6+ messages in thread