From: Vladimir Oltean <olteanv@gmail.com>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: "Marek Behún" <kabel@kernel.org>,
netdev@vger.kernel.org, "Andrew Lunn" <andrew@lunn.ch>,
"Jakub Kicinski" <kuba@kernel.org>,
"David Miller" <davem@davemloft.net>,
"Rob Herring" <robh+dt@kernel.org>,
devicetree@vger.kernel.org
Subject: Re: [PATCH net-next 8/8] net: phy: marvell10g: select host interface configuration
Date: Thu, 18 Nov 2021 17:09:51 +0200 [thread overview]
Message-ID: <20211118150951.jzwl5jickilxbfhy@skbuf> (raw)
In-Reply-To: <YZZnkEn76a3Q0hAY@shell.armlinux.org.uk>
On Thu, Nov 18, 2021 at 02:47:44PM +0000, Russell King (Oracle) wrote:
> On Thu, Nov 18, 2021 at 04:20:39PM +0200, Vladimir Oltean wrote:
> > On Thu, Nov 18, 2021 at 01:22:18PM +0000, Russell King (Oracle) wrote:
> > > On Thu, Nov 18, 2021 at 02:03:34PM +0200, Vladimir Oltean wrote:
> > > > On Wed, Nov 17, 2021 at 11:50:50PM +0100, Marek Behún wrote:
> > > > > +static int mv3310_select_mactype(unsigned long *interfaces)
> > > > > +{
> > > > > + if (test_bit(PHY_INTERFACE_MODE_USXGMII, interfaces))
> > > > > + return MV_V2_33X0_PORT_CTRL_MACTYPE_USXGMII;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_SGMII, interfaces) &&
> > > > > + test_bit(PHY_INTERFACE_MODE_10GBASER, interfaces))
> > > > > + return MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_SGMII, interfaces) &&
> > > > > + test_bit(PHY_INTERFACE_MODE_RXAUI, interfaces))
> > > > > + return MV_V2_33X0_PORT_CTRL_MACTYPE_RXAUI;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_SGMII, interfaces) &&
> > > > > + test_bit(PHY_INTERFACE_MODE_XAUI, interfaces))
> > > > > + return MV_V2_3310_PORT_CTRL_MACTYPE_XAUI;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_10GBASER, interfaces))
> > > > > + return MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER_RATE_MATCH;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_RXAUI, interfaces))
> > > > > + return MV_V2_33X0_PORT_CTRL_MACTYPE_RXAUI_RATE_MATCH;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_XAUI, interfaces))
> > > > > + return MV_V2_3310_PORT_CTRL_MACTYPE_XAUI_RATE_MATCH;
> > > > > + else if (test_bit(PHY_INTERFACE_MODE_SGMII, interfaces))
> > > > > + return MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER;
> > > > > + else
> > > > > + return -1;
> > > > > +}
> > > > > +
> > > >
> > > > I would like to understand this heuristic better. Both its purpose and
> > > > its implementation.
> > > >
> > > > It says:
> > > > (a) If the intersection between interface modes supported by the MAC and
> > > > the PHY contains USXGMII, then use USXGMII as a MACTYPE
> > > > (b) Otherwise, if the intersection contains both 10GBaseR and SGMII, then
> > > > use 10GBaseR as MACTYPE
> > > > (...)
> > > > (c) Otherwise, if the intersection contains just 10GBaseR (no SGMII), then
> > > > use 10GBaseR with rate matching as MACTYPE
> > > > (...)
> > > > (d) Otherwise, if the intersection contains just SGMII (no 10GBaseR), then
> > > > use 10GBaseR as MACTYPE (no rate matching).
> > >
> > > What is likely confusing you is a misinterpretation of the constant.
> > > MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER actually means the PHY will
> > > choose between 10GBASE-R, 5GBASE-R, 2500BASE-X, and SGMII depending
> > > on the speed negotiated by the media. In this setting, the PHY
> > > dictates which interface mode will be used.
> > >
> > > I could have named "MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER" as
> > > "MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER_5GBASER_2500BASEX_SGMII_AUTONEG_ON".
> > > Similar with "MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER_NO_SGMII_AN", which
> > > would be
> > > "MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER_5GBASER_2500BASEX_SGMII_AUTONEG_OFF".
> > > And "MV_V2_3310_PORT_CTRL_MACTYPE_XAUI" would be
> > > "MV_V2_3310_PORT_CTRL_MACTYPE_XAUI_5GBASER_2500BASEX_SGMII_AUTONEG_ON".
> > >
> > > Clearly using such long identifiers would have been rediculous,
> > > especially the second one at 74 characters.
> >
> > True, but at least there could be a comment above each definition.
> > There's no size limit to that.
> >
> > > > First of all, what is MACTYPE exactly? And what is the purpose of
> > > > changing it? What would happen if this configuration remained fixed, as
> > > > it were?
> > >
> > > The PHY defines the MAC interface mode depending on the MACTYPE
> > > setting selected and the results of the media side negotiation.
> > >
> > > I think the above answers your remaining questions.
> >
> > Ok, so going back to case (d). You said that the full name would be
> > MV_V2_33X0_PORT_CTRL_MACTYPE_10GBASER_5GBASER_2500BASEX_SGMII_AUTONEG_ON.
> > This means that when the only interface mode supported by the host would
> > be SGMII, the PHY's MACTYPE is still configured to use 2500basex,
> > 5gbaser, 10gbaser for the higher link speeds. Clearly this won't work.
> > But on the other hand, the phylink validate method will remove
> > 2500baseT, 5000baseT, 1000baseT from the advertising mask of the PHY, so
> > the system will never end up operating at those speeds, so it should be fine.
>
> I think you mean 10000baseT rather than 1000baseT. With that correction,
> you are then correct - with the media restricted to 1G or slower speeds,
> and the phy in MACTYPE mode 4 (aka 10GBASE-R as the fastest interface
> mode) it will permanently be talking SGMII to the host.
Yes, I missed a zero.
> > The reason why I'm looking at these patches is to see whether they would
> > bring something useful to Aquantia PHYs. These come with firmware on a
> > flash that is customized by Aquantia themselves based on the specifications
> > of a single board. These PHYs have an ability which is very similar to
> > what I'm seeing here, which is to select, for each negotiated link speed
> > on the media side, the SERDES protocol to use on the system side. This
> > is pre-programmed by the firmware, but could be fixed up by the
> > operating system if done carefully.
> >
> > The way Layerscape boards use Aquantia PHYs is to always select the
> > "rate matching" option and keep the SERDES protocol fixed, just
> > configure the mini MAC inside the PHY to emit PAUSE frames towards the
> > system to keep the data rate under control. We would be using these PHYs
> > with the generic C45 driver, which would be mostly enough except for
> > lack of PHY interrupts, because the firmware already configures
> > everything.
> >
> > But on the other hand it gets a bit tiring, especially for PHYs on riser
> > cards, to have to change firmware in order to test a different SERDES
> > protocol, so we were experimenting with some changes in the PHY driver
> > that would basically keep the firmware image fixed, and just fix up the
> > configuration it made, and do things like "use 2500base-x for the
> > 2500base-T speed, and sgmii for 1000base-T/100base-TX". The ability for
> > a PHY to work on a board where its firmware image wasn't specifically
> > designed for it comes in handy sometimes.
>
> You're going to get into problems with this on Layerscape, because
> reconfiguring the Serdes etc is something I've tried to highlight
> as being necessary to NXP since SolidRun started using LX2160A. I
> think there's some slow progress towards that, but it's so slow that
> I've basically given up caring about it on the Honeycomb/Clearfog CX
> boards now.
>
> All the SFP cages on my Honeycomb have been configured for the most
> useful mode to me - 1000BASE-X/SGMII, and I've given up caring about
> USXGMII/10GBASE-R on those ports.
Speaking of that, do you know of any SFP modules that would use USXGMII?
It doesn't appear to be listed in the spec sheet when looking for that.
> > I see that this patch set basically introduces the phydev->host_interfaces
> > bitmap which is an attempt to find the answer to that question. But when
> > will we know enough about phydev->host_interfaces in order to safely
> > make decisions in the PHY driver based on it? phylink sets it, phylib
> > does not.
>
> It won't be something phylib could set because phylib doesn't know
> the capabilities of its user - it's information that would need to be
> provided to phylib.
So you're saying it would be in phylib's best interest to not set it at
all, not even to a single bit corresponding to phydev->interface. So PHY
drivers could work out this way whether they should operate in backwards
compatibility mode or they could change MACTYPE at will.
> > And many Aquantia systems use the generic PHY driver, as mentioned.
> > Additionally, there are old device trees at play here, which only define
> > the initial SERDES protocol. Would we be changing the behavior for those,
> > in that we would be configuring the PHY to keep the SERDES protocol
> > fixed whereas it would have dynamically changed before?
>
> We have the same situation on Macchiatobin. The 88X3310 there defaults
> to MACTYPE mode 4, and we've supported this for years with DT describing
> the interface as 10GBASE-R - because we haven't actually cared very much
> what DT says up to this point for the 88X3310. As I said in my previous
> reply, the 88X3310 effectively dictates what the PHY interface mode will
> be, and that is communicated back through phylib to whoever is using
> phylib.
So what is the full backwards compatibility strategy with old DT blobs?
Is it in this patch set? I didn't notice it.
> > Another question is what to do if there are multiple ways of
> > establishing a system-side link. For example 1000 Mbps can be achieved
> > either through SGMII, or USXGMII with symbol replication, or 2500base-x
> > with flow control, or 10GBaseR with flow control. And I want to test
> > them all. What would I need to do to change the SERDES protocol from one
> > value to the other? Changing the phy-mode array in the device tree would
> > be one option, but that may not always be possible.
>
> First point to make here is that rate adaption at the PHY is really
> not well supported in Linux, and there is no way to know via phylib if
> a PHY is capable or not of rate adaption.
>
> Today, if you have a 10GBASE-R link between a PHY doing rate adaption
> and the "MAC", then what you will get from phylib is:
>
> phydev->interface = PHY_INTERFACE_MODE_10GBASER;
> phydev->speed = SPEED_1000; // result of media negotiation
> phydev->duplex = DUPLEX_FULL; // result of media negotiation
> phydev->pause = ...; // result of media negotiation
> phydev->asym_pause = ...; // result of media negotiation
>
> which will, for the majority of implementations, result in the MAC being
> forced to a 1G speed, possibly with or without pause enabled.
>
> Due to this, if phylink is being used, the parameters given to
> mac_link_up/pcs_link_up will be the result of the media negotiation, not
> what is required on the actual link.
>
> You mention "10GBaseR with flow control" but there is another
> possibility that exists in real hardware out there. "10GBaseR without
> flow control" and in that case, the MAC needs to pace its transmission
> for the media speed (which is a good reason why mac_link_up should be
> given the result of the media negotiation so it can do transmission
> pacing.)
>
> I have a follow-up to the response I gave to Sean Anderson on rate-
> adapting PHYs that I need to finish and send, and it would be better
> to have any discussion on this topic after I've sent that reply and
> follow-up to that reply.
Ok, how would the MAC pace itself to send at a lower data rate, if the
SERDES protocol is 10G and the PHY doesn't send it PAUSE frames back?
At least Layerscape systems can't do this AFAIK.
I can watch for updates on this on the other thread.
next prev parent reply other threads:[~2021-11-18 15:09 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-11-17 22:50 [PATCH net-next 0/8] Extend `phy-mode` to string array Marek Behún
2021-11-17 22:50 ` [PATCH net-next 1/8] dt-bindings: ethernet-controller: support multiple PHY connection types Marek Behún
2021-11-18 16:27 ` Andrew Lunn
2021-11-17 22:50 ` [PATCH net-next 2/8] net: Update documentation for *_get_phy_mode() functions Marek Behún
2021-11-18 16:28 ` Andrew Lunn
2021-11-17 22:50 ` [PATCH net-next 3/8] device property: add helper function for getting phy mode bitmap Marek Behún
2021-11-18 16:31 ` Andrew Lunn
2021-11-17 22:50 ` [PATCH net-next 4/8] net: phylink: update supported_interfaces with modes from fwnode Marek Behún
2021-11-18 9:05 ` Russell King (Oracle)
2021-11-18 16:33 ` Andrew Lunn
2021-11-18 17:08 ` Russell King (Oracle)
2021-11-18 17:33 ` Marek Behún
2021-11-18 17:39 ` Russell King (Oracle)
2021-11-18 20:38 ` Sean Anderson
2021-11-17 22:50 ` [PATCH net-next 5/8] net: phylink: pass supported PHY interface modes to phylib Marek Behún
2021-11-18 9:32 ` Russell King (Oracle)
2021-11-18 16:27 ` Andrew Lunn
2021-11-18 16:28 ` Russell King (Oracle)
2021-11-17 22:50 ` [PATCH net-next 6/8] net: phy: marvell10g: Use generic macro for supported interfaces Marek Behún
2021-11-18 9:33 ` Russell King (Oracle)
2021-11-18 16:27 ` Andrew Lunn
2021-11-17 22:50 ` [PATCH net-next 7/8] net: phy: marvell10g: Use tabs instead of spaces for indentation Marek Behún
2021-11-18 9:34 ` Russell King (Oracle)
2021-11-17 22:50 ` [PATCH net-next 8/8] net: phy: marvell10g: select host interface configuration Marek Behún
2021-11-18 9:41 ` Russell King (Oracle)
2021-11-18 13:46 ` Marek Behún
2021-11-18 14:25 ` Russell King (Oracle)
2021-11-18 12:03 ` Vladimir Oltean
2021-11-18 13:22 ` Russell King (Oracle)
2021-11-18 14:20 ` Vladimir Oltean
2021-11-18 14:47 ` Russell King (Oracle)
2021-11-18 15:09 ` Vladimir Oltean [this message]
2021-11-18 15:20 ` Marek Behún
2021-11-18 15:59 ` Russell King (Oracle)
2021-11-18 13:56 ` Marek Behún
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211118150951.jzwl5jickilxbfhy@skbuf \
--to=olteanv@gmail.com \
--cc=andrew@lunn.ch \
--cc=davem@davemloft.net \
--cc=devicetree@vger.kernel.org \
--cc=kabel@kernel.org \
--cc=kuba@kernel.org \
--cc=linux@armlinux.org.uk \
--cc=netdev@vger.kernel.org \
--cc=robh+dt@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox