netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Aquantia PHY in OCSGMII mode?
@ 2025-07-31 14:59 Alexander Wilhelm
  2025-07-31 15:14 ` Andrew Lunn
  2025-07-31 17:16 ` Vladimir Oltean
  0 siblings, 2 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-07-31 14:59 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Russell King, netdev, linux-kernel

Hello devs,

I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
working with the Aquantia AQR115 PHY. The existing driver already supports the
AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
be non-standard.

* Is it possible to use this mode with the current driver?
* If yes, what would be the correct DTS entry?
* If not, I’d be willing to implement support. Could you suggest a good starting point?

Any hints or guidance would be greatly appreciated.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 14:59 Aquantia PHY in OCSGMII mode? Alexander Wilhelm
@ 2025-07-31 15:14 ` Andrew Lunn
  2025-07-31 16:02   ` Russell King (Oracle)
  2025-07-31 17:16 ` Vladimir Oltean
  1 sibling, 1 reply; 53+ messages in thread
From: Andrew Lunn @ 2025-07-31 15:14 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Russell King, netdev, linux-kernel

On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> Hello devs,
> 
> I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> working with the Aquantia AQR115 PHY. The existing driver already supports the
> AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> be non-standard.
> 
> * Is it possible to use this mode with the current driver?
> * If yes, what would be the correct DTS entry?
> * If not, I’d be willing to implement support. Could you suggest a good starting point?

If the media is using 2500BaseT, the host side generally needs to be
using 2500BaseX. There is code which mangles OCSGMII into
2500BaseX. You will need that for AQC115.

You also need a MAC driver which says it supports 2500BaseX.  There is
signalling between the PHY and the MAC about how the host interface
should be configured, either SGMII for <= 1G and 2500BaseX for
2.5G.

Just watch out for the hardware being broken, e.g:

static int aqr105_get_features(struct phy_device *phydev)
{
        int ret;

        /* Normal feature discovery */
        ret = genphy_c45_pma_read_abilities(phydev);
        if (ret)
                return ret;

        /* The AQR105 PHY misses to indicate the 2.5G and 5G modes, so add them
         * here
         */
        linkmode_set_bit(ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
                         phydev->supported);
        linkmode_set_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
                         phydev->supported);

The AQR115 might support 2.5G, but does it actually announce it
supports 2.5G?

	 Andrew

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 15:14 ` Andrew Lunn
@ 2025-07-31 16:02   ` Russell King (Oracle)
  2025-08-01  5:44     ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-07-31 16:02 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Alexander Wilhelm, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Thu, Jul 31, 2025 at 05:14:28PM +0200, Andrew Lunn wrote:
> On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> > Hello devs,
> > 
> > I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> > working with the Aquantia AQR115 PHY. The existing driver already supports the
> > AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> > goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> > be non-standard.
> > 
> > * Is it possible to use this mode with the current driver?
> > * If yes, what would be the correct DTS entry?
> > * If not, I’d be willing to implement support. Could you suggest a good starting point?
> 
> If the media is using 2500BaseT, the host side generally needs to be
> using 2500BaseX. There is code which mangles OCSGMII into
> 2500BaseX. You will need that for AQC115.
> 
> You also need a MAC driver which says it supports 2500BaseX.  There is
> signalling between the PHY and the MAC about how the host interface
> should be configured, either SGMII for <= 1G and 2500BaseX for
> 2.5G.

Not necessarily - if the PHY is configured for rate adaption, then it
will stay at 2500Base-X and issue pause frames to the MAC driver to
pace it appropriately.

Given that it _may_ use rate adaption, I would recommend that the MAC
driver uses phylink to get all the implementation correct for that
(one then just needs the MAC driver to do exactly what phylink tells
it to do, no playing any silly games).

> Just watch out for the hardware being broken, e.g:
> 
> static int aqr105_get_features(struct phy_device *phydev)
> {
>         int ret;
> 
>         /* Normal feature discovery */
>         ret = genphy_c45_pma_read_abilities(phydev);
>         if (ret)
>                 return ret;
> 
>         /* The AQR105 PHY misses to indicate the 2.5G and 5G modes, so add them
>          * here
>          */
>         linkmode_set_bit(ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
>                          phydev->supported);
>         linkmode_set_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
>                          phydev->supported);
> 
> The AQR115 might support 2.5G, but does it actually announce it
> supports 2.5G?

I believe it is capable of advertising 2500BASE-T (otherwise it would
be pretty silly to set the bit in the supported mask.) However, given
that this is a firmware driven PHY, it likely depends on the firmware
build.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 14:59 Aquantia PHY in OCSGMII mode? Alexander Wilhelm
  2025-07-31 15:14 ` Andrew Lunn
@ 2025-07-31 17:16 ` Vladimir Oltean
  2025-07-31 19:26   ` Russell King (Oracle)
  2025-08-01  5:53   ` Alexander Wilhelm
  1 sibling, 2 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-07-31 17:16 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Russell King, netdev, linux-kernel

Hi Alexander,

On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> Hello devs,
> 
> I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> working with the Aquantia AQR115 PHY. The existing driver already supports the
> AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> be non-standard.
> 
> * Is it possible to use this mode with the current driver?
> * If yes, what would be the correct DTS entry?
> * If not, I’d be willing to implement support. Could you suggest a good starting point?
> 
> Any hints or guidance would be greatly appreciated.
> 
> 
> Best regards
> Alexander Wilhelm
> 

In addition to what Andrew and Russell said:

The Aquantia PHY driver is a bit unlike other PHY drivers, in that it
prefers not to change the hardware configuration, and work with the
provisioning of the firmware.

Do you know that the PHY firmware was built for OCSGMII, or do you just
intend to use OCSGMII knowing that the hardware capability is there?
Because the driver reads the VEND1_GLOBAL_CFG registers in
aqr107_fill_interface_modes(). These registers tell Linux what host
interface mode to use for each negotiated link speed on the media side.

If you haven't already,

[ and I guess you haven't, because you can find there this translation
  which clearly shows that OCSGMII corresponds to what Linux treats as
  2500base-x:

		case VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII:
			interface = PHY_INTERFACE_MODE_2500BASEX;
			break;

]

then you can instrument this function and see what host interface mode
it detects as configured for VEND1_GLOBAL_CFG_2_5G.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 17:16 ` Vladimir Oltean
@ 2025-07-31 19:26   ` Russell King (Oracle)
  2025-08-01  5:50     ` Alexander Wilhelm
                       ` (2 more replies)
  2025-08-01  5:53   ` Alexander Wilhelm
  1 sibling, 3 replies; 53+ messages in thread
From: Russell King (Oracle) @ 2025-07-31 19:26 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Thu, Jul 31, 2025 at 08:16:42PM +0300, Vladimir Oltean wrote:
> Hi Alexander,
> 
> On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> > Hello devs,
> > 
> > I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> > working with the Aquantia AQR115 PHY. The existing driver already supports the
> > AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> > goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> > be non-standard.
> > 
> > * Is it possible to use this mode with the current driver?
> > * If yes, what would be the correct DTS entry?
> > * If not, I’d be willing to implement support. Could you suggest a good starting point?
> > 
> > Any hints or guidance would be greatly appreciated.
> > 
> > 
> > Best regards
> > Alexander Wilhelm
> > 
> 
> In addition to what Andrew and Russell said:
> 
> The Aquantia PHY driver is a bit unlike other PHY drivers, in that it
> prefers not to change the hardware configuration, and work with the
> provisioning of the firmware.

I'll state here that this is a design decision of the PHY driver.
It is possible to reconfigure the PHY (I have code in the PHY
driver to do it, so I can test the module on the Armada 388 based
Clearfog patform.

Essentially, in aqr107_fill_interface_modes() I do this:

+       phy_set_bits_mmd(phydev, MDIO_MMD_VEND1, MDIO_CTRL1, MDIO_CTRL1_LPOWER);
+       mdelay(10);
+       phy_write_mmd(phydev, MDIO_MMD_VEND1, 0x31a, 2);
+       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_10M,
+                     VEND1_GLOBAL_CFG_SGMII_AN |
+                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
+       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_100M,
+                     VEND1_GLOBAL_CFG_SGMII_AN |
+                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
+       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_1G,
+                     VEND1_GLOBAL_CFG_SGMII_AN |
+                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
+       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_2_5G,
+                     VEND1_GLOBAL_CFG_SGMII_AN |
+                     VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII);
+       phy_clear_bits_mmd(phydev, MDIO_MMD_VEND1, MDIO_CTRL1,
+                          MDIO_CTRL1_LPOWER);

with:

 #define VEND1_GLOBAL_CFG_SERDES_MODE_XFI       0
 #define VEND1_GLOBAL_CFG_SERDES_MODE_SGMII     3
 #define VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII   4
+#define VEND1_GLOBAL_CFG_SERDES_MODE_LOW_POWER 5
 #define VEND1_GLOBAL_CFG_SERDES_MODE_XFI5G     6
+#define VEND1_GLOBAL_CFG_SERDES_MODE_XFI20G    7
+#define VEND1_GLOBAL_CFG_SGMII_AN              BIT(3)
+#define VEND1_GLOBAL_CFG_SERDES_SILENT         BIT(6)

and this works. So... we could actually reconfigure the PHY independent
of what was programmed into the firmware.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 16:02   ` Russell King (Oracle)
@ 2025-08-01  5:44     ` Alexander Wilhelm
  2025-08-04 14:53       ` Andrew Lunn
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-01  5:44 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Thu, Jul 31, 2025 at 05:02:49PM +0100 schrieb Russell King (Oracle):
> On Thu, Jul 31, 2025 at 05:14:28PM +0200, Andrew Lunn wrote:
> > On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> > > Hello devs,
> > > 
> > > I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> > > working with the Aquantia AQR115 PHY. The existing driver already supports the
> > > AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> > > goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> > > be non-standard.
> > > 
> > > * Is it possible to use this mode with the current driver?
> > > * If yes, what would be the correct DTS entry?
> > > * If not, I’d be willing to implement support. Could you suggest a good starting point?
> > 
> > If the media is using 2500BaseT, the host side generally needs to be
> > using 2500BaseX. There is code which mangles OCSGMII into
> > 2500BaseX. You will need that for AQC115.
> > 
> > You also need a MAC driver which says it supports 2500BaseX.  There is
> > signalling between the PHY and the MAC about how the host interface
> > should be configured, either SGMII for <= 1G and 2500BaseX for
> > 2.5G.
> 
> Not necessarily - if the PHY is configured for rate adaption, then it
> will stay at 2500Base-X and issue pause frames to the MAC driver to
> pace it appropriately.

Thanks a lot for supporting me. The rate adaption, so called AQrate, is exactly
what I want to use. It runs in overclocked SGMII mode and limits somehow the
pace to communicate with MAC.

> 
> Given that it _may_ use rate adaption, I would recommend that the MAC
> driver uses phylink to get all the implementation correct for that
> (one then just needs the MAC driver to do exactly what phylink tells
> it to do, no playing any silly games).
> 
> > Just watch out for the hardware being broken, e.g:
> > 
> > static int aqr105_get_features(struct phy_device *phydev)
> > {
> >         int ret;
> > 
> >         /* Normal feature discovery */
> >         ret = genphy_c45_pma_read_abilities(phydev);
> >         if (ret)
> >                 return ret;
> > 
> >         /* The AQR105 PHY misses to indicate the 2.5G and 5G modes, so add them
> >          * here
> >          */
> >         linkmode_set_bit(ETHTOOL_LINK_MODE_5000baseT_Full_BIT,
> >                          phydev->supported);
> >         linkmode_set_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT,
> >                          phydev->supported);
> > 
> > The AQR115 might support 2.5G, but does it actually announce it
> > supports 2.5G?
> 
> I believe it is capable of advertising 2500BASE-T (otherwise it would
> be pretty silly to set the bit in the supported mask.) However, given
> that this is a firmware driven PHY, it likely depends on the firmware
> build.

I don see any firmware problems. I have one of the latest builds, and from what
I understand, the firmware consists of base image and additionally a
provisioning table. But this table is a kind of pre-configuration. That means I
can override the entire PHY configuration to my needs.

By the way I already have a 2.5G link in U-Boot, but did not get to set lower
speeds. Now I am trying to do at least the same under linux.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 19:26   ` Russell King (Oracle)
@ 2025-08-01  5:50     ` Alexander Wilhelm
  2025-08-01 11:01     ` Vladimir Oltean
  2025-08-01 11:13     ` Vladimir Oltean
  2 siblings, 0 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-01  5:50 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Thu, Jul 31, 2025 at 08:26:43PM +0100 schrieb Russell King (Oracle):
> On Thu, Jul 31, 2025 at 08:16:42PM +0300, Vladimir Oltean wrote:
> > Hi Alexander,
> > 
> > On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> > > Hello devs,
> > > 
> > > I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> > > working with the Aquantia AQR115 PHY. The existing driver already supports the
> > > AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> > > goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> > > be non-standard.
> > > 
> > > * Is it possible to use this mode with the current driver?
> > > * If yes, what would be the correct DTS entry?
> > > * If not, I’d be willing to implement support. Could you suggest a good starting point?
> > > 
> > > Any hints or guidance would be greatly appreciated.
> > > 
> > > 
> > > Best regards
> > > Alexander Wilhelm
> > > 
> > 
> > In addition to what Andrew and Russell said:
> > 
> > The Aquantia PHY driver is a bit unlike other PHY drivers, in that it
> > prefers not to change the hardware configuration, and work with the
> > provisioning of the firmware.
> 
> I'll state here that this is a design decision of the PHY driver.
> It is possible to reconfigure the PHY (I have code in the PHY
> driver to do it, so I can test the module on the Armada 388 based
> Clearfog patform.
> 
> Essentially, in aqr107_fill_interface_modes() I do this:
> 
> +       phy_set_bits_mmd(phydev, MDIO_MMD_VEND1, MDIO_CTRL1, MDIO_CTRL1_LPOWER);
> +       mdelay(10);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, 0x31a, 2);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_10M,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_100M,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_1G,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_2_5G,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII);
> +       phy_clear_bits_mmd(phydev, MDIO_MMD_VEND1, MDIO_CTRL1,
> +                          MDIO_CTRL1_LPOWER);
> 
> with:
> 
>  #define VEND1_GLOBAL_CFG_SERDES_MODE_XFI       0
>  #define VEND1_GLOBAL_CFG_SERDES_MODE_SGMII     3
>  #define VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII   4
> +#define VEND1_GLOBAL_CFG_SERDES_MODE_LOW_POWER 5
>  #define VEND1_GLOBAL_CFG_SERDES_MODE_XFI5G     6
> +#define VEND1_GLOBAL_CFG_SERDES_MODE_XFI20G    7
> +#define VEND1_GLOBAL_CFG_SGMII_AN              BIT(3)
> +#define VEND1_GLOBAL_CFG_SERDES_SILENT         BIT(6)
> 
> and this works. So... we could actually reconfigure the PHY independent
> of what was programmed into the firmware.

Thanks, a good idea. I'll check how the firmware is configured and override the
PHY configuration to my needs.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 17:16 ` Vladimir Oltean
  2025-07-31 19:26   ` Russell King (Oracle)
@ 2025-08-01  5:53   ` Alexander Wilhelm
  1 sibling, 0 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-01  5:53 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Russell King, netdev, linux-kernel

Am Thu, Jul 31, 2025 at 08:16:42PM +0300 schrieb Vladimir Oltean:
> Hi Alexander,
> 
> On Thu, Jul 31, 2025 at 04:59:09PM +0200, Alexander Wilhelm wrote:
> > Hello devs,
> > 
> > I'm fairly new to Ethernet PHY drivers and would appreciate your help. I'm
> > working with the Aquantia AQR115 PHY. The existing driver already supports the
> > AQR115C, so I reused that code for the AQR115, assuming minimal differences. My
> > goal is to enable 2.5G link speed. The PHY supports OCSGMII mode, which seems to
> > be non-standard.
> > 
> > * Is it possible to use this mode with the current driver?
> > * If yes, what would be the correct DTS entry?
> > * If not, I’d be willing to implement support. Could you suggest a good starting point?
> > 
> > Any hints or guidance would be greatly appreciated.
> > 
> > 
> > Best regards
> > Alexander Wilhelm
> > 
> 
> In addition to what Andrew and Russell said:
> 
> The Aquantia PHY driver is a bit unlike other PHY drivers, in that it
> prefers not to change the hardware configuration, and work with the
> provisioning of the firmware.
> 
> Do you know that the PHY firmware was built for OCSGMII, or do you just
> intend to use OCSGMII knowing that the hardware capability is there?
> Because the driver reads the VEND1_GLOBAL_CFG registers in
> aqr107_fill_interface_modes(). These registers tell Linux what host
> interface mode to use for each negotiated link speed on the media side.
> 
> If you haven't already,
> 
> [ and I guess you haven't, because you can find there this translation
>   which clearly shows that OCSGMII corresponds to what Linux treats as
>   2500base-x:
> 
> 		case VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII:
> 			interface = PHY_INTERFACE_MODE_2500BASEX;
> 			break;
> 
> ]
> 
> then you can instrument this function and see what host interface mode
> it detects as configured for VEND1_GLOBAL_CFG_2_5G.

Thank you, Vladimir. I already saw the function in source code, but wasn't realy
sure how DTS need to be configured. Now I see see that 2500BASEX should be used.
I'll check how PHY registers are configured and will look further to what the
MAC driver is doing.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 19:26   ` Russell King (Oracle)
  2025-08-01  5:50     ` Alexander Wilhelm
@ 2025-08-01 11:01     ` Vladimir Oltean
  2025-08-01 11:54       ` Alexander Wilhelm
  2025-08-01 11:13     ` Vladimir Oltean
  2 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-01 11:01 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> and this works. So... we could actually reconfigure the PHY independent
> of what was programmed into the firmware.

It does work indeed, the trouble will be adding this code to the common
mainline kernel driver and then watching various boards break after their
known-good firmware provisioning was overwritten, from a source of unknown
applicability to their system.

Also, there are some registers which cannot be modified over MDIO, like
those involved in reconfiguring the AQR412C from 4x single-port USXGMII
lanes to 1x quad-port 10G-QXGMII lane (or "MUSX", as Aquantia call this).
I am actually investigating this currently - trying to find a way for
Linux to distinguish "MUSX" from USXGMII, in order to upstream a first
user of PHY_INTERFACE_MODE_10G_QXGMII.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-07-31 19:26   ` Russell King (Oracle)
  2025-08-01  5:50     ` Alexander Wilhelm
  2025-08-01 11:01     ` Vladimir Oltean
@ 2025-08-01 11:13     ` Vladimir Oltean
  2 siblings, 0 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-01 11:13 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> Essentially, in aqr107_fill_interface_modes() I do this:
> 
> +       phy_set_bits_mmd(phydev, MDIO_MMD_VEND1, MDIO_CTRL1, MDIO_CTRL1_LPOWER);
> +       mdelay(10);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, 0x31a, 2);

By the way, you can add:
#define VEND1_GLOBAL_STARTUP_RATE		0x031a
#define VEND1_GLOBAL_STARTUP_RATE_1G		2

according to:
https://github.com/nxp-qoriq/linux/blob/lf-6.12.20-2.0.0/drivers/net/phy/aquantia/aquantia.h#L45-L54

> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_10M,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_100M,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_1G,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_SGMII);
> +       phy_write_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_CFG_2_5G,
> +                     VEND1_GLOBAL_CFG_SGMII_AN |
> +                     VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII);
> +       phy_clear_bits_mmd(phydev, MDIO_MMD_VEND1, MDIO_CTRL1,
> +                          MDIO_CTRL1_LPOWER);

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 11:01     ` Vladimir Oltean
@ 2025-08-01 11:54       ` Alexander Wilhelm
  2025-08-01 11:58         ` Russell King (Oracle)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-01 11:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Fri, Aug 01, 2025 at 02:01:06PM +0300 schrieb Vladimir Oltean:
> On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> > and this works. So... we could actually reconfigure the PHY independent
> > of what was programmed into the firmware.
> 
> It does work indeed, the trouble will be adding this code to the common
> mainline kernel driver and then watching various boards break after their
> known-good firmware provisioning was overwritten, from a source of unknown
> applicability to their system.

You're right. I've now selected a firmware that uses a different provisioning
table, which already configures the PHY for 2500BASE-X with Flow Control.
According to the documentation, it should support all modes: 10M, 100M, 1G, and
2.5G.

It seems the issue lies with the MAC, as it doesn't appear to handle the
configured PHY_INTERFACE_MODE_2500BASEX correctly. I'm currently investigating
this further.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 11:54       ` Alexander Wilhelm
@ 2025-08-01 11:58         ` Russell King (Oracle)
  2025-08-01 12:06           ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-01 11:58 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Fri, Aug 01, 2025 at 01:54:29PM +0200, Alexander Wilhelm wrote:
> Am Fri, Aug 01, 2025 at 02:01:06PM +0300 schrieb Vladimir Oltean:
> > On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> > > and this works. So... we could actually reconfigure the PHY independent
> > > of what was programmed into the firmware.
> > 
> > It does work indeed, the trouble will be adding this code to the common
> > mainline kernel driver and then watching various boards break after their
> > known-good firmware provisioning was overwritten, from a source of unknown
> > applicability to their system.
> 
> You're right. I've now selected a firmware that uses a different provisioning
> table, which already configures the PHY for 2500BASE-X with Flow Control.
> According to the documentation, it should support all modes: 10M, 100M, 1G, and
> 2.5G.
> 
> It seems the issue lies with the MAC, as it doesn't appear to handle the
> configured PHY_INTERFACE_MODE_2500BASEX correctly. I'm currently investigating
> this further.

Which MAC driver, and is it using phylink?

phylib doesn't know about rate matching, so either you need to replicate
much of what phylink does, or make sure the MAC driver uses phylink.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 11:58         ` Russell King (Oracle)
@ 2025-08-01 12:06           ` Alexander Wilhelm
  2025-08-01 12:23             ` Russell King (Oracle)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-01 12:06 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Fri, Aug 01, 2025 at 12:58:23PM +0100 schrieb Russell King (Oracle):
> On Fri, Aug 01, 2025 at 01:54:29PM +0200, Alexander Wilhelm wrote:
> > Am Fri, Aug 01, 2025 at 02:01:06PM +0300 schrieb Vladimir Oltean:
> > > On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> > > > and this works. So... we could actually reconfigure the PHY independent
> > > > of what was programmed into the firmware.
> > > 
> > > It does work indeed, the trouble will be adding this code to the common
> > > mainline kernel driver and then watching various boards break after their
> > > known-good firmware provisioning was overwritten, from a source of unknown
> > > applicability to their system.
> > 
> > You're right. I've now selected a firmware that uses a different provisioning
> > table, which already configures the PHY for 2500BASE-X with Flow Control.
> > According to the documentation, it should support all modes: 10M, 100M, 1G, and
> > 2.5G.
> > 
> > It seems the issue lies with the MAC, as it doesn't appear to handle the
> > configured PHY_INTERFACE_MODE_2500BASEX correctly. I'm currently investigating
> > this further.
> 
> Which MAC driver, and is it using phylink?

If I understand it correclty, then yes. It is an Freescale FMAN driver that is
called through phylink callbacks like the following:

    static const struct phylink_mac_ops memac_mac_ops = {
            .validate = memac_validate,
            .mac_select_pcs = memac_select_pcs,
            .mac_prepare = memac_prepare,
            .mac_config = memac_mac_config,
            .mac_link_up = memac_link_up,
            .mac_link_down = memac_link_down,
    };


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 12:06           ` Alexander Wilhelm
@ 2025-08-01 12:23             ` Russell King (Oracle)
  2025-08-01 12:36               ` Alexander Wilhelm
  2025-08-01 13:04               ` Vladimir Oltean
  0 siblings, 2 replies; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-01 12:23 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Fri, Aug 01, 2025 at 02:06:16PM +0200, Alexander Wilhelm wrote:
> Am Fri, Aug 01, 2025 at 12:58:23PM +0100 schrieb Russell King (Oracle):
> > On Fri, Aug 01, 2025 at 01:54:29PM +0200, Alexander Wilhelm wrote:
> > > Am Fri, Aug 01, 2025 at 02:01:06PM +0300 schrieb Vladimir Oltean:
> > > > On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> > > > > and this works. So... we could actually reconfigure the PHY independent
> > > > > of what was programmed into the firmware.
> > > > 
> > > > It does work indeed, the trouble will be adding this code to the common
> > > > mainline kernel driver and then watching various boards break after their
> > > > known-good firmware provisioning was overwritten, from a source of unknown
> > > > applicability to their system.
> > > 
> > > You're right. I've now selected a firmware that uses a different provisioning
> > > table, which already configures the PHY for 2500BASE-X with Flow Control.
> > > According to the documentation, it should support all modes: 10M, 100M, 1G, and
> > > 2.5G.
> > > 
> > > It seems the issue lies with the MAC, as it doesn't appear to handle the
> > > configured PHY_INTERFACE_MODE_2500BASEX correctly. I'm currently investigating
> > > this further.
> > 
> > Which MAC driver, and is it using phylink?
> 
> If I understand it correclty, then yes. It is an Freescale FMAN driver that is
> called through phylink callbacks like the following:
> 
>     static const struct phylink_mac_ops memac_mac_ops = {
>             .validate = memac_validate,
>             .mac_select_pcs = memac_select_pcs,
>             .mac_prepare = memac_prepare,
>             .mac_config = memac_mac_config,
>             .mac_link_up = memac_link_up,
>             .mac_link_down = memac_link_down,
>     };

Thanks.

It looks like memac_select_pcs() and memac_prepare() fail to
handle 2500BASEX despite memac_initialization() suggesting the
SGMII PCS supports 2500BASEX.

It would also be good if the driver can also use
pcs->supported_interfaces which states which modes the PCS layer
supports as well.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 12:23             ` Russell King (Oracle)
@ 2025-08-01 12:36               ` Alexander Wilhelm
  2025-08-01 13:04               ` Vladimir Oltean
  1 sibling, 0 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-01 12:36 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Fri, Aug 01, 2025 at 01:23:44PM +0100 schrieb Russell King (Oracle):
> On Fri, Aug 01, 2025 at 02:06:16PM +0200, Alexander Wilhelm wrote:
> > Am Fri, Aug 01, 2025 at 12:58:23PM +0100 schrieb Russell King (Oracle):
> > > On Fri, Aug 01, 2025 at 01:54:29PM +0200, Alexander Wilhelm wrote:
> > > > Am Fri, Aug 01, 2025 at 02:01:06PM +0300 schrieb Vladimir Oltean:
> > > > > On Thu, Jul 31, 2025 at 08:26:43PM +0100, Russell King (Oracle) wrote:
> > > > > > and this works. So... we could actually reconfigure the PHY independent
> > > > > > of what was programmed into the firmware.
> > > > > 
> > > > > It does work indeed, the trouble will be adding this code to the common
> > > > > mainline kernel driver and then watching various boards break after their
> > > > > known-good firmware provisioning was overwritten, from a source of unknown
> > > > > applicability to their system.
> > > > 
> > > > You're right. I've now selected a firmware that uses a different provisioning
> > > > table, which already configures the PHY for 2500BASE-X with Flow Control.
> > > > According to the documentation, it should support all modes: 10M, 100M, 1G, and
> > > > 2.5G.
> > > > 
> > > > It seems the issue lies with the MAC, as it doesn't appear to handle the
> > > > configured PHY_INTERFACE_MODE_2500BASEX correctly. I'm currently investigating
> > > > this further.
> > > 
> > > Which MAC driver, and is it using phylink?
> > 
> > If I understand it correclty, then yes. It is an Freescale FMAN driver that is
> > called through phylink callbacks like the following:
> > 
> >     static const struct phylink_mac_ops memac_mac_ops = {
> >             .validate = memac_validate,
> >             .mac_select_pcs = memac_select_pcs,
> >             .mac_prepare = memac_prepare,
> >             .mac_config = memac_mac_config,
> >             .mac_link_up = memac_link_up,
> >             .mac_link_down = memac_link_down,
> >     };
> 
> Thanks.
> 
> It looks like memac_select_pcs() and memac_prepare() fail to
> handle 2500BASEX despite memac_initialization() suggesting the
> SGMII PCS supports 2500BASEX.
> 
> It would also be good if the driver can also use
> pcs->supported_interfaces which states which modes the PCS layer
> supports as well.

Thank you for your detailed support, Russell. I believe I now have a good
understanding of the next steps. I'll respond later once I’ve made some progress
and have results to share.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 12:23             ` Russell King (Oracle)
  2025-08-01 12:36               ` Alexander Wilhelm
@ 2025-08-01 13:04               ` Vladimir Oltean
  2025-08-01 14:02                 ` Russell King (Oracle)
  2025-08-04  6:17                 ` Alexander Wilhelm
  1 sibling, 2 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-01 13:04 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> It looks like memac_select_pcs() and memac_prepare() fail to
> handle 2500BASEX despite memac_initialization() suggesting the
> SGMII PCS supports 2500BASEX.

Thanks for pointing this out, it seems to be a regression introduced by
commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").

If there are no other volunteers, I can offer to submit a patch if
Alexander confirms this fixes his setup.

> It would also be good if the driver can also use
> pcs->supported_interfaces which states which modes the PCS layer
> supports as well.

The current algorithm in lynx_pcs_create() is too optimistic and
advertises host interfaces which the PCS may not actually support.

static const phy_interface_t lynx_interfaces[] = {
	PHY_INTERFACE_MODE_SGMII,
	PHY_INTERFACE_MODE_QSGMII,
	PHY_INTERFACE_MODE_1000BASEX,
	PHY_INTERFACE_MODE_2500BASEX,
	PHY_INTERFACE_MODE_10GBASER,
	PHY_INTERFACE_MODE_USXGMII,
};

	for (i = 0; i < ARRAY_SIZE(lynx_interfaces); i++)
		__set_bit(lynx_interfaces[i], lynx->pcs.supported_interfaces);

I am concerned that if we add logic to the MAC driver which does:

		phy_interface_or(config->supported_interfaces,
				 config->supported_interfaces,
				 pcs->supported_interfaces);

then we depart from the physical reality of the board and may end up
accepting a host interface which we should have rejected.

There is downstream code which refines lynx_pcs_create() to this:

	/* In case we have access to the SerDes phy/lane, then ask the SerDes
	 * driver what interfaces are supported based on the current PLL
	 * configuration.
	 */
	for (int i = 0; i < ARRAY_SIZE(lynx_interfaces); i++) {
		phy_interface_t iface = lynx_interfaces[i];

		err = phy_validate(lynx->serdes[PRIMARY_LANE],
				   PHY_MODE_ETHERNET, iface, NULL);
		if (err)
			continue;

		__set_bit(iface, supported_interfaces);
	}

but the infrastructure (the SerDes driver) is currently lacking upstream.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 13:04               ` Vladimir Oltean
@ 2025-08-01 14:02                 ` Russell King (Oracle)
  2025-08-01 14:37                   ` Vladimir Oltean
  2025-08-04  6:17                 ` Alexander Wilhelm
  1 sibling, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-01 14:02 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Fri, Aug 01, 2025 at 04:04:20PM +0300, Vladimir Oltean wrote:
> On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> > It looks like memac_select_pcs() and memac_prepare() fail to
> > handle 2500BASEX despite memac_initialization() suggesting the
> > SGMII PCS supports 2500BASEX.
> 
> Thanks for pointing this out, it seems to be a regression introduced by
> commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").
> 
> If there are no other volunteers, I can offer to submit a patch if
> Alexander confirms this fixes his setup.
> 
> > It would also be good if the driver can also use
> > pcs->supported_interfaces which states which modes the PCS layer
> > supports as well.
> 
> The current algorithm in lynx_pcs_create() is too optimistic and
> advertises host interfaces which the PCS may not actually support.
> 
> static const phy_interface_t lynx_interfaces[] = {
> 	PHY_INTERFACE_MODE_SGMII,
> 	PHY_INTERFACE_MODE_QSGMII,
> 	PHY_INTERFACE_MODE_1000BASEX,
> 	PHY_INTERFACE_MODE_2500BASEX,
> 	PHY_INTERFACE_MODE_10GBASER,
> 	PHY_INTERFACE_MODE_USXGMII,
> };
> 
> 	for (i = 0; i < ARRAY_SIZE(lynx_interfaces); i++)
> 		__set_bit(lynx_interfaces[i], lynx->pcs.supported_interfaces);
> 
> I am concerned that if we add logic to the MAC driver which does:
> 
> 		phy_interface_or(config->supported_interfaces,
> 				 config->supported_interfaces,
> 				 pcs->supported_interfaces);
> 
> then we depart from the physical reality of the board and may end up
> accepting a host interface which we should have rejected.
> 
> There is downstream code which refines lynx_pcs_create() to this:
> 
> 	/* In case we have access to the SerDes phy/lane, then ask the SerDes
> 	 * driver what interfaces are supported based on the current PLL
> 	 * configuration.
> 	 */
> 	for (int i = 0; i < ARRAY_SIZE(lynx_interfaces); i++) {
> 		phy_interface_t iface = lynx_interfaces[i];
> 
> 		err = phy_validate(lynx->serdes[PRIMARY_LANE],
> 				   PHY_MODE_ETHERNET, iface, NULL);
> 		if (err)
> 			continue;
> 
> 		__set_bit(iface, supported_interfaces);
> 	}
> 
> but the infrastructure (the SerDes driver) is currently lacking upstream.

It looks like the SerDes driver is managed by the MAC (it validates
each mode against the serdes PHY driver's validate function - serdes
being mac_dev->fman_mac->serdes. If this SerDes doesn't exist, then
only mac_dev->phy_if is supported.

So, I don't think there's any need for the Lynx to reach out to the
SerDes in mainline as it currently stands.

As the SerDes also dictates which modes and is managed by fman, I'd
suggest for mainline that the code needs to implement the following
pseudocode:

	config->supported_interfaces = mac_support |
				(pcs->supported_interfaces &
				serdes_supported_interfaces);

rather than the simple "or pcs->supported_interfaces into the
supported bitmap" that we can do in other drivers.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 14:02                 ` Russell King (Oracle)
@ 2025-08-01 14:37                   ` Vladimir Oltean
  0 siblings, 0 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-01 14:37 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Fri, Aug 01, 2025 at 03:02:14PM +0100, Russell King (Oracle) wrote:
> It looks like the SerDes driver is managed by the MAC (it validates
> each mode against the serdes PHY driver's validate function - serdes
> being mac_dev->fman_mac->serdes. If this SerDes doesn't exist, then
> only mac_dev->phy_if is supported.
> 
> So, I don't think there's any need for the Lynx to reach out to the
> SerDes in mainline as it currently stands.
> 
> As the SerDes also dictates which modes and is managed by fman, I'd
> suggest for mainline that the code needs to implement the following
> pseudocode:
> 
> 	config->supported_interfaces = mac_support |
> 				(pcs->supported_interfaces &
> 				serdes_supported_interfaces);
> 
> rather than the simple "or pcs->supported_interfaces into the
> supported bitmap" that we can do in other drivers.

The PCS needs to reach out to the SerDes lane in the more developed
downstream code due to the need to manage the lane (software-driven link
training according to 802.3 clause 72) for backplane link modes. The
AN/LT block is grouped together with the PCS, not with the MAC.

This design decision also makes it so that the other non-critical lane
management tasks (initialization, power management, figure out supported
interface modes, reconfiguration upon major reconfig) are done only once
in a central place (the PCS driver) rather than replicated at the
following PCS consumer sites (MAC drivers), which all need these features,
preferably with a unified behavior:
- drivers/net/dsa/ocelot/seville_vsc9953.c
- drivers/net/dsa/ocelot/felix_vsc9959.c
- drivers/net/ethernet/freescale/dpaa2/dpaa2-mac.c
- drivers/net/ethernet/freescale/fman/fman_memac.c
- drivers/net/ethernet/freescale/enetc/enetc_pf.c

So, in downstream, yes, the MAC acquires the SerDes lane using
devm_of_phy_optional_get(), but it just passes it to the PCS and lets it
do the above.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01 13:04               ` Vladimir Oltean
  2025-08-01 14:02                 ` Russell King (Oracle)
@ 2025-08-04  6:17                 ` Alexander Wilhelm
  2025-08-04 10:01                   ` Vladimir Oltean
  1 sibling, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-04  6:17 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Fri, Aug 01, 2025 at 04:04:20PM +0300 schrieb Vladimir Oltean:
> On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> > It looks like memac_select_pcs() and memac_prepare() fail to
> > handle 2500BASEX despite memac_initialization() suggesting the
> > SGMII PCS supports 2500BASEX.
> 
> Thanks for pointing this out, it seems to be a regression introduced by
> commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").
> 
> If there are no other volunteers, I can offer to submit a patch if
> Alexander confirms this fixes his setup.

I'd be happy to help by applying the patch on my system and running some tests.
Please let me know if there are any specific steps or scenarios you'd like me to
focus on.

Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04  6:17                 ` Alexander Wilhelm
@ 2025-08-04 10:01                   ` Vladimir Oltean
  2025-08-04 13:01                     ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-04 10:01 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2019 bytes --]

On Mon, Aug 04, 2025 at 08:17:47AM +0200, Alexander Wilhelm wrote:
> Am Fri, Aug 01, 2025 at 04:04:20PM +0300 schrieb Vladimir Oltean:
> > On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> > > It looks like memac_select_pcs() and memac_prepare() fail to
> > > handle 2500BASEX despite memac_initialization() suggesting the
> > > SGMII PCS supports 2500BASEX.
> > 
> > Thanks for pointing this out, it seems to be a regression introduced by
> > commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").
> > 
> > If there are no other volunteers, I can offer to submit a patch if
> > Alexander confirms this fixes his setup.
> 
> I'd be happy to help by applying the patch on my system and running some tests.
> Please let me know if there are any specific steps or scenarios you'd like me to
> focus on.
> 
> Best regards
> Alexander Wilhelm

Please find the attached patch.

You should only need something like below (assuming LS1046A fm1-mac9,
may be different in your case) in your board device tree:

	ethernet@f0000 { /* 10GEC1 */
		phy-handle = <&aqr115_phy>;
		phy-connection-type = "2500base-x";
	};

because the pcsphy-handle should have already been added by qoriq-fman3-0-10g-0.dtsi
or fsl-ls1046-post.dtsi.

For debugging, I recommend dumping /proc/device-tree/soc/fman@1a00000/ethernet@f0000/
(node may change for different MAC) to make sure that all the required
properties are there, i.e. phy-handle, phy-connection-type, pcsphy-handle.
Either inspect the device tree through the filesystem, or save it to a
text file using "dtc -I fs -O dts -o running.dts /proc/device-tree/".

I especially recommend instrumenting the live device tree, because I
don't know what bootloader version you are using, and whether it has
device tree fixups enabled (which mainly add status = "disabled" to
unused FMan ports, but also change the phy-connection-type in some cases).

managed = "in-band-status" is not needed and should not be added. The
PCS only supports LINK_INBAND_DISABLE for 2500base-x.

[-- Attachment #2: 0001-net-dpaa-fman_memac-complete-phylink-support-with-25.patch --]
[-- Type: text/x-diff, Size: 4032 bytes --]

From 2b4d48c93d317cccafc8128e33f18fab244d5bce Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Mon, 4 Aug 2025 11:15:26 +0300
Subject: [PATCH] net: dpaa: fman_memac: complete phylink support with
 2500base-x

The DPAA phylink conversion in the following commits partially developed
code for handling the 2500base-x host interface mode (called "2.5G
SGMII" in LS1043A/LS1046A reference manuals).

- 0fc83bd79589 ("net: fman: memac: Add serdes support")
- 5d93cfcf7360 ("net: dpaa: Convert to phylink")

In principle, having phy-interface-mode = "2500base-x" and a pcsphy-handle
(unnamed or with pcs-handle-names = "sgmii") in the MAC device tree node
results in PHY_INTERFACE_MODE_2500BASEX being set in phylink_config ::
supported_interfaces, but this isn't sufficient.

Because memac_select_pcs() returns no PCS for PHY_INTERFACE_MODE_2500BASEX,
the Lynx PCS code never engages. There's a chance the PCS driver doesn't
have any configuration to change for 2500base-x fixed-link (based on
bootloader pre-initialization), but there's an even higher chance that
this is not the case, and the PCS remains misconfigured.

More importantly, memac_if_mode() does not handle
PHY_INTERFACE_MODE_2500BASEX, and it should be telling the mEMAC to
configure itself in GMII mode (which is upclocked by the PCS). Currently
it prints a WARN_ON() and returns zero, aka IF_MODE_10G (incorrect).

The additional case statement in memac_prepare() for calling
phy_set_mode_ext() does not make any difference, because there is no
generic PHY driver for the Lynx 10G SerDes from LS1043A/LS1046A. But we
add it nonetheless, for consistency.

Regarding the question "did 2500base-x ever work with the FMan mEMAC
mainline code prior to the phylink conversion?" - the answer is more
nuanced.

For context, the previous phylib-based implementation was unable to
describe the fixed-link speed as 2500, because the software PHY
implementation is limited to 1G. However, improperly describing the link
as an sgmii fixed-link with speed = <1000> would have resulted in a
functional 2.5G speed, because there is no other difference than the
SerDes lane clock net frequency (3.125 GHz for 2500base-x) - all the
other higher-level settings are the same, and the SerDes lane frequency
is currently handled by the RCW.

But this hack cannot be extended towards a phylib PHY such as Aquantia
operating in OCSGMII, because the latter requires phy-mode = "2500base-x",
which the mEMAC driver did not support prior to the phylink conversion.
So I do not really consider this a regression, just completing support
for a missing feature.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/ethernet/freescale/fman/fman_memac.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/freescale/fman/fman_memac.c b/drivers/net/ethernet/freescale/fman/fman_memac.c
index 0291093f2e4e..b3e25234512e 100644
--- a/drivers/net/ethernet/freescale/fman/fman_memac.c
+++ b/drivers/net/ethernet/freescale/fman/fman_memac.c
@@ -649,6 +649,7 @@ static u32 memac_if_mode(phy_interface_t interface)
 		return IF_MODE_GMII | IF_MODE_RGMII;
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_1000BASEX:
+	case PHY_INTERFACE_MODE_2500BASEX:
 	case PHY_INTERFACE_MODE_QSGMII:
 		return IF_MODE_GMII;
 	case PHY_INTERFACE_MODE_10GBASER:
@@ -667,6 +668,7 @@ static struct phylink_pcs *memac_select_pcs(struct phylink_config *config,
 	switch (iface) {
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_1000BASEX:
+	case PHY_INTERFACE_MODE_2500BASEX:
 		return memac->sgmii_pcs;
 	case PHY_INTERFACE_MODE_QSGMII:
 		return memac->qsgmii_pcs;
@@ -685,6 +687,7 @@ static int memac_prepare(struct phylink_config *config, unsigned int mode,
 	switch (iface) {
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_1000BASEX:
+	case PHY_INTERFACE_MODE_2500BASEX:
 	case PHY_INTERFACE_MODE_QSGMII:
 	case PHY_INTERFACE_MODE_10GBASER:
 		return phy_set_mode_ext(memac->serdes, PHY_MODE_ETHERNET,
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 10:01                   ` Vladimir Oltean
@ 2025-08-04 13:01                     ` Alexander Wilhelm
  2025-08-04 13:41                       ` Vladimir Oltean
  2025-08-04 14:22                       ` Russell King (Oracle)
  0 siblings, 2 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-04 13:01 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Mon, Aug 04, 2025 at 01:01:39PM +0300 schrieb Vladimir Oltean:
> On Mon, Aug 04, 2025 at 08:17:47AM +0200, Alexander Wilhelm wrote:
> > Am Fri, Aug 01, 2025 at 04:04:20PM +0300 schrieb Vladimir Oltean:
> > > On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> > > > It looks like memac_select_pcs() and memac_prepare() fail to
> > > > handle 2500BASEX despite memac_initialization() suggesting the
> > > > SGMII PCS supports 2500BASEX.
> > > 
> > > Thanks for pointing this out, it seems to be a regression introduced by
> > > commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").
> > > 
> > > If there are no other volunteers, I can offer to submit a patch if
> > > Alexander confirms this fixes his setup.
> > 
> > I'd be happy to help by applying the patch on my system and running some tests.
> > Please let me know if there are any specific steps or scenarios you'd like me to
> > focus on.
> > 
> > Best regards
> > Alexander Wilhelm
> 
> Please find the attached patch.
[...]

Hi Vladimir,

I’ve applied the patch you provided, but it doesn’t seem to fully resolve the
issue -- or perhaps I’ve misconfigured something. I’m encountering the following
error during initialization:

    mdio_bus 0x0000000ffe4e7000:00: AN not supported on 3.125GHz SerDes lane
    fsl_dpaa_mac ffe4e6000.ethernet eth0: pcs_config failed: -EOPNOTSUPP

The relevant code is located in `drivers/net/pcs/pcs-lynx.c`, within the
`lynx_pcs_config(...)` function. In the case of 2500BASE-X with in-band
autonegotiation enabled, the function logs an error and returns -EOPNOTSUPP.

From what I can tell, autonegotiation isn’t supported on a 3.125GHz SerDes lane
when using 2500BASE-X. What I’m unclear about is how this setup is supposed to
work in practice. My understanding is that on the host side, communication
always uses OCSGMII with flow control, allowing speed pacing via pause frames.
But what about the line side -- should it be configurable, or is it expected to
operate in a fixed mode?

> For debugging, I recommend dumping /proc/device-tree/soc/fman@1a00000/ethernet@f0000/
> (node may change for different MAC) to make sure that all the required
> properties are there, i.e. phy-handle, phy-connection-type, pcsphy-handle.
[...]

I decompiled the running device tree. Below are the excerpt from the resulting file:

    /dts-v1/;

    / {
        soc@ffe000000 {
            fman@400000 {
                ethernet@e6000 {
                    phy-handle = <0x0f>;
                    compatible = "fsl,fman-memac";
                    mac-address = [00 00 5b 05 a2 cb];
                    local-mac-address = [00 00 5b 05 a2 cb];
                    fsl,fman-ports = <0x0b 0x0c>;
                    ptp-timer = <0x0a>;
                    status = "okay";
                    pcsphy-handle = <0x0d 0x0e>;
                    reg = <0xe6000 0x1000>;
                    phy-connection-type = "2500base-x";
                    sleep = <0x10 0x10000000>;
                    pcs-handle-names = "sgmii", "qsgmii";
                    cell-index = <0x03>;
                };

                mdio@fd000 {
                    fsl,erratum-a009885;
                    compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio";
                    status = "okay";
                    #address-cells = <0x01>;
                    #size-cells = <0x00>;
                    reg = <0xfd000 0x1000>;

                    ethernet-phy@7 {
                        compatible = "ethernet-phy-id31c3.1c63";
                        phandle = <0x0f>;
                        reg = <0x07>;
                    };
                };
            };
        };
    };

Let me know how I can assist further -- do you need any additional information from my side?


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 13:01                     ` Alexander Wilhelm
@ 2025-08-04 13:41                       ` Vladimir Oltean
  2025-08-04 14:47                         ` Alexander Wilhelm
  2025-08-04 14:22                       ` Russell King (Oracle)
  1 sibling, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-04 13:41 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Hi Alexander,

On Mon, Aug 04, 2025 at 03:01:44PM +0200, Alexander Wilhelm wrote:
> > Please find the attached patch.
> [...]
> 
> Hi Vladimir,
> 
> I’ve applied the patch you provided, but it doesn’t seem to fully resolve the
> issue -- or perhaps I’ve misconfigured something. I’m encountering the following
> error during initialization:
> 
>     mdio_bus 0x0000000ffe4e7000:00: AN not supported on 3.125GHz SerDes lane
>     fsl_dpaa_mac ffe4e6000.ethernet eth0: pcs_config failed: -EOPNOTSUPP
> 
> The relevant code is located in `drivers/net/pcs/pcs-lynx.c`, within the
> `lynx_pcs_config(...)` function. In the case of 2500BASE-X with in-band
> autonegotiation enabled, the function logs an error and returns -EOPNOTSUPP.

Once I saw this I immediately realized my mistake. More details at the end.

> From what I can tell, autonegotiation isn’t supported on a 3.125GHz SerDes lane
> when using 2500BASE-X. What I’m unclear about is how this setup is supposed to
> work in practice. My understanding is that on the host side, communication
> always uses OCSGMII with flow control, allowing speed pacing via pause frames.
> But what about the line side -- should it be configurable, or is it expected to
> operate in a fixed mode?

So there are two "auto-negotiation" processes involved.


 +-----+ internal +-----+          2500base-x       +-----------+  2.5GBase-T  +------------+
 | MAC |==========| PCS |===========================| Local PHY |==============| Remote PHY | ...
 +-----+ GMII not +-----+   in-band autonegotiation +-----------+   clause 28  +------------+
    represented in the                                           autonegotiation
        device tree                  (1)                              (2)

In the context of this error, it is about the in-band auto-negotiation (1).
This is what managed = "in-band-status" describes.

Actually "in-band autonegotation" is more of an umbrella term whose
exact meaning depends on the phy-interface-mode, i.e. the host-side
interface of the phylib PHY.

For 2500base-x, it refers to the state machines from IEEE 802.3 clause 37,
through which the two ends of this link segment exchange their support
of pause frames and duplex, through special 8b/10b code words.

Actually there is another form of "in-band autonegotiation" commonly in
use, where Cisco took the 802.3 clause 37 mechanism but modified just
the content and purpose of the exchanged messages. This is notably used
for SGMII and USXGMII. In Cisco's reinterpretation, the in-band code
words sent by the PHY on side (1) contain info about the link speed and
duplex negotiated by this device on side (2). And the in-band code words
sent by the MAC-side PCS on side (1) just contain an ACK that it
received the message and is going to reconfigure itself to the line-side
speed.

Whereas the "normal" form of in-band auto-negotiation for 2500base-x is
used over optical links and is truly a symmetric capability exchange,
the Cisco modified form for SGMII/USXGMII only carries useful information
one way (from PHY to MAC), and nothing is really "negotiated". A generic
mechanism was made domain-specific.

The auto-negotiation process which you are concerned about, the one
which dictates the line-side link mode to be used, is process (2) in the
diagram above, and happens independently of process (1).

The exchange (1) of code words is what the Lynx PCS doesn't support when
operating at 2.5G. It has no implications on process (2). It just means
that the PCS doesn't support being told in-band (over the SerDes lanes)
what speed, duplex and flow control settings to use. But it only supports
2.5G for speed anyway, full duplex, and the flow control needs to be
resolved out of band (by reading PHY registers over MDIO) and written
into PCS registers.

The limitation is more relevant for a fibre optic link than for the
Aquantia PHY case. I'm not even sure whether Aquantia PHYs send in-band
code words over OCSGMII anyway (I only tested in combination with the
Lynx PCS which wouldn't see them anyway), and if it does, what format do
they have.

> > For debugging, I recommend dumping /proc/device-tree/soc/fman@1a00000/ethernet@f0000/
> > (node may change for different MAC) to make sure that all the required
> > properties are there, i.e. phy-handle, phy-connection-type, pcsphy-handle.
> [...]
> 
> I decompiled the running device tree. Below are the excerpt from the resulting file:
> 
>     /dts-v1/;
> 
>     / {
>         soc@ffe000000 {
>             fman@400000 {
>                 ethernet@e6000 {
>                     phy-handle = <0x0f>;
>                     compatible = "fsl,fman-memac";
>                     mac-address = [00 00 5b 05 a2 cb];
>                     local-mac-address = [00 00 5b 05 a2 cb];
>                     fsl,fman-ports = <0x0b 0x0c>;
>                     ptp-timer = <0x0a>;
>                     status = "okay";
>                     pcsphy-handle = <0x0d 0x0e>;
>                     reg = <0xe6000 0x1000>;
>                     phy-connection-type = "2500base-x";
>                     sleep = <0x10 0x10000000>;
>                     pcs-handle-names = "sgmii", "qsgmii";
>                     cell-index = <0x03>;
>                 };
> 
>                 mdio@fd000 {
>                     fsl,erratum-a009885;
>                     compatible = "fsl,fman-memac-mdio", "fsl,fman-xmdio";
>                     status = "okay";
>                     #address-cells = <0x01>;
>                     #size-cells = <0x00>;
>                     reg = <0xfd000 0x1000>;
> 
>                     ethernet-phy@7 {
>                         compatible = "ethernet-phy-id31c3.1c63";
>                         phandle = <0x0f>;
>                         reg = <0x07>;
>                     };
>                 };
>             };
>         };
>     };
> 
> Let me know how I can assist further -- do you need any additional information from my side?

The device tree dump looks ok.

I said there should be no managed = "in-band-status" property in the
device tree, and indeed there is none.

What I failed to consider is that the FMan mEMAC driver sets phylink's
"default_an_inband" property to true, making it as if the device tree
node had this property anyway.

The driver needs to be further patched to prevent that from happening.
Here's a line that needs to be squashed into the previous change, could
you please retest with it?

--- a/drivers/net/ethernet/freescale/fman/fman_memac.c
+++ b/drivers/net/ethernet/freescale/fman/fman_memac.c
@@ -1229,6 +1229,7 @@ int memac_initialization(struct mac_device *mac_dev,
 	 * those configurations modes don't use in-band autonegotiation.
 	 */
 	if (!of_property_present(mac_node, "managed") &&
+	    mac_dev->phy_if != PHY_INTERFACE_MODE_2500BASEX &&
 	    mac_dev->phy_if != PHY_INTERFACE_MODE_MII &&
 	    !phy_interface_mode_is_rgmii(mac_dev->phy_if))
 		mac_dev->phylink_config.default_an_inband = true;

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 13:01                     ` Alexander Wilhelm
  2025-08-04 13:41                       ` Vladimir Oltean
@ 2025-08-04 14:22                       ` Russell King (Oracle)
  2025-08-04 14:51                         ` Alexander Wilhelm
  2025-08-04 14:56                         ` Vladimir Oltean
  1 sibling, 2 replies; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-04 14:22 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Mon, Aug 04, 2025 at 03:01:44PM +0200, Alexander Wilhelm wrote:
> Am Mon, Aug 04, 2025 at 01:01:39PM +0300 schrieb Vladimir Oltean:
> > On Mon, Aug 04, 2025 at 08:17:47AM +0200, Alexander Wilhelm wrote:
> > > Am Fri, Aug 01, 2025 at 04:04:20PM +0300 schrieb Vladimir Oltean:
> > > > On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> > > > > It looks like memac_select_pcs() and memac_prepare() fail to
> > > > > handle 2500BASEX despite memac_initialization() suggesting the
> > > > > SGMII PCS supports 2500BASEX.
> > > > 
> > > > Thanks for pointing this out, it seems to be a regression introduced by
> > > > commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").
> > > > 
> > > > If there are no other volunteers, I can offer to submit a patch if
> > > > Alexander confirms this fixes his setup.
> > > 
> > > I'd be happy to help by applying the patch on my system and running some tests.
> > > Please let me know if there are any specific steps or scenarios you'd like me to
> > > focus on.
> > > 
> > > Best regards
> > > Alexander Wilhelm
> > 
> > Please find the attached patch.
> [...]
> 
> Hi Vladimir,
> 
> I’ve applied the patch you provided, but it doesn’t seem to fully resolve the
> issue -- or perhaps I’ve misconfigured something. I’m encountering the following
> error during initialization:
> 
>     mdio_bus 0x0000000ffe4e7000:00: AN not supported on 3.125GHz SerDes lane
>     fsl_dpaa_mac ffe4e6000.ethernet eth0: pcs_config failed: -EOPNOTSUPP

We're falling foul of the historic crap that 2500base-X is (802.3 were
very very late to the party in "standardising" it, but after there were
many different implementations with varying capabilities already on the
market.)

aquantia_main.c needs to implement the .inband_caps() method, and
report what its actual capabilities are for the supplied interface
mode according to how it has been provisioned.

> 
> The relevant code is located in `drivers/net/pcs/pcs-lynx.c`, within the
> `lynx_pcs_config(...)` function. In the case of 2500BASE-X with in-band
> autonegotiation enabled, the function logs an error and returns -EOPNOTSUPP.
> 
> From what I can tell, autonegotiation isn’t supported on a 3.125GHz SerDes lane
> when using 2500BASE-X.

Due to the lack of early standardisation, some manufacturers require
AN, some have it optional, others simply do not support it.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 13:41                       ` Vladimir Oltean
@ 2025-08-04 14:47                         ` Alexander Wilhelm
  2025-08-04 16:00                           ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-04 14:47 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Mon, Aug 04, 2025 at 04:41:15PM +0300 schrieb Vladimir Oltean:
> Hi Alexander,
> 
> On Mon, Aug 04, 2025 at 03:01:44PM +0200, Alexander Wilhelm wrote:
> > > Please find the attached patch.
> > [...]
> > 
> > Hi Vladimir,
> > 
> > I’ve applied the patch you provided, but it doesn’t seem to fully resolve the
> > issue -- or perhaps I’ve misconfigured something. I’m encountering the following
> > error during initialization:
> > 
> >     mdio_bus 0x0000000ffe4e7000:00: AN not supported on 3.125GHz SerDes lane
> >     fsl_dpaa_mac ffe4e6000.ethernet eth0: pcs_config failed: -EOPNOTSUPP
> > 
> > The relevant code is located in `drivers/net/pcs/pcs-lynx.c`, within the
> > `lynx_pcs_config(...)` function. In the case of 2500BASE-X with in-band
> > autonegotiation enabled, the function logs an error and returns -EOPNOTSUPP.
> 
> Once I saw this I immediately realized my mistake. More details at the end.
> 
> > From what I can tell, autonegotiation isn’t supported on a 3.125GHz SerDes lane
> > when using 2500BASE-X. What I’m unclear about is how this setup is supposed to
> > work in practice. My understanding is that on the host side, communication
> > always uses OCSGMII with flow control, allowing speed pacing via pause frames.
> > But what about the line side -- should it be configurable, or is it expected to
> > operate in a fixed mode?
> 
> So there are two "auto-negotiation" processes involved.
> 
> 
>  +-----+ internal +-----+          2500base-x       +-----------+  2.5GBase-T  +------------+
>  | MAC |==========| PCS |===========================| Local PHY |==============| Remote PHY | ...
>  +-----+ GMII not +-----+   in-band autonegotiation +-----------+   clause 28  +------------+
>     represented in the                                           autonegotiation
>         device tree                  (1)                              (2)
[...]

Hi Vladimir,

thank you for the detailed explanation. I feel I now have a clearer
understanding of what's happening under the hood.

> What I failed to consider is that the FMan mEMAC driver sets phylink's
> "default_an_inband" property to true, making it as if the device tree
> node had this property anyway.
> 
> The driver needs to be further patched to prevent that from happening.
> Here's a line that needs to be squashed into the previous change, could
> you please retest with it?
> 
> --- a/drivers/net/ethernet/freescale/fman/fman_memac.c
> +++ b/drivers/net/ethernet/freescale/fman/fman_memac.c
> @@ -1229,6 +1229,7 @@ int memac_initialization(struct mac_device *mac_dev,
>  	 * those configurations modes don't use in-band autonegotiation.
>  	 */
>  	if (!of_property_present(mac_node, "managed") &&
> +	    mac_dev->phy_if != PHY_INTERFACE_MODE_2500BASEX &&
>  	    mac_dev->phy_if != PHY_INTERFACE_MODE_MII &&
>  	    !phy_interface_mode_is_rgmii(mac_dev->phy_if))
>  		mac_dev->phylink_config.default_an_inband = true;

I’ve applied this patch as well, which brought me a step further. Unfortunately,
I still don’t get a ping response, although the configuration looks correct to
me. Below are the logs and the `ethtool` output I’m seeing:

    user@host:~# logread | grep eth
    kern.info kernel: [   20.777530] fsl_dpaa_mac ffe4e6000.ethernet: FMan MEMAC
    kern.info kernel: [   20.782840] fsl_dpaa_mac ffe4e6000.ethernet: FMan MAC address: 00:00:5b:05:a2:cb
    kern.info kernel: [   20.793126] fsl_dpaa_mac ffe4e6000.ethernet eth0: Probed interface eth0
    kern.info kernel: [   31.058431] usbcore: registered new interface driver cdc_ether
    user.notice netifd: Added device handler type: veth
    kern.info kernel: [   48.171837] fsl_dpaa_mac ffe4e6000.ethernet eth0: PHY [0x0000000ffe4fd000:07] driver [Aquantia AQR115] (irq=POLL)
    kern.info kernel: [   48.171861] fsl_dpaa_mac ffe4e6000.ethernet eth0: configuring for phy/2500base-x link mode
    kern.info kernel: [   48.181338] br-lan: port 1(eth0) entered blocking state
    kern.info kernel: [   48.181363] br-lan: port 1(eth0) entered disabled state
    kern.info kernel: [   48.181399] fsl_dpaa_mac ffe4e6000.ethernet eth0: entered allmulticast mode
    kern.info kernel: [   48.181577] fsl_dpaa_mac ffe4e6000.ethernet eth0: entered promiscuous mode
    kern.info kernel: [   53.304459] fsl_dpaa_mac ffe4e6000.ethernet eth0: Link is Up - 2.5Gbps/Full - flow control rx/tx
    kern.info kernel: [   53.304629] br-lan: port 1(eth0) entered blocking state
    kern.info kernel: [   53.304642] br-lan: port 1(eth0) entered forwarding state
    daemon.notice netifd: Network device 'eth0' link is up
    daemon.info lldpd[6849]: libevent 2.1.12-stable initialized with epoll method
    daemon.info charon: 10[KNL] flags changed for fe80::200:5bff:fe05:a2cb on eth0

user@host:~# ethtool eth0
    Settings for eth0:
        Supported ports: [ TP MII ]
        Supported link modes:   10baseT/Full
                                100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Full
                                100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  100baseT/Full
                                             1000baseT/Full
                                             10000baseT/Full
                                             2500baseT/Full
                                             5000baseT/Full
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 2500Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 7
        Transceiver: external
        Auto-negotiation: on
        MDI-X: on
        Current message level: 0x00002037 (8247)
                               drv probe link ifdown ifup hw
        Link detected: yes


I will continue investigating why the ping isn’t working and will share any new
findings as soon as I have them. Thanks again for your support!


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 14:22                       ` Russell King (Oracle)
@ 2025-08-04 14:51                         ` Alexander Wilhelm
  2025-08-04 14:56                         ` Vladimir Oltean
  1 sibling, 0 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-04 14:51 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Mon, Aug 04, 2025 at 03:22:50PM +0100 schrieb Russell King (Oracle):
> On Mon, Aug 04, 2025 at 03:01:44PM +0200, Alexander Wilhelm wrote:
> > Am Mon, Aug 04, 2025 at 01:01:39PM +0300 schrieb Vladimir Oltean:
> > > On Mon, Aug 04, 2025 at 08:17:47AM +0200, Alexander Wilhelm wrote:
> > > > Am Fri, Aug 01, 2025 at 04:04:20PM +0300 schrieb Vladimir Oltean:
> > > > > On Fri, Aug 01, 2025 at 01:23:44PM +0100, Russell King (Oracle) wrote:
> > > > > > It looks like memac_select_pcs() and memac_prepare() fail to
> > > > > > handle 2500BASEX despite memac_initialization() suggesting the
> > > > > > SGMII PCS supports 2500BASEX.
> > > > > 
> > > > > Thanks for pointing this out, it seems to be a regression introduced by
> > > > > commit 5d93cfcf7360 ("net: dpaa: Convert to phylink").
> > > > > 
> > > > > If there are no other volunteers, I can offer to submit a patch if
> > > > > Alexander confirms this fixes his setup.
> > > > 
> > > > I'd be happy to help by applying the patch on my system and running some tests.
> > > > Please let me know if there are any specific steps or scenarios you'd like me to
> > > > focus on.
> > > > 
> > > > Best regards
> > > > Alexander Wilhelm
> > > 
> > > Please find the attached patch.
> > [...]
> > 
> > Hi Vladimir,
> > 
> > I’ve applied the patch you provided, but it doesn’t seem to fully resolve the
> > issue -- or perhaps I’ve misconfigured something. I’m encountering the following
> > error during initialization:
> > 
> >     mdio_bus 0x0000000ffe4e7000:00: AN not supported on 3.125GHz SerDes lane
> >     fsl_dpaa_mac ffe4e6000.ethernet eth0: pcs_config failed: -EOPNOTSUPP
> 
> We're falling foul of the historic crap that 2500base-X is (802.3 were
> very very late to the party in "standardising" it, but after there were
> many different implementations with varying capabilities already on the
> market.)
> 
> aquantia_main.c needs to implement the .inband_caps() method, and
> report what its actual capabilities are for the supplied interface
> mode according to how it has been provisioned.

Good to know, thank you. I found some implementation on marvell drivers that I
could use as a reference.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-01  5:44     ` Alexander Wilhelm
@ 2025-08-04 14:53       ` Andrew Lunn
  0 siblings, 0 replies; 53+ messages in thread
From: Andrew Lunn @ 2025-08-04 14:53 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

> I don see any firmware problems. I have one of the latest builds, and from what
> I understand, the firmware consists of base image and additionally a
> provisioning table. But this table is a kind of pre-configuration. That means I
> can override the entire PHY configuration to my needs.

This pre-configuration is what i don't like about these PHYs. It means
you cannot assume any register has the value the datasheet says it
should have after a reset. The driver might work for you, but not for
others because your firmware has different pre-configurations to other
firmware. So in effect, the driver needs to write every single
register with a known value...

	Andrew

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 14:22                       ` Russell King (Oracle)
  2025-08-04 14:51                         ` Alexander Wilhelm
@ 2025-08-04 14:56                         ` Vladimir Oltean
  1 sibling, 0 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-04 14:56 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Mon, Aug 04, 2025 at 03:22:50PM +0100, Russell King (Oracle) wrote:
> We're falling foul of the historic crap that 2500base-X is (802.3 were
> very very late to the party in "standardising" it, but after there were
> many different implementations with varying capabilities already on the
> market.)
> 
> aquantia_main.c needs to implement the .inband_caps() method, and
> report what its actual capabilities are for the supplied interface
> mode according to how it has been provisioned.

I have some patches for that which need testing, because I don't yet
fully understand why there are 2 different settings for this operation,
and how they interact.

- Bit 3 of the aqr_global_cfg_regs[] registers (1e.310, 1e.31b, 1e.31c,
  1e.31d, 1e.31e, 1e.31f) is "System Interface Autoneg". There's one of
  these for each supported media side link speed. We have to filter for
  those media link speeds where the translated VEND1_GLOBAL_CFG_SERDES_MODE
  matches the phy_interface_t given to .inband_caps(), and warn on
  inconsistent settings (the same phy_interface_t is provisioned with
  inband enabled at speed X, and disabled at speed Y). I'm crossing my
  fingers this warning isn't going to fire on OCSGMII/2500base-x on live
  systems, but who knows. I am unlikely to be able to find out whether
  setting or unsetting this bit makes any difference for OCSGMII, since
  my PCS does not see the 16-bit config words.

- Bit 3 of register 4.C441 is "USX Autoneg Control For MAC". Not clear
  whether it is an alternative or additional configuration specific for
  USXGMII. This bit I can test.

There is some non-trivial consolidation which needs to be dealt with
first. The driver does not call aqr107_fill_interface_modes() for many
of the PHY IDs for which it could do that. And we can't implement
.inband_caps() except for those PHYs where we know that the registers
read by aqr107_fill_interface_modes() are accessible. I think I do have
those consolidation patches in a reasonably good state.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 14:47                         ` Alexander Wilhelm
@ 2025-08-04 16:00                           ` Vladimir Oltean
  2025-08-04 16:02                             ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-04 16:00 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 3950 bytes --]

On Mon, Aug 04, 2025 at 04:47:03PM +0200, Alexander Wilhelm wrote:
> I’ve applied this patch as well, which brought me a step further. Unfortunately,
> I still don’t get a ping response, although the configuration looks correct to
> me. Below are the logs and the `ethtool` output I’m seeing:
> 
>     user@host:~# logread | grep eth
>     kern.info kernel: [   20.777530] fsl_dpaa_mac ffe4e6000.ethernet: FMan MEMAC
>     kern.info kernel: [   20.782840] fsl_dpaa_mac ffe4e6000.ethernet: FMan MAC address: 00:00:5b:05:a2:cb
>     kern.info kernel: [   20.793126] fsl_dpaa_mac ffe4e6000.ethernet eth0: Probed interface eth0
>     kern.info kernel: [   31.058431] usbcore: registered new interface driver cdc_ether
>     user.notice netifd: Added device handler type: veth
>     kern.info kernel: [   48.171837] fsl_dpaa_mac ffe4e6000.ethernet eth0: PHY [0x0000000ffe4fd000:07] driver [Aquantia AQR115] (irq=POLL)
>     kern.info kernel: [   48.171861] fsl_dpaa_mac ffe4e6000.ethernet eth0: configuring for phy/2500base-x link mode
>     kern.info kernel: [   48.181338] br-lan: port 1(eth0) entered blocking state
>     kern.info kernel: [   48.181363] br-lan: port 1(eth0) entered disabled state
>     kern.info kernel: [   48.181399] fsl_dpaa_mac ffe4e6000.ethernet eth0: entered allmulticast mode
>     kern.info kernel: [   48.181577] fsl_dpaa_mac ffe4e6000.ethernet eth0: entered promiscuous mode
>     kern.info kernel: [   53.304459] fsl_dpaa_mac ffe4e6000.ethernet eth0: Link is Up - 2.5Gbps/Full - flow control rx/tx
>     kern.info kernel: [   53.304629] br-lan: port 1(eth0) entered blocking state
>     kern.info kernel: [   53.304642] br-lan: port 1(eth0) entered forwarding state
>     daemon.notice netifd: Network device 'eth0' link is up
>     daemon.info lldpd[6849]: libevent 2.1.12-stable initialized with epoll method
>     daemon.info charon: 10[KNL] flags changed for fe80::200:5bff:fe05:a2cb on eth0
> 
> user@host:~# ethtool eth0
>     Settings for eth0:
>         Supported ports: [ TP MII ]
>         Supported link modes:   10baseT/Full
>                                 100baseT/Full
>                                 1000baseT/Full
>                                 2500baseT/Full
>         Supported pause frame use: Symmetric Receive-only
>         Supports auto-negotiation: Yes
>         Supported FEC modes: Not reported
>         Advertised link modes:  10baseT/Full
>                                 100baseT/Full
>                                 1000baseT/Full
>                                 2500baseT/Full
>         Advertised pause frame use: Symmetric Receive-only
>         Advertised auto-negotiation: Yes
>         Advertised FEC modes: Not reported
>         Link partner advertised link modes:  100baseT/Full
>                                              1000baseT/Full
>                                              10000baseT/Full
>                                              2500baseT/Full
>                                              5000baseT/Full
>         Link partner advertised pause frame use: Symmetric Receive-only
>         Link partner advertised auto-negotiation: Yes
>         Link partner advertised FEC modes: Not reported
>         Speed: 2500Mb/s
>         Duplex: Full
>         Port: Twisted Pair
>         PHYAD: 7
>         Transceiver: external
>         Auto-negotiation: on
>         MDI-X: on
>         Current message level: 0x00002037 (8247)
>                                drv probe link ifdown ifup hw
>         Link detected: yes
> 
> 
> I will continue investigating why the ping isn’t working and will share any new
> findings as soon as I have them. Thanks again for your support!

Can you apply the following patch, which adds support for ethtool
counters coming from the mEMAC, and dump them?

ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0'

Could you then compare this to:

ethtool --phy-statistics eth0 | grep -v ': 0'

?

[-- Attachment #2: 0001-net-fman_memac-report-structured-ethtool-counters.patch --]
[-- Type: text/x-diff, Size: 7675 bytes --]

From 899d6147cc1f70f579dda2f51d9dd38d697f85b9 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Mon, 4 Aug 2025 18:45:48 +0300
Subject: [PATCH] net: fman_memac: report structured ethtool counters

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 .../ethernet/freescale/dpaa/dpaa_ethtool.c    | 45 ++++++++++
 .../net/ethernet/freescale/fman/fman_memac.c  | 87 +++++++++++++++++++
 drivers/net/ethernet/freescale/fman/mac.h     | 14 +++
 3 files changed, 146 insertions(+)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
index 0c588e03b15e..c24ba6cbcb95 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
@@ -465,6 +465,47 @@ static int dpaa_set_coalesce(struct net_device *dev,
 	return res;
 }
 
+static void dpaa_get_pause_stats(struct net_device *net_dev,
+				 struct ethtool_pause_stats *s)
+{
+	struct dpaa_priv *priv = netdev_priv(net_dev);
+	struct mac_device *mac_dev = priv->mac_dev;
+
+	if (mac_dev->get_pause_stats)
+		mac_dev->get_pause_stats(mac_dev->fman_mac, s);
+}
+
+static void dpaa_get_rmon_stats(struct net_device *net_dev,
+				struct ethtool_rmon_stats *s,
+				const struct ethtool_rmon_hist_range **ranges)
+{
+	struct dpaa_priv *priv = netdev_priv(net_dev);
+	struct mac_device *mac_dev = priv->mac_dev;
+
+	if (mac_dev->get_rmon_stats)
+		mac_dev->get_rmon_stats(mac_dev->fman_mac, s, ranges);
+}
+
+static void dpaa_get_eth_ctrl_stats(struct net_device *net_dev,
+				    struct ethtool_eth_ctrl_stats *s)
+{
+	struct dpaa_priv *priv = netdev_priv(net_dev);
+	struct mac_device *mac_dev = priv->mac_dev;
+
+	if (mac_dev->get_eth_ctrl_stats)
+		mac_dev->get_eth_ctrl_stats(mac_dev->fman_mac, s);
+}
+
+static void dpaa_get_eth_mac_stats(struct net_device *net_dev,
+				   struct ethtool_eth_mac_stats *s)
+{
+	struct dpaa_priv *priv = netdev_priv(net_dev);
+	struct mac_device *mac_dev = priv->mac_dev;
+
+	if (mac_dev->get_eth_mac_stats)
+		mac_dev->get_eth_mac_stats(mac_dev->fman_mac, s);
+}
+
 const struct ethtool_ops dpaa_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_RX_USECS |
 				     ETHTOOL_COALESCE_RX_MAX_FRAMES,
@@ -485,4 +526,8 @@ const struct ethtool_ops dpaa_ethtool_ops = {
 	.get_ts_info = dpaa_get_ts_info,
 	.get_coalesce = dpaa_get_coalesce,
 	.set_coalesce = dpaa_set_coalesce,
+	.get_pause_stats = dpaa_get_pause_stats,
+	.get_rmon_stats = dpaa_get_rmon_stats,
+	.get_eth_ctrl_stats = dpaa_get_eth_ctrl_stats,
+	.get_eth_mac_stats = dpaa_get_eth_mac_stats,
 };
diff --git a/drivers/net/ethernet/freescale/fman/fman_memac.c b/drivers/net/ethernet/freescale/fman/fman_memac.c
index d32ffd6be7b1..c84f0336c94c 100644
--- a/drivers/net/ethernet/freescale/fman/fman_memac.c
+++ b/drivers/net/ethernet/freescale/fman/fman_memac.c
@@ -900,6 +900,89 @@ static int memac_set_exception(struct fman_mac *memac,
 	return 0;
 }
 
+static u64 memac_read64(void __iomem *reg)
+{
+	u32 low, high, tmp;
+
+	do {
+		high = ioread32be(reg + 4);
+		low = ioread32be(reg);
+		tmp = ioread32be(reg + 4);
+	} while (high != tmp);
+
+	return ((u64)high << 32) | low;
+}
+
+static void memac_get_pause_stats(struct fman_mac *memac,
+				  struct ethtool_pause_stats *s)
+{
+	s->tx_pause_frames = memac_read64(&memac->regs->txpf_l);
+	s->rx_pause_frames = memac_read64(&memac->regs->rxpf_l);
+}
+
+static const struct ethtool_rmon_hist_range memac_rmon_ranges[] = {
+	{   64,   64 },
+	{   65,  127 },
+	{  128,  255 },
+	{  256,  511 },
+	{  512, 1023 },
+	{ 1024, 1518 },
+	{ 1519, 9600 },
+	{},
+};
+
+static void memac_get_rmon_stats(struct fman_mac *memac,
+				 struct ethtool_rmon_stats *s,
+				 const struct ethtool_rmon_hist_range **ranges)
+{
+	s->undersize_pkts = memac_read64(&memac->regs->rund_l);
+	s->oversize_pkts = memac_read64(&memac->regs->rovr_l);
+	s->fragments = memac_read64(&memac->regs->rfrg_l);
+	s->jabbers = memac_read64(&memac->regs->rjbr_l);
+
+	s->hist[0] = memac_read64(&memac->regs->r64_l);
+	s->hist[1] = memac_read64(&memac->regs->r127_l);
+	s->hist[2] = memac_read64(&memac->regs->r255_l);
+	s->hist[3] = memac_read64(&memac->regs->r511_l);
+	s->hist[4] = memac_read64(&memac->regs->r1023_l);
+	s->hist[5] = memac_read64(&memac->regs->r1518_l);
+	s->hist[6] = memac_read64(&memac->regs->r1519x_l);
+
+	s->hist_tx[0] = memac_read64(&memac->regs->t64_l);
+	s->hist_tx[1] = memac_read64(&memac->regs->t127_l);
+	s->hist_tx[2] = memac_read64(&memac->regs->t255_l);
+	s->hist_tx[3] = memac_read64(&memac->regs->t511_l);
+	s->hist_tx[4] = memac_read64(&memac->regs->t1023_l);
+	s->hist_tx[5] = memac_read64(&memac->regs->t1518_l);
+	s->hist_tx[6] = memac_read64(&memac->regs->t1519x_l);
+
+	*ranges = memac_rmon_ranges;
+}
+
+static void memac_get_eth_ctrl_stats(struct fman_mac *memac,
+				     struct ethtool_eth_ctrl_stats *s)
+{
+	s->MACControlFramesTransmitted = memac_read64(&memac->regs->tcnp_l);
+	s->MACControlFramesReceived = memac_read64(&memac->regs->rcnp_l);
+}
+
+static void memac_get_eth_mac_stats(struct fman_mac *memac,
+				    struct ethtool_eth_mac_stats *s)
+{
+	s->FramesTransmittedOK = memac_read64(&memac->regs->tfrm_l);
+	s->FramesReceivedOK = memac_read64(&memac->regs->rfrm_l);
+	s->FrameCheckSequenceErrors = memac_read64(&memac->regs->rfcs_l);
+	s->AlignmentErrors = memac_read64(&memac->regs->raln_l);
+	s->OctetsTransmittedOK = memac_read64(&memac->regs->teoct_l);
+	s->FramesLostDueToIntMACXmitError = memac_read64(&memac->regs->terr_l);
+	s->OctetsReceivedOK = memac_read64(&memac->regs->reoct_l);
+	s->FramesLostDueToIntMACRcvError = memac_read64(&memac->regs->rdrntp_l);
+	s->MulticastFramesXmittedOK = memac_read64(&memac->regs->tmca_l);
+	s->BroadcastFramesXmittedOK = memac_read64(&memac->regs->tbca_l);
+	s->MulticastFramesReceivedOK = memac_read64(&memac->regs->rmca_l);
+	s->BroadcastFramesReceivedOK = memac_read64(&memac->regs->rbca_l);
+}
+
 static int memac_init(struct fman_mac *memac)
 {
 	struct memac_cfg *memac_drv_param;
@@ -1092,6 +1175,10 @@ int memac_initialization(struct mac_device *mac_dev,
 	mac_dev->set_tstamp		= memac_set_tstamp;
 	mac_dev->enable			= memac_enable;
 	mac_dev->disable		= memac_disable;
+	mac_dev->get_pause_stats	= memac_get_pause_stats;
+	mac_dev->get_rmon_stats		= memac_get_rmon_stats;
+	mac_dev->get_eth_ctrl_stats	= memac_get_eth_ctrl_stats;
+	mac_dev->get_eth_mac_stats	= memac_get_eth_mac_stats;
 
 	mac_dev->fman_mac = memac_config(mac_dev, params);
 	if (!mac_dev->fman_mac)
diff --git a/drivers/net/ethernet/freescale/fman/mac.h b/drivers/net/ethernet/freescale/fman/mac.h
index 955ace338965..63c2c5b4f99e 100644
--- a/drivers/net/ethernet/freescale/fman/mac.h
+++ b/drivers/net/ethernet/freescale/fman/mac.h
@@ -16,6 +16,11 @@
 #include "fman.h"
 #include "fman_mac.h"
 
+struct ethtool_eth_ctrl_stats;
+struct ethtool_eth_mac_stats;
+struct ethtool_pause_stats;
+struct ethtool_rmon_stats;
+struct ethtool_rmon_hist_range;
 struct fman_mac;
 struct mac_priv_s;
 
@@ -46,6 +51,15 @@ struct mac_device {
 				 enet_addr_t *eth_addr);
 	int (*remove_hash_mac_addr)(struct fman_mac *mac_dev,
 				    enet_addr_t *eth_addr);
+	void (*get_pause_stats)(struct fman_mac *memac,
+				struct ethtool_pause_stats *s);
+	void (*get_rmon_stats)(struct fman_mac *memac,
+			       struct ethtool_rmon_stats *s,
+			       const struct ethtool_rmon_hist_range **ranges);
+	void (*get_eth_ctrl_stats)(struct fman_mac *memac,
+				   struct ethtool_eth_ctrl_stats *s);
+	void (*get_eth_mac_stats)(struct fman_mac *memac,
+				  struct ethtool_eth_mac_stats *s);
 
 	void (*update_speed)(struct mac_device *mac_dev, int speed);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 16:00                           ` Vladimir Oltean
@ 2025-08-04 16:02                             ` Vladimir Oltean
  2025-08-05  7:59                               ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-04 16:02 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Mon, Aug 04, 2025 at 07:00:37PM +0300, Vladimir Oltean wrote:
> Can you apply the following patch, which adds support for ethtool
> counters coming from the mEMAC, and dump them?
> 
> ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0'

I forgot to mention how to show flow control counters:

ethtool -I --show-pause eth0

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-04 16:02                             ` Vladimir Oltean
@ 2025-08-05  7:59                               ` Alexander Wilhelm
  2025-08-05 10:20                                 ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-05  7:59 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Mon, Aug 04, 2025 at 07:02:34PM +0300 schrieb Vladimir Oltean:
> On Mon, Aug 04, 2025 at 07:00:37PM +0300, Vladimir Oltean wrote:
> > Can you apply the following patch, which adds support for ethtool
> > counters coming from the mEMAC, and dump them?
> > 
> > ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0'
> 
> I forgot to mention how to show flow control counters:
> 
> ethtool -I --show-pause eth0

Hi Vladimir,

Thank you for providing the patch. I was able to apply it successfully and
retrieve the desired statistics using the following command:

    user@host-A:~# ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0' && ethtool --phy-statistics eth0 | grep -v ': 0' && ethtool -I --show-pause eth0
    Standard stats for eth0:
    eth-mac-FramesTransmittedOK: 2188
    eth-mac-OctetsTransmittedOK: 337884
    eth-mac-MulticastFramesXmittedOK: 76
    eth-mac-BroadcastFramesXmittedOK: 2112
    tx-rmon-etherStatsPkts64to64Octets: 16
    tx-rmon-etherStatsPkts65to127Octets: 1587
    tx-rmon-etherStatsPkts128to255Octets: 63
    tx-rmon-etherStatsPkts256to511Octets: 522
    PHY statistics:
         sgmii_rx_good_frames: 20785
         sgmii_rx_false_carrier_events: 1
         sgmii_tx_good_frames: 21120
         sgmii_tx_bad_frames: 52
    Pause parameters for eth0:
    Autonegotiate:  on
    RX:             off
    TX:             off
    RX negotiated: on
    TX negotiated: on
    Statistics:
      tx_pause_frames: 0
      rx_pause_frames: 0

I have a ping running in the background and can observe that MAC frames and
TX-RMON packets are continuously increasing. However, the PHY statistics remain
unchanged. I suspect the current SGMII frames originate from U-Boot, as I load
the firmware image via `netboot`. These statistics were recorded at 2.5G speed,
but the same behavior is also visible at 1G.

Do you think the issue still lies within the MAC driver, or could it be related
to the Aquantia driver or firmware?


Best regards,
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-05  7:59                               ` Alexander Wilhelm
@ 2025-08-05 10:20                                 ` Vladimir Oltean
  2025-08-05 12:44                                   ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-05 10:20 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1554 bytes --]

On Tue, Aug 05, 2025 at 09:59:57AM +0200, Alexander Wilhelm wrote:
> I have a ping running in the background and can observe that MAC frames and
> TX-RMON packets are continuously increasing. However, the PHY statistics remain
> unchanged. I suspect the current SGMII frames originate from U-Boot, as I load
> the firmware image via `netboot`. These statistics were recorded at 2.5G speed,
> but the same behavior is also visible at 1G.
> 
> Do you think the issue still lies within the MAC driver, or could it be related
> to the Aquantia driver or firmware?

So the claim is that in U-Boot, the exact same link with the exact same
PHY firmware works, right? Yet in Linux, MAC transmit counters increase,
but nothing comes across on the PHY side of the MII link? What about
packets sent from the link partner (the remote board connected to the PHY)?
Do packets sent from that board result in an increase of PHY counters,
and MAC RX counters?

For sure this is the correct port ("ffe4e6000.ethernet" corresponds to fm1-mac4,
port name in U-Boot would be "FM1@DTSEC4")? What SoC is this on? T1 something?
What SRDS_PRTCL_S1 value is in the RCW? I'd like to trace back the steps
in order to establish that the link works at 2.5G with autoneg disabled
on both ends. It seems to me there is either a lack of connectivity
between the MAC used in Linux and the PHY, or a protocol mismatch.

Could you please also apply this PHY debugging patch and let us know
what the Global System Configuration registers contain after the
firmware applies the provisioning?

[-- Attachment #2: 0001-net-phy-aquantia-dump-Global-System-Configuration-re.patch --]
[-- Type: text/x-diff, Size: 2728 bytes --]

From 17b74539f4f1fe2c335505443d797a9e2ae1fab8 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Tue, 5 Aug 2025 12:54:01 +0300
Subject: [PATCH] net: phy: aquantia: dump Global System Configuration
 registers

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/phy/aquantia/aquantia.h      |  5 +++++
 drivers/net/phy/aquantia/aquantia_main.c | 18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h
index 0c78bfabace5..9d02f9f0b8b7 100644
--- a/drivers/net/phy/aquantia/aquantia.h
+++ b/drivers/net/phy/aquantia/aquantia.h
@@ -55,10 +55,15 @@
 #define VEND1_GLOBAL_CFG_SERDES_MODE_SGMII	3
 #define VEND1_GLOBAL_CFG_SERDES_MODE_OCSGMII	4
 #define VEND1_GLOBAL_CFG_SERDES_MODE_XFI5G	6
+#define VEND1_GLOBAL_CFG_AUTONEG_ENA		BIT(3)
+#define VEND1_GLOBAL_CFG_TRAINING_ENA		BIT(4)
+#define VEND1_GLOBAL_CFG_RESET_ON_TRANSITION	BIT(5)
+#define VEND1_GLOBAL_CFG_SERDES_SILENCE		BIT(6)
 #define VEND1_GLOBAL_CFG_RATE_ADAPT		GENMASK(8, 7)
 #define VEND1_GLOBAL_CFG_RATE_ADAPT_NONE	0
 #define VEND1_GLOBAL_CFG_RATE_ADAPT_USX		1
 #define VEND1_GLOBAL_CFG_RATE_ADAPT_PAUSE	2
+#define VEND1_GLOBAL_CFG_MACSEC_ENABLE		BIT(9)
 
 /* Vendor specific 1, MDIO_MMD_VEND2 */
 #define VEND1_GLOBAL_CONTROL2			0xc001
diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index 77a48635d7bf..72329e328f27 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -987,6 +987,15 @@ static const u16 aqr_global_cfg_regs[] = {
 	VEND1_GLOBAL_CFG_10G
 };
 
+static const int aqr_global_cfg_speeds[] = {
+	SPEED_10,
+	SPEED_100,
+	SPEED_1000,
+	SPEED_2500,
+	SPEED_5000,
+	SPEED_10000,
+};
+
 static int aqr107_fill_interface_modes(struct phy_device *phydev)
 {
 	unsigned long *possible = phydev->possible_interfaces;
@@ -1007,6 +1016,15 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
 		serdes_mode = FIELD_GET(VEND1_GLOBAL_CFG_SERDES_MODE, val);
 		rate_adapt = FIELD_GET(VEND1_GLOBAL_CFG_RATE_ADAPT, val);
 
+		phydev_info(phydev, "Speed %d SerDes mode %d autoneg %d training %d reset on transition %d silence %d rate adapt %d macsec %d\n",
+			    aqr_global_cfg_speeds[i], serdes_mode,
+			    !!(val & VEND1_GLOBAL_CFG_AUTONEG_ENA),
+			    !!(val & VEND1_GLOBAL_CFG_TRAINING_ENA),
+			    !!(val & VEND1_GLOBAL_CFG_RESET_ON_TRANSITION),
+			    !!(val & VEND1_GLOBAL_CFG_SERDES_SILENCE),
+			    rate_adapt,
+			    !!(val & VEND1_GLOBAL_CFG_MACSEC_ENABLE));
+
 		switch (serdes_mode) {
 		case VEND1_GLOBAL_CFG_SERDES_MODE_XFI:
 			if (rate_adapt == VEND1_GLOBAL_CFG_RATE_ADAPT_USX)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-05 10:20                                 ` Vladimir Oltean
@ 2025-08-05 12:44                                   ` Alexander Wilhelm
  2025-08-06 14:58                                     ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-05 12:44 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Tue, Aug 05, 2025 at 01:20:56PM +0300 schrieb Vladimir Oltean:
> On Tue, Aug 05, 2025 at 09:59:57AM +0200, Alexander Wilhelm wrote:
> > I have a ping running in the background and can observe that MAC frames and
> > TX-RMON packets are continuously increasing. However, the PHY statistics remain
> > unchanged. I suspect the current SGMII frames originate from U-Boot, as I load
> > the firmware image via `netboot`. These statistics were recorded at 2.5G speed,
> > but the same behavior is also visible at 1G.
> > 
> > Do you think the issue still lies within the MAC driver, or could it be related
> > to the Aquantia driver or firmware?
> 
> So the claim is that in U-Boot, the exact same link with the exact same
> PHY firmware works, right? Yet in Linux, MAC transmit counters increase,
> but nothing comes across on the PHY side of the MII link? What about
> packets sent from the link partner (the remote board connected to the PHY)?
> Do packets sent from that board result in an increase of PHY counters,
> and MAC RX counters?
> 
> For sure this is the correct port ("ffe4e6000.ethernet" corresponds to fm1-mac4,
> port name in U-Boot would be "FM1@DTSEC4")? What SoC is this on? T1 something?
> What SRDS_PRTCL_S1 value is in the RCW? I'd like to trace back the steps
> in order to establish that the link works at 2.5G with autoneg disabled
> on both ends. It seems to me there is either a lack of connectivity
> between the MAC used in Linux and the PHY, or a protocol mismatch.

Hi Vladimir,

thank you, you just solved my problem. I indeed have made an error in the DTS
file by setting the wrong MAC.

I'm using the T1023E CPU with e5500 core. For the SRDS_PRTCL_S1 value I set
0x135. That corresponds to the following:

* A: Aurora (5G/2.5G)
* B: sg.m3 (2.5G)
* C: PCIe2 (5G/2.5G)
* D: PCIe1 (5G/2.5G)

My PHY is directed to the `FM1@DTSEC3`, MAC3 instead of previously configured
MAC4. I fixed it by using ethernet `eth@ffe4e4000`. Now I have ping in both
directions with 2.5G and 1G speed settings. I cannot test 10M, because my host
does not support it. And 100M still does not work, as it does not work in U-Boot
as well.

> Could you please also apply this PHY debugging patch and let us know
> what the Global System Configuration registers contain after the
> firmware applies the provisioning?

Patch is applied. Here are the registers log:

    user@host:~# logread | grep AQR115
    Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10 SerDes mode 4 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
    Aquantia AQR115 0x0000000ffe4fd000:07: Speed 100 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
    Aquantia AQR115 0x0000000ffe4fd000:07: Speed 1000 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
    Aquantia AQR115 0x0000000ffe4fd000:07: Speed 2500 SerDes mode 4 autoneg 1 training 1 reset on transition 0 silence 1 rate adapt 0 macsec 0
    Aquantia AQR115 0x0000000ffe4fd000:07: Speed 5000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
    Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 0 macsec 0
    fsl_dpaa_mac ffe4e4000.ethernet eth0: PHY [0x0000000ffe4fd000:07] driver [Aquantia AQR115] (irq=POLL)

While 100M transfer, I see the MAC TX frame increasing and SGMII TX good frames
increasing. But the receiving frames are counted as SGMII RX bad frames and MAC
RX frames counter does not increase. The TX/RX pause frames always stay at 0,
independently whether ping is working with 1G/2.5G or not with 100M. Do you have
any idea here?

    user@host:~# ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0' && ethtool --phy-statistics eth0 | grep -v ': 0' && ethtool -I --show-pause eth0
    Standard stats for eth0:
    eth-mac-FramesTransmittedOK: 529
    eth-mac-FramesReceivedOK: 67
    eth-mac-OctetsTransmittedOK: 79287
    eth-mac-OctetsReceivedOK: 9787
    eth-mac-MulticastFramesXmittedOK: 43
    eth-mac-BroadcastFramesXmittedOK: 451
    eth-mac-MulticastFramesReceivedOK: 32
    eth-mac-BroadcastFramesReceivedOK: 1
    rx-rmon-etherStatsPkts64to64Octets: 3
    rx-rmon-etherStatsPkts65to127Octets: 42
    rx-rmon-etherStatsPkts128to255Octets: 18
    rx-rmon-etherStatsPkts256to511Octets: 4
    tx-rmon-etherStatsPkts64to64Octets: 5
    tx-rmon-etherStatsPkts65to127Octets: 385
    tx-rmon-etherStatsPkts128to255Octets: 26
    tx-rmon-etherStatsPkts256to511Octets: 113
    PHY statistics:
         sgmii_rx_good_frames: 21149
         sgmii_rx_bad_frames: 176
         sgmii_rx_false_carrier_events: 1
         sgmii_tx_good_frames: 21041
         sgmii_tx_line_collisions: 1
    Pause parameters for eth0:
    Autonegotiate:	on
    RX:		off
    TX:		off
    RX negotiated: on
    TX negotiated: on
    Statistics:
      tx_pause_frames: 0
      rx_pause_frames: 0


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-05 12:44                                   ` Alexander Wilhelm
@ 2025-08-06 14:58                                     ` Vladimir Oltean
  2025-08-07  5:56                                       ` Alexander Wilhelm
  2025-08-27  5:57                                       ` Alexander Wilhelm
  0 siblings, 2 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-06 14:58 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Tue, Aug 05, 2025 at 02:44:15PM +0200, Alexander Wilhelm wrote:
> Patch is applied. Here are the registers log:
> 
>     user@host:~# logread | grep AQR115
>     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10 SerDes mode 4 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
>     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 100 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
>     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 1000 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
>     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 2500 SerDes mode 4 autoneg 1 training 1 reset on transition 0 silence 1 rate adapt 0 macsec 0
>     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 5000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
>     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 0 macsec 0
>     fsl_dpaa_mac ffe4e4000.ethernet eth0: PHY [0x0000000ffe4fd000:07] driver [Aquantia AQR115] (irq=POLL)
> 
> While 100M transfer, I see the MAC TX frame increasing and SGMII TX good frames
> increasing. But the receiving frames are counted as SGMII RX bad frames and MAC
> RX frames counter does not increase. The TX/RX pause frames always stay at 0,
> independently whether ping is working with 1G/2.5G or not with 100M. Do you have
> any idea here?
> 
>     user@host:~# ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0' && ethtool --phy-statistics eth0 | grep -v ': 0' && ethtool -I --show-pause eth0
>     Standard stats for eth0:
>     eth-mac-FramesTransmittedOK: 529
>     eth-mac-FramesReceivedOK: 67
>     eth-mac-OctetsTransmittedOK: 79287
>     eth-mac-OctetsReceivedOK: 9787
>     eth-mac-MulticastFramesXmittedOK: 43
>     eth-mac-BroadcastFramesXmittedOK: 451
>     eth-mac-MulticastFramesReceivedOK: 32
>     eth-mac-BroadcastFramesReceivedOK: 1
>     rx-rmon-etherStatsPkts64to64Octets: 3
>     rx-rmon-etherStatsPkts65to127Octets: 42
>     rx-rmon-etherStatsPkts128to255Octets: 18
>     rx-rmon-etherStatsPkts256to511Octets: 4
>     tx-rmon-etherStatsPkts64to64Octets: 5
>     tx-rmon-etherStatsPkts65to127Octets: 385
>     tx-rmon-etherStatsPkts128to255Octets: 26
>     tx-rmon-etherStatsPkts256to511Octets: 113
>     PHY statistics:
>          sgmii_rx_good_frames: 21149
>          sgmii_rx_bad_frames: 176
>          sgmii_rx_false_carrier_events: 1
>          sgmii_tx_good_frames: 21041
>          sgmii_tx_line_collisions: 1
>     Pause parameters for eth0:
>     Autonegotiate:	on
>     RX:		off
>     TX:		off
>     RX negotiated: on
>     TX negotiated: on
>     Statistics:
>       tx_pause_frames: 0
>       rx_pause_frames: 0

Sorry, I am not fluent enough with the Aquantia PHYs to be further
helpful here.

I have made a procedural mistake by suggesting you to print select
fields of the Global System Configuration registers instead of the raw
register values. I am unable to say with the required certainty whether
the configuration for 100M and 1G is identical or not. The printed
fields are the same, however there could still be differences in the
unprinted bits (looking at bit 12 'Low Delay Jitter'). That's something
you should explore further.

About MAC RX counters not increasing at all. The mEMAC has a catch-all
RERR counter which increments for each frame received with a wider
variety of errors (except for undersized/fragment frames):
- FIFO overflow error
- CRC error
- Payload length error
- Jabber and oversized error
- Alignment error (if supported)
- Reception of PHY/PCS error indication
The structured ethtool statistics API doesn't seem to have a counter for
received frame errors in general, only for specific errors. So I didn't
export it in the patch I sent. It's possible that this counter is
incrementing (but the more specific RFCS/RALN/... counters apparently not).

In any case, the T1023 host configuration is literally unchanged from
1G to 100M, so I am suspecting a misconfiguration in the Aquantia
provisioning somewhere. Maybe an FAE or AE from Marvell can help you
further with this issue.

If you do contact them, please also request them to fix the discrepancy
where the Global System Configuration register for speed 2500 has
"autoneg 1", but all the other speeds have "autoneg 0" for the same
SerDes mode 4. It is precisely this concern that I was expressing to
Russell that makes it difficult to implement .inband_caps() based on
reading PHY registers.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-06 14:58                                     ` Vladimir Oltean
@ 2025-08-07  5:56                                       ` Alexander Wilhelm
  2025-08-27  5:57                                       ` Alexander Wilhelm
  1 sibling, 0 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-07  5:56 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Wed, Aug 06, 2025 at 05:58:56PM +0300 schrieb Vladimir Oltean:
> On Tue, Aug 05, 2025 at 02:44:15PM +0200, Alexander Wilhelm wrote:
> > Patch is applied. Here are the registers log:
> > 
> >     user@host:~# logread | grep AQR115
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10 SerDes mode 4 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 100 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 1000 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 2500 SerDes mode 4 autoneg 1 training 1 reset on transition 0 silence 1 rate adapt 0 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 5000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 0 macsec 0
> >     fsl_dpaa_mac ffe4e4000.ethernet eth0: PHY [0x0000000ffe4fd000:07] driver [Aquantia AQR115] (irq=POLL)
> > 
> > While 100M transfer, I see the MAC TX frame increasing and SGMII TX good frames
> > increasing. But the receiving frames are counted as SGMII RX bad frames and MAC
> > RX frames counter does not increase. The TX/RX pause frames always stay at 0,
> > independently whether ping is working with 1G/2.5G or not with 100M. Do you have
> > any idea here?
> > 
> >     user@host:~# ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0' && ethtool --phy-statistics eth0 | grep -v ': 0' && ethtool -I --show-pause eth0
> >     Standard stats for eth0:
> >     eth-mac-FramesTransmittedOK: 529
> >     eth-mac-FramesReceivedOK: 67
> >     eth-mac-OctetsTransmittedOK: 79287
> >     eth-mac-OctetsReceivedOK: 9787
> >     eth-mac-MulticastFramesXmittedOK: 43
> >     eth-mac-BroadcastFramesXmittedOK: 451
> >     eth-mac-MulticastFramesReceivedOK: 32
> >     eth-mac-BroadcastFramesReceivedOK: 1
> >     rx-rmon-etherStatsPkts64to64Octets: 3
> >     rx-rmon-etherStatsPkts65to127Octets: 42
> >     rx-rmon-etherStatsPkts128to255Octets: 18
> >     rx-rmon-etherStatsPkts256to511Octets: 4
> >     tx-rmon-etherStatsPkts64to64Octets: 5
> >     tx-rmon-etherStatsPkts65to127Octets: 385
> >     tx-rmon-etherStatsPkts128to255Octets: 26
> >     tx-rmon-etherStatsPkts256to511Octets: 113
> >     PHY statistics:
> >          sgmii_rx_good_frames: 21149
> >          sgmii_rx_bad_frames: 176
> >          sgmii_rx_false_carrier_events: 1
> >          sgmii_tx_good_frames: 21041
> >          sgmii_tx_line_collisions: 1
> >     Pause parameters for eth0:
> >     Autonegotiate:	on
> >     RX:		off
> >     TX:		off
> >     RX negotiated: on
> >     TX negotiated: on
> >     Statistics:
> >       tx_pause_frames: 0
> >       rx_pause_frames: 0
> 
> Sorry, I am not fluent enough with the Aquantia PHYs to be further
> helpful here.
> 
> I have made a procedural mistake by suggesting you to print select
> fields of the Global System Configuration registers instead of the raw
> register values. I am unable to say with the required certainty whether
> the configuration for 100M and 1G is identical or not. The printed
> fields are the same, however there could still be differences in the
> unprinted bits (looking at bit 12 'Low Delay Jitter'). That's something
> you should explore further.
> 
> About MAC RX counters not increasing at all. The mEMAC has a catch-all
> RERR counter which increments for each frame received with a wider
> variety of errors (except for undersized/fragment frames):
> - FIFO overflow error
> - CRC error
> - Payload length error
> - Jabber and oversized error
> - Alignment error (if supported)
> - Reception of PHY/PCS error indication
> The structured ethtool statistics API doesn't seem to have a counter for
> received frame errors in general, only for specific errors. So I didn't
> export it in the patch I sent. It's possible that this counter is
> incrementing (but the more specific RFCS/RALN/... counters apparently not).
> 
> In any case, the T1023 host configuration is literally unchanged from
> 1G to 100M, so I am suspecting a misconfiguration in the Aquantia
> provisioning somewhere. Maybe an FAE or AE from Marvell can help you
> further with this issue.
> 
> If you do contact them, please also request them to fix the discrepancy
> where the Global System Configuration register for speed 2500 has
> "autoneg 1", but all the other speeds have "autoneg 0" for the same
> SerDes mode 4. It is precisely this concern that I was expressing to
> Russell that makes it difficult to implement .inband_caps() based on
> reading PHY registers.

Hi Vladimir,

Now that I have 1G/2.5G working, I’ll be able to run some additional tests with
the device. I’ll revisit the 100M topic later and analyze it more thoroughly. I
suspect that some register settings in U-Boot might be causing issues when the
PHY is initialized again in the kernel.

I truly appreciate your detailed support, it has been incredibly helpful in
moving things forward.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-06 14:58                                     ` Vladimir Oltean
  2025-08-07  5:56                                       ` Alexander Wilhelm
@ 2025-08-27  5:57                                       ` Alexander Wilhelm
  2025-08-27  7:31                                         ` Vladimir Oltean
  2025-08-27  8:08                                         ` Russell King (Oracle)
  1 sibling, 2 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-27  5:57 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Wed, Aug 06, 2025 at 05:58:56PM +0300 schrieb Vladimir Oltean:
> On Tue, Aug 05, 2025 at 02:44:15PM +0200, Alexander Wilhelm wrote:
> > Patch is applied. Here are the registers log:
> > 
> >     user@host:~# logread | grep AQR115
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10 SerDes mode 4 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 100 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 1000 SerDes mode 4 autoneg 0 training 1 reset on transition 0 silence 1 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 2500 SerDes mode 4 autoneg 1 training 1 reset on transition 0 silence 1 rate adapt 0 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 5000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 2 macsec 0
> >     Aquantia AQR115 0x0000000ffe4fd000:07: Speed 10000 SerDes mode 0 autoneg 0 training 0 reset on transition 0 silence 0 rate adapt 0 macsec 0
> >     fsl_dpaa_mac ffe4e4000.ethernet eth0: PHY [0x0000000ffe4fd000:07] driver [Aquantia AQR115] (irq=POLL)
> > 
> > While 100M transfer, I see the MAC TX frame increasing and SGMII TX good frames
> > increasing. But the receiving frames are counted as SGMII RX bad frames and MAC
> > RX frames counter does not increase. The TX/RX pause frames always stay at 0,
> > independently whether ping is working with 1G/2.5G or not with 100M. Do you have
> > any idea here?
> > 
> >     user@host:~# ethtool -S eth0 --groups eth-mac eth-phy eth-ctrl rmon | grep -v ': 0' && ethtool --phy-statistics eth0 | grep -v ': 0' && ethtool -I --show-pause eth0
> >     Standard stats for eth0:
> >     eth-mac-FramesTransmittedOK: 529
> >     eth-mac-FramesReceivedOK: 67
> >     eth-mac-OctetsTransmittedOK: 79287
> >     eth-mac-OctetsReceivedOK: 9787
> >     eth-mac-MulticastFramesXmittedOK: 43
> >     eth-mac-BroadcastFramesXmittedOK: 451
> >     eth-mac-MulticastFramesReceivedOK: 32
> >     eth-mac-BroadcastFramesReceivedOK: 1
> >     rx-rmon-etherStatsPkts64to64Octets: 3
> >     rx-rmon-etherStatsPkts65to127Octets: 42
> >     rx-rmon-etherStatsPkts128to255Octets: 18
> >     rx-rmon-etherStatsPkts256to511Octets: 4
> >     tx-rmon-etherStatsPkts64to64Octets: 5
> >     tx-rmon-etherStatsPkts65to127Octets: 385
> >     tx-rmon-etherStatsPkts128to255Octets: 26
> >     tx-rmon-etherStatsPkts256to511Octets: 113
> >     PHY statistics:
> >          sgmii_rx_good_frames: 21149
> >          sgmii_rx_bad_frames: 176
> >          sgmii_rx_false_carrier_events: 1
> >          sgmii_tx_good_frames: 21041
> >          sgmii_tx_line_collisions: 1
> >     Pause parameters for eth0:
> >     Autonegotiate:	on
> >     RX:		off
> >     TX:		off
> >     RX negotiated: on
> >     TX negotiated: on
> >     Statistics:
> >       tx_pause_frames: 0
> >       rx_pause_frames: 0
> 
> Sorry, I am not fluent enough with the Aquantia PHYs to be further
> helpful here.

Hi Vladimir,

One of our hardware engineers has looked into the issue with the 100M link and
found the following: the Aquantia AQR115 always uses 2500BASE-X (GMII) on the
host side. For both 1G and 100M operation, it enables pause rate adaptation.
However, our MAC only applies rate adaptation for 1G links. For 100M, it uses a
10x symbol replication instead.

We’re exploring a workaround where the MAC is configured to believe it’s
operating at 1G, so it continues using pause rate adaptation, since flow control
is handled by the PHY. Given your deep expertise with Freescale MACs, I’d really
value your opinion on whether this approach makes sense or if you’ve seen
similar configurations before.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  5:57                                       ` Alexander Wilhelm
@ 2025-08-27  7:31                                         ` Vladimir Oltean
  2025-08-27  8:41                                           ` Alexander Wilhelm
  2025-08-27  8:08                                         ` Russell King (Oracle)
  1 sibling, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-27  7:31 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Hi Alexander,

On Wed, Aug 27, 2025 at 07:57:28AM +0200, Alexander Wilhelm wrote:
> Hi Vladimir,
> 
> One of our hardware engineers has looked into the issue with the 100M link and
> found the following: the Aquantia AQR115 always uses 2500BASE-X (GMII) on the
> host side. For both 1G and 100M operation, it enables pause rate adaptation.
> However, our MAC only applies rate adaptation for 1G links. For 100M, it uses a
> 10x symbol replication instead.
> 
> We’re exploring a workaround where the MAC is configured to believe it’s
> operating at 1G, so it continues using pause rate adaptation, since flow control

Why at 1G and not at 2.5G?

> is handled by the PHY. Given your deep expertise with Freescale MACs, I’d really
> value your opinion on whether this approach makes sense or if you’ve seen
> similar configurations before.
> 
> Best regards
> Alexander Wilhelm

To be crystal clear, are you talking about the T1023 FMan mEMAC as being
the one which at 100M uses 10x symbol replication? Because the AQR115
PHY also contains a MAC block inside - this is what provides the MACsec
and rate adaptation functionality.

And if so, I don't know _how_ can that be - in mainline there is no code
that would reconfigure the SerDes lane from 2500base-x to SGMII. These
use different baud rates, so the lane would need to be moved to a
different PLL which provides the required clock net. Or are you using a
different kernel code base where this logic exists?

Also, I don't understand _why_ would the FMan mEMAC change its protocol
from 2500base-x to SGMII. It certainly doesn't do that by itself.
Rate adaptation is handled by phylink (phylink_link_up() sets rx_pause
unconditionally to true when in RATE_MATCH_PAUSE mode), and the MAC
should be kept in the same configuration for different media-side speeds.

Could you print phy_modes(state->interface) in memac_mac_config(), as
well as phy_modes(interface), speed, duplex, tx_pause, rx_pause in
memac_link_up()? This is to confirm that the mEMAC configuration is
identical when the PHY links at 1G and 100M.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  5:57                                       ` Alexander Wilhelm
  2025-08-27  7:31                                         ` Vladimir Oltean
@ 2025-08-27  8:08                                         ` Russell King (Oracle)
  2025-08-27  8:32                                           ` Alexander Wilhelm
  1 sibling, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-27  8:08 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Aug 27, 2025 at 07:57:28AM +0200, Alexander Wilhelm wrote:
> Hi Vladimir,
> 
> One of our hardware engineers has looked into the issue with the 100M link and
> found the following: the Aquantia AQR115 always uses 2500BASE-X (GMII) on the
> host side. For both 1G and 100M operation, it enables pause rate adaptation.
> However, our MAC only applies rate adaptation for 1G links. For 100M, it uses a
> 10x symbol replication instead.

This sounds like a misunderstanding, specifically:

"our MAC only applies rate adaptation for 1G links. For 100M, it uses
10x symbol replication instead."

It is the PHY that does rate adaption, so the MAC doesn't need to
support other speeds. Therefore, if the PHY is using a 2.5Gbps link
to the MAC with rate adaption for 100M, then the MAC needs to operate
at that 2.5Gbps speed.

You don't program the MAC differently depending on the media side
speed, unlike when rate adaption is not being used.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  8:08                                         ` Russell King (Oracle)
@ 2025-08-27  8:32                                           ` Alexander Wilhelm
  2025-08-27  8:45                                             ` Russell King (Oracle)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-27  8:32 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Wed, Aug 27, 2025 at 09:08:24AM +0100 schrieb Russell King (Oracle):
> On Wed, Aug 27, 2025 at 07:57:28AM +0200, Alexander Wilhelm wrote:
> > Hi Vladimir,
> > 
> > One of our hardware engineers has looked into the issue with the 100M link and
> > found the following: the Aquantia AQR115 always uses 2500BASE-X (GMII) on the
> > host side. For both 1G and 100M operation, it enables pause rate adaptation.
> > However, our MAC only applies rate adaptation for 1G links. For 100M, it uses a
> > 10x symbol replication instead.
> 
> This sounds like a misunderstanding, specifically:
> 
> "our MAC only applies rate adaptation for 1G links. For 100M, it uses
> 10x symbol replication instead."
> 
> It is the PHY that does rate adaption, so the MAC doesn't need to
> support other speeds. Therefore, if the PHY is using a 2.5Gbps link
> to the MAC with rate adaption for 100M, then the MAC needs to operate
> at that 2.5Gbps speed.
> 
> You don't program the MAC differently depending on the media side
> speed, unlike when rate adaption is not being used.

You're right. The flow control with rate adaptation is controlled by PHY. The
MAC should remain on the 2.5Gbps speed. Therefore I wonder why it uses 10x
symbol repetition.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  7:31                                         ` Vladimir Oltean
@ 2025-08-27  8:41                                           ` Alexander Wilhelm
  2025-08-27  8:47                                             ` Russell King (Oracle)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-27  8:41 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Am Wed, Aug 27, 2025 at 10:31:20AM +0300 schrieb Vladimir Oltean:
> Hi Alexander,
> 
> On Wed, Aug 27, 2025 at 07:57:28AM +0200, Alexander Wilhelm wrote:
> > Hi Vladimir,
> > 
> > One of our hardware engineers has looked into the issue with the 100M link and
> > found the following: the Aquantia AQR115 always uses 2500BASE-X (GMII) on the
> > host side. For both 1G and 100M operation, it enables pause rate adaptation.
> > However, our MAC only applies rate adaptation for 1G links. For 100M, it uses a
> > 10x symbol replication instead.
> > 
> > We’re exploring a workaround where the MAC is configured to believe it’s
> > operating at 1G, so it continues using pause rate adaptation, since flow control
> 
> Why at 1G and not at 2.5G?

Good point. Actually it is 2.5G, but the source code does not really
differentiate between them. All register configurations are the same for both 1G
and 2.5G.
[...]

> To be crystal clear, are you talking about the T1023 FMan mEMAC as being
> the one which at 100M uses 10x symbol replication? Because the AQR115
> PHY also contains a MAC block inside - this is what provides the MACsec
> and rate adaptation functionality.

Exactly that is what our hardware engineer measured.

> And if so, I don't know _how_ can that be - in mainline there is no code
> that would reconfigure the SerDes lane from 2500base-x to SGMII. These
> use different baud rates, so the lane would need to be moved to a
> different PLL which provides the required clock net. Or are you using a
> different kernel code base where this logic exists?

That is the problem I'm trying to understand. I've also not seen any code that
changes that. I'm using OpenWRT v24.10.0 with default kernel v6.6.73. The only
patches applied are the ones you've provided to me.

> Also, I don't understand _why_ would the FMan mEMAC change its protocol
> from 2500base-x to SGMII. It certainly doesn't do that by itself.
> Rate adaptation is handled by phylink (phylink_link_up() sets rx_pause
> unconditionally to true when in RATE_MATCH_PAUSE mode), and the MAC
> should be kept in the same configuration for different media-side speeds.
> 
> Could you print phy_modes(state->interface) in memac_mac_config(), as
> well as phy_modes(interface), speed, duplex, tx_pause, rx_pause in
> memac_link_up()? This is to confirm that the mEMAC configuration is
> identical when the PHY links at 1G and 100M.

Sure. I set speed on host connected to the DUT. Here are the logs:

Started with 2.5G:

    fsl_dpaa_mac: [DEBUG] <memac_mac_config> called
    fsl_dpaa_mac: [DEBUG] * phy_modes(state->interface): 2500base-x
    fsl_dpaa_mac: [DEBUG] <memac_link_up> called
    fsl_dpaa_mac: [DEBUG] * mode: 0
    fsl_dpaa_mac: [DEBUG] * phy_mode(interface): 2500base-x
    fsl_dpaa_mac: [DEBUG] * memac_if_mode: 00000002 (IF_MODE_GMII)
    fsl_dpaa_mac: [DEBUG] * speed: 2500
    fsl_dpaa_mac: [DEBUG] * duplex: 1
    fsl_dpaa_mac: [DEBUG] * tx_pause: 1
    fsl_dpaa_mac: [DEBUG] * rx_pause: 1

Set to 1G:

    fsl_dpaa_mac: [DEBUG] <memac_link_down> called
    fsl_dpaa_mac: [DEBUG] <memac_link_up> called
    fsl_dpaa_mac: [DEBUG] * mode: 0
    fsl_dpaa_mac: [DEBUG] * phy_mode(interface): 2500base-x
    fsl_dpaa_mac: [DEBUG] * memac_if_mode: 00000002 (IF_MODE_GMII)
    fsl_dpaa_mac: [DEBUG] * speed: 2500
    fsl_dpaa_mac: [DEBUG] * duplex: 1
    fsl_dpaa_mac: [DEBUG] * tx_pause: 1
    fsl_dpaa_mac: [DEBUG] * rx_pause: 1

Set to 100M:

    fsl_dpaa_mac: [DEBUG] <memac_link_down> called
    fsl_dpaa_mac: [DEBUG] <memac_link_up> called
    fsl_dpaa_mac: [DEBUG] * mode: 0
    fsl_dpaa_mac: [DEBUG] * phy_mode(interface): 2500base-x
    fsl_dpaa_mac: [DEBUG] * memac_if_mode: 00000002 (IF_MODE_GMII)
    fsl_dpaa_mac: [DEBUG] * speed: 2500
    fsl_dpaa_mac: [DEBUG] * duplex: 1
    fsl_dpaa_mac: [DEBUG] * tx_pause: 1
    fsl_dpaa_mac: [DEBUG] * rx_pause: 1


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  8:32                                           ` Alexander Wilhelm
@ 2025-08-27  8:45                                             ` Russell King (Oracle)
  0 siblings, 0 replies; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-27  8:45 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Aug 27, 2025 at 10:32:27AM +0200, Alexander Wilhelm wrote:
> Am Wed, Aug 27, 2025 at 09:08:24AM +0100 schrieb Russell King (Oracle):
> > On Wed, Aug 27, 2025 at 07:57:28AM +0200, Alexander Wilhelm wrote:
> > > Hi Vladimir,
> > > 
> > > One of our hardware engineers has looked into the issue with the 100M link and
> > > found the following: the Aquantia AQR115 always uses 2500BASE-X (GMII) on the
> > > host side. For both 1G and 100M operation, it enables pause rate adaptation.
> > > However, our MAC only applies rate adaptation for 1G links. For 100M, it uses a
> > > 10x symbol replication instead.
> > 
> > This sounds like a misunderstanding, specifically:
> > 
> > "our MAC only applies rate adaptation for 1G links. For 100M, it uses
> > 10x symbol replication instead."
> > 
> > It is the PHY that does rate adaption, so the MAC doesn't need to
> > support other speeds. Therefore, if the PHY is using a 2.5Gbps link
> > to the MAC with rate adaption for 100M, then the MAC needs to operate
> > at that 2.5Gbps speed.
> > 
> > You don't program the MAC differently depending on the media side
> > speed, unlike when rate adaption is not being used.
> 
> You're right. The flow control with rate adaptation is controlled by PHY. The
> MAC should remain on the 2.5Gbps speed. Therefore I wonder why it uses 10x
> symbol repetition.

As I say, someone is misunderstanding something. The PHY controls what
happens at the different media speeds.

I wonder whether the hardware engineer is thinking that the PHY is
configured for SGMII mode at 100Mbps - whic his controlled by vendor
1 register 0x31b. If the 3 LSBs are 4, then it's using "OCSGMII"
otherwise if 3, then it'll be as the hardware engineer states.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  8:41                                           ` Alexander Wilhelm
@ 2025-08-27  8:47                                             ` Russell King (Oracle)
  2025-08-27  9:03                                               ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-27  8:47 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Aug 27, 2025 at 10:41:11AM +0200, Alexander Wilhelm wrote:
> Set to 100M:
> 
>     fsl_dpaa_mac: [DEBUG] <memac_link_down> called
>     fsl_dpaa_mac: [DEBUG] <memac_link_up> called
>     fsl_dpaa_mac: [DEBUG] * mode: 0
>     fsl_dpaa_mac: [DEBUG] * phy_mode(interface): 2500base-x
>     fsl_dpaa_mac: [DEBUG] * memac_if_mode: 00000002 (IF_MODE_GMII)
>     fsl_dpaa_mac: [DEBUG] * speed: 2500
>     fsl_dpaa_mac: [DEBUG] * duplex: 1
>     fsl_dpaa_mac: [DEBUG] * tx_pause: 1
>     fsl_dpaa_mac: [DEBUG] * rx_pause: 1

So the PHY reported that it's using 2500base-X ("OCSGMII") for 100M,
which means 0x31b 3 LSBs are 4. Your hardware engineer appears to be
incorrect in his statement.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  8:47                                             ` Russell King (Oracle)
@ 2025-08-27  9:03                                               ` Alexander Wilhelm
  2025-08-27  9:13                                                 ` Russell King (Oracle)
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-08-27  9:03 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Am Wed, Aug 27, 2025 at 09:47:49AM +0100 schrieb Russell King (Oracle):
> On Wed, Aug 27, 2025 at 10:41:11AM +0200, Alexander Wilhelm wrote:
> > Set to 100M:
> > 
> >     fsl_dpaa_mac: [DEBUG] <memac_link_down> called
> >     fsl_dpaa_mac: [DEBUG] <memac_link_up> called
> >     fsl_dpaa_mac: [DEBUG] * mode: 0
> >     fsl_dpaa_mac: [DEBUG] * phy_mode(interface): 2500base-x
> >     fsl_dpaa_mac: [DEBUG] * memac_if_mode: 00000002 (IF_MODE_GMII)
> >     fsl_dpaa_mac: [DEBUG] * speed: 2500
> >     fsl_dpaa_mac: [DEBUG] * duplex: 1
> >     fsl_dpaa_mac: [DEBUG] * tx_pause: 1
> >     fsl_dpaa_mac: [DEBUG] * rx_pause: 1
> 
> So the PHY reported that it's using 2500base-X ("OCSGMII") for 100M,
> which means 0x31b 3 LSBs are 4. Your hardware engineer appears to be
> incorrect in his statement.

I asked the hardware engineer again. The point is that the MAC does not set
SGMII for 100M. It still uses 2500base-x but with 10x paket repetition. He could
measure and proof that with a logic analyzer.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  9:03                                               ` Alexander Wilhelm
@ 2025-08-27  9:13                                                 ` Russell King (Oracle)
  2025-08-28  9:28                                                   ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-08-27  9:13 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Vladimir Oltean, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Aug 27, 2025 at 11:03:42AM +0200, Alexander Wilhelm wrote:
> Am Wed, Aug 27, 2025 at 09:47:49AM +0100 schrieb Russell King (Oracle):
> > On Wed, Aug 27, 2025 at 10:41:11AM +0200, Alexander Wilhelm wrote:
> > > Set to 100M:
> > > 
> > >     fsl_dpaa_mac: [DEBUG] <memac_link_down> called
> > >     fsl_dpaa_mac: [DEBUG] <memac_link_up> called
> > >     fsl_dpaa_mac: [DEBUG] * mode: 0
> > >     fsl_dpaa_mac: [DEBUG] * phy_mode(interface): 2500base-x
> > >     fsl_dpaa_mac: [DEBUG] * memac_if_mode: 00000002 (IF_MODE_GMII)
> > >     fsl_dpaa_mac: [DEBUG] * speed: 2500
> > >     fsl_dpaa_mac: [DEBUG] * duplex: 1
> > >     fsl_dpaa_mac: [DEBUG] * tx_pause: 1
> > >     fsl_dpaa_mac: [DEBUG] * rx_pause: 1
> > 
> > So the PHY reported that it's using 2500base-X ("OCSGMII") for 100M,
> > which means 0x31b 3 LSBs are 4. Your hardware engineer appears to be
> > incorrect in his statement.
> 
> I asked the hardware engineer again. The point is that the MAC does not set
> SGMII for 100M. It still uses 2500base-x but with 10x paket repetition.

No one uses symbol repetition when in 2500base-x mode. Nothing supports
it. Every device datasheet I've read states clearly that symbol
repetition is unsupported when operating at 2.5Gbps.

Also think about what this means. If the link is operating at 2.5Gbps
with a 10x symbol repetition, that means the link would be passing
250Mbps. That's not compatible with _anything_.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-27  9:13                                                 ` Russell King (Oracle)
@ 2025-08-28  9:28                                                   ` Vladimir Oltean
  2025-10-02  5:54                                                     ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-08-28  9:28 UTC (permalink / raw)
  To: Alexander Wilhelm, Russell King (Oracle)
  Cc: Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Aug 27, 2025 at 10:13:59AM +0100, Russell King (Oracle) wrote:
> On Wed, Aug 27, 2025 at 11:03:42AM +0200, Alexander Wilhelm wrote:
> > I asked the hardware engineer again. The point is that the MAC does not set
> > SGMII for 100M. It still uses 2500base-x but with 10x paket repetition.
> 
> No one uses symbol repetition when in 2500base-x mode. Nothing supports
> it. Every device datasheet I've read states clearly that symbol
> repetition is unsupported when operating at 2.5Gbps.
> 
> Also think about what this means. If the link is operating at 2.5Gbps
> with a 10x symbol repetition, that means the link would be passing
> 250Mbps. That's not compatible with _anything_.

FWIW, claim 5 of this active Cisco patent suggests dividing frames into
2 segments, replicating symbols from the first segment twice and symbols
from the second segment three times.
https://patents.google.com/patent/US7356047B1/en

I'm completely unaware of any implementations of this either, though.

To remain on topic, I don't see how the hardware engineer's claim can be
true. The PCS symbol replication is done through the IF_MODE_SPEED
field, which lynx_pcs_link_up_2500basex() sets to SGMII_SPEED_2500 (same
as SGMII_SPEED_1000, i.e. no replication). You can confirm that the
IF_MODE register has the expected value by putting a print.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-08-28  9:28                                                   ` Vladimir Oltean
@ 2025-10-02  5:54                                                     ` Alexander Wilhelm
  2025-10-07 14:08                                                       ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-10-02  5:54 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Thu, Aug 28, 2025 at 12:28:59PM +0300, Vladimir Oltean wrote:
> On Wed, Aug 27, 2025 at 10:13:59AM +0100, Russell King (Oracle) wrote:
> > On Wed, Aug 27, 2025 at 11:03:42AM +0200, Alexander Wilhelm wrote:
> > > I asked the hardware engineer again. The point is that the MAC does not set
> > > SGMII for 100M. It still uses 2500base-x but with 10x paket repetition.
> > 
> > No one uses symbol repetition when in 2500base-x mode. Nothing supports
> > it. Every device datasheet I've read states clearly that symbol
> > repetition is unsupported when operating at 2.5Gbps.
> > 
> > Also think about what this means. If the link is operating at 2.5Gbps
> > with a 10x symbol repetition, that means the link would be passing
> > 250Mbps. That's not compatible with _anything_.
> 
> FWIW, claim 5 of this active Cisco patent suggests dividing frames into
> 2 segments, replicating symbols from the first segment twice and symbols
> from the second segment three times.
> https://urldefense.com/v3/__https://patents.google.com/patent/US7356047B1/en__;!!I9LPvj3b!Fx2G5geAtgbXRIF2G5-FXZ1uR8K3DzHG9gwbOA0N3YRTEz4_c9Mx58Ejphl6RPuN5KXYHzAKyvPHyYnKNl1oJiY2aFmSNbRZ$ 
> 
> I'm completely unaware of any implementations of this either, though.
> 
> To remain on topic, I don't see how the hardware engineer's claim can be
> true. The PCS symbol replication is done through the IF_MODE_SPEED
> field, which lynx_pcs_link_up_2500basex() sets to SGMII_SPEED_2500 (same
> as SGMII_SPEED_1000, i.e. no replication). You can confirm that the
> IF_MODE register has the expected value by putting a print.

Hi Vladimir,

Thanks your for the hint with the IF_MODE register, I finally found the
root cause of my issue. Unfortunately, my U-Boot implementation was setting
the `IF_MODE_SGMII_EN` and `IF_MODE_USE_SGMII_AN` bits. This caused a 10x
symbol replication when operating at 100M speed. At the same time, the
`pcs-lynx` driver never modified these bits when 2500Base-X was configured.

I was able to fix this in U-Boot. Additionally, I explicitly cleared these
bits in the Lynx driver whenever 2500Base-X is configured (see patch
below). I’d like to hear your expertise on this: do you think this patch is
necessary, or could there be scenarios where these flags should remain set
for 2500Base-X?


Best regards
Alexander Wilhelm
---
 drivers/net/pcs/pcs-lynx.c | 26 ++++++++++++++++++++------
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/drivers/net/pcs/pcs-lynx.c b/drivers/net/pcs/pcs-lynx.c
index 23b40e9eacbb..2774c62fb0db 100644
--- a/drivers/net/pcs/pcs-lynx.c
+++ b/drivers/net/pcs/pcs-lynx.c
@@ -169,6 +169,25 @@ static int lynx_pcs_config_giga(struct mdio_device *pcs,
 					  neg_mode);
 }
 
+static int lynx_pcs_config_2500basex(struct mdio_device *pcs,
+				     unsigned int neg_mode)
+{
+	int err;
+
+	if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
+		dev_err(&pcs->dev,
+			"AN not supported on 3.125GHz SerDes lane\n");
+		return -EOPNOTSUPP;
+	}
+
+	err = mdiodev_modify(pcs, IF_MODE,
+			     IF_MODE_SGMII_EN | IF_MODE_USE_SGMII_AN, 0);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 static int lynx_pcs_config_usxgmii(struct mdio_device *pcs,
 				   const unsigned long *advertising,
 				   unsigned int neg_mode)
@@ -201,12 +220,7 @@ static int lynx_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
 		return lynx_pcs_config_giga(lynx->mdio, ifmode, advertising,
 					    neg_mode);
 	case PHY_INTERFACE_MODE_2500BASEX:
-		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
-			dev_err(&lynx->mdio->dev,
-				"AN not supported on 3.125GHz SerDes lane\n");
-			return -EOPNOTSUPP;
-		}
-		break;
+		return lynx_pcs_config_2500basex(lynx->mdio, neg_mode);
 	case PHY_INTERFACE_MODE_USXGMII:
 		return lynx_pcs_config_usxgmii(lynx->mdio, advertising,
 					       neg_mode);

base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-02  5:54                                                     ` Alexander Wilhelm
@ 2025-10-07 14:08                                                       ` Vladimir Oltean
  2025-10-08  7:47                                                         ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-10-07 14:08 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

Hi Alexander,

On Thu, Oct 02, 2025 at 07:54:48AM +0200, Alexander Wilhelm wrote:
> Hi Vladimir,
> 
> Thanks your for the hint with the IF_MODE register, I finally found the
> root cause of my issue. Unfortunately, my U-Boot implementation was setting
> the `IF_MODE_SGMII_EN` and `IF_MODE_USE_SGMII_AN` bits. This caused a 10x
> symbol replication when operating at 100M speed. At the same time, the
> `pcs-lynx` driver never modified these bits when 2500Base-X was configured.
> 
> I was able to fix this in U-Boot. Additionally, I explicitly cleared these
> bits in the Lynx driver whenever 2500Base-X is configured (see patch
> below). I’d like to hear your expertise on this: do you think this patch is
> necessary, or could there be scenarios where these flags should remain set
> for 2500Base-X?
> 
> 
> Best regards
> Alexander Wilhelm
> ---
>  drivers/net/pcs/pcs-lynx.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/pcs/pcs-lynx.c b/drivers/net/pcs/pcs-lynx.c
> index 23b40e9eacbb..2774c62fb0db 100644
> --- a/drivers/net/pcs/pcs-lynx.c
> +++ b/drivers/net/pcs/pcs-lynx.c
> @@ -169,6 +169,25 @@ static int lynx_pcs_config_giga(struct mdio_device *pcs,
>  					  neg_mode);
>  }
>  
> +static int lynx_pcs_config_2500basex(struct mdio_device *pcs,
> +				     unsigned int neg_mode)
> +{
> +	int err;
> +
> +	if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> +		dev_err(&pcs->dev,
> +			"AN not supported on 3.125GHz SerDes lane\n");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	err = mdiodev_modify(pcs, IF_MODE,
> +			     IF_MODE_SGMII_EN | IF_MODE_USE_SGMII_AN, 0);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
>  static int lynx_pcs_config_usxgmii(struct mdio_device *pcs,
>  				   const unsigned long *advertising,
>  				   unsigned int neg_mode)
> @@ -201,12 +220,7 @@ static int lynx_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
>  		return lynx_pcs_config_giga(lynx->mdio, ifmode, advertising,
>  					    neg_mode);
>  	case PHY_INTERFACE_MODE_2500BASEX:
> -		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> -			dev_err(&lynx->mdio->dev,
> -				"AN not supported on 3.125GHz SerDes lane\n");
> -			return -EOPNOTSUPP;
> -		}
> -		break;
> +		return lynx_pcs_config_2500basex(lynx->mdio, neg_mode);
>  	case PHY_INTERFACE_MODE_USXGMII:
>  		return lynx_pcs_config_usxgmii(lynx->mdio, advertising,
>  					       neg_mode);
> 
> base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
> -- 
> 2.43.0

Sorry for the delay. What you have found are undoubtebly two major bugs,
causing the Lynx PCS to operate in undefined behaviour territory.
Nonetheless, while your finding has helped me discover many previously
unknown facts about the hardware IP, I still cannot replicate exactly
your reported behaviour. In order to fully process things, I would like
to ask a few more clarification questions.

Is your U-Boot implementation based on NXP's dtsec_configure_serdes()?
https://github.com/u-boot/u-boot/blob/master/drivers/net/fm/eth.c#L57
Why would U-Boot set IF_MODE_SGMII_EN | IF_MODE_USE_SGMII_AN only when
the AQR115 resolves only to 100M, but not in the other cases (which do
not have this problem)? Or does it do it irrespective of resolved media
side link speed? Simply put: what did the code that you fixed up look like?

With the U-Boot fix reverted, could you please replicate the broken
setup with AQR115 linking at 100Mbps, and add the following function in
Linux drivers/pcs-lynx.c?

static void lynx_pcs_debug(struct mdio_device *pcs)
{
	int bmsr = mdiodev_read(pcs, MII_BMSR);
	int bmcr = mdiodev_read(pcs, MII_BMCR);
	int adv = mdiodev_read(pcs, MII_ADVERTISE);
	int lpa = mdiodev_read(pcs, MII_LPA);
	int if_mode = mdiodev_read(pcs, IF_MODE);

	dev_info(&pcs->dev, "BMSR 0x%x, BMCR 0x%x, ADV 0x%x, LPA 0x%x, IF_MODE 0x%x\n", bmsr, bmcr, adv, lpa, if_mode);
}

and call it from:

static void lynx_pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode,
			       struct phylink_link_state *state)
{
	struct lynx_pcs *lynx = phylink_pcs_to_lynx(pcs);

	lynx_pcs_debug(lynx->mdio); // <- here

	switch (state->interface) {
	...

With this, I would like to know:
(a) what is the IF_MODE register content outside of the IF_MODE_SGMII_EN
    and IF_MODE_USE_SGMII_AN bits.
(b) what is the SGMII code word advertised by the AQR115 in OCSGMII mode.

Then if you could replicate this test for 1Gbps medium link speed, it
would be great.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-07 14:08                                                       ` Vladimir Oltean
@ 2025-10-08  7:47                                                         ` Alexander Wilhelm
  2025-10-08 11:10                                                           ` Vladimir Oltean
  0 siblings, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-10-08  7:47 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Tue, Oct 07, 2025 at 05:08:19PM +0300, Vladimir Oltean wrote:
> Hi Alexander,
[...]
> Sorry for the delay. What you have found are undoubtebly two major bugs,
> causing the Lynx PCS to operate in undefined behaviour territory.
> Nonetheless, while your finding has helped me discover many previously
> unknown facts about the hardware IP, I still cannot replicate exactly
> your reported behaviour. In order to fully process things, I would like
> to ask a few more clarification questions.

Sure.

> Is your U-Boot implementation based on NXP's dtsec_configure_serdes()?
> https://urldefense.com/v3/__https://github.com/u-boot/u-boot/blob/master/drivers/net/fm/eth.c*L57__;Iw!!I9LPvj3b!An_LkChNHfp-qG89smQddcR4wAXVZC8Bt69TrktvBZg6BJNUrhH52LbgCRpu9sduQCpqfTfwsnXf8UB6VdHiAOeWo73T1jQe$ 

Unfortunately, I am working with an older U-Boot version v2016.07. However,
the bug I fixed was not part of the official U-Boot codebase, it was
introduced by our team:

    value = PHY_SGMII_IF_MODE_SGMII;
    value |= PHY_SGMII_IF_MODE_AN;

I added the missing `if` condition as follows:

    if (!sgmii_2500) {
        value = PHY_SGMII_IF_MODE_SGMII;
        value |= PHY_SGMII_IF_MODE_AN;
    }

With the official U-Boot codebase I don't have a ping at none of the
speeds:

    value = PHY_SGMII_IF_MODE_SGMII;
    if (!sgmii_2500)
        value |= PHY_SGMII_IF_MODE_AN;

> Why would U-Boot set IF_MODE_SGMII_EN | IF_MODE_USE_SGMII_AN only when
> the AQR115 resolves only to 100M, but not in the other cases (which do
> not have this problem)? Or does it do it irrespective of resolved media
> side link speed? Simply put: what did the code that you fixed up look like?

In our implementation, the SGMII flags were always set in U-Boot,
regardless of the negotiated link speed. My assumption is that the SGMII
mode configuration results in a behavior where only a 100M link applies the
10x symbol replication, while 1G does not. For a 2.5G link, the behavior
ends up being the same as 1G, since there is no actual SGMII mode for 2.5G.

> With the U-Boot fix reverted, could you please replicate the broken
> setup with AQR115 linking at 100Mbps, and add the following function in
> Linux drivers/pcs-lynx.c?
> 
> static void lynx_pcs_debug(struct mdio_device *pcs)
> {
> 	int bmsr = mdiodev_read(pcs, MII_BMSR);
> 	int bmcr = mdiodev_read(pcs, MII_BMCR);
> 	int adv = mdiodev_read(pcs, MII_ADVERTISE);
> 	int lpa = mdiodev_read(pcs, MII_LPA);
> 	int if_mode = mdiodev_read(pcs, IF_MODE);
> 
> 	dev_info(&pcs->dev, "BMSR 0x%x, BMCR 0x%x, ADV 0x%x, LPA 0x%x, IF_MODE 0x%x\n", bmsr, bmcr, adv, lpa, if_mode);
> }
> 
> and call it from:
> 
> static void lynx_pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode,
> 			       struct phylink_link_state *state)
> {
> 	struct lynx_pcs *lynx = phylink_pcs_to_lynx(pcs);
> 
> 	lynx_pcs_debug(lynx->mdio); // <- here
> 
> 	switch (state->interface) {
> 	...
> 
> With this, I would like to know:
> (a) what is the IF_MODE register content outside of the IF_MODE_SGMII_EN
>     and IF_MODE_USE_SGMII_AN bits.
> (b) what is the SGMII code word advertised by the AQR115 in OCSGMII mode.
> 
> Then if you could replicate this test for 1Gbps medium link speed, it
> would be great.

For now, I have reverted both the U-Boot and kernel fixes and added debug
outputs for further analysis. Unfortunately the function
`lynx_pcs_get_state` is never called in my kernel code. Therefore I put the
debug function into `lynx_pcs_config`. Here is the output:

    mdio_bus 0x0000000ffe4e5000:00: BMSR 0x29, BMCR 0x1140, ADV 0x4001, LPA 0xdc01, IF_MODE 0x3

I hope it'll help to analyze the problem further.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-08  7:47                                                         ` Alexander Wilhelm
@ 2025-10-08 11:10                                                           ` Vladimir Oltean
  2025-10-08 12:52                                                             ` Russell King (Oracle)
  2025-10-08 13:28                                                             ` Alexander Wilhelm
  0 siblings, 2 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-10-08 11:10 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 10010 bytes --]

On Wed, Oct 08, 2025 at 09:47:28AM +0200, Alexander Wilhelm wrote:
> On Tue, Oct 07, 2025 at 05:08:19PM +0300, Vladimir Oltean wrote:
> > Hi Alexander,
> [...]
> > Sorry for the delay. What you have found are undoubtebly two major bugs,
> > causing the Lynx PCS to operate in undefined behaviour territory.
> > Nonetheless, while your finding has helped me discover many previously
> > unknown facts about the hardware IP, I still cannot replicate exactly
> > your reported behaviour. In order to fully process things, I would like
> > to ask a few more clarification questions.
> 
> Sure.
> 
> > Is your U-Boot implementation based on NXP's dtsec_configure_serdes()?
> > https://github.com/u-boot/u-boot/blob/master/drivers/net/fm/eth.c#L57
> 
> Unfortunately, I am working with an older U-Boot version v2016.07. However,
> the bug I fixed was not part of the official U-Boot codebase, it was
> introduced by our team:
> 
>     value = PHY_SGMII_IF_MODE_SGMII;
>     value |= PHY_SGMII_IF_MODE_AN;
> 
> I added the missing `if` condition as follows:
> 
>     if (!sgmii_2500) {
>         value = PHY_SGMII_IF_MODE_SGMII;
>         value |= PHY_SGMII_IF_MODE_AN;
>     }
> 
> With the official U-Boot codebase I don't have a ping at none of the
> speeds:
> 
>     value = PHY_SGMII_IF_MODE_SGMII;
>     if (!sgmii_2500)
>         value |= PHY_SGMII_IF_MODE_AN;
> 
> > Why would U-Boot set IF_MODE_SGMII_EN | IF_MODE_USE_SGMII_AN only when
> > the AQR115 resolves only to 100M, but not in the other cases (which do
> > not have this problem)? Or does it do it irrespective of resolved media
> > side link speed? Simply put: what did the code that you fixed up look like?
> 
> In our implementation, the SGMII flags were always set in U-Boot,
> regardless of the negotiated link speed. My assumption is that the SGMII
> mode configuration results in a behavior where only a 100M link applies the
> 10x symbol replication, while 1G does not. For a 2.5G link, the behavior
> ends up being the same as 1G, since there is no actual SGMII mode for 2.5G.

Yes, this assumption seems to hold water thus far, but I have to
validate it by seeing the debugging print for 1G/2.5G, once we figure
out the debug printing aspect.

> > With the U-Boot fix reverted, could you please replicate the broken
> > setup with AQR115 linking at 100Mbps, and add the following function in
> > Linux drivers/pcs-lynx.c?
> > 
> > static void lynx_pcs_debug(struct mdio_device *pcs)
> > {
> > 	int bmsr = mdiodev_read(pcs, MII_BMSR);
> > 	int bmcr = mdiodev_read(pcs, MII_BMCR);
> > 	int adv = mdiodev_read(pcs, MII_ADVERTISE);
> > 	int lpa = mdiodev_read(pcs, MII_LPA);
> > 	int if_mode = mdiodev_read(pcs, IF_MODE);
> > 
> > 	dev_info(&pcs->dev, "BMSR 0x%x, BMCR 0x%x, ADV 0x%x, LPA 0x%x, IF_MODE 0x%x\n", bmsr, bmcr, adv, lpa, if_mode);
> > }
> > 
> > and call it from:
> > 
> > static void lynx_pcs_get_state(struct phylink_pcs *pcs, unsigned int neg_mode,
> > 			       struct phylink_link_state *state)
> > {
> > 	struct lynx_pcs *lynx = phylink_pcs_to_lynx(pcs);
> > 
> > 	lynx_pcs_debug(lynx->mdio); // <- here
> > 
> > 	switch (state->interface) {
> > 	...
> > 
> > With this, I would like to know:
> > (a) what is the IF_MODE register content outside of the IF_MODE_SGMII_EN
> >     and IF_MODE_USE_SGMII_AN bits.
> > (b) what is the SGMII code word advertised by the AQR115 in OCSGMII mode.
> > 
> > Then if you could replicate this test for 1Gbps medium link speed, it
> > would be great.
> 
> For now, I have reverted both the U-Boot and kernel fixes and added debug
> outputs for further analysis. Unfortunately the function
> `lynx_pcs_get_state` is never called in my kernel code. Therefore I put the
> debug function into `lynx_pcs_config`. Here is the output:
> 
>     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x29, BMCR 0x1140, ADV 0x4001, LPA 0xdc01, IF_MODE 0x3
> 
> I hope it'll help to analyze the problem further.

Correct. lynx_pcs_get_state() is only called for MLO_AN_INBAND (managed = "in-band-status"),
which the Lynx PCS driver does not currently support for 2500base-x.

However, I don't fully trust the positioning of the debug print into lynx_pcs_config().
The BMCR, ADV and IF_MODE registers look plausible, as if lynx_pcs_config() did what it
was supposed to do, but LPA (link config code word coming from AQR115) looks strange.
Field 11:10 (COP_SPD) is 0b11, which is a reserved value, neither 1G nor 100M nor 10M.
Maybe this is the mythical "SGMII 2500" auto-negotiation? Anyway, I don't think there is
any standard for it, and even if there was, the Lynx PCS doesn't implement it.

I'm surprised your AQR115 would transmit in-band code words for OCSGMII. None of the Aquantia
PHYs I've tested on were able to do that, and I'm not sure what register controls that.
If we look at your previous debugging output of the global system configuration registers:
https://lore.kernel.org/lkml/aJH8n0zheqB8tWzb@FUE-ALEWI-WINX/
we see that for 100M line side, the PHY uses "SerDes mode 4 autoneg 0". I also tried modifying
aqr_gen2_config_inband() to set VEND1_GLOBAL_CFG_AUTONEG_ENA for OCSGMII, but it didn't appear
to change anything, so that's probably not the setting. I'll have to ask somebody at Marvell.

In any case, contrary to my previous beliefs and according to your finding plus my parallel
testing, the Lynx PCS actually supports in-band auto-negotiation at 2500 data rate - both
2500base-x auto-negotiation and SGMII auto-negotiation (to the extent that this is a thing
that actually makes sense - it doesn't).

With IF_MODE=3 (SGMII_EN | USE_SGMII_AN), the PCS will automatically reconfigure the data path
for the speed decoded in hardware from the LPA_SGMII_SPD_MASK bits. Apparently it does this
for the lane data rate of 2500 just the same as it does it for 1000, just that the
LPA_SGMII_SPD_MASK bits need to be 0b00 (gigabit) for traffic to pass. Otherwise, it tries to
perform symbol replication (as per your hw engineer's claim), and that didn't work in my testing
either(* details at the end).

I'm not saying that IF_MODE=3 is a valid configuration when the lane data rate is 2500.
It absolutely isn't, and your patch which changes IF_MODE to 0 seems ok. I'm just trying to
understand and then re-explain what the PCS does when configured in this mode, based on the
evidence.

Specifically when LPA_SGMII_SPD_MASK/COP_SPD is 0b11, it isn't documented how it would behave,
I don't have a protocol analyzer to count replicated symbols, and I'm unable to obtain a
functional data path to measure bandwidth with iperf3. Your hardware engineer's claim remains
the most trustworthy source of information we have.

Regarding lynx_pcs_get_state(): I actually was working on the patch attached, which I had in my
tree and I didn't realize it would impact your testing. I would kindly ask you to apply as well.
Applying it alone would be enough to fix the IF_MODE=3 problem, but fixing the problem is not
what we want, instead we want to see the MII_LPA register value at lynx_pcs_get_state() time,
and for multiple link speeds.

For that, please break the link again, by making the following changes on top:

1. Configure IF_MODE=3 (SGMII autoneg format) for 2500base-x:

diff --git a/drivers/net/pcs/pcs-lynx.c b/drivers/net/pcs/pcs-lynx.c
index a88cbe67cc9d..ea42b8d813f3 100644
--- a/drivers/net/pcs/pcs-lynx.c
+++ b/drivers/net/pcs/pcs-lynx.c
@@ -152,11 +152,10 @@ static int lynx_pcs_config_giga(struct mdio_device *pcs,
 		mdiodev_write(pcs, LINK_TIMER_HI, link_timer >> 16);
 	}

-	if (interface == PHY_INTERFACE_MODE_1000BASEX ||
-	    interface == PHY_INTERFACE_MODE_2500BASEX) {
+	if (interface == PHY_INTERFACE_MODE_1000BASEX) {
 		if_mode = 0;
 	} else {
-		/* SGMII and QSGMII */
+		/* SGMII, QSGMII and (incorrectly) 2500base-x */
 		if_mode = IF_MODE_SGMII_EN;
 		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED)
 			if_mode |= IF_MODE_USE_SGMII_AN;

2. Edit your MAC OF node in the device tree and add:

&mac {
	managed = "in-band-status";  // this
	phy-mode = "2500base-x";
};

This will reliably cause the same behaviour as before, but with no dependency on U-Boot.

Because aqr_gen2_inband_caps() (code added recently to net-next) returns 0 (unknown)
for 2500base-x, phylink doesn't know whether the PHY will send or not in-band code words.
It will have to trust the firmware description with managed = "in-band-status", which will
lead neg_mode to be PHYLINK_PCS_NEG_INBAND_ENABLED, which will cause IF_MODE_USE_SGMII_AN
to be set.

Note: this is something else we'll have to look at later too. What bits control "OCSGMII"
link codeword transmission in Aquantia PHYs?

Your earlier assumption about why 100M is broken but 1G / 2.5G are not
only holds water if, as a result of testing at these other link speeds,
you find the MII_LPA register to contain 0b10 (Gigabit) in bits 11:10
(COP_SPD / LPA_SGMII_SPD_MASK). Aka it worked, but it was purely accidental,
because phylink thought 2500base-x would not use autoneg, yet it did,
and by some miracle the SGMII format coincided and resulted in no change
in the link characteristics.

Regarding my patch vs yours, my thoughts on this topic are: the bug is
old, the PCS driver never worked if the registers were not as expected
(this is not a regression), and your patch is incomplete if MII_BMCR
also contains significant differences. I would recommend submitting
mine, as a new feature to net-next, when it reopens for patches for 6.19.
I've credited you with Co-developed-by due to the significance of your
findings. Thanks.

(*) When testing forced 10x SGMII symbol replication on a 3.125 Gbps
lane over a pair of optical SFP+ modules connected between two LS1028A Lynx PCS
blocks, I can see packets being transmitted, but on the receiver, the
RFRG mEMAC counter increases (rx_fragments). The documentation says for this:
"Incremented for each packet which is shorter than 64 octets received with
a wrong CRC. (Fragments are not delivered to the FIFO interface.)"
Without dedicated equipment, I'm unsure how to push this further.

[-- Attachment #2: 0001-net-pcs-lynx-accept-in-band-autoneg-for-2500base-x.patch --]
[-- Type: text/x-diff, Size: 4689 bytes --]

From 19a375a81f3dae953410f8da607e89c34095a21c Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Fri, 3 Oct 2025 17:20:20 +0300
Subject: [PATCH] net: pcs: lynx: accept in-band autoneg for 2500base-x

Testing in two circumstances:

1. back to back optical SFP+ connection between two LS1028A-QDS ports
   with the SCH-26908 riser card
2. T1042 with on-board AQR115 PHY using "OCSGMII", as per
   https://lore.kernel.org/lkml/aIuEvaSCIQdJWcZx@FUE-ALEWI-WINX/

strongly suggests that enabling in-band auto-negotiation is actually
possible when the lane baud rate is 3.125 Gbps.

It was previously thought that this would not be the case, because it
was only tested on 2500base-x links with on-board Aquantia PHYs, where
it was noticed that MII_LPA is always reported as zero, and it was
thought that this is because of the PCS.

Test case #1 above shows it is not, and the configured MII_ADVERTISE on
system A ends up in the MII_LPA on system B, when in 2500base-x mode
(IF_MODE=0).

Test case #2, which uses "SGMII" auto-negotiation (IF_MODE=3) for the
3.125 Gbps lane, is actually a misconfiguration, but it is what led to
the discovery.

There is actually an old bug in the Lynx PCS driver - it expects all
register values to contain their default out-of-reset values, as if the
PCS were initialized by the Reset Configuration Word (RCW) settings.
There are 2 cases in which this is problematic:
- if the bootloader (or previous kexec-enabled Linux) wrote a different
  IF_MODE value
- if dynamically changing the SerDes protocol from 1000base-x to
  2500base-x, e.g. by replacing the optical SFP module.

Specifically in test case #2, an accidental alignment between the
bootloader configuring the PCS to expect SGMII in-band code words, and
the AQR115 PHY actually transmitting SGMII in-band code words when
operating in the "OCSGMII" system interface protocol, led to the PCS
transmitting replicated symbols at 3.125 Gbps baud rate. This could only
have happened if the PCS saw and reacted to the SGMII code words in the
first place.

Since test #2 is invalid from a protocol perspective (there seems to be
no standard way of negotiating the data rate of 2500 Mbps with SGMII,
and the lower data rates should remain 10/100/1000), in-band auto-negotiation
for 2500base-x effectively means Clause 37 (i.e. IF_MODE=0).

Make 2500base-x be treated like 1000base-x in this regard, by removing
all prior limitations and calling lynx_pcs_config_giga().

This adds a new feature: LINK_INBAND_ENABLE and at the same time fixes
the Lynx PCS's long standing problem that the registers (specifically
IF_MODE, but others could be misconfigured as well) are not written by
the driver to the known valid values for 2500base-x.

Co-developed-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>
Signed-off-by: Alexander Wilhelm <alexander.wilhelm@westermo.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 drivers/net/pcs/pcs-lynx.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/net/pcs/pcs-lynx.c b/drivers/net/pcs/pcs-lynx.c
index 677f92883976..a88cbe67cc9d 100644
--- a/drivers/net/pcs/pcs-lynx.c
+++ b/drivers/net/pcs/pcs-lynx.c
@@ -40,12 +40,12 @@ static unsigned int lynx_pcs_inband_caps(struct phylink_pcs *pcs,
 {
 	switch (interface) {
 	case PHY_INTERFACE_MODE_1000BASEX:
+	case PHY_INTERFACE_MODE_2500BASEX:
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_QSGMII:
 		return LINK_INBAND_DISABLE | LINK_INBAND_ENABLE;
 
 	case PHY_INTERFACE_MODE_10GBASER:
-	case PHY_INTERFACE_MODE_2500BASEX:
 		return LINK_INBAND_DISABLE;
 
 	case PHY_INTERFACE_MODE_USXGMII:
@@ -152,7 +152,8 @@ static int lynx_pcs_config_giga(struct mdio_device *pcs,
 		mdiodev_write(pcs, LINK_TIMER_HI, link_timer >> 16);
 	}
 
-	if (interface == PHY_INTERFACE_MODE_1000BASEX) {
+	if (interface == PHY_INTERFACE_MODE_1000BASEX ||
+	    interface == PHY_INTERFACE_MODE_2500BASEX) {
 		if_mode = 0;
 	} else {
 		/* SGMII and QSGMII */
@@ -202,15 +203,9 @@ static int lynx_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
 	case PHY_INTERFACE_MODE_1000BASEX:
 	case PHY_INTERFACE_MODE_SGMII:
 	case PHY_INTERFACE_MODE_QSGMII:
+	case PHY_INTERFACE_MODE_2500BASEX:
 		return lynx_pcs_config_giga(lynx->mdio, ifmode, advertising,
 					    neg_mode);
-	case PHY_INTERFACE_MODE_2500BASEX:
-		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
-			dev_err(&lynx->mdio->dev,
-				"AN not supported on 3.125GHz SerDes lane\n");
-			return -EOPNOTSUPP;
-		}
-		break;
 	case PHY_INTERFACE_MODE_USXGMII:
 	case PHY_INTERFACE_MODE_10G_QXGMII:
 		return lynx_pcs_config_usxgmii(lynx->mdio, ifmode, advertising,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-08 11:10                                                           ` Vladimir Oltean
@ 2025-10-08 12:52                                                             ` Russell King (Oracle)
  2025-10-08 13:00                                                               ` Vladimir Oltean
  2025-10-08 13:28                                                             ` Alexander Wilhelm
  1 sibling, 1 reply; 53+ messages in thread
From: Russell King (Oracle) @ 2025-10-08 12:52 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Oct 08, 2025 at 02:10:59PM +0300, Vladimir Oltean wrote:
> 1. Configure IF_MODE=3 (SGMII autoneg format) for 2500base-x:

Just to be clear, we're not going to accept SGMII autoneg format for
2500base-X.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-08 12:52                                                             ` Russell King (Oracle)
@ 2025-10-08 13:00                                                               ` Vladimir Oltean
  0 siblings, 0 replies; 53+ messages in thread
From: Vladimir Oltean @ 2025-10-08 13:00 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Alexander Wilhelm, Andrew Lunn, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Wed, Oct 08, 2025 at 01:52:41PM +0100, Russell King (Oracle) wrote:
> On Wed, Oct 08, 2025 at 02:10:59PM +0300, Vladimir Oltean wrote:
> > 1. Configure IF_MODE=3 (SGMII autoneg format) for 2500base-x:
> 
> Just to be clear, we're not going to accept SGMII autoneg format for
> 2500base-X.

The "good" change is in the attachments of my previous email. The inline
diff on which you've commented is solely for further debugging of the
original issue, and is supposed to be applied on top of the "good"
change, and undo it (and in the process, get lynx_pcs_get_state() to be
called).

I know we're not going to accept SGMII autoneg format for 2500base-x, I
also said so in a few places in the email and in the "good" patch's
commit message.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-08 11:10                                                           ` Vladimir Oltean
  2025-10-08 12:52                                                             ` Russell King (Oracle)
@ 2025-10-08 13:28                                                             ` Alexander Wilhelm
  2025-10-08 14:55                                                               ` Vladimir Oltean
  1 sibling, 1 reply; 53+ messages in thread
From: Alexander Wilhelm @ 2025-10-08 13:28 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Wed, Oct 08, 2025 at 02:10:59PM +0300, Vladimir Oltean wrote:
[...]
> For that, please break the link again, by making the following changes on top:
> 
> 1. Configure IF_MODE=3 (SGMII autoneg format) for 2500base-x:
> 
> diff --git a/drivers/net/pcs/pcs-lynx.c b/drivers/net/pcs/pcs-lynx.c
> index a88cbe67cc9d..ea42b8d813f3 100644
> --- a/drivers/net/pcs/pcs-lynx.c
> +++ b/drivers/net/pcs/pcs-lynx.c
> @@ -152,11 +152,10 @@ static int lynx_pcs_config_giga(struct mdio_device *pcs,
>  		mdiodev_write(pcs, LINK_TIMER_HI, link_timer >> 16);
>  	}
> 
> -	if (interface == PHY_INTERFACE_MODE_1000BASEX ||
> -	    interface == PHY_INTERFACE_MODE_2500BASEX) {
> +	if (interface == PHY_INTERFACE_MODE_1000BASEX) {
>  		if_mode = 0;
>  	} else {
> -		/* SGMII and QSGMII */
> +		/* SGMII, QSGMII and (incorrectly) 2500base-x */
>  		if_mode = IF_MODE_SGMII_EN;
>  		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED)
>  			if_mode |= IF_MODE_USE_SGMII_AN;
> 
> 2. Edit your MAC OF node in the device tree and add:
> 
> &mac {
> 	managed = "in-band-status";  // this
> 	phy-mode = "2500base-x";
> };
> 
> This will reliably cause the same behaviour as before, but with no dependency on U-Boot.

I have the broken 100M link state again (IF_MODE=3). Below are the debug
details I was able to observe:

* With 2.5G link:

    mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xdc01, IF_MODE 0x3

* With 1G link:

    mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xd801, IF_MODE 0x3

* With 100M link:

    mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xd401, IF_MODE 0x3

[...]
> Regarding my patch vs yours, my thoughts on this topic are: the bug is
> old, the PCS driver never worked if the registers were not as expected
> (this is not a regression), and your patch is incomplete if MII_BMCR
> also contains significant differences. I would recommend submitting
> mine, as a new feature to net-next, when it reopens for patches for 6.19.
> I've credited you with Co-developed-by due to the significance of your
> findings. Thanks.

Sure, thank you.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-08 13:28                                                             ` Alexander Wilhelm
@ 2025-10-08 14:55                                                               ` Vladimir Oltean
  2025-10-09  6:05                                                                 ` Alexander Wilhelm
  0 siblings, 1 reply; 53+ messages in thread
From: Vladimir Oltean @ 2025-10-08 14:55 UTC (permalink / raw)
  To: Alexander Wilhelm
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Wed, Oct 08, 2025 at 03:28:07PM +0200, Alexander Wilhelm wrote:
> I have the broken 100M link state again (IF_MODE=3). Below are the debug
> details I was able to observe:
> 
> * With 2.5G link:
> 
>     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xdc01, IF_MODE 0x3
> 
> * With 1G link:
> 
>     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xd801, IF_MODE 0x3
> 
> * With 100M link:
> 
>     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xd401, IF_MODE 0x3

Ok, this is why I didn't trust the print from lynx_pcs_config(). BMSR was 0x29
in your previous log (no link) and is 0x2d now. Also, the LPA for 100M is
different (I trust this one).

We have:
2.5G link: LPA_SGMII_SPD_MASK bits = 0b11 => undefined behaviour, reserved value
1G link: LPA_SGMII_SPD_MASK bits = 0b10 => 1G, the only proper value (by coincidence, of course)
100M link: LPA_SGMII_SPD_MASK bits = 0b01 => 100M, PHY practically requests 10x symbol replication, and the Lynx PCS obliges

So the AQR115 PHY uses the SGMII base page format, and with the IF_MODE=0 fix,
the Lynx PCS uses the Clause 37 base page format.

We know that in-band autoneg is enabled in the AQR115 PHY and we don't
know how to disable it, and we know that for traffic to pass, one of two
things must happen:

1. In-band autoneg must complete (as required by LINK_INBAND_ENABLE).
   This happens when we have managed = "in-band-status" in the device tree.
   - From the AQR115 perspective, SGMII AN completes if it receives a base page
     with the ACK bit set. Since SGMII and Clause 37 are compatible in this
     regard (the ACK bit is in the same position, bit 14), the Lynx PCS
     fulfills what the AQR115 expects.
   - From the Lynx PCS perspective, clause 37 AN also completes if it
     receives a base page with the ACK bit set. Which again it does, but
     the SGMII code word overlaps in strange ways (Next Page and Remote
     Fault 1 end up being set, neither Half Duplex nor Full Duplex bits
     are set), so the Lynx PCS may behave in unpredictable ways.
2. In-band autoneg fails, but the AQR115 PHY falls back to full data
   rate anyway (as permitted by LINK_INBAND_BYPASS). This happens when
   we do _not_ have managed = "in-band-status" in the device tree.
   The Lynx PCS does not respond with code words having the ACK bit set,
   and does not generate clause 37 code words of its own, instead goes
   to data mode directly. AQR115 eventually goes to data mode too.

I expect that your setup works through #2 right now.

The symbol replication aspect is now clarified, there is a new question mark caused by
the 0b11 speed bits also empirically passing traffic despite being a reserved value,
and in order to gain a bit more control over things and make them more robust, we need
to see how the PHY driver can implement aqr_gen2_inband_caps() and aqr_gen2_config_inband()
for PHY_INTERFACE_MODE_2500BASEX, and fix up the base pages the PHY is sending
(the current format is broken per all known standards).

Thanks a lot for testing.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: Aquantia PHY in OCSGMII mode?
  2025-10-08 14:55                                                               ` Vladimir Oltean
@ 2025-10-09  6:05                                                                 ` Alexander Wilhelm
  0 siblings, 0 replies; 53+ messages in thread
From: Alexander Wilhelm @ 2025-10-09  6:05 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Russell King (Oracle), Andrew Lunn, Heiner Kallweit,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel

On Wed, Oct 08, 2025 at 05:55:49PM +0300, Vladimir Oltean wrote:
> On Wed, Oct 08, 2025 at 03:28:07PM +0200, Alexander Wilhelm wrote:
> > I have the broken 100M link state again (IF_MODE=3). Below are the debug
> > details I was able to observe:
> > 
> > * With 2.5G link:
> > 
> >     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xdc01, IF_MODE 0x3
> > 
> > * With 1G link:
> > 
> >     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xd801, IF_MODE 0x3
> > 
> > * With 100M link:
> > 
> >     mdio_bus 0x0000000ffe4e5000:00: BMSR 0x2d, BMCR 0x1140, ADV 0x41a0, LPA 0xd401, IF_MODE 0x3
> 
> Ok, this is why I didn't trust the print from lynx_pcs_config(). BMSR was 0x29
> in your previous log (no link) and is 0x2d now. Also, the LPA for 100M is
> different (I trust this one).
> 
> We have:
> 2.5G link: LPA_SGMII_SPD_MASK bits = 0b11 => undefined behaviour, reserved value
> 1G link: LPA_SGMII_SPD_MASK bits = 0b10 => 1G, the only proper value (by coincidence, of course)
> 100M link: LPA_SGMII_SPD_MASK bits = 0b01 => 100M, PHY practically requests 10x symbol replication, and the Lynx PCS obliges
> 
> So the AQR115 PHY uses the SGMII base page format, and with the IF_MODE=0 fix,
> the Lynx PCS uses the Clause 37 base page format.
> 
> We know that in-band autoneg is enabled in the AQR115 PHY and we don't
> know how to disable it, and we know that for traffic to pass, one of two
> things must happen:
> 
> 1. In-band autoneg must complete (as required by LINK_INBAND_ENABLE).
>    This happens when we have managed = "in-band-status" in the device tree.
>    - From the AQR115 perspective, SGMII AN completes if it receives a base page
>      with the ACK bit set. Since SGMII and Clause 37 are compatible in this
>      regard (the ACK bit is in the same position, bit 14), the Lynx PCS
>      fulfills what the AQR115 expects.
>    - From the Lynx PCS perspective, clause 37 AN also completes if it
>      receives a base page with the ACK bit set. Which again it does, but
>      the SGMII code word overlaps in strange ways (Next Page and Remote
>      Fault 1 end up being set, neither Half Duplex nor Full Duplex bits
>      are set), so the Lynx PCS may behave in unpredictable ways.
> 2. In-band autoneg fails, but the AQR115 PHY falls back to full data
>    rate anyway (as permitted by LINK_INBAND_BYPASS). This happens when
>    we do _not_ have managed = "in-band-status" in the device tree.
>    The Lynx PCS does not respond with code words having the ACK bit set,
>    and does not generate clause 37 code words of its own, instead goes
>    to data mode directly. AQR115 eventually goes to data mode too.
> 
> I expect that your setup works through #2 right now.
> 
> The symbol replication aspect is now clarified, there is a new question mark caused by
> the 0b11 speed bits also empirically passing traffic despite being a reserved value,
> and in order to gain a bit more control over things and make them more robust, we need
> to see how the PHY driver can implement aqr_gen2_inband_caps() and aqr_gen2_config_inband()
> for PHY_INTERFACE_MODE_2500BASEX, and fix up the base pages the PHY is sending
> (the current format is broken per all known standards).
> 
> Thanks a lot for testing.

It was my pleasure to help. Thank you for your patch suggestions and especially
for the detailed explanations. I now have a much better understanding of how the
PHY and MAC interact.


Best regards
Alexander Wilhelm

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2025-10-09  6:06 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-31 14:59 Aquantia PHY in OCSGMII mode? Alexander Wilhelm
2025-07-31 15:14 ` Andrew Lunn
2025-07-31 16:02   ` Russell King (Oracle)
2025-08-01  5:44     ` Alexander Wilhelm
2025-08-04 14:53       ` Andrew Lunn
2025-07-31 17:16 ` Vladimir Oltean
2025-07-31 19:26   ` Russell King (Oracle)
2025-08-01  5:50     ` Alexander Wilhelm
2025-08-01 11:01     ` Vladimir Oltean
2025-08-01 11:54       ` Alexander Wilhelm
2025-08-01 11:58         ` Russell King (Oracle)
2025-08-01 12:06           ` Alexander Wilhelm
2025-08-01 12:23             ` Russell King (Oracle)
2025-08-01 12:36               ` Alexander Wilhelm
2025-08-01 13:04               ` Vladimir Oltean
2025-08-01 14:02                 ` Russell King (Oracle)
2025-08-01 14:37                   ` Vladimir Oltean
2025-08-04  6:17                 ` Alexander Wilhelm
2025-08-04 10:01                   ` Vladimir Oltean
2025-08-04 13:01                     ` Alexander Wilhelm
2025-08-04 13:41                       ` Vladimir Oltean
2025-08-04 14:47                         ` Alexander Wilhelm
2025-08-04 16:00                           ` Vladimir Oltean
2025-08-04 16:02                             ` Vladimir Oltean
2025-08-05  7:59                               ` Alexander Wilhelm
2025-08-05 10:20                                 ` Vladimir Oltean
2025-08-05 12:44                                   ` Alexander Wilhelm
2025-08-06 14:58                                     ` Vladimir Oltean
2025-08-07  5:56                                       ` Alexander Wilhelm
2025-08-27  5:57                                       ` Alexander Wilhelm
2025-08-27  7:31                                         ` Vladimir Oltean
2025-08-27  8:41                                           ` Alexander Wilhelm
2025-08-27  8:47                                             ` Russell King (Oracle)
2025-08-27  9:03                                               ` Alexander Wilhelm
2025-08-27  9:13                                                 ` Russell King (Oracle)
2025-08-28  9:28                                                   ` Vladimir Oltean
2025-10-02  5:54                                                     ` Alexander Wilhelm
2025-10-07 14:08                                                       ` Vladimir Oltean
2025-10-08  7:47                                                         ` Alexander Wilhelm
2025-10-08 11:10                                                           ` Vladimir Oltean
2025-10-08 12:52                                                             ` Russell King (Oracle)
2025-10-08 13:00                                                               ` Vladimir Oltean
2025-10-08 13:28                                                             ` Alexander Wilhelm
2025-10-08 14:55                                                               ` Vladimir Oltean
2025-10-09  6:05                                                                 ` Alexander Wilhelm
2025-08-27  8:08                                         ` Russell King (Oracle)
2025-08-27  8:32                                           ` Alexander Wilhelm
2025-08-27  8:45                                             ` Russell King (Oracle)
2025-08-04 14:22                       ` Russell King (Oracle)
2025-08-04 14:51                         ` Alexander Wilhelm
2025-08-04 14:56                         ` Vladimir Oltean
2025-08-01 11:13     ` Vladimir Oltean
2025-08-01  5:53   ` Alexander Wilhelm

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).