[PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
@ 2024-11-22 22:48 Asmaa Mnebhi
  2024-11-26 17:32 ` Andrew Lunn
  0 siblings, 1 reply; 11+ messages in thread
From: Asmaa Mnebhi @ 2024-11-22 22:48 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni
  Cc: Asmaa Mnebhi, netdev, linux-kernel, David Thompson

From: asmaa <asmaa@nvidia.com>

Once the BlueField-3 MDIO clock is enabled by software, it is expected
and intended for it to keep toggling. BlueField-3 has a hardware GPIO bug
where constant toggling at "high frequencies" will lead to GPIO
degradation.

The workaround suggested by the hardware team is to lower down the clock
frequency. That will increase the "life expectation" of the GPIO.
The lowest possible frequency we can achieve is 1.09Mhz by setting
mdio_period = 0xFF.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
---
 .../ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c
index 654190263535..d6dd36ab599e 100644
--- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c
+++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_mdio.c
@@ -96,6 +96,7 @@ static struct mlxbf_gige_mdio_gw mlxbf_gige_mdio_gw_t[] = {
 #define MLXBF_GIGE_MDIO_FREQ_REFERENCE 156250000ULL
 #define MLXBF_GIGE_MDIO_COREPLL_CONST  16384ULL
 #define MLXBF_GIGE_MDC_CLK_NS          400
+#define MLXBF_GIGE_BF3_MDIO_PERIOD     0xFF
 #define MLXBF_GIGE_MDIO_PLL_I1CLK_REG1 0x4
 #define MLXBF_GIGE_MDIO_PLL_I1CLK_REG2 0x8
 #define MLXBF_GIGE_MDIO_CORE_F_SHIFT   0
@@ -178,9 +179,16 @@ static u8 mdio_period_map(struct mlxbf_gige *priv)
 	u8 mdio_period;
 	u64 i1clk;
 
-	i1clk = calculate_i1clk(priv);
-
-	mdio_period = div_u64((MLXBF_GIGE_MDC_CLK_NS >> 1) * i1clk, 1000000000) - 1;
+	/* The MDIO clock frequency need to be set as low as possible to avoid
+	 * a BF3 hardware GPIO degradation. The lowest frequency can be achieved
+	 * by setting MdioPeriod = 0xFF.
+	 */
+	if (priv->hw_version == MLXBF_GIGE_VERSION_BF3) {
+		mdio_period = MLXBF_GIGE_BF3_MDIO_PERIOD;
+	} else {
+		i1clk = calculate_i1clk(priv);
+		mdio_period = div_u64((MLXBF_GIGE_MDC_CLK_NS >> 1) * i1clk, 1000000000) - 1;
+	}
 
 	return mdio_period;
 }
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2024-11-22 22:48 [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug Asmaa Mnebhi
@ 2024-11-26 17:32 ` Andrew Lunn
  2025-05-08 14:18   ` Asmaa Mnebhi
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Lunn @ 2024-11-26 17:32 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: davem, edumazet, kuba, pabeni, netdev, linux-kernel,
	David Thompson

On Fri, Nov 22, 2024 at 10:48:27PM +0000, Asmaa Mnebhi wrote:
> From: asmaa <asmaa@nvidia.com>
> 
> Once the BlueField-3 MDIO clock is enabled by software, it is expected
> and intended for it to keep toggling. BlueField-3 has a hardware GPIO bug
> where constant toggling at "high frequencies" will lead to GPIO
> degradation.
> 
> The workaround suggested by the hardware team is to lower down the clock
> frequency. That will increase the "life expectation" of the GPIO.
> The lowest possible frequency we can achieve is 1.09Mhz by setting
> mdio_period = 0xFF.

802.3 says:

  22.2.2.13 MDC (management data clock)

  MDC is sourced by the station management entity to the PHY as the
  timing reference for transfer of information on the MDIO signal. MDC
  is an aperiodic signal that has no maximum high or low times. The
  minimum high and low times for MDC shall be 160 ns each, and the
  minimum period for MDC shall be 400 ns, regardless of the nominal
  period of TX_CLK and RX_CLK.

My reading of this is that you can stop the clock when it is not
needed. Maybe tie into the Linux runtime power management
framework. It can keep track of how long a device has been idle, and
if a timer is exceeded, make a callback to power it down.

If you have an MDIO bus with one PHY on it, the access pattern is
likely to be a small bunch of reads followed by about one second of
idle time. I would of thought that stopping the clock increases the
life expectancy of you hardware more than just slowing it down.

	Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2024-11-26 17:32 ` Andrew Lunn
@ 2025-05-08 14:18   ` Asmaa Mnebhi
  2025-05-08 14:44     ` Andrew Lunn
  0 siblings, 1 reply; 11+ messages in thread
From: Asmaa Mnebhi @ 2025-05-08 14:18 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

> > Once the BlueField-3 MDIO clock is enabled by software, it is expected
> > and intended for it to keep toggling. BlueField-3 has a hardware GPIO
> > bug where constant toggling at "high frequencies" will lead to GPIO
> > degradation.
> >
> > The workaround suggested by the hardware team is to lower down the
> > clock frequency. That will increase the "life expectation" of the GPIO.
> > The lowest possible frequency we can achieve is 1.09Mhz by setting
> > mdio_period = 0xFF.
> 
> 802.3 says:
> 
>   22.2.2.13 MDC (management data clock)
> 
>   MDC is sourced by the station management entity to the PHY as the
>   timing reference for transfer of information on the MDIO signal. MDC
>   is an aperiodic signal that has no maximum high or low times. The
>   minimum high and low times for MDC shall be 160 ns each, and the
>   minimum period for MDC shall be 400 ns, regardless of the nominal
>   period of TX_CLK and RX_CLK.
> 
> My reading of this is that you can stop the clock when it is not needed. Maybe
> tie into the Linux runtime power management framework. It can keep track of
> how long a device has been idle, and if a timer is exceeded, make a callback to
> power it down.
> 
> If you have an MDIO bus with one PHY on it, the access pattern is likely to be a
> small bunch of reads followed by about one second of idle time. I would of
> thought that stopping the clock increases the life expectancy of you hardware
> more than just slowing it down.

Hi Andrew, 

Thank you for your answer and apologies for the very late response. My concern with completely stopping the clock is the case we are using the PHY polling mode for the link status? We would need MDIO to always be operational for polling to work, wouldn't we?

Thanks.
Asmaa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-05-08 14:18   ` Asmaa Mnebhi
@ 2025-05-08 14:44     ` Andrew Lunn
  2025-05-08 14:53       ` Asmaa Mnebhi
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Lunn @ 2025-05-08 14:44 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

> > My reading of this is that you can stop the clock when it is not needed. Maybe
> > tie into the Linux runtime power management framework. It can keep track of
> > how long a device has been idle, and if a timer is exceeded, make a callback to
> > power it down.
> > 
> > If you have an MDIO bus with one PHY on it, the access pattern is likely to be a
> > small bunch of reads followed by about one second of idle time. I would of
> > thought that stopping the clock increases the life expectancy of you hardware
> > more than just slowing it down.
> 
> Hi Andrew, 
> 

> Thank you for your answer and apologies for the very late
> response. My concern with completely stopping the clock is the case
> we are using the PHY polling mode for the link status? We would need
> MDIO to always be operational for polling to work, wouldn't we?

You should look at how power management work. For example, in the FEC
driver:

https://elixir.bootlin.com/linux/v6.14.5/source/drivers/net/ethernet/freescale/fec_main.c#L2180

static int fec_enet_mdio_read_c22(struct mii_bus *bus, int mii_id, int regnum)
{
	struct fec_enet_private *fep = bus->priv;
	struct device *dev = &fep->pdev->dev;
	int ret = 0, frame_start, frame_addr, frame_op;

	ret = pm_runtime_resume_and_get(dev);
	if (ret < 0)
		return ret;

This will use runtime PM to get the clocks ticking.


	/* C22 read */
	frame_op = FEC_MMFR_OP_READ;
	frame_start = FEC_MMFR_ST;
	frame_addr = regnum;

	/* start a read op */
	writel(frame_start | frame_op |
	       FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(frame_addr) |
	       FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);

	/* wait for end of transfer */
	ret = fec_enet_mdio_wait(fep);
	if (ret) {
		netdev_err(fep->netdev, "MDIO read timeout\n");
		goto out;
	}

	ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));

This all does the MDIO bus transaction

out:
	pm_runtime_mark_last_busy(dev);
	pm_runtime_put_autosuspend(dev);

And then tell PM that we are done. In this case, i _think_ it starts a
timer, and if there is no more MDIO activity for a while, the clocks
get disabled.

The same is done for write. 

PHY polling happens once per second, using these methods, nothing
special. So the clock will get enabled on the first read, polling can
need to read a few registers, so due to the timer, the clock is left
ticking between these reads, and then after a while the clock is
disabled.

My guess is, you can have the clock disabled 80% of the time, which is
probably going to be a better way to stop the magic smoke escaping
from your hardware than slowing down the clock.

	Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-05-08 14:44     ` Andrew Lunn
@ 2025-05-08 14:53       ` Asmaa Mnebhi
  2025-06-27 19:00         ` Asmaa Mnebhi
  0 siblings, 1 reply; 11+ messages in thread
From: Asmaa Mnebhi @ 2025-05-08 14:53 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

> > > My reading of this is that you can stop the clock when it is not
> > > needed. Maybe tie into the Linux runtime power management framework.
> > > It can keep track of how long a device has been idle, and if a timer
> > > is exceeded, make a callback to power it down.
> > >
> > > If you have an MDIO bus with one PHY on it, the access pattern is
> > > likely to be a small bunch of reads followed by about one second of
> > > idle time. I would of thought that stopping the clock increases the
> > > life expectancy of you hardware more than just slowing it down.
> >
> > Hi Andrew,
> >
> 
> > Thank you for your answer and apologies for the very late
> > response. My concern with completely stopping the clock is the case
> > we are using the PHY polling mode for the link status? We would need
> > MDIO to always be operational for polling to work, wouldn't we?
> 
> You should look at how power management work. For example, in the FEC
> driver:
> 
> https://elixir.bootlin.com/linux/v6.14.5/source/drivers/net/ethernet/freescale/f
> ec_main.c#L2180
> 
> static int fec_enet_mdio_read_c22(struct mii_bus *bus, int mii_id, int regnum)
> {
> 	struct fec_enet_private *fep = bus->priv;
> 	struct device *dev = &fep->pdev->dev;
> 	int ret = 0, frame_start, frame_addr, frame_op;
> 
> 	ret = pm_runtime_resume_and_get(dev);
> 	if (ret < 0)
> 		return ret;
> 
> This will use runtime PM to get the clocks ticking.
> 
> 
> 	/* C22 read */
> 	frame_op = FEC_MMFR_OP_READ;
> 	frame_start = FEC_MMFR_ST;
> 	frame_addr = regnum;
> 
> 	/* start a read op */
> 	writel(frame_start | frame_op |
> 	       FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(frame_addr) |
> 	       FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
> 
> 	/* wait for end of transfer */
> 	ret = fec_enet_mdio_wait(fep);
> 	if (ret) {
> 		netdev_err(fep->netdev, "MDIO read timeout\n");
> 		goto out;
> 	}
> 
> 	ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
> 
> This all does the MDIO bus transaction
> 
> out:
> 	pm_runtime_mark_last_busy(dev);
> 	pm_runtime_put_autosuspend(dev);
> 
> And then tell PM that we are done. In this case, i _think_ it starts a
> timer, and if there is no more MDIO activity for a while, the clocks
> get disabled.
> 
> The same is done for write.
> 
> PHY polling happens once per second, using these methods, nothing
> special. So the clock will get enabled on the first read, polling can
> need to read a few registers, so due to the timer, the clock is left
> ticking between these reads, and then after a while the clock is
> disabled.
> 
> My guess is, you can have the clock disabled 80% of the time, which is
> probably going to be a better way to stop the magic smoke escaping
> from your hardware than slowing down the clock.

Sweet! Thank you very much Andrew! I will make the changes and send a new patch soon.

Asmaa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-05-08 14:53       ` Asmaa Mnebhi
@ 2025-06-27 19:00         ` Asmaa Mnebhi
  2025-06-28  8:40           ` Andrew Lunn
  0 siblings, 1 reply; 11+ messages in thread
From: Asmaa Mnebhi @ 2025-06-27 19:00 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson



> -----Original Message-----
> From: Asmaa Mnebhi <asmaa@nvidia.com>
> Sent: Thursday, May 8, 2025 10:53 AM
> To: Andrew Lunn <andrew@lunn.ch>
> Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org;
> David Thompson <davthompson@nvidia.com>
> Subject: RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO
> degradation bug
> 
> > > > My reading of this is that you can stop the clock when it is not
> > > > needed. Maybe tie into the Linux runtime power management
> framework.
> > > > It can keep track of how long a device has been idle, and if a
> > > > timer is exceeded, make a callback to power it down.
> > > >
> > > > If you have an MDIO bus with one PHY on it, the access pattern is
> > > > likely to be a small bunch of reads followed by about one second
> > > > of idle time. I would of thought that stopping the clock increases
> > > > the life expectancy of you hardware more than just slowing it down.
> > >
> > > Hi Andrew,
> > >
> >
> > > Thank you for your answer and apologies for the very late response.
> > > My concern with completely stopping the clock is the case we are
> > > using the PHY polling mode for the link status? We would need MDIO
> > > to always be operational for polling to work, wouldn't we?
> >
> > You should look at how power management work. For example, in the FEC
> > driver:
> >
> > https://elixir.bootlin.com/linux/v6.14.5/source/drivers/net/ethernet/f
> > reescale/f
> > ec_main.c#L2180
> >
> > static int fec_enet_mdio_read_c22(struct mii_bus *bus, int mii_id, int
> > regnum) {
> > 	struct fec_enet_private *fep = bus->priv;
> > 	struct device *dev = &fep->pdev->dev;
> > 	int ret = 0, frame_start, frame_addr, frame_op;
> >
> > 	ret = pm_runtime_resume_and_get(dev);
> > 	if (ret < 0)
> > 		return ret;
> >
> > This will use runtime PM to get the clocks ticking.
> >
> >
> > 	/* C22 read */
> > 	frame_op = FEC_MMFR_OP_READ;
> > 	frame_start = FEC_MMFR_ST;
> > 	frame_addr = regnum;
> >
> > 	/* start a read op */
> > 	writel(frame_start | frame_op |
> > 	       FEC_MMFR_PA(mii_id) | FEC_MMFR_RA(frame_addr) |
> > 	       FEC_MMFR_TA, fep->hwp + FEC_MII_DATA);
> >
> > 	/* wait for end of transfer */
> > 	ret = fec_enet_mdio_wait(fep);
> > 	if (ret) {
> > 		netdev_err(fep->netdev, "MDIO read timeout\n");
> > 		goto out;
> > 	}
> >
> > 	ret = FEC_MMFR_DATA(readl(fep->hwp + FEC_MII_DATA));
> >
> > This all does the MDIO bus transaction
> >
> > out:
> > 	pm_runtime_mark_last_busy(dev);
> > 	pm_runtime_put_autosuspend(dev);
> >
> > And then tell PM that we are done. In this case, i _think_ it starts a
> > timer, and if there is no more MDIO activity for a while, the clocks
> > get disabled.
> >
> > The same is done for write.
> >
> > PHY polling happens once per second, using these methods, nothing
> > special. So the clock will get enabled on the first read, polling can
> > need to read a few registers, so due to the timer, the clock is left
> > ticking between these reads, and then after a while the clock is
> > disabled.
> >
> > My guess is, you can have the clock disabled 80% of the time, which is
> > probably going to be a better way to stop the magic smoke escaping
> > from your hardware than slowing down the clock.
> 
> Sweet! Thank you very much Andrew! I will make the changes and send a new
> patch soon.
> 

Hi Andrew, 

I implemented and tested the changes you suggested and the runtime_resume/suspend work smoothly for MDIO.
However, we have another issue. I noticed that even if mdio_read/write() functions are not being called, runtime_resume/suspend() are still called regularly. After investigation, I found out that this is due to ethtool being called regularly. Ethtool automatically triggers the resume/suspend even if we do no MDIO access. A different team wrote a script which monitors "ethtool -S eth0" every 60 seconds. So every minute, we are running resume/suspend and enabling/disabling the MDIO clock. Seems counter productive. That team said that it is a requirement that they collect these statistics about the mlxbf_gige interface.
Is there any way to prevent ethtool from calling resume/suspend without changing core kernel code?
If not, what would you suggest? 

Thanks.
Asmaa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-06-27 19:00         ` Asmaa Mnebhi
@ 2025-06-28  8:40           ` Andrew Lunn
  2025-07-02 20:56             ` Asmaa Mnebhi
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Lunn @ 2025-06-28  8:40 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

> Hi Andrew, 
> 

> I implemented and tested the changes you suggested and the
> runtime_resume/suspend work smoothly for MDIO.

> However, we have another issue. I noticed that even if
> mdio_read/write() functions are not being called,
> runtime_resume/suspend() are still called regularly. After
> investigation, I found out that this is due to ethtool being called
> regularly. Ethtool automatically triggers the resume/suspend even if
> we do no MDIO access. A different team wrote a script which monitors
> "ethtool -S eth0" every 60 seconds. So every minute, we are running
> resume/suspend and enabling/disabling the MDIO clock. Seems counter
> productive. That team said that it is a requirement that they
> collect these statistics about the mlxbf_gige interface.

> Is there any way to prevent ethtool from calling resume/suspend
> without changing core kernel code?

> If not, what would you suggest? 

You need to put the MDIO bus device into its own pm_domain. Try
calling dev_pm_domain_set() to separate the MDIO bus from the MAC
driver in terms of power domains. ethtool will then power on/off the
MAC but leave the MDIO bus alone.

	Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-06-28  8:40           ` Andrew Lunn
@ 2025-07-02 20:56             ` Asmaa Mnebhi
  2025-07-02 21:26               ` Andrew Lunn
  0 siblings, 1 reply; 11+ messages in thread
From: Asmaa Mnebhi @ 2025-07-02 20:56 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

> > However, we have another issue. I noticed that even if
> > mdio_read/write() functions are not being called,
> > runtime_resume/suspend() are still called regularly. After
> > investigation, I found out that this is due to ethtool being called
> > regularly. Ethtool automatically triggers the resume/suspend even if
> > we do no MDIO access. A different team wrote a script which monitors
> > "ethtool -S eth0" every 60 seconds. So every minute, we are running
> > resume/suspend and enabling/disabling the MDIO clock. Seems counter
> > productive. That team said that it is a requirement that they
> > collect these statistics about the mlxbf_gige interface.
> 
> > Is there any way to prevent ethtool from calling resume/suspend
> > without changing core kernel code?
> > 
> You need to put the MDIO bus device into its own pm_domain. Try
> calling dev_pm_domain_set() to separate the MDIO bus from the MAC
> driver in terms of power domains. ethtool will then power on/off the
> MAC but leave the MDIO bus alone.
> 
Using dev_pm_domain_set() has the same effect as SET_RUNTIME_PM_OPS. The dev struct is shared so ethtool is still calling the suspend/resume.

int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige *priv)
 {
        struct device *dev = &pdev->dev;
@@ -390,14 +418,27 @@ int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige *priv)
        snprintf(priv->mdiobus->id, MII_BUS_ID_SIZE, "%s",
                 dev_name(dev));

+       pm_runtime_set_autosuspend_delay(priv->mdiobus->parent, 100);
+       pm_runtime_use_autosuspend(priv->mdiobus->parent);
+       pm_runtime_set_active(priv->mdiobus->parent);
+       pm_runtime_enable(priv->mdiobus->parent);
+       dev_pm_domain_set(priv->mdiobus->parent, &mlxbf_gige_pm_domain);
+
        ret = mdiobus_register(priv->mdiobus);
-       if (ret)
+       if (ret) {
+               pm_runtime_disable(priv->mdiobus->parent);
+               pm_runtime_set_suspended(priv->mdiobus->parent);
+               pm_runtime_dont_use_autosuspend(priv->mdiobus->parent);
                dev_err(dev, "Failed to register MDIO bus\n");
+       }

        return ret;
 }

I removed the pm related apis and just added disabling/reenabling the clock directly in the mdio read/write and it works (see code snippet below). It might be slower and not as efficient as the pm runtime infrastructure since we don't have autosuspend.

@@ -224,6 +243,10 @@ static int mlxbf_gige_mdio_read(struct mii_bus *bus, int phy_add, int phy_reg)
        if (phy_reg & MII_ADDR_C45)
                return -EOPNOTSUPP;

+       spin_lock_irqsave(&priv->lock, flag);
+
+       mlxbf_gige_mdio_enable(priv);
+
        /* Send mdio read request */
        cmd = mlxbf_gige_mdio_create_cmd(priv->mdio_gw, 0, phy_add, phy_reg,
                                         MLXBF_GIGE_MDIO_CL22_READ);
@@ -236,6 +259,8 @@ static int mlxbf_gige_mdio_read(struct mii_bus *bus, int phy_add, int phy_reg)

        if (ret) {
                writel(0, priv->mdio_io + priv->mdio_gw->gw_address);
+               mlxbf_gige_mdio_disable(priv);
+               spin_unlock_irqrestore(&priv->lock, flag);
                return ret;
        }

@@ -246,6 +271,9 @@ static int mlxbf_gige_mdio_read(struct mii_bus *bus, int phy_add, int phy_reg)
        /* The MDIO lock is set on read. To release it, clear gw register */
        writel(0, priv->mdio_io + priv->mdio_gw->gw_address);

+       mlxbf_gige_mdio_disable(priv);
+       spin_unlock_irqrestore(&priv->lock, flag);
+
        return ret;
 }


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-07-02 20:56             ` Asmaa Mnebhi
@ 2025-07-02 21:26               ` Andrew Lunn
  2025-07-03 18:51                 ` Asmaa Mnebhi
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Lunn @ 2025-07-02 21:26 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

> > You need to put the MDIO bus device into its own pm_domain. Try
> > calling dev_pm_domain_set() to separate the MDIO bus from the MAC
> > driver in terms of power domains. ethtool will then power on/off the
> > MAC but leave the MDIO bus alone.
> > 

> Using dev_pm_domain_set() has the same effect as SET_RUNTIME_PM_OPS. The dev struct is shared so ethtool is still calling the suspend/resume.
> 
> int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige *priv)
>  {
>         struct device *dev = &pdev->dev;
> @@ -390,14 +418,27 @@ int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige *priv)
>         snprintf(priv->mdiobus->id, MII_BUS_ID_SIZE, "%s",
>                  dev_name(dev));
> 
> +       pm_runtime_set_autosuspend_delay(priv->mdiobus->parent, 100);
> +       pm_runtime_use_autosuspend(priv->mdiobus->parent);

Why parent?

	Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-07-02 21:26               ` Andrew Lunn
@ 2025-07-03 18:51                 ` Asmaa Mnebhi
  2025-07-03 20:49                   ` Andrew Lunn
  0 siblings, 1 reply; 11+ messages in thread
From: Asmaa Mnebhi @ 2025-07-03 18:51 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

 > > > You need to put the MDIO bus device into its own pm_domain. Try
> > > calling dev_pm_domain_set() to separate the MDIO bus from the MAC
> > > driver in terms of power domains. ethtool will then power on/off the
> > > MAC but leave the MDIO bus alone.
> > >
> 
> > Using dev_pm_domain_set() has the same effect as
> SET_RUNTIME_PM_OPS. The dev struct is shared so ethtool is still calling the
> suspend/resume.
> >
> > int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct
> > mlxbf_gige *priv)  {
> >         struct device *dev = &pdev->dev; @@ -390,14 +418,27 @@ int
> > mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige
> *priv)
> >         snprintf(priv->mdiobus->id, MII_BUS_ID_SIZE, "%s",
> >                  dev_name(dev));
> >
> > +       pm_runtime_set_autosuspend_delay(priv->mdiobus->parent, 100);
> > +       pm_runtime_use_autosuspend(priv->mdiobus->parent);
> 
> Why parent?

That was just an experiment. I tried priv->dev, same result but I guess that is expected because it is the MAC dev. priv->mdiobus->dev is only set in mdiobus_register which:
- sets dev struct and calls device_register
- device_register calls device_pm_init and device_add
- device_add calls device_pm_add
- device_pm_check_callbacks sets dev->power.no_pm_callbacks based on if pm_domain/pm_ops were defined or not.

So I have to call dev_pm_domain_set before mdiobus_register for it to be registered properly. But then, priv->mdiobus->dev is not set up yet so we cannot call dev_pm_domain_set.

Thank you.
Asmaa


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug
  2025-07-03 18:51                 ` Asmaa Mnebhi
@ 2025-07-03 20:49                   ` Andrew Lunn
  0 siblings, 0 replies; 11+ messages in thread
From: Andrew Lunn @ 2025-07-03 20:49 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, David Thompson

On Thu, Jul 03, 2025 at 06:51:52PM +0000, Asmaa Mnebhi wrote:
>  > > > You need to put the MDIO bus device into its own pm_domain. Try
> > > > calling dev_pm_domain_set() to separate the MDIO bus from the MAC
> > > > driver in terms of power domains. ethtool will then power on/off the
> > > > MAC but leave the MDIO bus alone.
> > > >
> > 
> > > Using dev_pm_domain_set() has the same effect as
> > SET_RUNTIME_PM_OPS. The dev struct is shared so ethtool is still calling the
> > suspend/resume.
> > >
> > > int mlxbf_gige_mdio_probe(struct platform_device *pdev, struct
> > > mlxbf_gige *priv)  {
> > >         struct device *dev = &pdev->dev; @@ -390,14 +418,27 @@ int
> > > mlxbf_gige_mdio_probe(struct platform_device *pdev, struct mlxbf_gige
> > *priv)
> > >         snprintf(priv->mdiobus->id, MII_BUS_ID_SIZE, "%s",
> > >                  dev_name(dev));
> > >
> > > +       pm_runtime_set_autosuspend_delay(priv->mdiobus->parent, 100);
> > > +       pm_runtime_use_autosuspend(priv->mdiobus->parent);
> > 
> > Why parent?
> 
> That was just an experiment. I tried priv->dev, same result but I guess that is expected because it is the MAC dev. priv->mdiobus->dev is only set in mdiobus_register which:
> - sets dev struct and calls device_register
> - device_register calls device_pm_init and device_add
> - device_add calls device_pm_add
> - device_pm_check_callbacks sets dev->power.no_pm_callbacks based on if pm_domain/pm_ops were defined or not.
> 
> So I have to call dev_pm_domain_set before mdiobus_register for it to be registered properly. But then, priv->mdiobus->dev is not set up yet so we cannot call dev_pm_domain_set.

You are the first needing this, so i'm not surprised. Please look at
how priv->mdiobus->dev can be made to work. Maybe the
device_register() needs moving into mdiobus_alloc_size()?

	Andrew

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-07-03 20:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-22 22:48 [PATCH net v1] mlxbf-gige: Support workaround for MDIO GPIO degradation bug Asmaa Mnebhi
2024-11-26 17:32 ` Andrew Lunn
2025-05-08 14:18   ` Asmaa Mnebhi
2025-05-08 14:44     ` Andrew Lunn
2025-05-08 14:53       ` Asmaa Mnebhi
2025-06-27 19:00         ` Asmaa Mnebhi
2025-06-28  8:40           ` Andrew Lunn
2025-07-02 20:56             ` Asmaa Mnebhi
2025-07-02 21:26               ` Andrew Lunn
2025-07-03 18:51                 ` Asmaa Mnebhi
2025-07-03 20:49                   ` Andrew Lunn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).