[RFC] increase MDIO i2c poll timeout gradually (including patch)

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC] increase MDIO i2c poll timeout gradually (including patch)
@ 2025-09-19 13:52 Janpieter Sollie
  2025-09-19 17:04 ` Andrew Lunn
  0 siblings, 1 reply; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-19 13:52 UTC (permalink / raw)
  To: netdev

Hello everyone,

I tested a SFP module where the i2c bus is "unstable" at best.
different i2c timeouts occured, resulting in a "phy not detected" error message.
A simple (but not revealing a lot) stack dump during probe::

 > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:268) mdio_i2c
 > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:316) mdio_i2c
 > __mdiobus_c45_read (drivers/net/phy/mdio_bus.c:992)
 > mdiobus_c45_read (drivers/net/phy/mdio_bus.c:1133)
 > get_phy_c45_ids (drivers/net/phy/phy_device.c:947)
 > get_phy_device (drivers/net/phy/phy_device.c:1054)
 > init_module (drivers/net/phy/sfp.c:1820) sfp
 > cleanup_module (drivers/net/phy/sfp.c:1956 drivers/net/phy/sfp.c:2667 
drivers/net/phy/sfp.c:2748) sfp
 > cleanup_module (drivers/net/phy/sfp.c:2760 drivers/net/phy/sfp.c:2892) sfp
 > process_one_work (./arch/arm64/include/asm/jump_label.h:32 ./include/linux/jump_label.h:207)
 > worker_thread (kernel/workqueue.c:3304 (discriminator 2) kernel/workqueue.c:3391 
(discriminator 2))
 > kthread (kernel/kthread.c:389)
 > ret_from_fork (arch/arm64/kernel/entry.S:863)

I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious.
In order to lower the stress on the i2c bus, I made the following patch.
is it the best way to "not-stress-sensitive-devices"?
Will it cause a performance regression on some other SFP cages?

Eric Woudstra told me another option was to add a few tries, increasing i = 10,
If the issue isn't the device itself, but the stress on the i2c bus is too high, it may not be a 
real solution.

A good question may be: is this approach sufficient to close the gap between
"high performance" equipment having a stable i2c bus and they do not want to wait,
and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts?
Should this be fixed at another point in the initialization process (eg: not probing 
ridiculously all phy ids)?

Thanks,

Janpieter Sollie

--- a/drivers/net/mdio/mdio-i2c.c       2025-09-19 14:08:41.285357818 +0200
+++ b/drivers/net/mdio/mdio-i2c.c       2025-09-19 14:10:24.962796149 +0200
@@ -253,7 +253,7 @@ static int i2c_rollball_mii_poll(struct mii_bus *bus, int bus_addr, u8 *buf,
          */
         i = 10;
         do {
-               msleep(20);
+               msleep(20+(10*(10-i)));

                 ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs));
                 if (ret)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-19 13:52 [RFC] increase MDIO i2c poll timeout gradually (including patch) Janpieter Sollie
@ 2025-09-19 17:04 ` Andrew Lunn
  2025-09-20 10:00   ` Janpieter Sollie
  0 siblings, 1 reply; 15+ messages in thread
From: Andrew Lunn @ 2025-09-19 17:04 UTC (permalink / raw)
  To: Janpieter Sollie; +Cc: netdev, Russell King, Heiner Kallweit

On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote:
> Hello everyone,

Please ensure you Cc: the correct Maintainers.

./scripts/get_maintainer.pl drivers/net/phy/sfp.c 
Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT)
Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY)
Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY)
"David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS)
Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS)
Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS)
Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS)
netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT)
linux-kernel@vger.kernel.org (open list)

> I tested a SFP module where the i2c bus is "unstable" at best.

Please tell us more about the hardware.

Also, what speed do you have the I2C bus running at? Have you tried
different clock-frequency values to slow down the I2C bus? Have you
checked the pull-up resistors? I2C problems are sometimes due to too
strong pull-ups.

> A good question may be: is this approach sufficient to close the gap between
> "high performance" equipment having a stable i2c bus and they do not want to wait,
> and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts?

Does your board actually confirm to the standards? I2C busses should
be able to run at 100KHz, as defined by the standard. Also, the SFP
standards define modules should work at 100KHz. And counting every
milliwatt makes no sense when you are supposed to be able to deliver
3.3V at 300mA, i.e. 1 Watt, to the module.

> Should this be fixed at another point in the initialization process (eg: not
> probing ridiculously all phy ids)?

Unfortunately, MDIO over I2C is not standardised. So we have no idea
what address the PHY will be using, so we need to look at them all. If
you have an SFF, not an SFP, it might be possible to do some
optimisation.

	Andrew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-19 17:04 ` Andrew Lunn
@ 2025-09-20 10:00   ` Janpieter Sollie
  2025-09-20 11:18     ` Russell King (Oracle)
  0 siblings, 1 reply; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-20 10:00 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: netdev, Russell King, Heiner Kallweit, Andrew Lunn,
	Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni

Op 19/09/2025 om 19:04 schreef Andrew Lunn:
> On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote:
>> Hello everyone,
> Please ensure you Cc: the correct Maintainers.
>
> ./scripts/get_maintainer.pl drivers/net/phy/sfp.c
> Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT)
> Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY)
> Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY)
> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS)
> Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS)
> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS)
> Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS)
> netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT)
> linux-kernel@vger.kernel.org (open list)
Done, sorry, this is my first post here
>
>> I tested a SFP module where the i2c bus is "unstable" at best.
> Please tell us more about the hardware.
>
> Also, what speed do you have the I2C bus running at? Have you tried
> different clock-frequency values to slow down the I2C bus? Have you
> checked the pull-up resistors? I2C problems are sometimes due to too
> strong pull-ups.
The hardware is a bananapi R4 2xSFP using a MT7988a SoC.
The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a quirk in sfp.c (added 
below)
I'm not a i2c expert at all,
but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root.
I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell?

The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know.

>
>> A good question may be: is this approach sufficient to close the gap between
>> "high performance" equipment having a stable i2c bus and they do not want to wait,
>> and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts?
> Does your board actually confirm to the standards? I2C busses should
> be able to run at 100KHz, as defined by the standard. Also, the SFP
> standards define modules should work at 100KHz. And counting every
> milliwatt makes no sense when you are supposed to be able to deliver
> 3.3V at 300mA, i.e. 1 Watt, to the module.
Sorry, I thought the issue here might be the SFP cage limited to 3W.
While it should be enough to feed the RJ45 short range module, maybe it needs more during 
initalization.
Is there a way to tell?
>
>> Should this be fixed at another point in the initialization process (eg: not
>> probing ridiculously all phy ids)?
> Unfortunately, MDIO over I2C is not standardised. So we have no idea
> what address the PHY will be using, so we need to look at them all. If
> you have an SFF, not an SFP, it might be possible to do some
> optimisation.
>
> 	Andrew
>
Well, that's a bummer.

is it possible to perform stress testing on the bus by randomly reading those phy registers and 
check
whether  the muxer is the cause of the instability, or the target device itself?

Janpieter Sollie


--- a/drivers/net/phy/sfp.c     2025-06-21 15:06:02.086163322 +0200
+++ b/drivers/net/phy/sfp.c     2025-06-21 15:59:41.836883438 +0200
@@ -422,6 +422,13 @@
         sfp->id.base.extended_cc = SFF8024_ECC_10GBASE_T_SFI;
  }

+static void sfp_fixup_oem_10gt(struct sfp *sfp)
+{
+       sfp_fixup_10gbaset_30m(sfp);
+       sfp_fixup_rollball_wait4s(sfp);
+       //sfp->module_t_wait = msecs_to_jiffies(10000);
+}
+
  static void sfp_quirk_2500basex(const struct sfp_eeprom_id *id,
                                 unsigned long *modes,
                                 unsigned long *interfaces)
@@ -516,6 +523,7 @@
         SFP_QUIRK_F("OEM", "SFP-GE-T", sfp_fixup_ignore_tx_fault),

         SFP_QUIRK_F("OEM", "SFP-10G-T", sfp_fixup_rollball_cc),
+       SFP_QUIRK_F("OEM", "ZK-10G-TX", sfp_fixup_oem_10gt),
         SFP_QUIRK_M("OEM", "SFP-2.5G-T", sfp_quirk_oem_2_5g),
         SFP_QUIRK_M("OEM", "SFP-2.5G", sfp_quirk_oem_2_5g),
         SFP_QUIRK_M("OEM", "SFP-2.5G-BX10-D", sfp_quirk_2500basex),

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-20 10:00   ` Janpieter Sollie
@ 2025-09-20 11:18     ` Russell King (Oracle)
  2025-09-20 12:34       ` Janpieter Sollie
       [not found]       ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>
  0 siblings, 2 replies; 15+ messages in thread
From: Russell King (Oracle) @ 2025-09-20 11:18 UTC (permalink / raw)
  To: Janpieter Sollie
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Sat, Sep 20, 2025 at 12:00:50PM +0200, Janpieter Sollie wrote:
> Op 19/09/2025 om 19:04 schreef Andrew Lunn:
> > On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote:
> > > Hello everyone,
> > Please ensure you Cc: the correct Maintainers.
> > 
> > ./scripts/get_maintainer.pl drivers/net/phy/sfp.c
> > Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT)
> > Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY)
> > Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY)
> > "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS)
> > Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS)
> > Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS)
> > Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS)
> > netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT)
> > linux-kernel@vger.kernel.org (open list)
> Done, sorry, this is my first post here
> > 
> > > I tested a SFP module where the i2c bus is "unstable" at best.
> > Please tell us more about the hardware.
> > 
> > Also, what speed do you have the I2C bus running at? Have you tried
> > different clock-frequency values to slow down the I2C bus? Have you
> > checked the pull-up resistors? I2C problems are sometimes due to too
> > strong pull-ups.
> The hardware is a bananapi R4 2xSFP using a MT7988a SoC.
> The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a
> quirk in sfp.c (added below)
> I'm not a i2c expert at all,
> but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root.
> I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell?
> 
> The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know.

As you have not include the author of the SFP support (me) in your
initial email, and have not provided a repeat of the description,
I'm afraid I have no idea what the issue is that you're encountering.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-20 11:18     ` Russell King (Oracle)
@ 2025-09-20 12:34       ` Janpieter Sollie
       [not found]       ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>
  1 sibling, 0 replies; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-20 12:34 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Op 20/09/2025 om 13:18 schreef Russell King (Oracle):
> On Sat, Sep 20, 2025 at 12:00:50PM +0200, Janpieter Sollie wrote:
>> Op 19/09/2025 om 19:04 schreef Andrew Lunn:
>>> On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote:
>>>> Hello everyone,
>>> Please ensure you Cc: the correct Maintainers.
>>>
>>> ./scripts/get_maintainer.pl drivers/net/phy/sfp.c
>>> Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT)
>>> Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY)
>>> Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY)
>>> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS)
>>> Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS)
>>> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS)
>>> Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS)
>>> netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT)
>>> linux-kernel@vger.kernel.org (open list)
>> Done, sorry, this is my first post here
>>>> I tested a SFP module where the i2c bus is "unstable" at best.
>>> Please tell us more about the hardware.
>>>
>>> Also, what speed do you have the I2C bus running at? Have you tried
>>> different clock-frequency values to slow down the I2C bus? Have you
>>> checked the pull-up resistors? I2C problems are sometimes due to too
>>> strong pull-ups.
>> The hardware is a bananapi R4 2xSFP using a MT7988a SoC.
>> The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a
>> quirk in sfp.c (added below)
>> I'm not a i2c expert at all,
>> but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root.
>> I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell?
>>
>> The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know.
> As you have not include the author of the SFP support (me) in your
> initial email, and have not provided a repeat of the description,
> I'm afraid I have no idea what the issue is that you're encountering.
>
> Thanks.
>

Yes indeed, I see it too.

Hereby the original post:

======================

Hello everyone,

I tested a SFP module where the i2c bus is "unstable" at best.
different i2c timeouts occured, resulting in a "phy not detected" error message.
A simple (but not revealing a lot) stack dump during probe::

 > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:268) mdio_i2c
 > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:316) mdio_i2c
 > __mdiobus_c45_read (drivers/net/phy/mdio_bus.c:992)
 > mdiobus_c45_read (drivers/net/phy/mdio_bus.c:1133)
 > get_phy_c45_ids (drivers/net/phy/phy_device.c:947)
 > get_phy_device (drivers/net/phy/phy_device.c:1054)
 > init_module (drivers/net/phy/sfp.c:1820) sfp
 > cleanup_module (drivers/net/phy/sfp.c:1956 drivers/net/phy/sfp.c:2667 
drivers/net/phy/sfp.c:2748) sfp
 > cleanup_module (drivers/net/phy/sfp.c:2760 drivers/net/phy/sfp.c:2892) sfp
 > process_one_work (./arch/arm64/include/asm/jump_label.h:32 ./include/linux/jump_label.h:207)
 > worker_thread (kernel/workqueue.c:3304 (discriminator 2) kernel/workqueue.c:3391 
(discriminator 2))
 > kthread (kernel/kthread.c:389)
 > ret_from_fork (arch/arm64/kernel/entry.S:863)

I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious.
In order to lower the stress on the i2c bus, I made the following patch.
is it the best way to "not-stress-sensitive-devices"?
Will it cause a performance regression on some other SFP cages?

Eric Woudstra told me another option was to add a few tries, increasing i = 10,
If the issue isn't the device itself, but the stress on the i2c bus is too high, it may not be a 
real solution.

A good question may be: is this approach sufficient to close the gap between
"high performance" equipment having a stable i2c bus and they do not want to wait,
and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts?
Should this be fixed at another point in the initialization process (eg: not probing 
ridiculously all phy ids)?

Thanks,

Janpieter Sollie

--- a/drivers/net/mdio/mdio-i2c.c       2025-09-19 14:08:41.285357818 +0200
+++ b/drivers/net/mdio/mdio-i2c.c       2025-09-19 14:10:24.962796149 +0200
@@ -253,7 +253,7 @@ static int i2c_rollball_mii_poll(struct mii_bus *bus, int bus_addr, u8 *buf,
          */
         i = 10;
         do {
-               msleep(20);
+               msleep(20+(10*(10-i)));

                 ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs));
                 if (ret)

^ permalink raw reply	[flat|nested] 15+ messages in thread

[parent not found: <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>]

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
       [not found]       ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>
@ 2025-09-20 13:53         ` Russell King (Oracle)
  2025-09-22  8:04           ` Janpieter Sollie
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King (Oracle) @ 2025-09-20 13:53 UTC (permalink / raw)
  To: Janpieter Sollie
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Sat, Sep 20, 2025 at 02:09:47PM +0200, Janpieter Sollie wrote:
> I tested a SFP module where the i2c bus is "unstable" at best.
> different i2c timeouts occured, resulting in a "phy not detected" error message.

If the I2C bus is so unstable that attempting to read a register from
the PHY, which should take no more than 70ms, takes in excess of 200ms
(which is what it takes for the loop to time out) for just one
register, it seems to me that you're chasing a dead horse.

However, looking at the rest of your message, I don't think you have
a problem with the I2C bus at all.

> I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious.

Sorry, but I don't follow the same reasoning. If they aren't hard-coded,
but are "knobs" in e.g. sysfs, then how do we teach people how to "tune"
them? When should they "tune" them? No, the less knobs there are, the
better for the user, provided timeouts are sensible - and I think
waiting 200ms for a register to be read is already an excessively long
timeout.

In any case, if we have a module that takes longer, that's new, and the
timeouts need to be adjusted - not on a per-module basis, and not by
users having to tinker with stuff in sysfs.

> In order to lower the stress on the i2c bus, I made the following patch.
> is it the best way to "not-stress-sensitive-devices"?
> Will it cause a performance regression on some other SFP cages?
> 
> Eric Woudstra told me another option was to add a few tries, increasing i = 10,
> If the issue isn't the device itself, but the stress on the i2c bus is too
> high, it may not be a real solution.

Why are you concerned about "stress" on the I2C bus? What kind of
stress? The bus is 5V or 3.3V signalling, running at 100kHz (so slow)
with resistive pull-ups. Apart from the bus transitions (which cause
CMOS to take a pulse of power) the energy from that will be nothing
compared to the energy required to run the CPU, which is operating
much faster with many more CMOS transistors switching.

> A good question may be: is this approach sufficient to close the gap between
> "high performance" equipment having a stable i2c bus and they do not want to wait,

None of this has been written for "high performance" equipment. It was
developed on a SolidRun clearfog platform (Armada 388 based) which is
hardly "high performance". It's been used with bit-banged I2C as well
on Macchiatobin platforms.

I'm guessing that the problem is not an I2C bus problem (which would
cause i2c_transfer_rollball() to return an error, and the loop to quit),
but yet another cheap and nasty SFP module that takes much longer
than 70ms to respond under the "Rollball" protocol.

So, what we need you to do is to work out how long it takes this module
to respond, and whether it always takes a long time to respond. Please
add some debugging to i2c_rollball_mii_poll() to measure the amount of
time it takes for the module to respond - and please measure it for
several transactions.

You can use jiffies, and convert to msecs using jiffies_to_msecs(),
or you could use ktime_get_ns().

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-20 13:53         ` Russell King (Oracle)
@ 2025-09-22  8:04           ` Janpieter Sollie
  2025-09-22 10:54             ` Janpieter Sollie
  0 siblings, 1 reply; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-22  8:04 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Op 20/09/2025 om 15:53 schreef Russell King (Oracle):
> So, what we need you to do is to work out how long it takes this module
> to respond, and whether it always takes a long time to respond. Please
> add some debugging to i2c_rollball_mii_poll() to measure the amount of
> time it takes for the module to respond - and please measure it for
> several transactions.
>
> You can use jiffies, and convert to msecs using jiffies_to_msecs(),
> or you could use ktime_get_ns().
>
> Thanks.
>

All right, so I changed the modification to a more debug-friendly function (view below).
I also changed the incremental wait() function from (20+10*(10-i)) to (20+5*(10-i)) to be more 
accurate.
When bringing up with "ip link set sfp-lan up", it starts printing
It keeps polling (which, I guess, is what you expect, I guess the hwmon subsystem, link 
detection, ... )
Mostly, the output is 300-400us.
When the module is up and running, and maintenance polls happen, I saw something new:
It starts responding with a much lower value: something like 30 us. At iteration 10. occasionally.
i = 10 means "immediately", i decrements from 10 to 0.
I got no clarification for that.
The thing I'm mostly sure of, is that no commands return an error, and no mii_polls time out,
but 300-400us is way too much for a loop which expects to return in 200us
...
156724.601882] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348074665 ns 
in iteration 4
[156724.941879] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328077236 
ns in iteration 4
[156725.301889] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348093665 
ns in iteration 4
[156725.661896] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348075511 
ns in iteration 4
[156726.081885] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 408061414 
ns in iteration 3
[156726.441997] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348189434 
ns in iteration 4
[156726.781885] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 327952775 
ns in iteration 4
[156727.831885] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 30037797 ns 
in iteration 10
[156728.241880] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398062738 
ns in iteration 3
[156728.591889] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 338068066 
ns in iteration 4
[156728.941888] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 338062605 
ns in iteration 4
[156729.351924] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398101891 
ns in iteration 3
[156729.692014] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328135697 
ns in iteration 4
[156730.041895] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 337947143 
ns in iteration 4
[156730.391997] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 338162758 
ns in iteration 4
[156730.741994] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 337980912 
ns in iteration 4
[156731.831897] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 30043259 ns 
in iteration 10
[156732.241897] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398065122 
ns in iteration 3
[156732.581982] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328157082 
ns in iteration 4
[156732.921978] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 327986467 
ns in iteration 4

...

This was an example of the output when using the patch below.
No poll command returns an error state. No poll commands time out.
all mii_poll commands return at i = 3 or i = 4,
when the sfp module is completely up, i = 10 occurs from time to time.

Thanks,

Janpieter Sollie

--- a/drivers/net/mdio/mdio-i2c.c       2025-09-22 08:41:12.084752517 +0200
+++ b/drivers/net/mdio/mdio-i2c.c       2025-09-22 08:47:20.230592266 +0200
@@ -232,6 +232,7 @@
         struct i2c_msg msgs[2];
         u8 cmd_addr, tmp, *res;
         int i, ret;
+       long uptime;

         cmd_addr = ROLLBALL_CMD_ADDR;

@@ -252,14 +253,19 @@
          * SFPs. Sleep 20ms between iterations and try 10 times.
          */
         i = 10;
+       uptime = ktime_get_ns();
         do {
-               msleep(20);
+               msleep(20 + 5*(10-i));

                 ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs));
-               if (ret)
+               if (ret) {
+                       dev_dbg(&bus->dev, "poll cmd return %d after %d ns in iteration %d\n", 
ret, (ktime_get_ns() - uptime), i);
                         return ret;
+               }

-               if (*res == ROLLBALL_CMD_DONE)
+               if (*res == ROLLBALL_CMD_DONE) {
+                       dev_dbg(&bus->dev, "poll cmd success after %d ns in iteration %d\n", 
(ktime_get_ns() - uptime), i);
                         return 0;
+               }
         } while (i-- > 0);

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22  8:04           ` Janpieter Sollie
@ 2025-09-22 10:54             ` Janpieter Sollie
  2025-09-22 12:38               ` Russell King (Oracle)
  0 siblings, 1 reply; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-22 10:54 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Op 22/09/2025 om 10:04 schreef Janpieter Sollie:
> Op 20/09/2025 om 15:53 schreef Russell King (Oracle):
>> So, what we need you to do is to work out how long it takes this module
>> to respond, and whether it always takes a long time to respond. Please
>> add some debugging to i2c_rollball_mii_poll() to measure the amount of
>> time it takes for the module to respond - and please measure it for
>> several transactions.
>>
>> You can use jiffies, and convert to msecs using jiffies_to_msecs(),
>> or you could use ktime_get_ns().
>>
>> Thanks.
>>
>
> All right, so I changed the modification to a more debug-friendly function (view below).
> I also changed the incremental wait() function from (20+10*(10-i)) to (20+5*(10-i)) to be more 
> accurate.
>
> [156732.241897] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398065122 
> ns in iteration 3
> [156732.581982] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328157082 
> ns in iteration 4
> [156732.921978] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 327986467 
> ns in iteration 4
>
> ...
>

Something I noticed when going through a lot more iterations:

FYI: The kernel has been compiled with a 100hz timer, tickless idle and no kernel preemption.
I wanted to count how much the system actually runs those i2c calls, substracting msleep,
which is 20, 245 and 300, respectively.

So, I separated i = 10, i = 4 and i = 3 a bit.
 > 192 numbers for i = 10
 > 2309 numbers for i = 4
 > 1129 numbers for i = 3

and calculating max and min, substracting the msleep:

 > 1 function call max: 20131327
 > 1 function call min: 9990574
 > diff at iteration 10: 10140753
 > avg at iteration 10:  10723993

 > 7 function calls min: 82868351
 > 7 function calls max:  123074939
 > diff at iteration 4: 40206588
 > avg at iteration 4: 86375811

 > 8 function calls min: 97889217
 > 8 function calls max: 128086531
 > diff at iteration 3: 30197314
 > avg at iteration 3: 99901858

this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device is running the 
i2c_transfer_rollball().
seems a lot to me when an i2c call takes 11-12 usecs avg per call
are you sure these numbers point to a stable i2c bus?

thanks,

Janpieter Sollie

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 10:54             ` Janpieter Sollie
@ 2025-09-22 12:38               ` Russell King (Oracle)
  2025-09-22 13:16                 ` Russell King (Oracle)
                                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Russell King (Oracle) @ 2025-09-22 12:38 UTC (permalink / raw)
  To: Janpieter Sollie
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote:
> this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device
> is running the i2c_transfer_rollball().
> seems a lot to me when an i2c call takes 11-12 usecs avg per call
> are you sure these numbers point to a stable i2c bus?

I guess you've never dealt with I2C buses before. As has already been
stated, the clock rate for I2C used with SFP modules (which is, if you
like, I2C v1) is 100kHz. that's 10us per bit.

An I2C transaction consists of one start bit, 8 bits for the address
and r/w bit, one bit for the ack, 8 bits for the each byte of data
and their individual ACK bits, and finally a stop bit. If a restart
condition is used, the stop and start between the messages can be
combined into a restart condition, saving one bit.

That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits
or clocks.

The polling consists of two transactions on the bus:

- a write of one byte - giving 20 clock cycles.
- a read of six bytes - giving 65 clock cycles.

So that's 85 clock cycles, or 84 if using restart. At 10us per cycle,
that's 840us _minimum_.

If i2c_transfer() for that write and read are taking on the order of
12us, that suggest the bus is being clocked at around 7MHz, which is
certainly way too fast, a bug in the I2C driver, an issue with the
I2C hardware, or maybe an error in calculating how long a call takes.

And... it's your interpretation of your results.

Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds
(us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry
to teach you to suck eggs, but based on your reply it seems necessary
to point this out.

You quoted an average of 99901858ns - 99.9ms for the i=3 case.
You quoted an average of 86375811ns - 86.4ms for the i=4 case.

Given that the difference in msleep() delay is 5ms, and we're
talking about 50 or 55ms here, for the i=3 case that's 45ms
for i2c_transfer(). For the i=4 case, that's 36.4ms.

However, msleep() is not accurate - and may even be bucket-based so I
wouldn't rely on the requested msleep() interval being what you end
up with - which seems to be suggested by the difference of almost 10ms
in the apparent time that i2c_transfer() takes. 10ms, not 10us. So,
unless you actually obtain timestamps before and after the
i2c_transfer() call and calculate their difference, I would not read
too much into that.

In any case, figures in the realms of milliseconds are certainly in the
realm of possibility for a 100kHz bus - as I say, one instance of a
transaction _should_ be no less than 840 microseconds, so if your
calculations come out less than that, you should not be claiming
"bad bus" or something like that, but at first revalidating your
analysis or interpretation of the figures.

Also, because of scheduling delays, and some I2C drivers will sleep
waiting for the transaction to finish, even that measurement I
suggest can not be said to relate to the actual time it takes for
the transactions on the bus, unless you're running a hard-realtime OS.

However, it seems you're very keen to blame the I2C bus hardware...

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 12:38               ` Russell King (Oracle)
@ 2025-09-22 13:16                 ` Russell King (Oracle)
  2025-09-22 13:50                   ` Russell King (Oracle)
  2025-09-22 14:30                 ` Janpieter Sollie
  2025-09-22 14:46                 ` Janpieter Sollie
  2 siblings, 1 reply; 15+ messages in thread
From: Russell King (Oracle) @ 2025-09-22 13:16 UTC (permalink / raw)
  To: Janpieter Sollie
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Mon, Sep 22, 2025 at 01:38:01PM +0100, Russell King (Oracle) wrote:
> On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote:
> > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device
> > is running the i2c_transfer_rollball().
> > seems a lot to me when an i2c call takes 11-12 usecs avg per call
> > are you sure these numbers point to a stable i2c bus?
> 
> I guess you've never dealt with I2C buses before. As has already been
> stated, the clock rate for I2C used with SFP modules (which is, if you
> like, I2C v1) is 100kHz. that's 10us per bit.
> 
> An I2C transaction consists of one start bit, 8 bits for the address
> and r/w bit, one bit for the ack, 8 bits for the each byte of data
> and their individual ACK bits, and finally a stop bit. If a restart
> condition is used, the stop and start between the messages can be
> combined into a restart condition, saving one bit.
> 
> That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits
> or clocks.
> 
> The polling consists of two transactions on the bus:
> 
> - a write of one byte - giving 20 clock cycles.
> - a read of six bytes - giving 65 clock cycles.
> 
> So that's 85 clock cycles, or 84 if using restart. At 10us per cycle,
> that's 840us _minimum_.
> 
> If i2c_transfer() for that write and read are taking on the order of
> 12us, that suggest the bus is being clocked at around 7MHz, which is
> certainly way too fast, a bug in the I2C driver, an issue with the
> I2C hardware, or maybe an error in calculating how long a call takes.
> 
> And... it's your interpretation of your results.
> 
> Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds
> (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry
> to teach you to suck eggs, but based on your reply it seems necessary
> to point this out.
> 
> You quoted an average of 99901858ns - 99.9ms for the i=3 case.
> You quoted an average of 86375811ns - 86.4ms for the i=4 case.
> 
> Given that the difference in msleep() delay is 5ms, and we're
> talking about 50 or 55ms here, for the i=3 case that's 45ms
> for i2c_transfer(). For the i=4 case, that's 36.4ms.
> 
> However, msleep() is not accurate - and may even be bucket-based so I
> wouldn't rely on the requested msleep() interval being what you end
> up with - which seems to be suggested by the difference of almost 10ms
> in the apparent time that i2c_transfer() takes. 10ms, not 10us. So,
> unless you actually obtain timestamps before and after the
> i2c_transfer() call and calculate their difference, I would not read
> too much into that.
> 
> In any case, figures in the realms of milliseconds are certainly in the
> realm of possibility for a 100kHz bus - as I say, one instance of a
> transaction _should_ be no less than 840 microseconds, so if your
> calculations come out less than that, you should not be claiming
> "bad bus" or something like that, but at first revalidating your
> analysis or interpretation of the figures.
> 
> Also, because of scheduling delays, and some I2C drivers will sleep
> waiting for the transaction to finish, even that measurement I
> suggest can not be said to relate to the actual time it takes for
> the transactions on the bus, unless you're running a hard-realtime OS.
> 
> However, it seems you're very keen to blame the I2C bus hardware...

I'll also point out that if an error occurs on the I2C bus, the earliest
that could be detected is after the address byte (lack of ACK to the
address - non-responsive slave) which is 10 clocks, or 100us.

If an error occurs, then i2c_transfer() should either return an error,
or indicate that the transaction has not been completed.

What I have missed in my analysis is that it's not calling
i2c_transfer() but i2c_transfer_rollball() which does many more bus
transactions. So, my figures of 840us minimum is likely a grossly
under the expected time.

However, my fundamental point still stands - that being 10us for
i2c_transfer_rollball() is an insanely short period for the work that
needs to be done no matter _what_ happens on the hardware bus.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 13:16                 ` Russell King (Oracle)
@ 2025-09-22 13:50                   ` Russell King (Oracle)
  0 siblings, 0 replies; 15+ messages in thread
From: Russell King (Oracle) @ 2025-09-22 13:50 UTC (permalink / raw)
  To: Janpieter Sollie
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Mon, Sep 22, 2025 at 02:16:03PM +0100, Russell King (Oracle) wrote:
> On Mon, Sep 22, 2025 at 01:38:01PM +0100, Russell King (Oracle) wrote:
> > On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote:
> > > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device
> > > is running the i2c_transfer_rollball().
> > > seems a lot to me when an i2c call takes 11-12 usecs avg per call
> > > are you sure these numbers point to a stable i2c bus?
> > 
> > I guess you've never dealt with I2C buses before. As has already been
> > stated, the clock rate for I2C used with SFP modules (which is, if you
> > like, I2C v1) is 100kHz. that's 10us per bit.
> > 
> > An I2C transaction consists of one start bit, 8 bits for the address
> > and r/w bit, one bit for the ack, 8 bits for the each byte of data
> > and their individual ACK bits, and finally a stop bit. If a restart
> > condition is used, the stop and start between the messages can be
> > combined into a restart condition, saving one bit.
> > 
> > That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits
> > or clocks.
> > 
> > The polling consists of two transactions on the bus:
> > 
> > - a write of one byte - giving 20 clock cycles.
> > - a read of six bytes - giving 65 clock cycles.
> > 
> > So that's 85 clock cycles, or 84 if using restart. At 10us per cycle,
> > that's 840us _minimum_.
> > 
> > If i2c_transfer() for that write and read are taking on the order of
> > 12us, that suggest the bus is being clocked at around 7MHz, which is
> > certainly way too fast, a bug in the I2C driver, an issue with the
> > I2C hardware, or maybe an error in calculating how long a call takes.
> > 
> > And... it's your interpretation of your results.
> > 
> > Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds
> > (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry
> > to teach you to suck eggs, but based on your reply it seems necessary
> > to point this out.
> > 
> > You quoted an average of 99901858ns - 99.9ms for the i=3 case.
> > You quoted an average of 86375811ns - 86.4ms for the i=4 case.
> > 
> > Given that the difference in msleep() delay is 5ms, and we're
> > talking about 50 or 55ms here, for the i=3 case that's 45ms
> > for i2c_transfer(). For the i=4 case, that's 36.4ms.
> > 
> > However, msleep() is not accurate - and may even be bucket-based so I
> > wouldn't rely on the requested msleep() interval being what you end
> > up with - which seems to be suggested by the difference of almost 10ms
> > in the apparent time that i2c_transfer() takes. 10ms, not 10us. So,
> > unless you actually obtain timestamps before and after the
> > i2c_transfer() call and calculate their difference, I would not read
> > too much into that.
> > 
> > In any case, figures in the realms of milliseconds are certainly in the
> > realm of possibility for a 100kHz bus - as I say, one instance of a
> > transaction _should_ be no less than 840 microseconds, so if your
> > calculations come out less than that, you should not be claiming
> > "bad bus" or something like that, but at first revalidating your
> > analysis or interpretation of the figures.
> > 
> > Also, because of scheduling delays, and some I2C drivers will sleep
> > waiting for the transaction to finish, even that measurement I
> > suggest can not be said to relate to the actual time it takes for
> > the transactions on the bus, unless you're running a hard-realtime OS.
> > 
> > However, it seems you're very keen to blame the I2C bus hardware...
> 
> I'll also point out that if an error occurs on the I2C bus, the earliest
> that could be detected is after the address byte (lack of ACK to the
> address - non-responsive slave) which is 10 clocks, or 100us.
> 
> If an error occurs, then i2c_transfer() should either return an error,
> or indicate that the transaction has not been completed.
> 
> What I have missed in my analysis is that it's not calling
> i2c_transfer() but i2c_transfer_rollball() which does many more bus
> transactions. So, my figures of 840us minimum is likely a grossly
> under the expected time.
> 
> However, my fundamental point still stands - that being 10us for
> i2c_transfer_rollball() is an insanely short period for the work that
> needs to be done no matter _what_ happens on the hardware bus.

Having now been through i2c_transfer_rollball(), the additional
transactions on the bus will take between 97 and 98 cycles on top
of what I mentioned, or 970us - 980us at 100kHz.

In total, that works out at a minimum of 1.81ms for a successful
access to the hardware, assuming every I2C transaction occurs
back-to-back to the previous transaction.

Thanks.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 12:38               ` Russell King (Oracle)
  2025-09-22 13:16                 ` Russell King (Oracle)
@ 2025-09-22 14:30                 ` Janpieter Sollie
  2025-09-22 15:25                   ` Russell King (Oracle)
  2025-09-22 14:46                 ` Janpieter Sollie
  2 siblings, 1 reply; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-22 14:30 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Op 22/09/2025 om 14:38 schreef Russell King (Oracle):
> On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote:
>> this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device
>> is running the i2c_transfer_rollball().
>> seems a lot to me when an i2c call takes 11-12 usecs avg per call
>> are you sure these numbers point to a stable i2c bus?
> I guess you've never dealt with I2C buses before.
No, but I've read many kernel code dealing with it, desperately trying to figure out the error.
But yes, this is only the software part.
> As has already been
> stated, the clock rate for I2C used with SFP modules (which is, if you
> like, I2C v1) is 100kHz. that's 10us per bit.
>
> An I2C transaction consists of one start bit, 8 bits for the address
> and r/w bit, one bit for the ack, 8 bits for the each byte of data
> and their individual ACK bits, and finally a stop bit. If a restart
> condition is used, the stop and start between the messages can be
> combined into a restart condition, saving one bit.
>
> That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits
> or clocks.
>
> The polling consists of two transactions on the bus:
>
> - a write of one byte - giving 20 clock cycles.
> - a read of six bytes - giving 65 clock cycles.
>
> So that's 85 clock cycles, or 84 if using restart. At 10us per cycle,
> that's 840us _minimum_.
and here's my first error:
 > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device is running the 
i2c_transfer_rollball().
 > seems a lot to me when an i2c call takes 11-12 usecs avg per call
I wrote usecs, but was obviously in a wrong universe, it should have been msecs.  Sorry
>
> If i2c_transfer() for that write and read are taking on the order of
> 12us, that suggest the bus is being clocked at around 7MHz, which is
> certainly way too fast, a bug in the I2C driver, an issue with the
> I2C hardware, or maybe an error in calculating how long a call takes.
Yes, I'm afraid of the last one as well ...
Honestly, even after re-reading it (twice),
I'm still not sure if / when I made an error, so my calculations are below
>
> And... it's your interpretation of your results.
>
> Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds
> (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry
> to teach you to suck eggs, but based on your reply it seems necessary
> to point this out.
you're welcome, It's always useful to get corrected no mather what.
I'm simply trying to find a solution for hardware I built myself,
which is more or less "problem I created myself".
And I _WILL_ make dumb errors. I'm sorry for that.
but hey, at least I'm not crying like a little kid 'please help me to fix my internet' ....
Yes, I know the definition of ns -> us -> ms -> sec,
the commands used (example for i = 4 here, a total msleep of 245):

$ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | 
head -n 1) - 245000000)) -> for min
$ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | 
tail -n 1) - 245000000)) -> for max
$ echo diff at iteration 4: $((123074939 - 82868351)) -> for diff
$ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | 
awk '{x+=$0}END{print int(x/NR)}' -) - 245000000)) -> for average
avg: 86375811
so that's 86375811ns, 86375us, and 86ms, or ~12ms for each iteration

>
> You quoted an average of 99901858ns - 99.9ms for the i=3 case.
> You quoted an average of 86375811ns - 86.4ms for the i=4 case.
>
> Given that the difference in msleep() delay is 5ms, and we're
> talking about 50 or 55ms here, for the i=3 case that's 45ms
> for i2c_transfer(). For the i=4 case, that's 36.4ms.
Ouch ... that's a _LOT_ more than I thought.That explains the difference.  Sorry.
>
> However, msleep() is not accurate - and may even be bucket-based so I
> wouldn't rely on the requested msleep() interval being what you end
> up with - which seems to be suggested by the difference of almost 10ms
> in the apparent time that i2c_transfer() takes. 10ms, not 10us. So,
> unless you actually obtain timestamps before and after the
> i2c_transfer() call and calculate their difference, I would not read
> too much into that.
>
> In any case, figures in the realms of milliseconds are certainly in the
> realm of possibility for a 100kHz bus - as I say, one instance of a
> transaction _should_ be no less than 840 microseconds, so if your
> calculations come out less than that, you should not be claiming
> "bad bus" or something like that, but at first revalidating your
> analysis or interpretation of the figures.
>
> Also, because of scheduling delays, and some I2C drivers will sleep
> waiting for the transaction to finish, even that measurement I
> suggest can not be said to relate to the actual time it takes for
> the transactions on the bus, unless you're running a hard-realtime OS.
you are giving a lot of reasons why it's unreliable ...
and I certainly won't dare to calculate useless statistics anymore.
> However, it seems you're very keen to blame the I2C bus hardware...
>
Based on my mails, I can certainly see why you're thinking this way.
I have no idea what goes wrong anywhere between me making a modification in the mdio.c file -> 
i2c code -> ... -> SFP phy.
I'm curious what goes wrong, notice the 3 dots in between,
I know there's a pca9545 muxer in in there further complicating it, but that's about it.

Long story short: should I somehow try to test the reliability of something else?

Thanks,

Janpieter Sollie

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 14:30                 ` Janpieter Sollie
@ 2025-09-22 15:25                   ` Russell King (Oracle)
  2025-09-23  7:02                     ` Janpieter Sollie
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King (Oracle) @ 2025-09-22 15:25 UTC (permalink / raw)
  To: Janpieter Sollie
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

On Mon, Sep 22, 2025 at 04:30:56PM +0200, Janpieter Sollie wrote:
> Based on my mails, I can certainly see why you're thinking this way.
> I have no idea what goes wrong anywhere between me making a modification in
> the mdio.c file -> i2c code -> ... -> SFP phy.
> I'm curious what goes wrong, notice the 3 dots in between,
> I know there's a pca9545 muxer in in there further complicating it, but that's about it.
> 
> Long story short: should I somehow try to test the reliability of something else?

What you have in these setups is:

1. The I2C bus from the host to the SFP module pins. On the SFP module
   is an EEPROM at address 0x50 which contains some useful, some not so
   useful identification of the module.

2. Sometimes there is a PHY at 0x56, which is normally a Marvell
   88E1111 which was designed for use on SFPs, and has not only the
   conventional MDIO bus connectivity, but also supports I2C as well.

3. Some baseT modules, the PHY is not accessible.

4. Others have a microcontroller on them - so far identified some with
   an Arm Cortex-M controller, but others have an 8051-based controller
   to implement the "Rollball" protocol.

So, in the case of Rollball protocol modules, one is at the mercy of
the microcontroller receiving the I2C transactions, then accessing the
PHY over MDIO, and then responding appropriately. Given that there are
two different microcontrollers used for this task, I wouldn't be
surprised if there were numerous different firmwares running on them
of varying quality and efficiency.

I would suggest your module is taking excessively long to respond for
_some_ accesses. Maybe the controller isn't merely converting the
Rollball protocol to MDIO, but is doing other PHY manipulation as well,
e.g. emulating some functionality.

It may be interesting to work out whether it is a specific register or
set of registers that need longer access, and augment our knowledge
about what is going on with this stuff.

Ultimately yes, we likely have no option but to increase the timeout,
and to do that I suggest simply increasing the number of loops -
having the approx. 20ms delay between each attempt doesn't stress
anything.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 15:25                   ` Russell King (Oracle)
@ 2025-09-23  7:02                     ` Janpieter Sollie
  0 siblings, 0 replies; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-23  7:02 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

Op 22/09/2025 om 17:25 schreef Russell King (Oracle):
> On Mon, Sep 22, 2025 at 04:30:56PM +0200, Janpieter Sollie wrote:
>> Based on my mails, I can certainly see why you're thinking this way.
>> I have no idea what goes wrong anywhere between me making a modification in
>> the mdio.c file -> i2c code -> ... -> SFP phy.
>> I'm curious what goes wrong, notice the 3 dots in between,
>> I know there's a pca9545 muxer in in there further complicating it, but that's about it.
>>
>> Long story short: should I somehow try to test the reliability of something else?
> What you have in these setups is:
>
> 1. The I2C bus from the host to the SFP module pins. On the SFP module
>     is an EEPROM at address 0x50 which contains some useful, some not so
>     useful identification of the module.
>
> 2. Sometimes there is a PHY at 0x56, which is normally a Marvell
>     88E1111 which was designed for use on SFPs, and has not only the
>     conventional MDIO bus connectivity, but also supports I2C as well.
The page is indeed present in i2cdetect ...
>
> 3. Some baseT modules, the PHY is not accessible.
>
I saw this on another SFP+ RJ45 module, I have 2 of them (unused),
I'm thinking of breaking the metal cage of one to see what's inside
(out if curiosity). What I do know is that there's no page 56.
>
> It may be interesting to work out whether it is a specific register or
> set of registers that need longer access, and augment our knowledge
> about what is going on with this stuff.
Do you have a suggestion for this? I mean, runninig 'time i2cget xx xx'
in a for loop for all pages (50, 51 and 56) over all registers 10 times or so takes a huge 
amount of time.
>
> Ultimately yes, we likely have no option but to increase the timeout,
> and to do that I suggest simply increasing the number of loops -
> having the approx. 20ms delay between each attempt doesn't stress
> anything.
>
Let's try to avoid that at all costs: if I'm the first one reporting a problem like this,
it may not be useful to let the whole subsystem sit back and relax for all modules,
it may hide the case where there's really something wrong.

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch)
  2025-09-22 12:38               ` Russell King (Oracle)
  2025-09-22 13:16                 ` Russell King (Oracle)
  2025-09-22 14:30                 ` Janpieter Sollie
@ 2025-09-22 14:46                 ` Janpieter Sollie
  2 siblings, 0 replies; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-22 14:46 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni

(rewriting as plain text, mail client accidentally converted to html)
Op 22/09/2025 om 14:38 schreef Russell King (Oracle):
> On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote:
>> this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device
>> is running the i2c_transfer_rollball().
>> seems a lot to me when an i2c call takes 11-12 usecs avg per call
>> are you sure these numbers point to a stable i2c bus?
> I guess you've never dealt with I2C buses before.
No, but I've read many kernel code dealing with it, desperately trying to figure out the error.
But yes, this is only the software part.
> As has already been
> stated, the clock rate for I2C used with SFP modules (which is, if you
> like, I2C v1) is 100kHz. that's 10us per bit.
>
> An I2C transaction consists of one start bit, 8 bits for the address
> and r/w bit, one bit for the ack, 8 bits for the each byte of data
> and their individual ACK bits, and finally a stop bit. If a restart
> condition is used, the stop and start between the messages can be
> combined into a restart condition, saving one bit.
>
> That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits
> or clocks.
>
> The polling consists of two transactions on the bus:
>
> - a write of one byte - giving 20 clock cycles.
> - a read of six bytes - giving 65 clock cycles.
>
> So that's 85 clock cycles, or 84 if using restart. At 10us per cycle,
> that's 840us _minimum_.
and here's my first error:
 > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device is running the 
i2c_transfer_rollball().
 > seems a lot to me when an i2c call takes 11-12 usecs avg per call
I wrote usecs, but was obviously in a wrong universe, it should have been msecs.  Sorry
>
> If i2c_transfer() for that write and read are taking on the order of
> 12us, that suggest the bus is being clocked at around 7MHz, which is
> certainly way too fast, a bug in the I2C driver, an issue with the
> I2C hardware, or maybe an error in calculating how long a call takes.
Yes, I'm afraid of the last one as well ...
Honestly, even after re-reading it (twice),
I'm still not sure if / when I made an error, so my calculations are below
>
> And... it's your interpretation of your results.
>
> Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds
> (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry
> to teach you to suck eggs, but based on your reply it seems necessary
> to point this out.
you're welcome, It's always useful to get corrected no mather what.
I'm simply trying to find a solution for hardware I built myself,
which is more or less "problem I created myself".
And I _WILL_ make dumb errors. I'm sorry for that.
but hey, at least I'm not crying like a little kid 'please help me to fix my internet' ....
Yes, I know the definition of ns -> us -> ms -> sec,
the commands used (example for i = 4 here, a total msleep of 245):

$ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | 
head -n 1) - 245000000)) -> for min
$ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | 
tail -n 1) - 245000000)) -> for max
$ echo diff at iteration 4: $((123074939 - 82868351)) -> for diff
$ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | 
awk '{x+=$0}END{print int(x/NR)}' -) - 245000000)) -> for average
avg: 86375811
so that's 86375811ns, 86375us, and 86ms, or ~12ms for each iteration
>
> You quoted an average of 99901858ns - 99.9ms for the i=3 case.
> You quoted an average of 86375811ns - 86.4ms for the i=4 case.
>
> Given that the difference in msleep() delay is 5ms, and we're
> talking about 50 or 55ms here, for the i=3 case that's 45ms
> for i2c_transfer(). For the i=4 case, that's 36.4ms.
Ouch ... that's a _LOT_ more than I thought.That explains the difference.  Sorry.
>
> However, msleep() is not accurate - and may even be bucket-based so I
> wouldn't rely on the requested msleep() interval being what you end
> up with - which seems to be suggested by the difference of almost 10ms
> in the apparent time that i2c_transfer() takes. 10ms, not 10us. So,
> unless you actually obtain timestamps before and after the
> i2c_transfer() call and calculate their difference, I would not read
> too much into that.
>
> In any case, figures in the realms of milliseconds are certainly in the
> realm of possibility for a 100kHz bus - as I say, one instance of a
> transaction _should_ be no less than 840 microseconds, so if your
> calculations come out less than that, you should not be claiming
> "bad bus" or something like that, but at first revalidating your
> analysis or interpretation of the figures.
>
> Also, because of scheduling delays, and some I2C drivers will sleep
> waiting for the transaction to finish, even that measurement I
> suggest can not be said to relate to the actual time it takes for
> the transactions on the bus, unless you're running a hard-realtime OS.
you are giving a lot of reasons why it's unreliable ...
and I certainly won't dare to calculate useless statistics anymore.
>
> However, it seems you're very keen to blame the I2C bus hardware...
>
Based on my mails, I can certainly see why you're thinking this way.
I have no idea what goes wrong anywhere between me making a modification in the mdio.c file -> 
i2c code -> ... -> SFP phy.
I'm curious what goes wrong, notice the 3 dots in between,
I know there's a pca9545 muxer in in there further complicating it, but that's about it.

Long story short: should I somehow try to test the reliability of something else?

Thanks,

Janpieter Sollie

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2025-09-23  7:04 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 13:52 [RFC] increase MDIO i2c poll timeout gradually (including patch) Janpieter Sollie
2025-09-19 17:04 ` Andrew Lunn
2025-09-20 10:00   ` Janpieter Sollie
2025-09-20 11:18     ` Russell King (Oracle)
2025-09-20 12:34       ` Janpieter Sollie
     [not found]       ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>
2025-09-20 13:53         ` Russell King (Oracle)
2025-09-22  8:04           ` Janpieter Sollie
2025-09-22 10:54             ` Janpieter Sollie
2025-09-22 12:38               ` Russell King (Oracle)
2025-09-22 13:16                 ` Russell King (Oracle)
2025-09-22 13:50                   ` Russell King (Oracle)
2025-09-22 14:30                 ` Janpieter Sollie
2025-09-22 15:25                   ` Russell King (Oracle)
2025-09-23  7:02                     ` Janpieter Sollie
2025-09-22 14:46                 ` Janpieter Sollie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).