* [RFC] increase MDIO i2c poll timeout gradually (including patch)
@ 2025-09-19 13:52 Janpieter Sollie
2025-09-19 17:04 ` Andrew Lunn
0 siblings, 1 reply; 15+ messages in thread
From: Janpieter Sollie @ 2025-09-19 13:52 UTC (permalink / raw)
To: netdev
Hello everyone,
I tested a SFP module where the i2c bus is "unstable" at best.
different i2c timeouts occured, resulting in a "phy not detected" error message.
A simple (but not revealing a lot) stack dump during probe::
> mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:268) mdio_i2c
> mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:316) mdio_i2c
> __mdiobus_c45_read (drivers/net/phy/mdio_bus.c:992)
> mdiobus_c45_read (drivers/net/phy/mdio_bus.c:1133)
> get_phy_c45_ids (drivers/net/phy/phy_device.c:947)
> get_phy_device (drivers/net/phy/phy_device.c:1054)
> init_module (drivers/net/phy/sfp.c:1820) sfp
> cleanup_module (drivers/net/phy/sfp.c:1956 drivers/net/phy/sfp.c:2667
drivers/net/phy/sfp.c:2748) sfp
> cleanup_module (drivers/net/phy/sfp.c:2760 drivers/net/phy/sfp.c:2892) sfp
> process_one_work (./arch/arm64/include/asm/jump_label.h:32 ./include/linux/jump_label.h:207)
> worker_thread (kernel/workqueue.c:3304 (discriminator 2) kernel/workqueue.c:3391
(discriminator 2))
> kthread (kernel/kthread.c:389)
> ret_from_fork (arch/arm64/kernel/entry.S:863)
I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious.
In order to lower the stress on the i2c bus, I made the following patch.
is it the best way to "not-stress-sensitive-devices"?
Will it cause a performance regression on some other SFP cages?
Eric Woudstra told me another option was to add a few tries, increasing i = 10,
If the issue isn't the device itself, but the stress on the i2c bus is too high, it may not be a
real solution.
A good question may be: is this approach sufficient to close the gap between
"high performance" equipment having a stable i2c bus and they do not want to wait,
and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts?
Should this be fixed at another point in the initialization process (eg: not probing
ridiculously all phy ids)?
Thanks,
Janpieter Sollie
--- a/drivers/net/mdio/mdio-i2c.c 2025-09-19 14:08:41.285357818 +0200
+++ b/drivers/net/mdio/mdio-i2c.c 2025-09-19 14:10:24.962796149 +0200
@@ -253,7 +253,7 @@ static int i2c_rollball_mii_poll(struct mii_bus *bus, int bus_addr, u8 *buf,
*/
i = 10;
do {
- msleep(20);
+ msleep(20+(10*(10-i)));
ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs));
if (ret)
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-19 13:52 [RFC] increase MDIO i2c poll timeout gradually (including patch) Janpieter Sollie @ 2025-09-19 17:04 ` Andrew Lunn 2025-09-20 10:00 ` Janpieter Sollie 0 siblings, 1 reply; 15+ messages in thread From: Andrew Lunn @ 2025-09-19 17:04 UTC (permalink / raw) To: Janpieter Sollie; +Cc: netdev, Russell King, Heiner Kallweit On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote: > Hello everyone, Please ensure you Cc: the correct Maintainers. ./scripts/get_maintainer.pl drivers/net/phy/sfp.c Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT) Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY) Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY) "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS) Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS) Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS) Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS) netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT) linux-kernel@vger.kernel.org (open list) > I tested a SFP module where the i2c bus is "unstable" at best. Please tell us more about the hardware. Also, what speed do you have the I2C bus running at? Have you tried different clock-frequency values to slow down the I2C bus? Have you checked the pull-up resistors? I2C problems are sometimes due to too strong pull-ups. > A good question may be: is this approach sufficient to close the gap between > "high performance" equipment having a stable i2c bus and they do not want to wait, > and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts? Does your board actually confirm to the standards? I2C busses should be able to run at 100KHz, as defined by the standard. Also, the SFP standards define modules should work at 100KHz. And counting every milliwatt makes no sense when you are supposed to be able to deliver 3.3V at 300mA, i.e. 1 Watt, to the module. > Should this be fixed at another point in the initialization process (eg: not > probing ridiculously all phy ids)? Unfortunately, MDIO over I2C is not standardised. So we have no idea what address the PHY will be using, so we need to look at them all. If you have an SFF, not an SFP, it might be possible to do some optimisation. Andrew ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-19 17:04 ` Andrew Lunn @ 2025-09-20 10:00 ` Janpieter Sollie 2025-09-20 11:18 ` Russell King (Oracle) 0 siblings, 1 reply; 15+ messages in thread From: Janpieter Sollie @ 2025-09-20 10:00 UTC (permalink / raw) To: Andrew Lunn Cc: netdev, Russell King, Heiner Kallweit, Andrew Lunn, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Op 19/09/2025 om 19:04 schreef Andrew Lunn: > On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote: >> Hello everyone, > Please ensure you Cc: the correct Maintainers. > > ./scripts/get_maintainer.pl drivers/net/phy/sfp.c > Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT) > Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY) > Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY) > "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS) > Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS) > Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS) > Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS) > netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT) > linux-kernel@vger.kernel.org (open list) Done, sorry, this is my first post here > >> I tested a SFP module where the i2c bus is "unstable" at best. > Please tell us more about the hardware. > > Also, what speed do you have the I2C bus running at? Have you tried > different clock-frequency values to slow down the I2C bus? Have you > checked the pull-up resistors? I2C problems are sometimes due to too > strong pull-ups. The hardware is a bananapi R4 2xSFP using a MT7988a SoC. The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a quirk in sfp.c (added below) I'm not a i2c expert at all, but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root. I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell? The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know. > >> A good question may be: is this approach sufficient to close the gap between >> "high performance" equipment having a stable i2c bus and they do not want to wait, >> and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts? > Does your board actually confirm to the standards? I2C busses should > be able to run at 100KHz, as defined by the standard. Also, the SFP > standards define modules should work at 100KHz. And counting every > milliwatt makes no sense when you are supposed to be able to deliver > 3.3V at 300mA, i.e. 1 Watt, to the module. Sorry, I thought the issue here might be the SFP cage limited to 3W. While it should be enough to feed the RJ45 short range module, maybe it needs more during initalization. Is there a way to tell? > >> Should this be fixed at another point in the initialization process (eg: not >> probing ridiculously all phy ids)? > Unfortunately, MDIO over I2C is not standardised. So we have no idea > what address the PHY will be using, so we need to look at them all. If > you have an SFF, not an SFP, it might be possible to do some > optimisation. > > Andrew > Well, that's a bummer. is it possible to perform stress testing on the bus by randomly reading those phy registers and check whether the muxer is the cause of the instability, or the target device itself? Janpieter Sollie --- a/drivers/net/phy/sfp.c 2025-06-21 15:06:02.086163322 +0200 +++ b/drivers/net/phy/sfp.c 2025-06-21 15:59:41.836883438 +0200 @@ -422,6 +422,13 @@ sfp->id.base.extended_cc = SFF8024_ECC_10GBASE_T_SFI; } +static void sfp_fixup_oem_10gt(struct sfp *sfp) +{ + sfp_fixup_10gbaset_30m(sfp); + sfp_fixup_rollball_wait4s(sfp); + //sfp->module_t_wait = msecs_to_jiffies(10000); +} + static void sfp_quirk_2500basex(const struct sfp_eeprom_id *id, unsigned long *modes, unsigned long *interfaces) @@ -516,6 +523,7 @@ SFP_QUIRK_F("OEM", "SFP-GE-T", sfp_fixup_ignore_tx_fault), SFP_QUIRK_F("OEM", "SFP-10G-T", sfp_fixup_rollball_cc), + SFP_QUIRK_F("OEM", "ZK-10G-TX", sfp_fixup_oem_10gt), SFP_QUIRK_M("OEM", "SFP-2.5G-T", sfp_quirk_oem_2_5g), SFP_QUIRK_M("OEM", "SFP-2.5G", sfp_quirk_oem_2_5g), SFP_QUIRK_M("OEM", "SFP-2.5G-BX10-D", sfp_quirk_2500basex), ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-20 10:00 ` Janpieter Sollie @ 2025-09-20 11:18 ` Russell King (Oracle) 2025-09-20 12:34 ` Janpieter Sollie [not found] ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de> 0 siblings, 2 replies; 15+ messages in thread From: Russell King (Oracle) @ 2025-09-20 11:18 UTC (permalink / raw) To: Janpieter Sollie Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni On Sat, Sep 20, 2025 at 12:00:50PM +0200, Janpieter Sollie wrote: > Op 19/09/2025 om 19:04 schreef Andrew Lunn: > > On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote: > > > Hello everyone, > > Please ensure you Cc: the correct Maintainers. > > > > ./scripts/get_maintainer.pl drivers/net/phy/sfp.c > > Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT) > > Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY) > > Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY) > > "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS) > > Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS) > > Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS) > > Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS) > > netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT) > > linux-kernel@vger.kernel.org (open list) > Done, sorry, this is my first post here > > > > > I tested a SFP module where the i2c bus is "unstable" at best. > > Please tell us more about the hardware. > > > > Also, what speed do you have the I2C bus running at? Have you tried > > different clock-frequency values to slow down the I2C bus? Have you > > checked the pull-up resistors? I2C problems are sometimes due to too > > strong pull-ups. > The hardware is a bananapi R4 2xSFP using a MT7988a SoC. > The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a > quirk in sfp.c (added below) > I'm not a i2c expert at all, > but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root. > I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell? > > The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know. As you have not include the author of the SFP support (me) in your initial email, and have not provided a repeat of the description, I'm afraid I have no idea what the issue is that you're encountering. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-20 11:18 ` Russell King (Oracle) @ 2025-09-20 12:34 ` Janpieter Sollie [not found] ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de> 1 sibling, 0 replies; 15+ messages in thread From: Janpieter Sollie @ 2025-09-20 12:34 UTC (permalink / raw) To: Russell King (Oracle) Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Op 20/09/2025 om 13:18 schreef Russell King (Oracle): > On Sat, Sep 20, 2025 at 12:00:50PM +0200, Janpieter Sollie wrote: >> Op 19/09/2025 om 19:04 schreef Andrew Lunn: >>> On Fri, Sep 19, 2025 at 03:52:55PM +0200, Janpieter Sollie wrote: >>>> Hello everyone, >>> Please ensure you Cc: the correct Maintainers. >>> >>> ./scripts/get_maintainer.pl drivers/net/phy/sfp.c >>> Russell King <linux@armlinux.org.uk> (maintainer:SFF/SFP/SFP+ MODULE SUPPORT) >>> Andrew Lunn <andrew@lunn.ch> (maintainer:ETHERNET PHY LIBRARY) >>> Heiner Kallweit <hkallweit1@gmail.com> (maintainer:ETHERNET PHY LIBRARY) >>> "David S. Miller" <davem@davemloft.net> (maintainer:NETWORKING DRIVERS) >>> Eric Dumazet <edumazet@google.com> (maintainer:NETWORKING DRIVERS) >>> Jakub Kicinski <kuba@kernel.org> (maintainer:NETWORKING DRIVERS) >>> Paolo Abeni <pabeni@redhat.com> (maintainer:NETWORKING DRIVERS) >>> netdev@vger.kernel.org (open list:SFF/SFP/SFP+ MODULE SUPPORT) >>> linux-kernel@vger.kernel.org (open list) >> Done, sorry, this is my first post here >>>> I tested a SFP module where the i2c bus is "unstable" at best. >>> Please tell us more about the hardware. >>> >>> Also, what speed do you have the I2C bus running at? Have you tried >>> different clock-frequency values to slow down the I2C bus? Have you >>> checked the pull-up resistors? I2C problems are sometimes due to too >>> strong pull-ups. >> The hardware is a bananapi R4 2xSFP using a MT7988a SoC. >> The SFP+ module is a RJ45 rollball module using a AQR113C phy, but needs a >> quirk in sfp.c (added below) >> I'm not a i2c expert at all, >> but about the i2c bus speed, the SFP cage seems to be behind a muxer, not a i2c root. >> I could not find anything about i2c bus speed in /proc or /sys, maybe it's impossible to tell? >> >> The dtsi or dtso files do not mention anything about bus speeds, so I honestly do not know. > As you have not include the author of the SFP support (me) in your > initial email, and have not provided a repeat of the description, > I'm afraid I have no idea what the issue is that you're encountering. > > Thanks. > Yes indeed, I see it too. Hereby the original post: ====================== Hello everyone, I tested a SFP module where the i2c bus is "unstable" at best. different i2c timeouts occured, resulting in a "phy not detected" error message. A simple (but not revealing a lot) stack dump during probe:: > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:268) mdio_i2c > mdio_i2c_alloc (drivers/net/mdio/mdio-i2c.c:316) mdio_i2c > __mdiobus_c45_read (drivers/net/phy/mdio_bus.c:992) > mdiobus_c45_read (drivers/net/phy/mdio_bus.c:1133) > get_phy_c45_ids (drivers/net/phy/phy_device.c:947) > get_phy_device (drivers/net/phy/phy_device.c:1054) > init_module (drivers/net/phy/sfp.c:1820) sfp > cleanup_module (drivers/net/phy/sfp.c:1956 drivers/net/phy/sfp.c:2667 drivers/net/phy/sfp.c:2748) sfp > cleanup_module (drivers/net/phy/sfp.c:2760 drivers/net/phy/sfp.c:2892) sfp > process_one_work (./arch/arm64/include/asm/jump_label.h:32 ./include/linux/jump_label.h:207) > worker_thread (kernel/workqueue.c:3304 (discriminator 2) kernel/workqueue.c:3391 (discriminator 2)) > kthread (kernel/kthread.c:389) > ret_from_fork (arch/arm64/kernel/entry.S:863) I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious. In order to lower the stress on the i2c bus, I made the following patch. is it the best way to "not-stress-sensitive-devices"? Will it cause a performance regression on some other SFP cages? Eric Woudstra told me another option was to add a few tries, increasing i = 10, If the issue isn't the device itself, but the stress on the i2c bus is too high, it may not be a real solution. A good question may be: is this approach sufficient to close the gap between "high performance" equipment having a stable i2c bus and they do not want to wait, and embedded equipment (the device I tested on was a BPI-R4) where every milliwatt counts? Should this be fixed at another point in the initialization process (eg: not probing ridiculously all phy ids)? Thanks, Janpieter Sollie --- a/drivers/net/mdio/mdio-i2c.c 2025-09-19 14:08:41.285357818 +0200 +++ b/drivers/net/mdio/mdio-i2c.c 2025-09-19 14:10:24.962796149 +0200 @@ -253,7 +253,7 @@ static int i2c_rollball_mii_poll(struct mii_bus *bus, int bus_addr, u8 *buf, */ i = 10; do { - msleep(20); + msleep(20+(10*(10-i))); ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs)); if (ret) ^ permalink raw reply [flat|nested] 15+ messages in thread
[parent not found: <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>]
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) [not found] ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de> @ 2025-09-20 13:53 ` Russell King (Oracle) 2025-09-22 8:04 ` Janpieter Sollie 0 siblings, 1 reply; 15+ messages in thread From: Russell King (Oracle) @ 2025-09-20 13:53 UTC (permalink / raw) To: Janpieter Sollie Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni On Sat, Sep 20, 2025 at 02:09:47PM +0200, Janpieter Sollie wrote: > I tested a SFP module where the i2c bus is "unstable" at best. > different i2c timeouts occured, resulting in a "phy not detected" error message. If the I2C bus is so unstable that attempting to read a register from the PHY, which should take no more than 70ms, takes in excess of 200ms (which is what it takes for the loop to time out) for just one register, it seems to me that you're chasing a dead horse. However, looking at the rest of your message, I don't think you have a problem with the I2C bus at all. > I noticed a few hard-coded numbers in i2c_rollball_mii_pol(), which is always suspicious. Sorry, but I don't follow the same reasoning. If they aren't hard-coded, but are "knobs" in e.g. sysfs, then how do we teach people how to "tune" them? When should they "tune" them? No, the less knobs there are, the better for the user, provided timeouts are sensible - and I think waiting 200ms for a register to be read is already an excessively long timeout. In any case, if we have a module that takes longer, that's new, and the timeouts need to be adjusted - not on a per-module basis, and not by users having to tinker with stuff in sysfs. > In order to lower the stress on the i2c bus, I made the following patch. > is it the best way to "not-stress-sensitive-devices"? > Will it cause a performance regression on some other SFP cages? > > Eric Woudstra told me another option was to add a few tries, increasing i = 10, > If the issue isn't the device itself, but the stress on the i2c bus is too > high, it may not be a real solution. Why are you concerned about "stress" on the I2C bus? What kind of stress? The bus is 5V or 3.3V signalling, running at 100kHz (so slow) with resistive pull-ups. Apart from the bus transitions (which cause CMOS to take a pulse of power) the energy from that will be nothing compared to the energy required to run the CPU, which is operating much faster with many more CMOS transistors switching. > A good question may be: is this approach sufficient to close the gap between > "high performance" equipment having a stable i2c bus and they do not want to wait, None of this has been written for "high performance" equipment. It was developed on a SolidRun clearfog platform (Armada 388 based) which is hardly "high performance". It's been used with bit-banged I2C as well on Macchiatobin platforms. I'm guessing that the problem is not an I2C bus problem (which would cause i2c_transfer_rollball() to return an error, and the loop to quit), but yet another cheap and nasty SFP module that takes much longer than 70ms to respond under the "Rollball" protocol. So, what we need you to do is to work out how long it takes this module to respond, and whether it always takes a long time to respond. Please add some debugging to i2c_rollball_mii_poll() to measure the amount of time it takes for the module to respond - and please measure it for several transactions. You can use jiffies, and convert to msecs using jiffies_to_msecs(), or you could use ktime_get_ns(). Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-20 13:53 ` Russell King (Oracle) @ 2025-09-22 8:04 ` Janpieter Sollie 2025-09-22 10:54 ` Janpieter Sollie 0 siblings, 1 reply; 15+ messages in thread From: Janpieter Sollie @ 2025-09-22 8:04 UTC (permalink / raw) To: Russell King (Oracle) Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Op 20/09/2025 om 15:53 schreef Russell King (Oracle): > So, what we need you to do is to work out how long it takes this module > to respond, and whether it always takes a long time to respond. Please > add some debugging to i2c_rollball_mii_poll() to measure the amount of > time it takes for the module to respond - and please measure it for > several transactions. > > You can use jiffies, and convert to msecs using jiffies_to_msecs(), > or you could use ktime_get_ns(). > > Thanks. > All right, so I changed the modification to a more debug-friendly function (view below). I also changed the incremental wait() function from (20+10*(10-i)) to (20+5*(10-i)) to be more accurate. When bringing up with "ip link set sfp-lan up", it starts printing It keeps polling (which, I guess, is what you expect, I guess the hwmon subsystem, link detection, ... ) Mostly, the output is 300-400us. When the module is up and running, and maintenance polls happen, I saw something new: It starts responding with a much lower value: something like 30 us. At iteration 10. occasionally. i = 10 means "immediately", i decrements from 10 to 0. I got no clarification for that. The thing I'm mostly sure of, is that no commands return an error, and no mii_polls time out, but 300-400us is way too much for a loop which expects to return in 200us ... 156724.601882] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348074665 ns in iteration 4 [156724.941879] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328077236 ns in iteration 4 [156725.301889] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348093665 ns in iteration 4 [156725.661896] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348075511 ns in iteration 4 [156726.081885] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 408061414 ns in iteration 3 [156726.441997] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 348189434 ns in iteration 4 [156726.781885] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 327952775 ns in iteration 4 [156727.831885] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 30037797 ns in iteration 10 [156728.241880] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398062738 ns in iteration 3 [156728.591889] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 338068066 ns in iteration 4 [156728.941888] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 338062605 ns in iteration 4 [156729.351924] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398101891 ns in iteration 3 [156729.692014] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328135697 ns in iteration 4 [156730.041895] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 337947143 ns in iteration 4 [156730.391997] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 338162758 ns in iteration 4 [156730.741994] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 337980912 ns in iteration 4 [156731.831897] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 30043259 ns in iteration 10 [156732.241897] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398065122 ns in iteration 3 [156732.581982] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328157082 ns in iteration 4 [156732.921978] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 327986467 ns in iteration 4 ... This was an example of the output when using the patch below. No poll command returns an error state. No poll commands time out. all mii_poll commands return at i = 3 or i = 4, when the sfp module is completely up, i = 10 occurs from time to time. Thanks, Janpieter Sollie --- a/drivers/net/mdio/mdio-i2c.c 2025-09-22 08:41:12.084752517 +0200 +++ b/drivers/net/mdio/mdio-i2c.c 2025-09-22 08:47:20.230592266 +0200 @@ -232,6 +232,7 @@ struct i2c_msg msgs[2]; u8 cmd_addr, tmp, *res; int i, ret; + long uptime; cmd_addr = ROLLBALL_CMD_ADDR; @@ -252,14 +253,19 @@ * SFPs. Sleep 20ms between iterations and try 10 times. */ i = 10; + uptime = ktime_get_ns(); do { - msleep(20); + msleep(20 + 5*(10-i)); ret = i2c_transfer_rollball(i2c, msgs, ARRAY_SIZE(msgs)); - if (ret) + if (ret) { + dev_dbg(&bus->dev, "poll cmd return %d after %d ns in iteration %d\n", ret, (ktime_get_ns() - uptime), i); return ret; + } - if (*res == ROLLBALL_CMD_DONE) + if (*res == ROLLBALL_CMD_DONE) { + dev_dbg(&bus->dev, "poll cmd success after %d ns in iteration %d\n", (ktime_get_ns() - uptime), i); return 0; + } } while (i-- > 0); ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 8:04 ` Janpieter Sollie @ 2025-09-22 10:54 ` Janpieter Sollie 2025-09-22 12:38 ` Russell King (Oracle) 0 siblings, 1 reply; 15+ messages in thread From: Janpieter Sollie @ 2025-09-22 10:54 UTC (permalink / raw) To: Russell King (Oracle) Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Op 22/09/2025 om 10:04 schreef Janpieter Sollie: > Op 20/09/2025 om 15:53 schreef Russell King (Oracle): >> So, what we need you to do is to work out how long it takes this module >> to respond, and whether it always takes a long time to respond. Please >> add some debugging to i2c_rollball_mii_poll() to measure the amount of >> time it takes for the module to respond - and please measure it for >> several transactions. >> >> You can use jiffies, and convert to msecs using jiffies_to_msecs(), >> or you could use ktime_get_ns(). >> >> Thanks. >> > > All right, so I changed the modification to a more debug-friendly function (view below). > I also changed the incremental wait() function from (20+10*(10-i)) to (20+5*(10-i)) to be more > accurate. > > [156732.241897] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 398065122 > ns in iteration 3 > [156732.581982] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 328157082 > ns in iteration 4 > [156732.921978] i2c_rollball_mii_poll:267: mdio_bus i2c:sfp2: poll cmd success after 327986467 > ns in iteration 4 > > ... > Something I noticed when going through a lot more iterations: FYI: The kernel has been compiled with a 100hz timer, tickless idle and no kernel preemption. I wanted to count how much the system actually runs those i2c calls, substracting msleep, which is 20, 245 and 300, respectively. So, I separated i = 10, i = 4 and i = 3 a bit. > 192 numbers for i = 10 > 2309 numbers for i = 4 > 1129 numbers for i = 3 and calculating max and min, substracting the msleep: > 1 function call max: 20131327 > 1 function call min: 9990574 > diff at iteration 10: 10140753 > avg at iteration 10: 10723993 > 7 function calls min: 82868351 > 7 function calls max: 123074939 > diff at iteration 4: 40206588 > avg at iteration 4: 86375811 > 8 function calls min: 97889217 > 8 function calls max: 128086531 > diff at iteration 3: 30197314 > avg at iteration 3: 99901858 this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device is running the i2c_transfer_rollball(). seems a lot to me when an i2c call takes 11-12 usecs avg per call are you sure these numbers point to a stable i2c bus? thanks, Janpieter Sollie ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 10:54 ` Janpieter Sollie @ 2025-09-22 12:38 ` Russell King (Oracle) 2025-09-22 13:16 ` Russell King (Oracle) ` (2 more replies) 0 siblings, 3 replies; 15+ messages in thread From: Russell King (Oracle) @ 2025-09-22 12:38 UTC (permalink / raw) To: Janpieter Sollie Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote: > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device > is running the i2c_transfer_rollball(). > seems a lot to me when an i2c call takes 11-12 usecs avg per call > are you sure these numbers point to a stable i2c bus? I guess you've never dealt with I2C buses before. As has already been stated, the clock rate for I2C used with SFP modules (which is, if you like, I2C v1) is 100kHz. that's 10us per bit. An I2C transaction consists of one start bit, 8 bits for the address and r/w bit, one bit for the ack, 8 bits for the each byte of data and their individual ACK bits, and finally a stop bit. If a restart condition is used, the stop and start between the messages can be combined into a restart condition, saving one bit. That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits or clocks. The polling consists of two transactions on the bus: - a write of one byte - giving 20 clock cycles. - a read of six bytes - giving 65 clock cycles. So that's 85 clock cycles, or 84 if using restart. At 10us per cycle, that's 840us _minimum_. If i2c_transfer() for that write and read are taking on the order of 12us, that suggest the bus is being clocked at around 7MHz, which is certainly way too fast, a bug in the I2C driver, an issue with the I2C hardware, or maybe an error in calculating how long a call takes. And... it's your interpretation of your results. Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry to teach you to suck eggs, but based on your reply it seems necessary to point this out. You quoted an average of 99901858ns - 99.9ms for the i=3 case. You quoted an average of 86375811ns - 86.4ms for the i=4 case. Given that the difference in msleep() delay is 5ms, and we're talking about 50 or 55ms here, for the i=3 case that's 45ms for i2c_transfer(). For the i=4 case, that's 36.4ms. However, msleep() is not accurate - and may even be bucket-based so I wouldn't rely on the requested msleep() interval being what you end up with - which seems to be suggested by the difference of almost 10ms in the apparent time that i2c_transfer() takes. 10ms, not 10us. So, unless you actually obtain timestamps before and after the i2c_transfer() call and calculate their difference, I would not read too much into that. In any case, figures in the realms of milliseconds are certainly in the realm of possibility for a 100kHz bus - as I say, one instance of a transaction _should_ be no less than 840 microseconds, so if your calculations come out less than that, you should not be claiming "bad bus" or something like that, but at first revalidating your analysis or interpretation of the figures. Also, because of scheduling delays, and some I2C drivers will sleep waiting for the transaction to finish, even that measurement I suggest can not be said to relate to the actual time it takes for the transactions on the bus, unless you're running a hard-realtime OS. However, it seems you're very keen to blame the I2C bus hardware... -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 12:38 ` Russell King (Oracle) @ 2025-09-22 13:16 ` Russell King (Oracle) 2025-09-22 13:50 ` Russell King (Oracle) 2025-09-22 14:30 ` Janpieter Sollie 2025-09-22 14:46 ` Janpieter Sollie 2 siblings, 1 reply; 15+ messages in thread From: Russell King (Oracle) @ 2025-09-22 13:16 UTC (permalink / raw) To: Janpieter Sollie Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni On Mon, Sep 22, 2025 at 01:38:01PM +0100, Russell King (Oracle) wrote: > On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote: > > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device > > is running the i2c_transfer_rollball(). > > seems a lot to me when an i2c call takes 11-12 usecs avg per call > > are you sure these numbers point to a stable i2c bus? > > I guess you've never dealt with I2C buses before. As has already been > stated, the clock rate for I2C used with SFP modules (which is, if you > like, I2C v1) is 100kHz. that's 10us per bit. > > An I2C transaction consists of one start bit, 8 bits for the address > and r/w bit, one bit for the ack, 8 bits for the each byte of data > and their individual ACK bits, and finally a stop bit. If a restart > condition is used, the stop and start between the messages can be > combined into a restart condition, saving one bit. > > That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits > or clocks. > > The polling consists of two transactions on the bus: > > - a write of one byte - giving 20 clock cycles. > - a read of six bytes - giving 65 clock cycles. > > So that's 85 clock cycles, or 84 if using restart. At 10us per cycle, > that's 840us _minimum_. > > If i2c_transfer() for that write and read are taking on the order of > 12us, that suggest the bus is being clocked at around 7MHz, which is > certainly way too fast, a bug in the I2C driver, an issue with the > I2C hardware, or maybe an error in calculating how long a call takes. > > And... it's your interpretation of your results. > > Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds > (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry > to teach you to suck eggs, but based on your reply it seems necessary > to point this out. > > You quoted an average of 99901858ns - 99.9ms for the i=3 case. > You quoted an average of 86375811ns - 86.4ms for the i=4 case. > > Given that the difference in msleep() delay is 5ms, and we're > talking about 50 or 55ms here, for the i=3 case that's 45ms > for i2c_transfer(). For the i=4 case, that's 36.4ms. > > However, msleep() is not accurate - and may even be bucket-based so I > wouldn't rely on the requested msleep() interval being what you end > up with - which seems to be suggested by the difference of almost 10ms > in the apparent time that i2c_transfer() takes. 10ms, not 10us. So, > unless you actually obtain timestamps before and after the > i2c_transfer() call and calculate their difference, I would not read > too much into that. > > In any case, figures in the realms of milliseconds are certainly in the > realm of possibility for a 100kHz bus - as I say, one instance of a > transaction _should_ be no less than 840 microseconds, so if your > calculations come out less than that, you should not be claiming > "bad bus" or something like that, but at first revalidating your > analysis or interpretation of the figures. > > Also, because of scheduling delays, and some I2C drivers will sleep > waiting for the transaction to finish, even that measurement I > suggest can not be said to relate to the actual time it takes for > the transactions on the bus, unless you're running a hard-realtime OS. > > However, it seems you're very keen to blame the I2C bus hardware... I'll also point out that if an error occurs on the I2C bus, the earliest that could be detected is after the address byte (lack of ACK to the address - non-responsive slave) which is 10 clocks, or 100us. If an error occurs, then i2c_transfer() should either return an error, or indicate that the transaction has not been completed. What I have missed in my analysis is that it's not calling i2c_transfer() but i2c_transfer_rollball() which does many more bus transactions. So, my figures of 840us minimum is likely a grossly under the expected time. However, my fundamental point still stands - that being 10us for i2c_transfer_rollball() is an insanely short period for the work that needs to be done no matter _what_ happens on the hardware bus. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 13:16 ` Russell King (Oracle) @ 2025-09-22 13:50 ` Russell King (Oracle) 0 siblings, 0 replies; 15+ messages in thread From: Russell King (Oracle) @ 2025-09-22 13:50 UTC (permalink / raw) To: Janpieter Sollie Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni On Mon, Sep 22, 2025 at 02:16:03PM +0100, Russell King (Oracle) wrote: > On Mon, Sep 22, 2025 at 01:38:01PM +0100, Russell King (Oracle) wrote: > > On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote: > > > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device > > > is running the i2c_transfer_rollball(). > > > seems a lot to me when an i2c call takes 11-12 usecs avg per call > > > are you sure these numbers point to a stable i2c bus? > > > > I guess you've never dealt with I2C buses before. As has already been > > stated, the clock rate for I2C used with SFP modules (which is, if you > > like, I2C v1) is 100kHz. that's 10us per bit. > > > > An I2C transaction consists of one start bit, 8 bits for the address > > and r/w bit, one bit for the ack, 8 bits for the each byte of data > > and their individual ACK bits, and finally a stop bit. If a restart > > condition is used, the stop and start between the messages can be > > combined into a restart condition, saving one bit. > > > > That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits > > or clocks. > > > > The polling consists of two transactions on the bus: > > > > - a write of one byte - giving 20 clock cycles. > > - a read of six bytes - giving 65 clock cycles. > > > > So that's 85 clock cycles, or 84 if using restart. At 10us per cycle, > > that's 840us _minimum_. > > > > If i2c_transfer() for that write and read are taking on the order of > > 12us, that suggest the bus is being clocked at around 7MHz, which is > > certainly way too fast, a bug in the I2C driver, an issue with the > > I2C hardware, or maybe an error in calculating how long a call takes. > > > > And... it's your interpretation of your results. > > > > Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds > > (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry > > to teach you to suck eggs, but based on your reply it seems necessary > > to point this out. > > > > You quoted an average of 99901858ns - 99.9ms for the i=3 case. > > You quoted an average of 86375811ns - 86.4ms for the i=4 case. > > > > Given that the difference in msleep() delay is 5ms, and we're > > talking about 50 or 55ms here, for the i=3 case that's 45ms > > for i2c_transfer(). For the i=4 case, that's 36.4ms. > > > > However, msleep() is not accurate - and may even be bucket-based so I > > wouldn't rely on the requested msleep() interval being what you end > > up with - which seems to be suggested by the difference of almost 10ms > > in the apparent time that i2c_transfer() takes. 10ms, not 10us. So, > > unless you actually obtain timestamps before and after the > > i2c_transfer() call and calculate their difference, I would not read > > too much into that. > > > > In any case, figures in the realms of milliseconds are certainly in the > > realm of possibility for a 100kHz bus - as I say, one instance of a > > transaction _should_ be no less than 840 microseconds, so if your > > calculations come out less than that, you should not be claiming > > "bad bus" or something like that, but at first revalidating your > > analysis or interpretation of the figures. > > > > Also, because of scheduling delays, and some I2C drivers will sleep > > waiting for the transaction to finish, even that measurement I > > suggest can not be said to relate to the actual time it takes for > > the transactions on the bus, unless you're running a hard-realtime OS. > > > > However, it seems you're very keen to blame the I2C bus hardware... > > I'll also point out that if an error occurs on the I2C bus, the earliest > that could be detected is after the address byte (lack of ACK to the > address - non-responsive slave) which is 10 clocks, or 100us. > > If an error occurs, then i2c_transfer() should either return an error, > or indicate that the transaction has not been completed. > > What I have missed in my analysis is that it's not calling > i2c_transfer() but i2c_transfer_rollball() which does many more bus > transactions. So, my figures of 840us minimum is likely a grossly > under the expected time. > > However, my fundamental point still stands - that being 10us for > i2c_transfer_rollball() is an insanely short period for the work that > needs to be done no matter _what_ happens on the hardware bus. Having now been through i2c_transfer_rollball(), the additional transactions on the bus will take between 97 and 98 cycles on top of what I mentioned, or 970us - 980us at 100kHz. In total, that works out at a minimum of 1.81ms for a successful access to the hardware, assuming every I2C transaction occurs back-to-back to the previous transaction. Thanks. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 12:38 ` Russell King (Oracle) 2025-09-22 13:16 ` Russell King (Oracle) @ 2025-09-22 14:30 ` Janpieter Sollie 2025-09-22 15:25 ` Russell King (Oracle) 2025-09-22 14:46 ` Janpieter Sollie 2 siblings, 1 reply; 15+ messages in thread From: Janpieter Sollie @ 2025-09-22 14:30 UTC (permalink / raw) To: Russell King (Oracle) Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Op 22/09/2025 om 14:38 schreef Russell King (Oracle): > On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote: >> this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device >> is running the i2c_transfer_rollball(). >> seems a lot to me when an i2c call takes 11-12 usecs avg per call >> are you sure these numbers point to a stable i2c bus? > I guess you've never dealt with I2C buses before. No, but I've read many kernel code dealing with it, desperately trying to figure out the error. But yes, this is only the software part. > As has already been > stated, the clock rate for I2C used with SFP modules (which is, if you > like, I2C v1) is 100kHz. that's 10us per bit. > > An I2C transaction consists of one start bit, 8 bits for the address > and r/w bit, one bit for the ack, 8 bits for the each byte of data > and their individual ACK bits, and finally a stop bit. If a restart > condition is used, the stop and start between the messages can be > combined into a restart condition, saving one bit. > > That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits > or clocks. > > The polling consists of two transactions on the bus: > > - a write of one byte - giving 20 clock cycles. > - a read of six bytes - giving 65 clock cycles. > > So that's 85 clock cycles, or 84 if using restart. At 10us per cycle, > that's 840us _minimum_. and here's my first error: > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device is running the i2c_transfer_rollball(). > seems a lot to me when an i2c call takes 11-12 usecs avg per call I wrote usecs, but was obviously in a wrong universe, it should have been msecs. Sorry > > If i2c_transfer() for that write and read are taking on the order of > 12us, that suggest the bus is being clocked at around 7MHz, which is > certainly way too fast, a bug in the I2C driver, an issue with the > I2C hardware, or maybe an error in calculating how long a call takes. Yes, I'm afraid of the last one as well ... Honestly, even after re-reading it (twice), I'm still not sure if / when I made an error, so my calculations are below > > And... it's your interpretation of your results. > > Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds > (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry > to teach you to suck eggs, but based on your reply it seems necessary > to point this out. you're welcome, It's always useful to get corrected no mather what. I'm simply trying to find a solution for hardware I built myself, which is more or less "problem I created myself". And I _WILL_ make dumb errors. I'm sorry for that. but hey, at least I'm not crying like a little kid 'please help me to fix my internet' .... Yes, I know the definition of ns -> us -> ms -> sec, the commands used (example for i = 4 here, a total msleep of 245): $ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | head -n 1) - 245000000)) -> for min $ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | tail -n 1) - 245000000)) -> for max $ echo diff at iteration 4: $((123074939 - 82868351)) -> for diff $ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | awk '{x+=$0}END{print int(x/NR)}' -) - 245000000)) -> for average avg: 86375811 so that's 86375811ns, 86375us, and 86ms, or ~12ms for each iteration > > You quoted an average of 99901858ns - 99.9ms for the i=3 case. > You quoted an average of 86375811ns - 86.4ms for the i=4 case. > > Given that the difference in msleep() delay is 5ms, and we're > talking about 50 or 55ms here, for the i=3 case that's 45ms > for i2c_transfer(). For the i=4 case, that's 36.4ms. Ouch ... that's a _LOT_ more than I thought.That explains the difference. Sorry. > > However, msleep() is not accurate - and may even be bucket-based so I > wouldn't rely on the requested msleep() interval being what you end > up with - which seems to be suggested by the difference of almost 10ms > in the apparent time that i2c_transfer() takes. 10ms, not 10us. So, > unless you actually obtain timestamps before and after the > i2c_transfer() call and calculate their difference, I would not read > too much into that. > > In any case, figures in the realms of milliseconds are certainly in the > realm of possibility for a 100kHz bus - as I say, one instance of a > transaction _should_ be no less than 840 microseconds, so if your > calculations come out less than that, you should not be claiming > "bad bus" or something like that, but at first revalidating your > analysis or interpretation of the figures. > > Also, because of scheduling delays, and some I2C drivers will sleep > waiting for the transaction to finish, even that measurement I > suggest can not be said to relate to the actual time it takes for > the transactions on the bus, unless you're running a hard-realtime OS. you are giving a lot of reasons why it's unreliable ... and I certainly won't dare to calculate useless statistics anymore. > However, it seems you're very keen to blame the I2C bus hardware... > Based on my mails, I can certainly see why you're thinking this way. I have no idea what goes wrong anywhere between me making a modification in the mdio.c file -> i2c code -> ... -> SFP phy. I'm curious what goes wrong, notice the 3 dots in between, I know there's a pca9545 muxer in in there further complicating it, but that's about it. Long story short: should I somehow try to test the reliability of something else? Thanks, Janpieter Sollie ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 14:30 ` Janpieter Sollie @ 2025-09-22 15:25 ` Russell King (Oracle) 2025-09-23 7:02 ` Janpieter Sollie 0 siblings, 1 reply; 15+ messages in thread From: Russell King (Oracle) @ 2025-09-22 15:25 UTC (permalink / raw) To: Janpieter Sollie Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni On Mon, Sep 22, 2025 at 04:30:56PM +0200, Janpieter Sollie wrote: > Based on my mails, I can certainly see why you're thinking this way. > I have no idea what goes wrong anywhere between me making a modification in > the mdio.c file -> i2c code -> ... -> SFP phy. > I'm curious what goes wrong, notice the 3 dots in between, > I know there's a pca9545 muxer in in there further complicating it, but that's about it. > > Long story short: should I somehow try to test the reliability of something else? What you have in these setups is: 1. The I2C bus from the host to the SFP module pins. On the SFP module is an EEPROM at address 0x50 which contains some useful, some not so useful identification of the module. 2. Sometimes there is a PHY at 0x56, which is normally a Marvell 88E1111 which was designed for use on SFPs, and has not only the conventional MDIO bus connectivity, but also supports I2C as well. 3. Some baseT modules, the PHY is not accessible. 4. Others have a microcontroller on them - so far identified some with an Arm Cortex-M controller, but others have an 8051-based controller to implement the "Rollball" protocol. So, in the case of Rollball protocol modules, one is at the mercy of the microcontroller receiving the I2C transactions, then accessing the PHY over MDIO, and then responding appropriately. Given that there are two different microcontrollers used for this task, I wouldn't be surprised if there were numerous different firmwares running on them of varying quality and efficiency. I would suggest your module is taking excessively long to respond for _some_ accesses. Maybe the controller isn't merely converting the Rollball protocol to MDIO, but is doing other PHY manipulation as well, e.g. emulating some functionality. It may be interesting to work out whether it is a specific register or set of registers that need longer access, and augment our knowledge about what is going on with this stuff. Ultimately yes, we likely have no option but to increase the timeout, and to do that I suggest simply increasing the number of loops - having the approx. 20ms delay between each attempt doesn't stress anything. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 15:25 ` Russell King (Oracle) @ 2025-09-23 7:02 ` Janpieter Sollie 0 siblings, 0 replies; 15+ messages in thread From: Janpieter Sollie @ 2025-09-23 7:02 UTC (permalink / raw) To: Russell King (Oracle) Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni Op 22/09/2025 om 17:25 schreef Russell King (Oracle): > On Mon, Sep 22, 2025 at 04:30:56PM +0200, Janpieter Sollie wrote: >> Based on my mails, I can certainly see why you're thinking this way. >> I have no idea what goes wrong anywhere between me making a modification in >> the mdio.c file -> i2c code -> ... -> SFP phy. >> I'm curious what goes wrong, notice the 3 dots in between, >> I know there's a pca9545 muxer in in there further complicating it, but that's about it. >> >> Long story short: should I somehow try to test the reliability of something else? > What you have in these setups is: > > 1. The I2C bus from the host to the SFP module pins. On the SFP module > is an EEPROM at address 0x50 which contains some useful, some not so > useful identification of the module. > > 2. Sometimes there is a PHY at 0x56, which is normally a Marvell > 88E1111 which was designed for use on SFPs, and has not only the > conventional MDIO bus connectivity, but also supports I2C as well. The page is indeed present in i2cdetect ... > > 3. Some baseT modules, the PHY is not accessible. > I saw this on another SFP+ RJ45 module, I have 2 of them (unused), I'm thinking of breaking the metal cage of one to see what's inside (out if curiosity). What I do know is that there's no page 56. > > It may be interesting to work out whether it is a specific register or > set of registers that need longer access, and augment our knowledge > about what is going on with this stuff. Do you have a suggestion for this? I mean, runninig 'time i2cget xx xx' in a for loop for all pages (50, 51 and 56) over all registers 10 times or so takes a huge amount of time. > > Ultimately yes, we likely have no option but to increase the timeout, > and to do that I suggest simply increasing the number of loops - > having the approx. 20ms delay between each attempt doesn't stress > anything. > Let's try to avoid that at all costs: if I'm the first one reporting a problem like this, it may not be useful to let the whole subsystem sit back and relax for all modules, it may hide the case where there's really something wrong. Thanks ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC] increase MDIO i2c poll timeout gradually (including patch) 2025-09-22 12:38 ` Russell King (Oracle) 2025-09-22 13:16 ` Russell King (Oracle) 2025-09-22 14:30 ` Janpieter Sollie @ 2025-09-22 14:46 ` Janpieter Sollie 2 siblings, 0 replies; 15+ messages in thread From: Janpieter Sollie @ 2025-09-22 14:46 UTC (permalink / raw) To: Russell King (Oracle) Cc: Andrew Lunn, netdev, Heiner Kallweit, David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni (rewriting as plain text, mail client accidentally converted to html) Op 22/09/2025 om 14:38 schreef Russell King (Oracle): > On Mon, Sep 22, 2025 at 12:54:20PM +0200, Janpieter Sollie wrote: >> this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device >> is running the i2c_transfer_rollball(). >> seems a lot to me when an i2c call takes 11-12 usecs avg per call >> are you sure these numbers point to a stable i2c bus? > I guess you've never dealt with I2C buses before. No, but I've read many kernel code dealing with it, desperately trying to figure out the error. But yes, this is only the software part. > As has already been > stated, the clock rate for I2C used with SFP modules (which is, if you > like, I2C v1) is 100kHz. that's 10us per bit. > > An I2C transaction consists of one start bit, 8 bits for the address > and r/w bit, one bit for the ack, 8 bits for the each byte of data > and their individual ACK bits, and finally a stop bit. If a restart > condition is used, the stop and start between the messages can be > combined into a restart condition, saving one bit. > > That works out at 1 + 8 + 1 + N*(8 + 1) + 1 bits, or 11 + 9 * N bits > or clocks. > > The polling consists of two transactions on the bus: > > - a write of one byte - giving 20 clock cycles. > - a read of six bytes - giving 65 clock cycles. > > So that's 85 clock cycles, or 84 if using restart. At 10us per cycle, > that's 840us _minimum_. and here's my first error: > this is a diff of 10usecs (i=10), 40usecs (i=4) and 30usecs (i=3) my device is running the i2c_transfer_rollball(). > seems a lot to me when an i2c call takes 11-12 usecs avg per call I wrote usecs, but was obviously in a wrong universe, it should have been msecs. Sorry > > If i2c_transfer() for that write and read are taking on the order of > 12us, that suggest the bus is being clocked at around 7MHz, which is > certainly way too fast, a bug in the I2C driver, an issue with the > I2C hardware, or maybe an error in calculating how long a call takes. Yes, I'm afraid of the last one as well ... Honestly, even after re-reading it (twice), I'm still not sure if / when I made an error, so my calculations are below > > And... it's your interpretation of your results. > > Remember, these are nanoseconds (ns), nanoseconds are 1000 microseconds > (us) and there are 1000000 nanoseconds in a millisecond (ms). Sorry > to teach you to suck eggs, but based on your reply it seems necessary > to point this out. you're welcome, It's always useful to get corrected no mather what. I'm simply trying to find a solution for hardware I built myself, which is more or less "problem I created myself". And I _WILL_ make dumb errors. I'm sorry for that. but hey, at least I'm not crying like a little kid 'please help me to fix my internet' .... Yes, I know the definition of ns -> us -> ms -> sec, the commands used (example for i = 4 here, a total msleep of 245): $ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | head -n 1) - 245000000)) -> for min $ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | tail -n 1) - 245000000)) -> for max $ echo diff at iteration 4: $((123074939 - 82868351)) -> for diff $ echo $(($(sed -n 's/.*\ \([0-9]\+\)\ ns\ in\ iteration\ 4/\1/p' < tempoutput.txt | sort -n | awk '{x+=$0}END{print int(x/NR)}' -) - 245000000)) -> for average avg: 86375811 so that's 86375811ns, 86375us, and 86ms, or ~12ms for each iteration > > You quoted an average of 99901858ns - 99.9ms for the i=3 case. > You quoted an average of 86375811ns - 86.4ms for the i=4 case. > > Given that the difference in msleep() delay is 5ms, and we're > talking about 50 or 55ms here, for the i=3 case that's 45ms > for i2c_transfer(). For the i=4 case, that's 36.4ms. Ouch ... that's a _LOT_ more than I thought.That explains the difference. Sorry. > > However, msleep() is not accurate - and may even be bucket-based so I > wouldn't rely on the requested msleep() interval being what you end > up with - which seems to be suggested by the difference of almost 10ms > in the apparent time that i2c_transfer() takes. 10ms, not 10us. So, > unless you actually obtain timestamps before and after the > i2c_transfer() call and calculate their difference, I would not read > too much into that. > > In any case, figures in the realms of milliseconds are certainly in the > realm of possibility for a 100kHz bus - as I say, one instance of a > transaction _should_ be no less than 840 microseconds, so if your > calculations come out less than that, you should not be claiming > "bad bus" or something like that, but at first revalidating your > analysis or interpretation of the figures. > > Also, because of scheduling delays, and some I2C drivers will sleep > waiting for the transaction to finish, even that measurement I > suggest can not be said to relate to the actual time it takes for > the transactions on the bus, unless you're running a hard-realtime OS. you are giving a lot of reasons why it's unreliable ... and I certainly won't dare to calculate useless statistics anymore. > > However, it seems you're very keen to blame the I2C bus hardware... > Based on my mails, I can certainly see why you're thinking this way. I have no idea what goes wrong anywhere between me making a modification in the mdio.c file -> i2c code -> ... -> SFP phy. I'm curious what goes wrong, notice the 3 dots in between, I know there's a pca9545 muxer in in there further complicating it, but that's about it. Long story short: should I somehow try to test the reliability of something else? Thanks, Janpieter Sollie ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-09-23 7:04 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 13:52 [RFC] increase MDIO i2c poll timeout gradually (including patch) Janpieter Sollie
2025-09-19 17:04 ` Andrew Lunn
2025-09-20 10:00 ` Janpieter Sollie
2025-09-20 11:18 ` Russell King (Oracle)
2025-09-20 12:34 ` Janpieter Sollie
[not found] ` <6d444507-1c97-4904-8edb-e8cc1aa4399e@kabelmail.de>
2025-09-20 13:53 ` Russell King (Oracle)
2025-09-22 8:04 ` Janpieter Sollie
2025-09-22 10:54 ` Janpieter Sollie
2025-09-22 12:38 ` Russell King (Oracle)
2025-09-22 13:16 ` Russell King (Oracle)
2025-09-22 13:50 ` Russell King (Oracle)
2025-09-22 14:30 ` Janpieter Sollie
2025-09-22 15:25 ` Russell King (Oracle)
2025-09-23 7:02 ` Janpieter Sollie
2025-09-22 14:46 ` Janpieter Sollie
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).