* [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
@ 2024-01-01 21:31 Ezra Buehler
2024-01-01 22:44 ` Heiner Kallweit
0 siblings, 1 reply; 17+ messages in thread
From: Ezra Buehler @ 2024-01-01 21:31 UTC (permalink / raw)
To: Andrew Lunn, Heiner Kallweit, Russell King, Tristram Ha
Cc: Michael Walle, Jesse Brandeburg, netdev
Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
Gateway will no longer boot.
Prior to the mentioned change, probe_capabilities would be set to
MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
least with our setup) considerably slow down kernel startup and
ultimately result in a board reset.
AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
"Clause 45 protection" feature (e.g. LAN8830) and others like the
LAN8804 will explicitly state the following in the datasheet:
This device may respond to Clause 45 accesses and so must not be
mixed with Clause 45 devices on the same MDIO bus.
Fixes: 1a136ca2e089 ("net: mdio: scan bus based on bus capabilities for C22 and C45")
Signed-off-by: Ezra Buehler <ezra.buehler@husqvarnagroup.com>
---
This change is modeled after commit 348659337485 ("net: mdio: Add
workaround for Micrel PHYs which are not C45 compatible"). However,
I find the name SMSC_OUI somewhat misleading as the value is not the
full OUI (0x00800f) but, just the OUI part of phy_id, which is quite
different.
drivers/net/phy/mdio_bus.c | 3 ++-
include/linux/smscphy.h | 2 ++
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 25dcaa49ab8b..63f1c42fbc8d 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -31,6 +31,7 @@
#include <linux/reset.h>
#include <linux/skbuff.h>
#include <linux/slab.h>
+#include <linux/smscphy.h>
#include <linux/spinlock.h>
#include <linux/string.h>
#include <linux/uaccess.h>
@@ -632,7 +633,7 @@ static bool mdiobus_prevent_c45_scan(struct mii_bus *bus)
continue;
oui = phydev->phy_id >> 10;
- if (oui == MICREL_OUI)
+ if (oui == MICREL_OUI || oui == SMSC_OUI)
return true;
}
return false;
diff --git a/include/linux/smscphy.h b/include/linux/smscphy.h
index 1a6a851d2cf8..069d6d226abd 100644
--- a/include/linux/smscphy.h
+++ b/include/linux/smscphy.h
@@ -2,6 +2,8 @@
#ifndef __LINUX_SMSCPHY_H__
#define __LINUX_SMSCPHY_H__
+#define SMSC_OUI 0x01f0
+
#define MII_LAN83C185_ISF 29 /* Interrupt Source Flags */
#define MII_LAN83C185_IM 30 /* Interrupt Mask */
#define MII_LAN83C185_CTRL_STATUS 17 /* Mode/Status Register */
--
2.39.2
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-01 21:31 [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs Ezra Buehler
@ 2024-01-01 22:44 ` Heiner Kallweit
2024-01-02 8:50 ` Michael Walle
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Heiner Kallweit @ 2024-01-01 22:44 UTC (permalink / raw)
To: ezra, Andrew Lunn, Russell King, Tristram Ha
Cc: Michael Walle, Jesse Brandeburg, netdev
On 01.01.2024 22:31, Ezra Buehler wrote:
> Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
> capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
> Gateway will no longer boot.
>
> Prior to the mentioned change, probe_capabilities would be set to
> MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
> Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> least with our setup) considerably slow down kernel startup and
> ultimately result in a board reset.
>
> AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
> "Clause 45 protection" feature (e.g. LAN8830) and others like the
> LAN8804 will explicitly state the following in the datasheet:
>
> This device may respond to Clause 45 accesses and so must not be
> mixed with Clause 45 devices on the same MDIO bus.
>
I'm not convinced that some heuristic based on vendors is a
sustainable approach. Also I'd like to avoid (as far as possible)
that core code includes vendor driver headers. Maybe we could use
a new PHY driver flag. Approaches I could think of:
Approach 1:
Add a PHY driver flag to state: PHY is not c45-access-safe
Then c45 scanning would be omitted if at least one c22 PHY
with this flag was found.
Approach 2:
Add a PHY driver flag to state: PHY is c45-access-safe
Then c45 scanning would only be done if all found c22 devices
Not sure which options have been discussed before. Any feedback
welcome.
Related: How common are setups where c22 and c45 devices are attached
to a single MDIO bus?
> Fixes: 1a136ca2e089 ("net: mdio: scan bus based on bus capabilities for C22 and C45")
> Signed-off-by: Ezra Buehler <ezra.buehler@husqvarnagroup.com>
> ---
>
> This change is modeled after commit 348659337485 ("net: mdio: Add
> workaround for Micrel PHYs which are not C45 compatible"). However,
> I find the name SMSC_OUI somewhat misleading as the value is not the
> full OUI (0x00800f) but, just the OUI part of phy_id, which is quite
> different.
>
> drivers/net/phy/mdio_bus.c | 3 ++-
> include/linux/smscphy.h | 2 ++
> 2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index 25dcaa49ab8b..63f1c42fbc8d 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -31,6 +31,7 @@
> #include <linux/reset.h>
> #include <linux/skbuff.h>
> #include <linux/slab.h>
> +#include <linux/smscphy.h>
> #include <linux/spinlock.h>
> #include <linux/string.h>
> #include <linux/uaccess.h>
> @@ -632,7 +633,7 @@ static bool mdiobus_prevent_c45_scan(struct mii_bus *bus)
> continue;
> oui = phydev->phy_id >> 10;
>
> - if (oui == MICREL_OUI)
> + if (oui == MICREL_OUI || oui == SMSC_OUI)
> return true;
> }
> return false;
> diff --git a/include/linux/smscphy.h b/include/linux/smscphy.h
> index 1a6a851d2cf8..069d6d226abd 100644
> --- a/include/linux/smscphy.h
> +++ b/include/linux/smscphy.h
> @@ -2,6 +2,8 @@
> #ifndef __LINUX_SMSCPHY_H__
> #define __LINUX_SMSCPHY_H__
>
> +#define SMSC_OUI 0x01f0
> +
> #define MII_LAN83C185_ISF 29 /* Interrupt Source Flags */
> #define MII_LAN83C185_IM 30 /* Interrupt Mask */
> #define MII_LAN83C185_CTRL_STATUS 17 /* Mode/Status Register */
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-01 22:44 ` Heiner Kallweit
@ 2024-01-02 8:50 ` Michael Walle
2024-01-02 9:14 ` Heiner Kallweit
2024-01-02 12:02 ` Russell King (Oracle)
2024-01-02 13:42 ` Andrew Lunn
2 siblings, 1 reply; 17+ messages in thread
From: Michael Walle @ 2024-01-02 8:50 UTC (permalink / raw)
To: Heiner Kallweit
Cc: ezra, Andrew Lunn, Russell King, Tristram Ha, Jesse Brandeburg,
netdev
Hi,
>> Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
>> capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
>> Gateway will no longer boot.
>>
>> Prior to the mentioned change, probe_capabilities would be set to
>> MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
>> Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
>> least with our setup) considerably slow down kernel startup and
>> ultimately result in a board reset.
>>
>> AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
>> "Clause 45 protection" feature (e.g. LAN8830) and others like the
>> LAN8804 will explicitly state the following in the datasheet:
>>
>> This device may respond to Clause 45 accesses and so must not be
>> mixed with Clause 45 devices on the same MDIO bus.
If implemented correctly, c22 phys should never respond to c45
accesses. Correct? So the "Clause 45 protection" sounds like the
normal behavior here and the "may respond to c45 accesses" looks
like it's broken.
> I'm not convinced that some heuristic based on vendors is a
> sustainable approach. Also I'd like to avoid (as far as possible)
> that core code includes vendor driver headers. Maybe we could use
> a new PHY driver flag. Approaches I could think of:
>
> Approach 1:
> Add a PHY driver flag to state: PHY is not c45-access-safe
> Then c45 scanning would be omitted if at least one c22 PHY
> with this flag was found.
>
> Approach 2:
> Add a PHY driver flag to state: PHY is c45-access-safe
> Then c45 scanning would only be done if all found c22 devices
>
> Not sure which options have been discussed before. Any feedback
> welcome.
I had a similar idea and IIRC Andrew said this would be a layering
violation. But I can't find the thread anymore.
> Related: How common are setups where c22 and c45 devices are attached
> to a single MDIO bus?
At least we have boards which has c22 and c45 PHYs on one bus. And
on one board, we even have a Micrel/Microchip PHY on this bus, which
forces us to use c22-over-c45 for the c45 PHY. I really need to repost
my
c45-over-c22 series, although there was no consensus there
unfortunately.
-michael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 8:50 ` Michael Walle
@ 2024-01-02 9:14 ` Heiner Kallweit
0 siblings, 0 replies; 17+ messages in thread
From: Heiner Kallweit @ 2024-01-02 9:14 UTC (permalink / raw)
To: Michael Walle
Cc: ezra, Andrew Lunn, Russell King, Tristram Ha, Jesse Brandeburg,
netdev
On 02.01.2024 09:50, Michael Walle wrote:
> Hi,
>
>>> Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
>>> capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
>>> Gateway will no longer boot.
>>>
>>> Prior to the mentioned change, probe_capabilities would be set to
>>> MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
>>> Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
>>> least with our setup) considerably slow down kernel startup and
>>> ultimately result in a board reset.
>>>
>>> AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
>>> "Clause 45 protection" feature (e.g. LAN8830) and others like the
>>> LAN8804 will explicitly state the following in the datasheet:
>>>
>>> This device may respond to Clause 45 accesses and so must not be
>>> mixed with Clause 45 devices on the same MDIO bus.
>
> If implemented correctly, c22 phys should never respond to c45
> accesses. Correct? So the "Clause 45 protection" sounds like the
> normal behavior here and the "may respond to c45 accesses" looks
> like it's broken.
>
>> I'm not convinced that some heuristic based on vendors is a
>> sustainable approach. Also I'd like to avoid (as far as possible)
>> that core code includes vendor driver headers. Maybe we could use
>> a new PHY driver flag. Approaches I could think of:
>>
>> Approach 1:
>> Add a PHY driver flag to state: PHY is not c45-access-safe
>> Then c45 scanning would be omitted if at least one c22 PHY
>> with this flag was found.
>>
>> Approach 2:
>> Add a PHY driver flag to state: PHY is c45-access-safe
>> Then c45 scanning would only be done if all found c22 devices
>>
>> Not sure which options have been discussed before. Any feedback
>> welcome.
>
> I had a similar idea and IIRC Andrew said this would be a layering
> violation. But I can't find the thread anymore.
>
Due to async probing we may have the case that the driver isn't bound
yet. Right. Maybe there are more reasons.
Then, as a compromise, maybe we can replace the OUI checks with a blacklist,
where the entries consist of PHY ID and mask. This would cover the case of
matching all PHY's from a particular vendor, but would also allow to be more
granular.
>> Related: How common are setups where c22 and c45 devices are attached
>> to a single MDIO bus?
>
> At least we have boards which has c22 and c45 PHYs on one bus. And
> on one board, we even have a Micrel/Microchip PHY on this bus, which
> forces us to use c22-over-c45 for the c45 PHY. I really need to repost my
> c45-over-c22 series, although there was no consensus there unfortunately.
>
> -michael
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-01 22:44 ` Heiner Kallweit
2024-01-02 8:50 ` Michael Walle
@ 2024-01-02 12:02 ` Russell King (Oracle)
2024-01-02 13:42 ` Andrew Lunn
2 siblings, 0 replies; 17+ messages in thread
From: Russell King (Oracle) @ 2024-01-02 12:02 UTC (permalink / raw)
To: Heiner Kallweit
Cc: ezra, Andrew Lunn, Tristram Ha, Michael Walle, Jesse Brandeburg,
netdev
On Mon, Jan 01, 2024 at 11:44:38PM +0100, Heiner Kallweit wrote:
> On 01.01.2024 22:31, Ezra Buehler wrote:
> > Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
> > capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
> > Gateway will no longer boot.
> >
> > Prior to the mentioned change, probe_capabilities would be set to
> > MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
> > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > least with our setup) considerably slow down kernel startup and
> > ultimately result in a board reset.
> >
> > AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
> > "Clause 45 protection" feature (e.g. LAN8830) and others like the
> > LAN8804 will explicitly state the following in the datasheet:
> >
> > This device may respond to Clause 45 accesses and so must not be
> > mixed with Clause 45 devices on the same MDIO bus.
> >
>
> I'm not convinced that some heuristic based on vendors is a
> sustainable approach. Also I'd like to avoid (as far as possible)
> that core code includes vendor driver headers. Maybe we could use
> a new PHY driver flag. Approaches I could think of:
>
> Approach 1:
> Add a PHY driver flag to state: PHY is not c45-access-safe
> Then c45 scanning would be omitted if at least one c22 PHY
> with this flag was found.
>
> Approach 2:
> Add a PHY driver flag to state: PHY is c45-access-safe
> Then c45 scanning would only be done if all found c22 devices
Anything based on PHY driver flags isn't going to work - the scan
happens _before_ we know what is on the bus and _before_ we have
any devices to even think about probing drivers (which could even
be in a module on a filesystem that has yet to be mounted.)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-01 22:44 ` Heiner Kallweit
2024-01-02 8:50 ` Michael Walle
2024-01-02 12:02 ` Russell King (Oracle)
@ 2024-01-02 13:42 ` Andrew Lunn
2024-01-02 14:00 ` Russell King (Oracle)
` (2 more replies)
2 siblings, 3 replies; 17+ messages in thread
From: Andrew Lunn @ 2024-01-02 13:42 UTC (permalink / raw)
To: Heiner Kallweit
Cc: ezra, Russell King, Tristram Ha, Michael Walle, Jesse Brandeburg,
netdev
On Mon, Jan 01, 2024 at 11:44:38PM +0100, Heiner Kallweit wrote:
> On 01.01.2024 22:31, Ezra Buehler wrote:
> > Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
> > capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
> > Gateway will no longer boot.
> >
> > Prior to the mentioned change, probe_capabilities would be set to
> > MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
> > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > least with our setup) considerably slow down kernel startup and
> > ultimately result in a board reset.
> >
> > AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
> > "Clause 45 protection" feature (e.g. LAN8830) and others like the
> > LAN8804 will explicitly state the following in the datasheet:
> >
> > This device may respond to Clause 45 accesses and so must not be
> > mixed with Clause 45 devices on the same MDIO bus.
> >
>
> I'm not convinced that some heuristic based on vendors is a
> sustainable approach. Also I'd like to avoid (as far as possible)
> that core code includes vendor driver headers. Maybe we could use
> a new PHY driver flag. Approaches I could think of:
We already have a core hack for these broken PHYs:
/*
* There are some C22 PHYs which do bad things when where is a C45
* transaction on the bus, like accepting a read themselves, and
* stomping over the true devices reply, to performing a write to
* themselves which was intended for another device. Now that C22
* devices have been found, see if any of them are bad for C45, and if we
* should skip the C45 scan.
*/
static bool mdiobus_prevent_c45_scan(struct mii_bus *bus)
{
int i;
for (i = 0; i < PHY_MAX_ADDR; i++) {
struct phy_device *phydev;
u32 oui;
phydev = mdiobus_get_phy(bus, i);
if (!phydev)
continue;
oui = phydev->phy_id >> 10;
if (oui == MICREL_OUI)
return true;
}
return false;
}
So it seems we need to extend this with another OUI.
Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 13:42 ` Andrew Lunn
@ 2024-01-02 14:00 ` Russell King (Oracle)
2024-01-02 14:04 ` Ezra Buehler
2024-01-02 14:06 ` Heiner Kallweit
2 siblings, 0 replies; 17+ messages in thread
From: Russell King (Oracle) @ 2024-01-02 14:00 UTC (permalink / raw)
To: Andrew Lunn
Cc: Heiner Kallweit, ezra, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
On Tue, Jan 02, 2024 at 02:42:17PM +0100, Andrew Lunn wrote:
> On Mon, Jan 01, 2024 at 11:44:38PM +0100, Heiner Kallweit wrote:
> > On 01.01.2024 22:31, Ezra Buehler wrote:
> > > Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
> > > capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
> > > Gateway will no longer boot.
> > >
> > > Prior to the mentioned change, probe_capabilities would be set to
> > > MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
> > > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > > least with our setup) considerably slow down kernel startup and
> > > ultimately result in a board reset.
> > >
> > > AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
> > > "Clause 45 protection" feature (e.g. LAN8830) and others like the
> > > LAN8804 will explicitly state the following in the datasheet:
> > >
> > > This device may respond to Clause 45 accesses and so must not be
> > > mixed with Clause 45 devices on the same MDIO bus.
> > >
> >
> > I'm not convinced that some heuristic based on vendors is a
> > sustainable approach. Also I'd like to avoid (as far as possible)
> > that core code includes vendor driver headers. Maybe we could use
> > a new PHY driver flag. Approaches I could think of:
>
> We already have a core hack for these broken PHYs:
Yes, and it is this very function that the patch at the start of this
thread is changing!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 13:42 ` Andrew Lunn
2024-01-02 14:00 ` Russell King (Oracle)
@ 2024-01-02 14:04 ` Ezra Buehler
2024-01-02 14:06 ` Heiner Kallweit
2 siblings, 0 replies; 17+ messages in thread
From: Ezra Buehler @ 2024-01-02 14:04 UTC (permalink / raw)
To: Andrew Lunn
Cc: Heiner Kallweit, Russell King, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
Hi Andrew,
On Tue, Jan 2, 2024 at 2:42 PM Andrew Lunn <andrew@lunn.ch> wrote:
> So it seems we need to extend this with another OUI.
That is actually what I am proposing:
https://lore.kernel.org/netdev/77fa1435-58e3-4fe1-b860-288ed143e7bc@gmail.com/T/
Sorry, that the patch did not reach you directly. Unfortunately, I cannot
send patches (yet) through the company's mail server. I am smarter now,
won't happen again.
Cheers,
Ezra.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 13:42 ` Andrew Lunn
2024-01-02 14:00 ` Russell King (Oracle)
2024-01-02 14:04 ` Ezra Buehler
@ 2024-01-02 14:06 ` Heiner Kallweit
2024-01-02 15:50 ` Andrew Lunn
2 siblings, 1 reply; 17+ messages in thread
From: Heiner Kallweit @ 2024-01-02 14:06 UTC (permalink / raw)
To: Andrew Lunn
Cc: ezra, Russell King, Tristram Ha, Michael Walle, Jesse Brandeburg,
netdev
On 02.01.2024 14:42, Andrew Lunn wrote:
> On Mon, Jan 01, 2024 at 11:44:38PM +0100, Heiner Kallweit wrote:
>> On 01.01.2024 22:31, Ezra Buehler wrote:
>>> Since commit 1a136ca2e089 ("net: mdio: scan bus based on bus
>>> capabilities for C22 and C45") our AT91SAM9G25-based GARDENA smart
>>> Gateway will no longer boot.
>>>
>>> Prior to the mentioned change, probe_capabilities would be set to
>>> MDIOBUS_NO_CAP (0) and therefore, no Clause 45 scan was performed.
>>> Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
>>> least with our setup) considerably slow down kernel startup and
>>> ultimately result in a board reset.
>>>
>>> AFAICT all SMSC/Microchip PHYs are Clause 22 devices. Some have a
>>> "Clause 45 protection" feature (e.g. LAN8830) and others like the
>>> LAN8804 will explicitly state the following in the datasheet:
>>>
>>> This device may respond to Clause 45 accesses and so must not be
>>> mixed with Clause 45 devices on the same MDIO bus.
>>>
>>
>> I'm not convinced that some heuristic based on vendors is a
>> sustainable approach. Also I'd like to avoid (as far as possible)
>> that core code includes vendor driver headers. Maybe we could use
>> a new PHY driver flag. Approaches I could think of:
>
> We already have a core hack for these broken PHYs:
>
Excluding all PHY's from a vendor for me is a quite big hammer.
I think we should make this more granular.
And mdio-bus.c including micrel_phy.h also isn't too nice.
Maybe we should move all OUI definitions in drivers to a
core header. Because the OUI seems to be all we need from
these headers.
> /*
> * There are some C22 PHYs which do bad things when where is a C45
> * transaction on the bus, like accepting a read themselves, and
> * stomping over the true devices reply, to performing a write to
> * themselves which was intended for another device. Now that C22
> * devices have been found, see if any of them are bad for C45, and if we
> * should skip the C45 scan.
> */
> static bool mdiobus_prevent_c45_scan(struct mii_bus *bus)
> {
> int i;
>
> for (i = 0; i < PHY_MAX_ADDR; i++) {
> struct phy_device *phydev;
> u32 oui;
>
> phydev = mdiobus_get_phy(bus, i);
> if (!phydev)
> continue;
> oui = phydev->phy_id >> 10;
>
> if (oui == MICREL_OUI)
> return true;
> }
> return false;
> }
>
> So it seems we need to extend this with another OUI.
>
> Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 14:06 ` Heiner Kallweit
@ 2024-01-02 15:50 ` Andrew Lunn
2024-01-02 18:08 ` Ezra Buehler
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2024-01-02 15:50 UTC (permalink / raw)
To: Heiner Kallweit
Cc: ezra, Russell King, Tristram Ha, Michael Walle, Jesse Brandeburg,
netdev
> Excluding all PHY's from a vendor for me is a quite big hammer.
Maybe it serves them right for getting this wrong?
Micrel is now part of Microchip, so in effect, this is the same broken
IP just with a different name and OUI. We have not seen any other
vendor get this wrong.
I do however disagree with this statement in the original patch:
> AFAICT all SMSC/Microchip PHYs are Clause 22 devices.
drivers/net/phy/smsc.c has a number of phy_write_mmd()/phy_read_mmd()
in it. But that device has a different OUI.
> I think we should make this more granular.
> And mdio-bus.c including micrel_phy.h also isn't too nice.
> Maybe we should move all OUI definitions in drivers to a
> core header. Because the OUI seems to be all we need from
> these headers.
That does seem a big change to make for 'one' broken PHY IP.
However, the commit message says:
> Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> least with our setup) considerably slow down kernel startup and
> ultimately result in a board reset.
So we need to clarify the real issue here. Does the C45 scan work
correctly, but the board watchdog timer is too short and fires? We
should not be extended this workaround when its a bad watchdog
configuration issue...
Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 15:50 ` Andrew Lunn
@ 2024-01-02 18:08 ` Ezra Buehler
2024-01-02 18:57 ` Russell King (Oracle)
2024-01-02 19:49 ` Andrew Lunn
0 siblings, 2 replies; 17+ messages in thread
From: Ezra Buehler @ 2024-01-02 18:08 UTC (permalink / raw)
To: Andrew Lunn
Cc: Heiner Kallweit, Russell King, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
Hi Andrew,
On Tue, Jan 2, 2024 at 4:50 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> I do however disagree with this statement in the original patch:
>
> > AFAICT all SMSC/Microchip PHYs are Clause 22 devices.
Excuse my ignorance, I am by no means an expert here. I guess what I
wanted to say was:
By skimming over some datasheets for similar SMSC/Microchip PHYs, I
could not find any evidence that they support Clause 45 scanning
(other than not responding).
> drivers/net/phy/smsc.c has a number of phy_write_mmd()/phy_read_mmd()
> in it. But that device has a different OUI.
I guess I am confused here, AFAICT all PHYs in smsc.c have the same OUI
(phy_id >> 10).
> However, the commit message says:
>
> > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > least with our setup) considerably slow down kernel startup and
> > ultimately result in a board reset.
>
> So we need to clarify the real issue here. Does the C45 scan work
> correctly, but the board watchdog timer is too short and fires? We
> should not be extended this workaround when its a bad watchdog
> configuration issue...
Changing the watchdog configuration is not an option here. We are
talking about a slowdown of several seconds here, that is not acceptable
on its own.
Cheers,
Ezra.
On Tue, Jan 2, 2024 at 4:50 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > Excluding all PHY's from a vendor for me is a quite big hammer.
>
> Maybe it serves them right for getting this wrong?
>
> Micrel is now part of Microchip, so in effect, this is the same broken
> IP just with a different name and OUI. We have not seen any other
> vendor get this wrong.
>
> I do however disagree with this statement in the original patch:
>
> > AFAICT all SMSC/Microchip PHYs are Clause 22 devices.
>
> drivers/net/phy/smsc.c has a number of phy_write_mmd()/phy_read_mmd()
> in it. But that device has a different OUI.
>
> > I think we should make this more granular.
> > And mdio-bus.c including micrel_phy.h also isn't too nice.
> > Maybe we should move all OUI definitions in drivers to a
> > core header. Because the OUI seems to be all we need from
> > these headers.
>
> That does seem a big change to make for 'one' broken PHY IP.
>
> However, the commit message says:
>
> > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > least with our setup) considerably slow down kernel startup and
> > ultimately result in a board reset.
>
> So we need to clarify the real issue here. Does the C45 scan work
> correctly, but the board watchdog timer is too short and fires? We
> should not be extended this workaround when its a bad watchdog
> configuration issue...
>
> Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 18:08 ` Ezra Buehler
@ 2024-01-02 18:57 ` Russell King (Oracle)
2024-01-06 12:41 ` Ezra Buehler
2024-01-02 19:49 ` Andrew Lunn
1 sibling, 1 reply; 17+ messages in thread
From: Russell King (Oracle) @ 2024-01-02 18:57 UTC (permalink / raw)
To: Ezra Buehler
Cc: Andrew Lunn, Heiner Kallweit, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
On Tue, Jan 02, 2024 at 07:08:32PM +0100, Ezra Buehler wrote:
> Hi Andrew,
>
> On Tue, Jan 2, 2024 at 4:50 PM Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > I do however disagree with this statement in the original patch:
> >
> > > AFAICT all SMSC/Microchip PHYs are Clause 22 devices.
>
> Excuse my ignorance, I am by no means an expert here. I guess what I
> wanted to say was:
>
> By skimming over some datasheets for similar SMSC/Microchip PHYs, I
> could not find any evidence that they support Clause 45 scanning
> (other than not responding).
>
> > drivers/net/phy/smsc.c has a number of phy_write_mmd()/phy_read_mmd()
> > in it. But that device has a different OUI.
>
> I guess I am confused here, AFAICT all PHYs in smsc.c have the same OUI
> (phy_id >> 10).
>
> > However, the commit message says:
> >
> > > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > > least with our setup) considerably slow down kernel startup and
> > > ultimately result in a board reset.
> >
> > So we need to clarify the real issue here. Does the C45 scan work
> > correctly, but the board watchdog timer is too short and fires? We
> > should not be extended this workaround when its a bad watchdog
> > configuration issue...
>
> Changing the watchdog configuration is not an option here. We are
> talking about a slowdown of several seconds here, that is not acceptable
> on its own.
Any ideas why the scan is taking so long?
For each PHY address (of which there are 32) we scan each MMD between
1 and 31 inclusive. We attempt to read the devices-in-package
registers. That means 32 * 32 * 2 calls to the mdiobus_c45_read(),
which is 2048 calls. Each is two MDIO transactions on the bus, so
that's 4096. Each is 64 bits long including preamble, and at 2.5MHz
(the "old" maximum) it should take about 100ms to scan the each
MMD on each PHY address to determine whether a device is present.
I'm guessing the MDIO driver you are using is probably using a software
timeout which is extending the latency of each bus frame considerably.
Maybe that is where one should be looking if the timing is not
acceptable?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 18:08 ` Ezra Buehler
2024-01-02 18:57 ` Russell King (Oracle)
@ 2024-01-02 19:49 ` Andrew Lunn
2024-01-02 20:05 ` Ezra Buehler
1 sibling, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2024-01-02 19:49 UTC (permalink / raw)
To: Ezra Buehler
Cc: Heiner Kallweit, Russell King, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
> By skimming over some datasheets for similar SMSC/Microchip PHYs, I
> could not find any evidence that they support Clause 45 scanning
> (other than not responding).
Do you find any reference to Clause 22 scanning being supported in the
datasheets?
What we expect is that C22 registers 2 and 3 contain ID values. If
there is no device at the address on the bus, nothing should respond,
and the pull up on the bus should result in a read of 0xffff. If we
get a value other than 0xffff, we know there is a device there. The
same is basically true for C45, but because each address has 32 MMD
spaces, its a bit more complex. But still, if there is no device
there, it should return 0xffff when reading an ID register. This is
all part of IEEE 802.3, so there is no real need to specific this in
the datasheet, other than to say its conformance to 802.3, or list
where it does not conform.
>
> > drivers/net/phy/smsc.c has a number of phy_write_mmd()/phy_read_mmd()
> > in it. But that device has a different OUI.
>
> I guess I am confused here, AFAICT all PHYs in smsc.c have the same OUI
> (phy_id >> 10).
My error. I forgot about the odd shift. So smsc.c does use the 01f0
OUI.
> > However, the commit message says:
> >
> > > Running a Clause 45 scan on an SMSC/Microchip LAN8720A PHY will (at
> > > least with our setup) considerably slow down kernel startup and
> > > ultimately result in a board reset.
> >
> > So we need to clarify the real issue here. Does the C45 scan work
> > correctly, but the board watchdog timer is too short and fires? We
> > should not be extended this workaround when its a bad watchdog
> > configuration issue...
>
> Changing the watchdog configuration is not an option here. We are
> talking about a slowdown of several seconds here, that is not acceptable
> on its own.
I'm with Russell here, we should understand why its so slow. And by
fixing that, you might find access in general gets better.
Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 19:49 ` Andrew Lunn
@ 2024-01-02 20:05 ` Ezra Buehler
0 siblings, 0 replies; 17+ messages in thread
From: Ezra Buehler @ 2024-01-02 20:05 UTC (permalink / raw)
To: Andrew Lunn
Cc: Heiner Kallweit, Russell King, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
> On 2 Jan 2024, at 20:49, Andrew Lunn <andrew@lunn.ch> wrote:
>
>> By skimming over some datasheets for similar SMSC/Microchip PHYs, I
>> could not find any evidence that they support Clause 45 scanning
>> (other than not responding).
>
> Do you find any reference to Clause 22 scanning being supported in the
> datasheets?
E.g. the one for the LAN8720A actually states:
This interface supports registers 0 through 6 as required by Clause 22 of the 802.3 standard, as well as “vendor- specific” registers 16 to 31 allowed by the specification. Non-supported registers (such as 7 to 15) will be read as hexadecimal “FFFF”.
> I'm with Russell here, we should understand why its so slow. And by
> fixing that, you might find access in general gets better.
I agree. I will dig deeper.
Cheers,
Ezra.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-02 18:57 ` Russell King (Oracle)
@ 2024-01-06 12:41 ` Ezra Buehler
2024-01-06 15:20 ` Andrew Lunn
0 siblings, 1 reply; 17+ messages in thread
From: Ezra Buehler @ 2024-01-06 12:41 UTC (permalink / raw)
To: Russell King (Oracle)
Cc: Andrew Lunn, Heiner Kallweit, Tristram Ha, Michael Walle,
Jesse Brandeburg, netdev
On Tue, Jan 2, 2024 at 7:58 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> Any ideas why the scan is taking so long?
>
> For each PHY address (of which there are 32) we scan each MMD between
> 1 and 31 inclusive. We attempt to read the devices-in-package
> registers. That means 32 * 32 * 2 calls to the mdiobus_c45_read(),
> which is 2048 calls. Each is two MDIO transactions on the bus, so
> that's 4096. Each is 64 bits long including preamble, and at 2.5MHz
> (the "old" maximum) it should take about 100ms to scan the each
> MMD on each PHY address to determine whether a device is present.
>
> I'm guessing the MDIO driver you are using is probably using a software
> timeout which is extending the latency of each bus frame considerably.
> Maybe that is where one should be looking if the timing is not
> acceptable?
When profiling with ftrace, I see that executing macb_mdio_read_c45()
takes about 20ms. The function calls macb_mdio_wait_for_idle() 3 times,
where the latter 2 invocations take about 10ms each. The
macb_mdio_wait_for_idle() function invokes the read_poll_timeout() macro
with a timeout of 1 second, which we obviously do not hit.
To me it looks like we are simply waiting for the hardware to perform
the transaction, i.e. write out the address and read the register (which
is read as 0xFFFF).
I've checked the MDIO clock, it is at 2MHz.
So, any suggestions for what to look into next?
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-06 12:41 ` Ezra Buehler
@ 2024-01-06 15:20 ` Andrew Lunn
2024-01-07 13:51 ` Ezra Buehler
0 siblings, 1 reply; 17+ messages in thread
From: Andrew Lunn @ 2024-01-06 15:20 UTC (permalink / raw)
To: Ezra Buehler
Cc: Russell King (Oracle), Heiner Kallweit, Tristram Ha,
Michael Walle, Jesse Brandeburg, netdev
On Sat, Jan 06, 2024 at 01:41:01PM +0100, Ezra Buehler wrote:
> On Tue, Jan 2, 2024 at 7:58 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > Any ideas why the scan is taking so long?
> >
> > For each PHY address (of which there are 32) we scan each MMD between
> > 1 and 31 inclusive. We attempt to read the devices-in-package
> > registers. That means 32 * 32 * 2 calls to the mdiobus_c45_read(),
> > which is 2048 calls. Each is two MDIO transactions on the bus, so
> > that's 4096. Each is 64 bits long including preamble, and at 2.5MHz
> > (the "old" maximum) it should take about 100ms to scan the each
> > MMD on each PHY address to determine whether a device is present.
> >
> > I'm guessing the MDIO driver you are using is probably using a software
> > timeout which is extending the latency of each bus frame considerably.
> > Maybe that is where one should be looking if the timing is not
> > acceptable?
>
> When profiling with ftrace, I see that executing macb_mdio_read_c45()
> takes about 20ms. The function calls macb_mdio_wait_for_idle() 3 times,
> where the latter 2 invocations take about 10ms each. The
> macb_mdio_wait_for_idle() function invokes the read_poll_timeout() macro
> with a timeout of 1 second, which we obviously do not hit.
>
> To me it looks like we are simply waiting for the hardware to perform
> the transaction, i.e. write out the address and read the register (which
> is read as 0xFFFF).
>
> I've checked the MDIO clock, it is at 2MHz.
>
> So, any suggestions for what to look into next?
Does a C22 read/write call to macb_mdio_wait_for_idle() take as long?
Could you hack a copy of readx_poll_timeout() and real_poll_timeout()
into the driver, and extend it to count how many times it goes around
the loop. Is the usleep_range() actually sleeping for 10ms because you
don't have any high resolution clocks, and a 100Hz tick? If so, you
might want to swap to 1000Hz tick, or NO_HZ, or enable a high
resolution clock, so that usleep_range() can actually sleep for short
times.
Andrew
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs
2024-01-06 15:20 ` Andrew Lunn
@ 2024-01-07 13:51 ` Ezra Buehler
0 siblings, 0 replies; 17+ messages in thread
From: Ezra Buehler @ 2024-01-07 13:51 UTC (permalink / raw)
To: Andrew Lunn
Cc: Russell King (Oracle), Heiner Kallweit, Tristram Ha,
Michael Walle, Jesse Brandeburg, netdev
On Sat, Jan 6, 2024 at 4:20 PM Andrew Lunn <andrew@lunn.ch> wrote:
> Could you hack a copy of readx_poll_timeout() and real_poll_timeout()
> into the driver, and extend it to count how many times it goes around
> the loop. Is the usleep_range() actually sleeping for 10ms because you
> don't have any high resolution clocks, and a 100Hz tick? If so, you
> might want to swap to 1000Hz tick, or NO_HZ, or enable a high
> resolution clock, so that usleep_range() can actually sleep for short
> times.
You are spot on, simply selecting the 1000Hz tick fixes the issue for me.
So, no need to change the code I guess.
Thank you very much for the help and sorry for wasting your time.
Cheers,
Ezra.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2024-01-07 13:51 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-01 21:31 [PATCH net] net: mdio: Prevent Clause 45 scan on SMSC PHYs Ezra Buehler
2024-01-01 22:44 ` Heiner Kallweit
2024-01-02 8:50 ` Michael Walle
2024-01-02 9:14 ` Heiner Kallweit
2024-01-02 12:02 ` Russell King (Oracle)
2024-01-02 13:42 ` Andrew Lunn
2024-01-02 14:00 ` Russell King (Oracle)
2024-01-02 14:04 ` Ezra Buehler
2024-01-02 14:06 ` Heiner Kallweit
2024-01-02 15:50 ` Andrew Lunn
2024-01-02 18:08 ` Ezra Buehler
2024-01-02 18:57 ` Russell King (Oracle)
2024-01-06 12:41 ` Ezra Buehler
2024-01-06 15:20 ` Andrew Lunn
2024-01-07 13:51 ` Ezra Buehler
2024-01-02 19:49 ` Andrew Lunn
2024-01-02 20:05 ` Ezra Buehler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).