netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c
@ 2024-07-08  7:50 Bartosz Golaszewski
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 1/4] net: phy: aquantia: rename and export aqr107_wait_reset_complete() Bartosz Golaszewski
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-08  7:50 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Resending rebased on top of current net-next.

This series addesses two issues with the aqr115c PHY on Qualcomm
sa8775p-ride-r3 board and adds support for this PHY to the aquantia driver.

While the manufacturer calls the 2.5G PHY mode OCSGMII, we reuse the existing
2500BASEX mode in the kernel to avoid extending the uAPI.

It took me a while to resend because I noticed an issue with the PHY coming
out of suspend with no possible interfaces listed and tracked it to the
GLOBAL_CFG registers for different modes returning 0. A workaround has been
added to the series. Unfortunately the HPG doesn't mention a proper way of
doing it or even mention any such issue at all.

Changes since v2:
- add a patch that addresses an issue with GLOBAL_CFG registers returning 0
- reuse aqr113c_config_init() for aqr115c
- improve commit messages, give more details on the 2500BASEX mode reuse
Link to v2: https://lore.kernel.org/lkml/Zn4Nq1QvhjAUaogb@makrotopia.org/T/

Changes since v1:
- split out the PHY patches into their own series
- don't introduce new mode (OCSGMII) but use existing 2500BASEX instead
- split the wait-for-FW patch into two: one renaming and exporting the
  relevant function and the second using it before checking the FW ID
Link to v1: https://lore.kernel.org/linux-arm-kernel/20240619184550.34524-1-brgl@bgdev.pl/T/

Bartosz Golaszewski (4):
  net: phy: aquantia: rename and export aqr107_wait_reset_complete()
  net: phy: aquantia: wait for FW reset before checking the vendor ID
  net: phy: aquantia: wait for the GLOBAL_CFG to start returning real
    values
  net: phy: aquantia: add support for aqr115c

 drivers/net/phy/aquantia/aquantia.h          |  2 +
 drivers/net/phy/aquantia/aquantia_firmware.c |  4 ++
 drivers/net/phy/aquantia/aquantia_main.c     | 40 ++++++++++++++++++--
 3 files changed, 42 insertions(+), 4 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RESEND PATCH net-next v3 1/4] net: phy: aquantia: rename and export aqr107_wait_reset_complete()
  2024-07-08  7:50 [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c Bartosz Golaszewski
@ 2024-07-08  7:50 ` Bartosz Golaszewski
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID Bartosz Golaszewski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-08  7:50 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

This function is quite generic in this driver and not limited to aqr107.
We will use it outside its current compilation unit soon so rename it
and declare it in the header.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/net/phy/aquantia/aquantia.h      | 2 ++
 drivers/net/phy/aquantia/aquantia_main.c | 6 +++---
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h
index 0bca28ec3332..2465345081f8 100644
--- a/drivers/net/phy/aquantia/aquantia.h
+++ b/drivers/net/phy/aquantia/aquantia.h
@@ -201,4 +201,6 @@ int aqr_phy_led_hw_control_set(struct phy_device *phydev, u8 index,
 int aqr_phy_led_active_low_set(struct phy_device *phydev, int index, bool enable);
 int aqr_phy_led_polarity_set(struct phy_device *phydev, int index,
 			     unsigned long modes);
+int aqr_wait_reset_complete(struct phy_device *phydev);
+
 #endif /* AQUANTIA_H */
diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index 6c14355744b7..974795bd0860 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -441,7 +441,7 @@ static int aqr107_set_tunable(struct phy_device *phydev,
  * The chip also provides a "reset completed" bit, but it's cleared after
  * read. Therefore function would time out if called again.
  */
-static int aqr107_wait_reset_complete(struct phy_device *phydev)
+int aqr_wait_reset_complete(struct phy_device *phydev)
 {
 	int val;
 
@@ -494,7 +494,7 @@ static int aqr107_config_init(struct phy_device *phydev)
 	WARN(phydev->interface == PHY_INTERFACE_MODE_XGMII,
 	     "Your devicetree is out of date, please update it. The AQR107 family doesn't support XGMII, maybe you mean USXGMII.\n");
 
-	ret = aqr107_wait_reset_complete(phydev);
+	ret = aqr_wait_reset_complete(phydev);
 	if (!ret)
 		aqr107_chip_info(phydev);
 
@@ -522,7 +522,7 @@ static int aqcs109_config_init(struct phy_device *phydev)
 	    phydev->interface != PHY_INTERFACE_MODE_2500BASEX)
 		return -ENODEV;
 
-	ret = aqr107_wait_reset_complete(phydev);
+	ret = aqr_wait_reset_complete(phydev);
 	if (!ret)
 		aqr107_chip_info(phydev);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID
  2024-07-08  7:50 [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c Bartosz Golaszewski
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 1/4] net: phy: aquantia: rename and export aqr107_wait_reset_complete() Bartosz Golaszewski
@ 2024-07-08  7:50 ` Bartosz Golaszewski
  2024-07-30  9:59   ` Jon Hunter
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values Bartosz Golaszewski
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-08  7:50 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Checking the firmware register before it complete the boot process makes
no sense, it will report 0 even if FW is available from internal memory.
Always wait for FW to boot before continuing or we'll unnecessarily try
to load it from nvmem/filesystem and fail.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/net/phy/aquantia/aquantia_firmware.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
index 0c9640ef153b..524627a36c6f 100644
--- a/drivers/net/phy/aquantia/aquantia_firmware.c
+++ b/drivers/net/phy/aquantia/aquantia_firmware.c
@@ -353,6 +353,10 @@ int aqr_firmware_load(struct phy_device *phydev)
 {
 	int ret;
 
+	ret = aqr_wait_reset_complete(phydev);
+	if (ret)
+		return ret;
+
 	/* Check if the firmware is not already loaded by pooling
 	 * the current version returned by the PHY. If 0 is returned,
 	 * no firmware is loaded.
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-08  7:50 [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c Bartosz Golaszewski
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 1/4] net: phy: aquantia: rename and export aqr107_wait_reset_complete() Bartosz Golaszewski
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID Bartosz Golaszewski
@ 2024-07-08  7:50 ` Bartosz Golaszewski
  2024-07-18 12:23   ` Jon Hunter
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 4/4] net: phy: aquantia: add support for aqr115c Bartosz Golaszewski
  2024-07-10 10:20 ` [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable " patchwork-bot+netdevbpf
  4 siblings, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-08  7:50 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

When the PHY is first coming up (or resuming from suspend), it's
possible that although the FW status shows as running, we still see
zeroes in the GLOBAL_CFG set of registers and cannot determine available
modes. Since all models support 10M, add a poll and wait the config to
become available.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/net/phy/aquantia/aquantia_main.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index 974795bd0860..2c8ba2725a91 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
 	unsigned long *possible = phydev->possible_interfaces;
 	unsigned int serdes_mode, rate_adapt;
 	phy_interface_t interface;
-	int i, val;
+	int i, val, ret;
+
+	ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
+					VEND1_GLOBAL_CFG_10M, val, val != 0,
+					1000, 100000, false);
+	if (ret)
+		return ret;
 
 	/* Walk the media-speed configuration registers to determine which
 	 * host-side serdes modes may be used by the PHY depending on the
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RESEND PATCH net-next v3 4/4] net: phy: aquantia: add support for aqr115c
  2024-07-08  7:50 [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c Bartosz Golaszewski
                   ` (2 preceding siblings ...)
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values Bartosz Golaszewski
@ 2024-07-08  7:50 ` Bartosz Golaszewski
  2024-07-10 10:20 ` [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable " patchwork-bot+netdevbpf
  4 siblings, 0 replies; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-08  7:50 UTC (permalink / raw)
  To: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski

From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Add support for a new model to the Aquantia driver. This PHY supports
2.5 gigabit speeds. The PHY mode is referred to by the manufacturer as
Overclocked SGMII (OCSGMII) but this actually is just 2500BASEX without
in-band signalling so reuse the existing mode to avoid changing the
uAPI.

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/net/phy/aquantia/aquantia_main.c | 26 ++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index 2c8ba2725a91..d12e35374231 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -29,6 +29,7 @@
 #define PHY_ID_AQR113	0x31c31c40
 #define PHY_ID_AQR113C	0x31c31c12
 #define PHY_ID_AQR114C	0x31c31c22
+#define PHY_ID_AQR115C	0x31c31c33
 #define PHY_ID_AQR813	0x31c31cb2
 
 #define MDIO_PHYXS_VEND_IF_STATUS		0xe812
@@ -1005,6 +1006,30 @@ static struct phy_driver aqr_driver[] = {
 	.led_hw_control_get = aqr_phy_led_hw_control_get,
 	.led_polarity_set = aqr_phy_led_polarity_set,
 },
+{
+	PHY_ID_MATCH_MODEL(PHY_ID_AQR115C),
+	.name           = "Aquantia AQR115C",
+	.probe          = aqr107_probe,
+	.get_rate_matching = aqr107_get_rate_matching,
+	.config_init    = aqr113c_config_init,
+	.config_aneg    = aqr_config_aneg,
+	.config_intr    = aqr_config_intr,
+	.handle_interrupt = aqr_handle_interrupt,
+	.read_status    = aqr107_read_status,
+	.get_tunable    = aqr107_get_tunable,
+	.set_tunable    = aqr107_set_tunable,
+	.suspend        = aqr107_suspend,
+	.resume         = aqr107_resume,
+	.get_sset_count = aqr107_get_sset_count,
+	.get_strings    = aqr107_get_strings,
+	.get_stats      = aqr107_get_stats,
+	.link_change_notify = aqr107_link_change_notify,
+	.led_brightness_set = aqr_phy_led_brightness_set,
+	.led_hw_is_supported = aqr_phy_led_hw_is_supported,
+	.led_hw_control_set = aqr_phy_led_hw_control_set,
+	.led_hw_control_get = aqr_phy_led_hw_control_get,
+	.led_polarity_set = aqr_phy_led_polarity_set,
+},
 {
 	PHY_ID_MATCH_MODEL(PHY_ID_AQR813),
 	.name		= "Aquantia AQR813",
@@ -1048,6 +1073,7 @@ static struct mdio_device_id __maybe_unused aqr_tbl[] = {
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR113) },
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR113C) },
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR114C) },
+	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR115C) },
 	{ PHY_ID_MATCH_MODEL(PHY_ID_AQR813) },
 	{ }
 };
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c
  2024-07-08  7:50 [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c Bartosz Golaszewski
                   ` (3 preceding siblings ...)
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 4/4] net: phy: aquantia: add support for aqr115c Bartosz Golaszewski
@ 2024-07-10 10:20 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 21+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-07-10 10:20 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: andrew, hkallweit1, linux, davem, edumazet, kuba, pabeni, netdev,
	linux-kernel, bartosz.golaszewski

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Mon,  8 Jul 2024 09:50:19 +0200 you wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Resending rebased on top of current net-next.
> 
> This series addesses two issues with the aqr115c PHY on Qualcomm
> sa8775p-ride-r3 board and adds support for this PHY to the aquantia driver.
> 
> [...]

Here is the summary with links:
  - [RESEND,net-next,v3,1/4] net: phy: aquantia: rename and export aqr107_wait_reset_complete()
    https://git.kernel.org/netdev/net-next/c/663117327a39
  - [RESEND,net-next,v3,2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID
    https://git.kernel.org/netdev/net-next/c/ad649a1fac37
  - [RESEND,net-next,v3,3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
    https://git.kernel.org/netdev/net-next/c/708405f3e56e
  - [RESEND,net-next,v3,4/4] net: phy: aquantia: add support for aqr115c
    https://git.kernel.org/netdev/net-next/c/0ebc581f8a4b

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values Bartosz Golaszewski
@ 2024-07-18 12:23   ` Jon Hunter
  2024-07-18 13:04     ` Bartosz Golaszewski
  0 siblings, 1 reply; 21+ messages in thread
From: Jon Hunter @ 2024-07-18 12:23 UTC (permalink / raw)
  To: Bartosz Golaszewski, Andrew Lunn, Heiner Kallweit, Russell King,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski,
	linux-tegra@vger.kernel.org, Brad Griffis

Hi Bartosz,

On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> When the PHY is first coming up (or resuming from suspend), it's
> possible that although the FW status shows as running, we still see
> zeroes in the GLOBAL_CFG set of registers and cannot determine available
> modes. Since all models support 10M, add a poll and wait the config to
> become available.
> 
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
>   drivers/net/phy/aquantia/aquantia_main.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index 974795bd0860..2c8ba2725a91 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
>   	unsigned long *possible = phydev->possible_interfaces;
>   	unsigned int serdes_mode, rate_adapt;
>   	phy_interface_t interface;
> -	int i, val;
> +	int i, val, ret;
> +
> +	ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> +					VEND1_GLOBAL_CFG_10M, val, val != 0,
> +					1000, 100000, false);
> +	if (ret)
> +		return ret;
>   
>   	/* Walk the media-speed configuration registers to determine which
>   	 * host-side serdes modes may be used by the PHY depending on the


With the current -next and mainline we are seeing the following issue on
our Tegra234 Jetson AGX Orin platform ...

  Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
  tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)


We have tracked it down to this change and looks like our PHY does not
support 10M ...

$ ethtool eth0
Settings for eth0:
         Supported ports: [  ]
         Supported link modes:   100baseT/Full
                                 1000baseT/Full
                                 10000baseT/Full
                                 1000baseKX/Full
                                 10000baseKX4/Full
                                 10000baseKR/Full
                                 2500baseT/Full
                                 5000baseT/Full

The following fixes this for this platform ...

diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index d12e35374231..0b2db486d8bd 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
         int i, val, ret;
  
         ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
-                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
+                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
                                         1000, 100000, false);
         if (ret)
                 return ret;


However, I am not sure if this is guaranteed to work for all?

On a related note, we had also found an issue with this PHY where
the PHY reset is not working quite correctly. What we see is that
when polling for the firmware ID in aqr107_wait_reset_complete()
is that value in the firware ID registers transitions from 0 to
0xffff and then to the firmware ID. We have been testing the
following change to fix this ...

diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h
index 2465345081f8..278e3b167c58 100644
--- a/drivers/net/phy/aquantia/aquantia.h
+++ b/drivers/net/phy/aquantia/aquantia.h
@@ -20,6 +20,7 @@
  #define VEND1_GLOBAL_FW_ID                     0x0020
  #define VEND1_GLOBAL_FW_ID_MAJOR               GENMASK(15, 8)
  #define VEND1_GLOBAL_FW_ID_MINOR               GENMASK(7, 0)
+#define VEND1_GLOBAL_FW_ID_MASK                        GENMASK(15, 0)
  
  #define VEND1_GLOBAL_MAILBOX_INTERFACE1                        0x0200
  #define VEND1_GLOBAL_MAILBOX_INTERFACE1_EXECUTE                BIT(15)
diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index 0b2db486d8bd..5023fd70050d 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -447,7 +447,9 @@ int aqr_wait_reset_complete(struct phy_device *phydev)
         int val;
  
         return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
-                                        VEND1_GLOBAL_FW_ID, val, val != 0,
+                                        VEND1_GLOBAL_FW_ID, val,
+                                        ((val & VEND1_GLOBAL_FW_ID_MASK) != 0 &&
+                                        (val & VEND1_GLOBAL_FW_ID_MASK) != VEND1_GLOBAL_FW_ID_MASK),
                                          20000, 2000000, false);
  }

I have not tried the resume use-case, but curious if this may
also help here?

Cheers
Jon

-- 
nvpublic

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 12:23   ` Jon Hunter
@ 2024-07-18 13:04     ` Bartosz Golaszewski
  2024-07-18 13:29       ` Bartosz Golaszewski
  0 siblings, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-18 13:04 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis

On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>
> Hi Bartosz,
>
> On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >
> > When the PHY is first coming up (or resuming from suspend), it's
> > possible that although the FW status shows as running, we still see
> > zeroes in the GLOBAL_CFG set of registers and cannot determine available
> > modes. Since all models support 10M, add a poll and wait the config to
> > become available.
> >
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > ---
> >   drivers/net/phy/aquantia/aquantia_main.c | 8 +++++++-
> >   1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> > index 974795bd0860..2c8ba2725a91 100644
> > --- a/drivers/net/phy/aquantia/aquantia_main.c
> > +++ b/drivers/net/phy/aquantia/aquantia_main.c
> > @@ -652,7 +652,13 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
> >       unsigned long *possible = phydev->possible_interfaces;
> >       unsigned int serdes_mode, rate_adapt;
> >       phy_interface_t interface;
> > -     int i, val;
> > +     int i, val, ret;
> > +
> > +     ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> > +                                     VEND1_GLOBAL_CFG_10M, val, val != 0,
> > +                                     1000, 100000, false);
> > +     if (ret)
> > +             return ret;
> >
> >       /* Walk the media-speed configuration registers to determine which
> >        * host-side serdes modes may be used by the PHY depending on the
>
>
> With the current -next and mainline we are seeing the following issue on
> our Tegra234 Jetson AGX Orin platform ...
>
>   Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
>   tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
>
>
> We have tracked it down to this change and looks like our PHY does not
> support 10M ...
>
> $ ethtool eth0
> Settings for eth0:
>          Supported ports: [  ]
>          Supported link modes:   100baseT/Full
>                                  1000baseT/Full
>                                  10000baseT/Full
>                                  1000baseKX/Full
>                                  10000baseKX4/Full
>                                  10000baseKR/Full
>                                  2500baseT/Full
>                                  5000baseT/Full
>
> The following fixes this for this platform ...
>
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index d12e35374231..0b2db486d8bd 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
>          int i, val, ret;
>
>          ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
> +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
>                                          1000, 100000, false);
>          if (ret)
>                  return ret;
>
>
> However, I am not sure if this is guaranteed to work for all?

Ah cr*p. No, I don't think it is. We should take the first supported
mode for a given PHY I think.

>
> On a related note, we had also found an issue with this PHY where
> the PHY reset is not working quite correctly. What we see is that
> when polling for the firmware ID in aqr107_wait_reset_complete()
> is that value in the firware ID registers transitions from 0 to
> 0xffff and then to the firmware ID. We have been testing the
> following change to fix this ...
>
> diff --git a/drivers/net/phy/aquantia/aquantia.h b/drivers/net/phy/aquantia/aquantia.h
> index 2465345081f8..278e3b167c58 100644
> --- a/drivers/net/phy/aquantia/aquantia.h
> +++ b/drivers/net/phy/aquantia/aquantia.h
> @@ -20,6 +20,7 @@
>   #define VEND1_GLOBAL_FW_ID                     0x0020
>   #define VEND1_GLOBAL_FW_ID_MAJOR               GENMASK(15, 8)
>   #define VEND1_GLOBAL_FW_ID_MINOR               GENMASK(7, 0)
> +#define VEND1_GLOBAL_FW_ID_MASK                        GENMASK(15, 0)
>
>   #define VEND1_GLOBAL_MAILBOX_INTERFACE1                        0x0200
>   #define VEND1_GLOBAL_MAILBOX_INTERFACE1_EXECUTE                BIT(15)
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index 0b2db486d8bd..5023fd70050d 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -447,7 +447,9 @@ int aqr_wait_reset_complete(struct phy_device *phydev)
>          int val;
>
>          return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> -                                        VEND1_GLOBAL_FW_ID, val, val != 0,
> +                                        VEND1_GLOBAL_FW_ID, val,
> +                                        ((val & VEND1_GLOBAL_FW_ID_MASK) != 0 &&
> +                                        (val & VEND1_GLOBAL_FW_ID_MASK) != VEND1_GLOBAL_FW_ID_MASK),
>                                           20000, 2000000, false);
>   }
>
> I have not tried the resume use-case, but curious if this may
> also help here?
>

Interesting. Unfortunately this doesn't help with the suspend/resume
use-case if I revert my offending patch.

Bart

> Cheers
> Jon
>
> --
> nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 13:04     ` Bartosz Golaszewski
@ 2024-07-18 13:29       ` Bartosz Golaszewski
  2024-07-18 14:08         ` Jon Hunter
  0 siblings, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-18 13:29 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis

On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>
> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote:
> >
> >
> > With the current -next and mainline we are seeing the following issue on
> > our Tegra234 Jetson AGX Orin platform ...
> >
> >   Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
> >   tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
> >
> >
> > We have tracked it down to this change and looks like our PHY does not
> > support 10M ...
> >
> > $ ethtool eth0
> > Settings for eth0:
> >          Supported ports: [  ]
> >          Supported link modes:   100baseT/Full
> >                                  1000baseT/Full
> >                                  10000baseT/Full
> >                                  1000baseKX/Full
> >                                  10000baseKX4/Full
> >                                  10000baseKR/Full
> >                                  2500baseT/Full
> >                                  5000baseT/Full
> >
> > The following fixes this for this platform ...
> >
> > diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> > index d12e35374231..0b2db486d8bd 100644
> > --- a/drivers/net/phy/aquantia/aquantia_main.c
> > +++ b/drivers/net/phy/aquantia/aquantia_main.c
> > @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
> >          int i, val, ret;
> >
> >          ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> > -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
> > +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
> >                                          1000, 100000, false);
> >          if (ret)
> >                  return ret;
> >
> >
> > However, I am not sure if this is guaranteed to work for all?
>
> Ah cr*p. No, I don't think it is. We should take the first supported
> mode for a given PHY I think.
>

TBH I only observed the issue on AQR115C. I don't have any other model
to test with. Is it fine to fix it by implementing
aqr115_fill_interface_modes() that would first wait for this register
to return non-0 and then call aqr107_fill_interface_modes()?

Bartosz

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 13:29       ` Bartosz Golaszewski
@ 2024-07-18 14:08         ` Jon Hunter
  2024-07-18 14:13           ` Bartosz Golaszewski
  2024-07-19  3:41           ` Andrew Lunn
  0 siblings, 2 replies; 21+ messages in thread
From: Jon Hunter @ 2024-07-18 14:08 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis


On 18/07/2024 14:29, Bartosz Golaszewski wrote:
> On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>>
>> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>>>
>>>
>>> With the current -next and mainline we are seeing the following issue on
>>> our Tegra234 Jetson AGX Orin platform ...
>>>
>>>    Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
>>>    tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
>>>
>>>
>>> We have tracked it down to this change and looks like our PHY does not
>>> support 10M ...
>>>
>>> $ ethtool eth0
>>> Settings for eth0:
>>>           Supported ports: [  ]
>>>           Supported link modes:   100baseT/Full
>>>                                   1000baseT/Full
>>>                                   10000baseT/Full
>>>                                   1000baseKX/Full
>>>                                   10000baseKX4/Full
>>>                                   10000baseKR/Full
>>>                                   2500baseT/Full
>>>                                   5000baseT/Full
>>>
>>> The following fixes this for this platform ...
>>>
>>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
>>> index d12e35374231..0b2db486d8bd 100644
>>> --- a/drivers/net/phy/aquantia/aquantia_main.c
>>> +++ b/drivers/net/phy/aquantia/aquantia_main.c
>>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
>>>           int i, val, ret;
>>>
>>>           ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
>>> -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
>>> +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
>>>                                           1000, 100000, false);
>>>           if (ret)
>>>                   return ret;
>>>
>>>
>>> However, I am not sure if this is guaranteed to work for all?
>>
>> Ah cr*p. No, I don't think it is. We should take the first supported
>> mode for a given PHY I think.
>>
> 
> TBH I only observed the issue on AQR115C. I don't have any other model
> to test with. Is it fine to fix it by implementing
> aqr115_fill_interface_modes() that would first wait for this register
> to return non-0 and then call aqr107_fill_interface_modes()?

I am doing a bit more testing. We have seen a few issues with this PHY 
driver and so I am wondering if we also need something similar for the 
AQR113C variant too.

Interestingly, the product brief for these PHYs [0] do show that both 
the AQR113C and AQR115C both support 10M. So I wonder if it is our 
ethernet controller that is not supporting 10M? I will check on this too.

Jon

[0] 
https://www.marvell.com/content/dam/marvell/en/public-collateral/transceivers/marvell-phys-transceivers-aqrate-gen4-product-brief.pdf


-- 
nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 14:08         ` Jon Hunter
@ 2024-07-18 14:13           ` Bartosz Golaszewski
  2024-07-18 14:49             ` Jon Hunter
  2024-07-19  3:41           ` Andrew Lunn
  1 sibling, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-18 14:13 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis

On Thu, Jul 18, 2024 at 4:08 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>
>
> On 18/07/2024 14:29, Bartosz Golaszewski wrote:
> > On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> >>
> >> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote:
> >>>
> >>>
> >>> With the current -next and mainline we are seeing the following issue on
> >>> our Tegra234 Jetson AGX Orin platform ...
> >>>
> >>>    Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
> >>>    tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
> >>>
> >>>
> >>> We have tracked it down to this change and looks like our PHY does not
> >>> support 10M ...
> >>>
> >>> $ ethtool eth0
> >>> Settings for eth0:
> >>>           Supported ports: [  ]
> >>>           Supported link modes:   100baseT/Full
> >>>                                   1000baseT/Full
> >>>                                   10000baseT/Full
> >>>                                   1000baseKX/Full
> >>>                                   10000baseKX4/Full
> >>>                                   10000baseKR/Full
> >>>                                   2500baseT/Full
> >>>                                   5000baseT/Full
> >>>
> >>> The following fixes this for this platform ...
> >>>
> >>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> >>> index d12e35374231..0b2db486d8bd 100644
> >>> --- a/drivers/net/phy/aquantia/aquantia_main.c
> >>> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> >>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
> >>>           int i, val, ret;
> >>>
> >>>           ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> >>> -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
> >>> +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
> >>>                                           1000, 100000, false);
> >>>           if (ret)
> >>>                   return ret;
> >>>
> >>>
> >>> However, I am not sure if this is guaranteed to work for all?
> >>
> >> Ah cr*p. No, I don't think it is. We should take the first supported
> >> mode for a given PHY I think.
> >>
> >
> > TBH I only observed the issue on AQR115C. I don't have any other model
> > to test with. Is it fine to fix it by implementing
> > aqr115_fill_interface_modes() that would first wait for this register
> > to return non-0 and then call aqr107_fill_interface_modes()?
>
> I am doing a bit more testing. We have seen a few issues with this PHY
> driver and so I am wondering if we also need something similar for the
> AQR113C variant too.
>
> Interestingly, the product brief for these PHYs [0] do show that both
> the AQR113C and AQR115C both support 10M. So I wonder if it is our
> ethernet controller that is not supporting 10M? I will check on this too.
>

Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it
should support 10M. In fact all AQR PHYs should hence my initial
change.

Bart

> Jon
>
> [0]
> https://www.marvell.com/content/dam/marvell/en/public-collateral/transceivers/marvell-phys-transceivers-aqrate-gen4-product-brief.pdf
>
>
> --
> nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 14:13           ` Bartosz Golaszewski
@ 2024-07-18 14:49             ` Jon Hunter
  2024-07-18 14:59               ` Bartosz Golaszewski
  0 siblings, 1 reply; 21+ messages in thread
From: Jon Hunter @ 2024-07-18 14:49 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis


On 18/07/2024 15:13, Bartosz Golaszewski wrote:
> On Thu, Jul 18, 2024 at 4:08 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>>
>>
>> On 18/07/2024 14:29, Bartosz Golaszewski wrote:
>>> On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
>>>>
>>>> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>>>>>
>>>>>
>>>>> With the current -next and mainline we are seeing the following issue on
>>>>> our Tegra234 Jetson AGX Orin platform ...
>>>>>
>>>>>     Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
>>>>>     tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
>>>>>
>>>>>
>>>>> We have tracked it down to this change and looks like our PHY does not
>>>>> support 10M ...
>>>>>
>>>>> $ ethtool eth0
>>>>> Settings for eth0:
>>>>>            Supported ports: [  ]
>>>>>            Supported link modes:   100baseT/Full
>>>>>                                    1000baseT/Full
>>>>>                                    10000baseT/Full
>>>>>                                    1000baseKX/Full
>>>>>                                    10000baseKX4/Full
>>>>>                                    10000baseKR/Full
>>>>>                                    2500baseT/Full
>>>>>                                    5000baseT/Full
>>>>>
>>>>> The following fixes this for this platform ...
>>>>>
>>>>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
>>>>> index d12e35374231..0b2db486d8bd 100644
>>>>> --- a/drivers/net/phy/aquantia/aquantia_main.c
>>>>> +++ b/drivers/net/phy/aquantia/aquantia_main.c
>>>>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
>>>>>            int i, val, ret;
>>>>>
>>>>>            ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
>>>>> -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
>>>>> +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
>>>>>                                            1000, 100000, false);
>>>>>            if (ret)
>>>>>                    return ret;
>>>>>
>>>>>
>>>>> However, I am not sure if this is guaranteed to work for all?
>>>>
>>>> Ah cr*p. No, I don't think it is. We should take the first supported
>>>> mode for a given PHY I think.
>>>>
>>>
>>> TBH I only observed the issue on AQR115C. I don't have any other model
>>> to test with. Is it fine to fix it by implementing
>>> aqr115_fill_interface_modes() that would first wait for this register
>>> to return non-0 and then call aqr107_fill_interface_modes()?
>>
>> I am doing a bit more testing. We have seen a few issues with this PHY
>> driver and so I am wondering if we also need something similar for the
>> AQR113C variant too.
>>
>> Interestingly, the product brief for these PHYs [0] do show that both
>> the AQR113C and AQR115C both support 10M. So I wonder if it is our
>> ethernet controller that is not supporting 10M? I will check on this too.
>>
> 
> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it
> should support 10M. In fact all AQR PHYs should hence my initial
> change.


Yes we have an AQR113C. I agree it should support this, but for whatever 
reason this is not advertised. I do see that 10M is advertised as 
supported by the network ...

  Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                       100baseT/Half 100baseT/Full
                                       1000baseT/Full

My PC that is on the same network supports 10M, but just not this Tegra 
device. I am checking to see if this is expected for this device.

Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 14:49             ` Jon Hunter
@ 2024-07-18 14:59               ` Bartosz Golaszewski
  2024-07-18 17:42                 ` Jon Hunter
  0 siblings, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-18 14:59 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis

On Thu, Jul 18, 2024 at 4:49 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>
>
> On 18/07/2024 15:13, Bartosz Golaszewski wrote:
> > On Thu, Jul 18, 2024 at 4:08 PM Jon Hunter <jonathanh@nvidia.com> wrote:
> >>
> >>
> >> On 18/07/2024 14:29, Bartosz Golaszewski wrote:
> >>> On Thu, Jul 18, 2024 at 3:04 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote:
> >>>>
> >>>> On Thu, Jul 18, 2024 at 2:23 PM Jon Hunter <jonathanh@nvidia.com> wrote:
> >>>>>
> >>>>>
> >>>>> With the current -next and mainline we are seeing the following issue on
> >>>>> our Tegra234 Jetson AGX Orin platform ...
> >>>>>
> >>>>>     Aquantia AQR113C stmmac-0:00: aqr107_fill_interface_modes failed: -110
> >>>>>     tegra-mgbe 6800000.ethernet eth0: __stmmac_open: Cannot attach to PHY (error: -110)
> >>>>>
> >>>>>
> >>>>> We have tracked it down to this change and looks like our PHY does not
> >>>>> support 10M ...
> >>>>>
> >>>>> $ ethtool eth0
> >>>>> Settings for eth0:
> >>>>>            Supported ports: [  ]
> >>>>>            Supported link modes:   100baseT/Full
> >>>>>                                    1000baseT/Full
> >>>>>                                    10000baseT/Full
> >>>>>                                    1000baseKX/Full
> >>>>>                                    10000baseKX4/Full
> >>>>>                                    10000baseKR/Full
> >>>>>                                    2500baseT/Full
> >>>>>                                    5000baseT/Full
> >>>>>
> >>>>> The following fixes this for this platform ...
> >>>>>
> >>>>> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> >>>>> index d12e35374231..0b2db486d8bd 100644
> >>>>> --- a/drivers/net/phy/aquantia/aquantia_main.c
> >>>>> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> >>>>> @@ -656,7 +656,7 @@ static int aqr107_fill_interface_modes(struct phy_device *phydev)
> >>>>>            int i, val, ret;
> >>>>>
> >>>>>            ret = phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> >>>>> -                                       VEND1_GLOBAL_CFG_10M, val, val != 0,
> >>>>> +                                       VEND1_GLOBAL_CFG_100M, val, val != 0,
> >>>>>                                            1000, 100000, false);
> >>>>>            if (ret)
> >>>>>                    return ret;
> >>>>>
> >>>>>
> >>>>> However, I am not sure if this is guaranteed to work for all?
> >>>>
> >>>> Ah cr*p. No, I don't think it is. We should take the first supported
> >>>> mode for a given PHY I think.
> >>>>
> >>>
> >>> TBH I only observed the issue on AQR115C. I don't have any other model
> >>> to test with. Is it fine to fix it by implementing
> >>> aqr115_fill_interface_modes() that would first wait for this register
> >>> to return non-0 and then call aqr107_fill_interface_modes()?
> >>
> >> I am doing a bit more testing. We have seen a few issues with this PHY
> >> driver and so I am wondering if we also need something similar for the
> >> AQR113C variant too.
> >>
> >> Interestingly, the product brief for these PHYs [0] do show that both
> >> the AQR113C and AQR115C both support 10M. So I wonder if it is our
> >> ethernet controller that is not supporting 10M? I will check on this too.
> >>
> >
> > Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it
> > should support 10M. In fact all AQR PHYs should hence my initial
> > change.
>
>
> Yes we have an AQR113C. I agree it should support this, but for whatever
> reason this is not advertised. I do see that 10M is advertised as
> supported by the network ...
>
>   Link partner advertised link modes:  10baseT/Half 10baseT/Full
>                                        100baseT/Half 100baseT/Full
>                                        1000baseT/Full
>
> My PC that is on the same network supports 10M, but just not this Tegra
> device. I am checking to see if this is expected for this device.
>

I sent a patch for you to test. I think that even if it doesn't fully
fix the issue you're observing, it's worth picking it up as it reduces
the impact of the workaround I introduced.

I'll be off next week so I'm sending it quickly with the hope it will be useful.

Bart

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 14:59               ` Bartosz Golaszewski
@ 2024-07-18 17:42                 ` Jon Hunter
  2024-07-18 19:05                   ` Bartosz Golaszewski
  0 siblings, 1 reply; 21+ messages in thread
From: Jon Hunter @ 2024-07-18 17:42 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis


On 18/07/2024 15:59, Bartosz Golaszewski wrote:

...

>>>>> TBH I only observed the issue on AQR115C. I don't have any other model
>>>>> to test with. Is it fine to fix it by implementing
>>>>> aqr115_fill_interface_modes() that would first wait for this register
>>>>> to return non-0 and then call aqr107_fill_interface_modes()?
>>>>
>>>> I am doing a bit more testing. We have seen a few issues with this PHY
>>>> driver and so I am wondering if we also need something similar for the
>>>> AQR113C variant too.
>>>>
>>>> Interestingly, the product brief for these PHYs [0] do show that both
>>>> the AQR113C and AQR115C both support 10M. So I wonder if it is our
>>>> ethernet controller that is not supporting 10M? I will check on this too.
>>>>
>>>
>>> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it
>>> should support 10M. In fact all AQR PHYs should hence my initial
>>> change.
>>
>>
>> Yes we have an AQR113C. I agree it should support this, but for whatever
>> reason this is not advertised. I do see that 10M is advertised as
>> supported by the network ...
>>
>>    Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>                                         100baseT/Half 100baseT/Full
>>                                         1000baseT/Full
>>
>> My PC that is on the same network supports 10M, but just not this Tegra
>> device. I am checking to see if this is expected for this device.
>>
> 
> I sent a patch for you to test. I think that even if it doesn't fully
> fix the issue you're observing, it's worth picking it up as it reduces
> the impact of the workaround I introduced.


Thanks! I will test this tonight.

> I'll be off next week so I'm sending it quickly with the hope it will be useful.


OK thanks for letting me know.

Another thought I had, which is also quite timely, is that I have 
recently been testing a patch [0] as I found that this actually resolves 
an issue where we occasionally see our device fail to get an IP address.

This was sent out over a year ago and sadly we failed to follow up :-(

Russell was concerned if this would make the function that was being 
changed fail if it did not have the link (if I am understanding the 
comments correctly). However, looking at the code now, I see that the 
aqr107_read_status() function checks if '!phydev->link' before we poll 
the TX ready status, and so I am wondering if this change is OK? From my 
testing it does work. I would be interested to know if this may also 
resolve your issue?

With this change [0] I have been able to do 500 boots on our board and 
verify that the ethernet controller is able to get an IP address every 
time. Without this change it would fail to get an IP address anywhere 
from 1-100 boots typically.

I will test your patch in the same way, but I am wondering if both are 
trying to address the same sort of issue?

Cheers
Jon

[0] 
https://lore.kernel.org/linux-tegra/20230628124326.55732-3-ruppala@nvidia.com/#t

-- 
nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 17:42                 ` Jon Hunter
@ 2024-07-18 19:05                   ` Bartosz Golaszewski
  2024-07-19  8:39                     ` Jon Hunter
  0 siblings, 1 reply; 21+ messages in thread
From: Bartosz Golaszewski @ 2024-07-18 19:05 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis

On Thu, Jul 18, 2024 at 7:42 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>
>
> On 18/07/2024 15:59, Bartosz Golaszewski wrote:
>
> ...
>
> >>>>> TBH I only observed the issue on AQR115C. I don't have any other model
> >>>>> to test with. Is it fine to fix it by implementing
> >>>>> aqr115_fill_interface_modes() that would first wait for this register
> >>>>> to return non-0 and then call aqr107_fill_interface_modes()?
> >>>>
> >>>> I am doing a bit more testing. We have seen a few issues with this PHY
> >>>> driver and so I am wondering if we also need something similar for the
> >>>> AQR113C variant too.
> >>>>
> >>>> Interestingly, the product brief for these PHYs [0] do show that both
> >>>> the AQR113C and AQR115C both support 10M. So I wonder if it is our
> >>>> ethernet controller that is not supporting 10M? I will check on this too.
> >>>>
> >>>
> >>> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it
> >>> should support 10M. In fact all AQR PHYs should hence my initial
> >>> change.
> >>
> >>
> >> Yes we have an AQR113C. I agree it should support this, but for whatever
> >> reason this is not advertised. I do see that 10M is advertised as
> >> supported by the network ...
> >>
> >>    Link partner advertised link modes:  10baseT/Half 10baseT/Full
> >>                                         100baseT/Half 100baseT/Full
> >>                                         1000baseT/Full
> >>
> >> My PC that is on the same network supports 10M, but just not this Tegra
> >> device. I am checking to see if this is expected for this device.
> >>
> >
> > I sent a patch for you to test. I think that even if it doesn't fully
> > fix the issue you're observing, it's worth picking it up as it reduces
> > the impact of the workaround I introduced.
>
>
> Thanks! I will test this tonight.
>
> > I'll be off next week so I'm sending it quickly with the hope it will be useful.
>
>
> OK thanks for letting me know.
>
> Another thought I had, which is also quite timely, is that I have
> recently been testing a patch [0] as I found that this actually resolves
> an issue where we occasionally see our device fail to get an IP address.
>
> This was sent out over a year ago and sadly we failed to follow up :-(
>
> Russell was concerned if this would make the function that was being
> changed fail if it did not have the link (if I am understanding the
> comments correctly). However, looking at the code now, I see that the
> aqr107_read_status() function checks if '!phydev->link' before we poll
> the TX ready status, and so I am wondering if this change is OK? From my
> testing it does work. I would be interested to know if this may also
> resolve your issue?
>
> With this change [0] I have been able to do 500 boots on our board and
> verify that the ethernet controller is able to get an IP address every
> time. Without this change it would fail to get an IP address anywhere
> from 1-100 boots typically.
>
> I will test your patch in the same way, but I am wondering if both are
> trying to address the same sort of issue?
>

The patch you linked does not fix the suspend/resume either. :(

Bartosz

> Cheers
> Jon
>
> [0]
> https://lore.kernel.org/linux-tegra/20230628124326.55732-3-ruppala@nvidia.com/#t
>
> --
> nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 14:08         ` Jon Hunter
  2024-07-18 14:13           ` Bartosz Golaszewski
@ 2024-07-19  3:41           ` Andrew Lunn
  1 sibling, 0 replies; 21+ messages in thread
From: Andrew Lunn @ 2024-07-19  3:41 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Bartosz Golaszewski, Heiner Kallweit, Russell King,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Bartosz Golaszewski,
	linux-tegra@vger.kernel.org, Brad Griffis

> Interestingly, the product brief for these PHYs [0] do show that both the
> AQR113C and AQR115C both support 10M. So I wonder if it is our ethernet
> controller that is not supporting 10M? I will check on this too.

The PHY should enumerate what it supports, look at
genphy_read_abilities(), which reads the Basic Mode Status Register,
and there are bits in there which indicate if the PHY supports 10Half
and 10Full.

The MAC should also be indicating what it support, and ethtool will be
showing you the combination of the two.

	Andrew

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values
  2024-07-18 19:05                   ` Bartosz Golaszewski
@ 2024-07-19  8:39                     ` Jon Hunter
  0 siblings, 0 replies; 21+ messages in thread
From: Jon Hunter @ 2024-07-19  8:39 UTC (permalink / raw)
  To: Bartosz Golaszewski
  Cc: Andrew Lunn, Heiner Kallweit, Russell King, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, netdev, linux-kernel,
	Bartosz Golaszewski, linux-tegra@vger.kernel.org, Brad Griffis


On 18/07/2024 20:05, Bartosz Golaszewski wrote:
> On Thu, Jul 18, 2024 at 7:42 PM Jon Hunter <jonathanh@nvidia.com> wrote:
>>
>>
>> On 18/07/2024 15:59, Bartosz Golaszewski wrote:
>>
>> ...
>>
>>>>>>> TBH I only observed the issue on AQR115C. I don't have any other model
>>>>>>> to test with. Is it fine to fix it by implementing
>>>>>>> aqr115_fill_interface_modes() that would first wait for this register
>>>>>>> to return non-0 and then call aqr107_fill_interface_modes()?
>>>>>>
>>>>>> I am doing a bit more testing. We have seen a few issues with this PHY
>>>>>> driver and so I am wondering if we also need something similar for the
>>>>>> AQR113C variant too.
>>>>>>
>>>>>> Interestingly, the product brief for these PHYs [0] do show that both
>>>>>> the AQR113C and AQR115C both support 10M. So I wonder if it is our
>>>>>> ethernet controller that is not supporting 10M? I will check on this too.
>>>>>>
>>>>>
>>>>> Oh you have an 113c? I didn't get this. Yeah, weird, all docs say it
>>>>> should support 10M. In fact all AQR PHYs should hence my initial
>>>>> change.
>>>>
>>>>
>>>> Yes we have an AQR113C. I agree it should support this, but for whatever
>>>> reason this is not advertised. I do see that 10M is advertised as
>>>> supported by the network ...
>>>>
>>>>     Link partner advertised link modes:  10baseT/Half 10baseT/Full
>>>>                                          100baseT/Half 100baseT/Full
>>>>                                          1000baseT/Full
>>>>
>>>> My PC that is on the same network supports 10M, but just not this Tegra
>>>> device. I am checking to see if this is expected for this device.
>>>>
>>>
>>> I sent a patch for you to test. I think that even if it doesn't fully
>>> fix the issue you're observing, it's worth picking it up as it reduces
>>> the impact of the workaround I introduced.
>>
>>
>> Thanks! I will test this tonight.
>>
>>> I'll be off next week so I'm sending it quickly with the hope it will be useful.
>>
>>
>> OK thanks for letting me know.
>>
>> Another thought I had, which is also quite timely, is that I have
>> recently been testing a patch [0] as I found that this actually resolves
>> an issue where we occasionally see our device fail to get an IP address.
>>
>> This was sent out over a year ago and sadly we failed to follow up :-(
>>
>> Russell was concerned if this would make the function that was being
>> changed fail if it did not have the link (if I am understanding the
>> comments correctly). However, looking at the code now, I see that the
>> aqr107_read_status() function checks if '!phydev->link' before we poll
>> the TX ready status, and so I am wondering if this change is OK? From my
>> testing it does work. I would be interested to know if this may also
>> resolve your issue?
>>
>> With this change [0] I have been able to do 500 boots on our board and
>> verify that the ethernet controller is able to get an IP address every
>> time. Without this change it would fail to get an IP address anywhere
>> from 1-100 boots typically.
>>
>> I will test your patch in the same way, but I am wondering if both are
>> trying to address the same sort of issue?
>>
> 
> The patch you linked does not fix the suspend/resume either. :(


Thanks for testing! I have verified that the patch you sent resolves the 
issue introduced by this patch for Tegra. And likewise this patch does 
not resolve the long-standing issue (not related to this change) that we 
have been observing.

Cheers
Jon

-- 
nvpublic

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID
  2024-07-08  7:50 ` [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID Bartosz Golaszewski
@ 2024-07-30  9:59   ` Jon Hunter
  2024-07-30 11:23     ` Russell King (Oracle)
  2024-08-06 11:27     ` Vladimir Oltean
  0 siblings, 2 replies; 21+ messages in thread
From: Jon Hunter @ 2024-07-30  9:59 UTC (permalink / raw)
  To: Bartosz Golaszewski, Andrew Lunn, Heiner Kallweit, Russell King,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: netdev, linux-kernel, Bartosz Golaszewski,
	linux-tegra@vger.kernel.org, Brad Griffis

Hi Bartosz,

On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Checking the firmware register before it complete the boot process makes
> no sense, it will report 0 even if FW is available from internal memory.
> Always wait for FW to boot before continuing or we'll unnecessarily try
> to load it from nvmem/filesystem and fail.
> 
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
>   drivers/net/phy/aquantia/aquantia_firmware.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
> index 0c9640ef153b..524627a36c6f 100644
> --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> @@ -353,6 +353,10 @@ int aqr_firmware_load(struct phy_device *phydev)
>   {
>   	int ret;
>   
> +	ret = aqr_wait_reset_complete(phydev);
> +	if (ret)
> +		return ret;
> +
>   	/* Check if the firmware is not already loaded by pooling
>   	 * the current version returned by the PHY. If 0 is returned,
>   	 * no firmware is loaded.


Although this fixed another issue we were seeing with this driver, we 
have been reviewing this change and have a question about it.

According to the description for the function aqr_wait_reset_complete() 
this function is intended to give the device time to load firmware and 
check there is a valid firmware ID.

If a valid firmware ID (non-zero) is detected, then 
aqr_wait_reset_complete() will return 0 (because 
phy_read_mmd_poll_timeout() returns 0 on success and -ETIMEDOUT upon a 
timeout).

If it times out, then it would appear that with the above code we don't 
attempt to load the firmware by any other means?

Hence, I was wondering if we want this ...

diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c 
b/drivers/net/phy/aquantia/aquantia_firmware.c
index 524627a36c6f..a167f42ae36b 100644
--- a/drivers/net/phy/aquantia/aquantia_firmware.c
+++ b/drivers/net/phy/aquantia/aquantia_firmware.c
@@ -353,16 +353,12 @@ int aqr_firmware_load(struct phy_device *phydev)
  {
         int ret;

-       ret = aqr_wait_reset_complete(phydev);
-       if (ret)
-               return ret;
-
-       /* Check if the firmware is not already loaded by pooling
+       /* Check if the firmware is not already loaded by polling
          * the current version returned by the PHY. If 0 is returned,
-        * no firmware is loaded.
+        * firmware is loaded.
          */
-       ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
-       if (ret > 0)
+       ret = aqr_wait_reset_complete(phydev);
+       if (!ret)
                 goto exit;

         ret = aqr_firmware_load_nvmem(phydev);


Our Aquantia PHY has a SPI-NOR and so we don't to test the other 
firmware loading cases.

Jon

-- 
nvpublic

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID
  2024-07-30  9:59   ` Jon Hunter
@ 2024-07-30 11:23     ` Russell King (Oracle)
  2024-08-06 11:36       ` Vladimir Oltean
  2024-08-06 11:27     ` Vladimir Oltean
  1 sibling, 1 reply; 21+ messages in thread
From: Russell King (Oracle) @ 2024-07-30 11:23 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Bartosz Golaszewski, Andrew Lunn, Heiner Kallweit,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Bartosz Golaszewski,
	linux-tegra@vger.kernel.org, Brad Griffis

On Tue, Jul 30, 2024 at 10:59:59AM +0100, Jon Hunter wrote:
> Hi Bartosz,
> 
> On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > 
> > Checking the firmware register before it complete the boot process makes
> > no sense, it will report 0 even if FW is available from internal memory.
> > Always wait for FW to boot before continuing or we'll unnecessarily try
> > to load it from nvmem/filesystem and fail.
> > 
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > ---
> >   drivers/net/phy/aquantia/aquantia_firmware.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
> > index 0c9640ef153b..524627a36c6f 100644
> > --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> > +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> > @@ -353,6 +353,10 @@ int aqr_firmware_load(struct phy_device *phydev)
> >   {
> >   	int ret;
> > +	ret = aqr_wait_reset_complete(phydev);
> > +	if (ret)
> > +		return ret;
> > +
> >   	/* Check if the firmware is not already loaded by pooling
> >   	 * the current version returned by the PHY. If 0 is returned,
> >   	 * no firmware is loaded.
> 
> 
> Although this fixed another issue we were seeing with this driver, we have
> been reviewing this change and have a question about it.
> 
> According to the description for the function aqr_wait_reset_complete() this
> function is intended to give the device time to load firmware and check
> there is a valid firmware ID.
> 
> If a valid firmware ID (non-zero) is detected, then
> aqr_wait_reset_complete() will return 0 (because phy_read_mmd_poll_timeout()
> returns 0 on success and -ETIMEDOUT upon a timeout).
> 
> If it times out, then it would appear that with the above code we don't
> attempt to load the firmware by any other means?

I'm also wondering about aqr_wait_reset_complete(). It uses
phy_read_mmd_poll_timeout(), which prints an error message if it times
out (which means no firmware has been loaded.) If we're then going on to
attempt to load firmware, the error is not an error at all. So, I think
while phy_read_poll_timeout() is nice and convenient, we need something
like:

#define phy_read_poll_timeout_quiet(phydev, regnum, val, cond, sleep_us, \
                                    timeout_us, sleep_before_read) \
({ \
        int __ret, __val; \
        __ret = read_poll_timeout(__val = phy_read, val, \
                                  __val < 0 || (cond), \
                sleep_us, timeout_us, sleep_before_read, phydev, regnum); \
        if (__val < 0) \
                __ret = __val; \
        __ret; \
})

#define phy_read_poll_timeout(phydev, regnum, val, cond, sleep_us, \
                                timeout_us, sleep_before_read) \
({ \
        int __ret = phy_read_poll_timeout_quiet(phydev, regnum, val, cond, \
						sleep_us, timeout_us, \
						sleep_before_read); \
        if (__ret) \
                phydev_err(phydev, "%s failed: %d\n", __func__, __ret); \
        __ret; \
})

and aqr_wait_reset_complete() needs to use phy_read_poll_timeout_quiet().

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID
  2024-07-30  9:59   ` Jon Hunter
  2024-07-30 11:23     ` Russell King (Oracle)
@ 2024-08-06 11:27     ` Vladimir Oltean
  1 sibling, 0 replies; 21+ messages in thread
From: Vladimir Oltean @ 2024-08-06 11:27 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Bartosz Golaszewski, Andrew Lunn, Heiner Kallweit, Russell King,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Bartosz Golaszewski,
	linux-tegra@vger.kernel.org, Brad Griffis, Clark Wang, Wei Fang

Hi Jon,

On Tue, Jul 30, 2024 at 10:59:59AM +0100, Jon Hunter wrote:
> Hi Bartosz,
> 
> On 08/07/2024 08:50, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > 
> > Checking the firmware register before it complete the boot process makes
> > no sense, it will report 0 even if FW is available from internal memory.
> > Always wait for FW to boot before continuing or we'll unnecessarily try
> > to load it from nvmem/filesystem and fail.
> > 
> > Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > ---
> >   drivers/net/phy/aquantia/aquantia_firmware.c | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
> > index 0c9640ef153b..524627a36c6f 100644
> > --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> > +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> > @@ -353,6 +353,10 @@ int aqr_firmware_load(struct phy_device *phydev)
> >   {
> >   	int ret;
> > +	ret = aqr_wait_reset_complete(phydev);
> > +	if (ret)
> > +		return ret;
> > +
> >   	/* Check if the firmware is not already loaded by pooling
> >   	 * the current version returned by the PHY. If 0 is returned,
> >   	 * no firmware is loaded.
> 
> 
> Although this fixed another issue we were seeing with this driver, we have
> been reviewing this change and have a question about it.
> 
> According to the description for the function aqr_wait_reset_complete() this
> function is intended to give the device time to load firmware and check
> there is a valid firmware ID.
> 
> If a valid firmware ID (non-zero) is detected, then
> aqr_wait_reset_complete() will return 0 (because phy_read_mmd_poll_timeout()
> returns 0 on success and -ETIMEDOUT upon a timeout).
> 
> If it times out, then it would appear that with the above code we don't
> attempt to load the firmware by any other means?
> 
> Hence, I was wondering if we want this ...
> 
> diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c
> b/drivers/net/phy/aquantia/aquantia_firmware.c
> index 524627a36c6f..a167f42ae36b 100644
> --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> @@ -353,16 +353,12 @@ int aqr_firmware_load(struct phy_device *phydev)
>  {
>         int ret;
> 
> -       ret = aqr_wait_reset_complete(phydev);
> -       if (ret)
> -               return ret;
> -
> -       /* Check if the firmware is not already loaded by pooling
> +       /* Check if the firmware is not already loaded by polling
>          * the current version returned by the PHY. If 0 is returned,
> -        * no firmware is loaded.
> +        * firmware is loaded.
>          */
> -       ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
> -       if (ret > 0)
> +       ret = aqr_wait_reset_complete(phydev);
> +       if (!ret)
>                 goto exit;
> 
>         ret = aqr_firmware_load_nvmem(phydev);

I agree with your analysis and we also noticed this.

But actually, you wouldn't want to ignore other return codes from
phy_read_mmd_poll_timeout() like real errors from phy_read_mmd():
-ENODEV, -ENXIO etc.

I found that the logic is more readable with a switch/case statement as below.

diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
index 524627a36c6f..d839f64471bd 100644
--- a/drivers/net/phy/aquantia/aquantia_firmware.c
+++ b/drivers/net/phy/aquantia/aquantia_firmware.c
@@ -353,26 +353,33 @@ int aqr_firmware_load(struct phy_device *phydev)
 {
 	int ret;
 
-	ret = aqr_wait_reset_complete(phydev);
-	if (ret)
-		return ret;
-
-	/* Check if the firmware is not already loaded by pooling
-	 * the current version returned by the PHY. If 0 is returned,
-	 * no firmware is loaded.
+	/* Check if the firmware is not already loaded by polling
+	 * the current version returned by the PHY.
 	 */
-	ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
-	if (ret > 0)
-		goto exit;
+	ret = aqr_wait_reset_complete(phydev);
+	switch (ret) {
+	case 0:
+		/* Some firmware is loaded => do nothing */
+		return 0;
+	case -ETIMEDOUT:
+		/* VEND1_GLOBAL_FW_ID still reads 0 after 2 seconds of polling.
+		 * We don't have full confidence that no firmware is loaded (in
+		 * theory it might just not have loaded yet), but we will
+		 * assume that, and load a new image.
+		 */
+		ret = aqr_firmware_load_nvmem(phydev);
+		if (!ret)
+			goto exit;
 
-	ret = aqr_firmware_load_nvmem(phydev);
-	if (!ret)
-		goto exit;
+		ret = aqr_firmware_load_fs(phydev);
+		if (ret)
+			return ret;
 
-	ret = aqr_firmware_load_fs(phydev);
-	if (ret)
+		break;
+	default:
+		/* PHY read error, propagate it to the caller */
 		return ret;
+	}
 
-exit:
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID
  2024-07-30 11:23     ` Russell King (Oracle)
@ 2024-08-06 11:36       ` Vladimir Oltean
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Oltean @ 2024-08-06 11:36 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Jon Hunter, Bartosz Golaszewski, Andrew Lunn, Heiner Kallweit,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	netdev, linux-kernel, Bartosz Golaszewski,
	linux-tegra@vger.kernel.org, Brad Griffis

Hi Russell,

On Tue, Jul 30, 2024 at 12:23:45PM +0100, Russell King (Oracle) wrote:
> > If it times out, then it would appear that with the above code we don't
> > attempt to load the firmware by any other means?
> 
> I'm also wondering about aqr_wait_reset_complete(). It uses
> phy_read_mmd_poll_timeout(), which prints an error message if it times
> out (which means no firmware has been loaded.) If we're then going on to
> attempt to load firmware, the error is not an error at all. So, I think
> while phy_read_poll_timeout() is nice and convenient, we need something
> like:
> 
> #define phy_read_poll_timeout_quiet(phydev, regnum, val, cond, sleep_us, \
>                                     timeout_us, sleep_before_read) \
> ({ \
>         int __ret, __val; \
>         __ret = read_poll_timeout(__val = phy_read, val, \
>                                   __val < 0 || (cond), \
>                 sleep_us, timeout_us, sleep_before_read, phydev, regnum); \
>         if (__val < 0) \
>                 __ret = __val; \
>         __ret; \
> })
> 
> #define phy_read_poll_timeout(phydev, regnum, val, cond, sleep_us, \
>                                 timeout_us, sleep_before_read) \
> ({ \
>         int __ret = phy_read_poll_timeout_quiet(phydev, regnum, val, cond, \
> 						sleep_us, timeout_us, \
> 						sleep_before_read); \
>         if (__ret) \
>                 phydev_err(phydev, "%s failed: %d\n", __func__, __ret); \
>         __ret; \
> })
> 
> and aqr_wait_reset_complete() needs to use phy_read_poll_timeout_quiet().

I agree that aqr_wait_reset_complete() shouldn't have built-in prints in it,
as long as failures are also expected. Maybe an alternative option would
be for aqr_wait_reset_complete() to manually roll a call to read_poll_timeout(),
considering how it would be nice for _actual_ errors (not -ETIMEDOUT)
from phy_read_mmd() to still be logged.

But it seems strange that the driver has to time out on a 2 second poll,
and then it's still not sure why VEND1_GLOBAL_FW_ID still reads 0?
Is it because there's no firmware, or because there is, but it hasn't
waited for long enough?

I haven't followed the development of AQR firmware loading. Isn't there
a faster and more reliable way of determining whether there is firmware
in the first place? It could give the driver a 2 second boot-time speedup,
plus more confidence.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-08-06 11:36 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-08  7:50 [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable support for aqr115c Bartosz Golaszewski
2024-07-08  7:50 ` [RESEND PATCH net-next v3 1/4] net: phy: aquantia: rename and export aqr107_wait_reset_complete() Bartosz Golaszewski
2024-07-08  7:50 ` [RESEND PATCH net-next v3 2/4] net: phy: aquantia: wait for FW reset before checking the vendor ID Bartosz Golaszewski
2024-07-30  9:59   ` Jon Hunter
2024-07-30 11:23     ` Russell King (Oracle)
2024-08-06 11:36       ` Vladimir Oltean
2024-08-06 11:27     ` Vladimir Oltean
2024-07-08  7:50 ` [RESEND PATCH net-next v3 3/4] net: phy: aquantia: wait for the GLOBAL_CFG to start returning real values Bartosz Golaszewski
2024-07-18 12:23   ` Jon Hunter
2024-07-18 13:04     ` Bartosz Golaszewski
2024-07-18 13:29       ` Bartosz Golaszewski
2024-07-18 14:08         ` Jon Hunter
2024-07-18 14:13           ` Bartosz Golaszewski
2024-07-18 14:49             ` Jon Hunter
2024-07-18 14:59               ` Bartosz Golaszewski
2024-07-18 17:42                 ` Jon Hunter
2024-07-18 19:05                   ` Bartosz Golaszewski
2024-07-19  8:39                     ` Jon Hunter
2024-07-19  3:41           ` Andrew Lunn
2024-07-08  7:50 ` [RESEND PATCH net-next v3 4/4] net: phy: aquantia: add support for aqr115c Bartosz Golaszewski
2024-07-10 10:20 ` [RESEND PATCH net-next v3 0/4] net: phy: aquantia: enable " patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).