* [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present
@ 2024-09-13 12:12 Vladimir Oltean
2024-09-13 13:18 ` Bartosz Golaszewski
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Vladimir Oltean @ 2024-09-13 12:12 UTC (permalink / raw)
To: netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Andrew Lunn, Heiner Kallweit, Russell King, Bartosz Golaszewski,
Christian Marangi, Clark Wang, Jon Hunter, Hans-Frieder Vogt
The author of the blamed commit apparently did not notice something
about aqr_wait_reset_complete(): it polls the exact same register -
MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID - as aqr_firmware_load().
Thus, the entire logic after the introduction of aqr_wait_reset_complete() is
now completely side-stepped, because if aqr_wait_reset_complete()
succeeds, MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID could have only been a
non-zero value. The handling of the case where the register reads as 0
is dead code, due to the previous -ETIMEDOUT having stopped execution
and returning a fatal error to the caller. We never attempt to load
new firmware if no firmware is present.
Based on static code analysis, I guess we should simply introduce a
switch/case statement based on the return code from aqr_wait_reset_complete(),
to determine whether to load firmware or not. I am not intending to
change the procedure through which the driver determines whether to load
firmware or not, as I am unaware of alternative possibilities.
At the same time, Russell King suggests that if aqr_wait_reset_complete()
is expected to return -ETIMEDOUT as part of normal operation and not
just catastrophic failure, the use of phy_read_mmd_poll_timeout() is
improper, since that has an embedded print inside. Just open-code a
call to read_poll_timeout() to avoid printing -ETIMEDOUT, but continue
printing actual read errors from the MDIO bus.
Fixes: ad649a1fac37 ("net: phy: aquantia: wait for FW reset before checking the vendor ID")
Reported-by: Clark Wang <xiaoning.wang@nxp.com>
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Closes: https://lore.kernel.org/netdev/8ac00a45-ac61-41b4-9f74-d18157b8b6bf@nvidia.com/
Reported-by: Hans-Frieder Vogt <hfdevel@gmx.net>
Closes: https://lore.kernel.org/netdev/c7c1a3ae-be97-4929-8d89-04c8aa870209@gmx.net/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
Only compile-tested. However, my timeout timer expired waiting for
reactions on the thread with Bartosz' original patch, and Hans-Frieder
Vogt wrote a message in his cover letter implying that the patch fixes
the issue for him. Any Tested-by: tags are welcome.
drivers/net/phy/aquantia/aquantia_firmware.c | 42 +++++++++++---------
drivers/net/phy/aquantia/aquantia_main.c | 19 +++++++--
2 files changed, 39 insertions(+), 22 deletions(-)
diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
index 524627a36c6f..dac6464b5fe2 100644
--- a/drivers/net/phy/aquantia/aquantia_firmware.c
+++ b/drivers/net/phy/aquantia/aquantia_firmware.c
@@ -353,26 +353,32 @@ int aqr_firmware_load(struct phy_device *phydev)
{
int ret;
- ret = aqr_wait_reset_complete(phydev);
- if (ret)
- return ret;
-
- /* Check if the firmware is not already loaded by pooling
- * the current version returned by the PHY. If 0 is returned,
- * no firmware is loaded.
+ /* Check if the firmware is not already loaded by polling
+ * the current version returned by the PHY.
*/
- ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
- if (ret > 0)
- goto exit;
-
- ret = aqr_firmware_load_nvmem(phydev);
- if (!ret)
- goto exit;
-
- ret = aqr_firmware_load_fs(phydev);
- if (ret)
+ ret = aqr_wait_reset_complete(phydev);
+ switch (ret) {
+ case 0:
+ /* Some firmware is loaded => do nothing */
+ return 0;
+ case -ETIMEDOUT:
+ /* VEND1_GLOBAL_FW_ID still reads 0 after 2 seconds of polling.
+ * We don't have full confidence that no firmware is loaded (in
+ * theory it might just not have loaded yet), but we will
+ * assume that, and load a new image.
+ */
+ ret = aqr_firmware_load_nvmem(phydev);
+ if (!ret)
+ return ret;
+
+ ret = aqr_firmware_load_fs(phydev);
+ if (ret)
+ return ret;
+ break;
+ default:
+ /* PHY read error, propagate it to the caller */
return ret;
+ }
-exit:
return 0;
}
diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
index e982e9ce44a5..57b8b8f400fd 100644
--- a/drivers/net/phy/aquantia/aquantia_main.c
+++ b/drivers/net/phy/aquantia/aquantia_main.c
@@ -435,6 +435,9 @@ static int aqr107_set_tunable(struct phy_device *phydev,
}
}
+#define AQR_FW_WAIT_SLEEP_US 20000
+#define AQR_FW_WAIT_TIMEOUT_US 2000000
+
/* If we configure settings whilst firmware is still initializing the chip,
* then these settings may be overwritten. Therefore make sure chip
* initialization has completed. Use presence of the firmware ID as
@@ -444,11 +447,19 @@ static int aqr107_set_tunable(struct phy_device *phydev,
*/
int aqr_wait_reset_complete(struct phy_device *phydev)
{
- int val;
+ int ret, val;
+
+ ret = read_poll_timeout(phy_read_mmd, val, val != 0,
+ AQR_FW_WAIT_SLEEP_US, AQR_FW_WAIT_TIMEOUT_US,
+ false, phydev, MDIO_MMD_VEND1,
+ VEND1_GLOBAL_FW_ID);
+ if (val < 0) {
+ phydev_err(phydev, "Failed to read VEND1_GLOBAL_FW_ID: %pe\n",
+ ERR_PTR(val));
+ return val;
+ }
- return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
- VEND1_GLOBAL_FW_ID, val, val != 0,
- 20000, 2000000, false);
+ return ret;
}
static void aqr107_chip_info(struct phy_device *phydev)
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present
2024-09-13 12:12 [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present Vladimir Oltean
@ 2024-09-13 13:18 ` Bartosz Golaszewski
2024-09-13 13:21 ` Vladimir Oltean
2024-09-14 13:16 ` Hans-Frieder Vogt
2024-09-19 11:00 ` patchwork-bot+netdevbpf
2 siblings, 1 reply; 5+ messages in thread
From: Bartosz Golaszewski @ 2024-09-13 13:18 UTC (permalink / raw)
To: Vladimir Oltean
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Andrew Lunn, Heiner Kallweit, Russell King,
Christian Marangi, Clark Wang, Jon Hunter, Hans-Frieder Vogt
On Fri, 13 Sept 2024 at 14:12, Vladimir Oltean <vladimir.oltean@nxp.com> wrote:
>
> The author of the blamed commit apparently did not notice something
> about aqr_wait_reset_complete(): it polls the exact same register -
> MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID - as aqr_firmware_load().
>
> Thus, the entire logic after the introduction of aqr_wait_reset_complete() is
> now completely side-stepped, because if aqr_wait_reset_complete()
> succeeds, MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID could have only been a
> non-zero value. The handling of the case where the register reads as 0
> is dead code, due to the previous -ETIMEDOUT having stopped execution
> and returning a fatal error to the caller. We never attempt to load
> new firmware if no firmware is present.
>
> Based on static code analysis, I guess we should simply introduce a
> switch/case statement based on the return code from aqr_wait_reset_complete(),
> to determine whether to load firmware or not. I am not intending to
> change the procedure through which the driver determines whether to load
> firmware or not, as I am unaware of alternative possibilities.
>
> At the same time, Russell King suggests that if aqr_wait_reset_complete()
> is expected to return -ETIMEDOUT as part of normal operation and not
> just catastrophic failure, the use of phy_read_mmd_poll_timeout() is
> improper, since that has an embedded print inside. Just open-code a
> call to read_poll_timeout() to avoid printing -ETIMEDOUT, but continue
> printing actual read errors from the MDIO bus.
>
> Fixes: ad649a1fac37 ("net: phy: aquantia: wait for FW reset before checking the vendor ID")
> Reported-by: Clark Wang <xiaoning.wang@nxp.com>
> Reported-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/netdev/8ac00a45-ac61-41b4-9f74-d18157b8b6bf@nvidia.com/
> Reported-by: Hans-Frieder Vogt <hfdevel@gmx.net>
> Closes: https://lore.kernel.org/netdev/c7c1a3ae-be97-4929-8d89-04c8aa870209@gmx.net/
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> Only compile-tested. However, my timeout timer expired waiting for
> reactions on the thread with Bartosz' original patch, and Hans-Frieder
> Vogt wrote a message in his cover letter implying that the patch fixes
> the issue for him. Any Tested-by: tags are welcome.
>
Still works on sa8775p-ride v3
Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present
2024-09-13 13:18 ` Bartosz Golaszewski
@ 2024-09-13 13:21 ` Vladimir Oltean
0 siblings, 0 replies; 5+ messages in thread
From: Vladimir Oltean @ 2024-09-13 13:21 UTC (permalink / raw)
To: Bartosz Golaszewski
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Andrew Lunn, Heiner Kallweit, Russell King,
Christian Marangi, Clark Wang, Jon Hunter, Hans-Frieder Vogt
On Fri, Sep 13, 2024 at 03:18:42PM +0200, Bartosz Golaszewski wrote:
> Still works on sa8775p-ride v3
>
> Tested-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Thanks for testing, I appreciate it.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present
2024-09-13 12:12 [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present Vladimir Oltean
2024-09-13 13:18 ` Bartosz Golaszewski
@ 2024-09-14 13:16 ` Hans-Frieder Vogt
2024-09-19 11:00 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 5+ messages in thread
From: Hans-Frieder Vogt @ 2024-09-14 13:16 UTC (permalink / raw)
To: Vladimir Oltean, netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Andrew Lunn, Heiner Kallweit, Russell King, Bartosz Golaszewski,
Christian Marangi, Clark Wang, Jon Hunter
On 13.09.2024 14.12, Vladimir Oltean wrote:
> The author of the blamed commit apparently did not notice something
> about aqr_wait_reset_complete(): it polls the exact same register -
> MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID - as aqr_firmware_load().
>
> Thus, the entire logic after the introduction of aqr_wait_reset_complete() is
> now completely side-stepped, because if aqr_wait_reset_complete()
> succeeds, MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID could have only been a
> non-zero value. The handling of the case where the register reads as 0
> is dead code, due to the previous -ETIMEDOUT having stopped execution
> and returning a fatal error to the caller. We never attempt to load
> new firmware if no firmware is present.
>
> Based on static code analysis, I guess we should simply introduce a
> switch/case statement based on the return code from aqr_wait_reset_complete(),
> to determine whether to load firmware or not. I am not intending to
> change the procedure through which the driver determines whether to load
> firmware or not, as I am unaware of alternative possibilities.
>
> At the same time, Russell King suggests that if aqr_wait_reset_complete()
> is expected to return -ETIMEDOUT as part of normal operation and not
> just catastrophic failure, the use of phy_read_mmd_poll_timeout() is
> improper, since that has an embedded print inside. Just open-code a
> call to read_poll_timeout() to avoid printing -ETIMEDOUT, but continue
> printing actual read errors from the MDIO bus.
>
> Fixes: ad649a1fac37 ("net: phy: aquantia: wait for FW reset before checking the vendor ID")
> Reported-by: Clark Wang <xiaoning.wang@nxp.com>
> Reported-by: Jon Hunter <jonathanh@nvidia.com>
> Closes: https://lore.kernel.org/netdev/8ac00a45-ac61-41b4-9f74-d18157b8b6bf@nvidia.com/
> Reported-by: Hans-Frieder Vogt <hfdevel@gmx.net>
> Closes: https://lore.kernel.org/netdev/c7c1a3ae-be97-4929-8d89-04c8aa870209@gmx.net/
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
> Only compile-tested. However, my timeout timer expired waiting for
> reactions on the thread with Bartosz' original patch, and Hans-Frieder
> Vogt wrote a message in his cover letter implying that the patch fixes
> the issue for him. Any Tested-by: tags are welcome.
>
> drivers/net/phy/aquantia/aquantia_firmware.c | 42 +++++++++++---------
> drivers/net/phy/aquantia/aquantia_main.c | 19 +++++++--
> 2 files changed, 39 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/net/phy/aquantia/aquantia_firmware.c b/drivers/net/phy/aquantia/aquantia_firmware.c
> index 524627a36c6f..dac6464b5fe2 100644
> --- a/drivers/net/phy/aquantia/aquantia_firmware.c
> +++ b/drivers/net/phy/aquantia/aquantia_firmware.c
> @@ -353,26 +353,32 @@ int aqr_firmware_load(struct phy_device *phydev)
> {
> int ret;
>
> - ret = aqr_wait_reset_complete(phydev);
> - if (ret)
> - return ret;
> -
> - /* Check if the firmware is not already loaded by pooling
> - * the current version returned by the PHY. If 0 is returned,
> - * no firmware is loaded.
> + /* Check if the firmware is not already loaded by polling
> + * the current version returned by the PHY.
> */
> - ret = phy_read_mmd(phydev, MDIO_MMD_VEND1, VEND1_GLOBAL_FW_ID);
> - if (ret > 0)
> - goto exit;
> -
> - ret = aqr_firmware_load_nvmem(phydev);
> - if (!ret)
> - goto exit;
> -
> - ret = aqr_firmware_load_fs(phydev);
> - if (ret)
> + ret = aqr_wait_reset_complete(phydev);
> + switch (ret) {
> + case 0:
> + /* Some firmware is loaded => do nothing */
> + return 0;
> + case -ETIMEDOUT:
> + /* VEND1_GLOBAL_FW_ID still reads 0 after 2 seconds of polling.
> + * We don't have full confidence that no firmware is loaded (in
> + * theory it might just not have loaded yet), but we will
> + * assume that, and load a new image.
> + */
> + ret = aqr_firmware_load_nvmem(phydev);
> + if (!ret)
> + return ret;
> +
> + ret = aqr_firmware_load_fs(phydev);
> + if (ret)
> + return ret;
> + break;
> + default:
> + /* PHY read error, propagate it to the caller */
> return ret;
> + }
>
> -exit:
> return 0;
> }
> diff --git a/drivers/net/phy/aquantia/aquantia_main.c b/drivers/net/phy/aquantia/aquantia_main.c
> index e982e9ce44a5..57b8b8f400fd 100644
> --- a/drivers/net/phy/aquantia/aquantia_main.c
> +++ b/drivers/net/phy/aquantia/aquantia_main.c
> @@ -435,6 +435,9 @@ static int aqr107_set_tunable(struct phy_device *phydev,
> }
> }
>
> +#define AQR_FW_WAIT_SLEEP_US 20000
> +#define AQR_FW_WAIT_TIMEOUT_US 2000000
> +
> /* If we configure settings whilst firmware is still initializing the chip,
> * then these settings may be overwritten. Therefore make sure chip
> * initialization has completed. Use presence of the firmware ID as
> @@ -444,11 +447,19 @@ static int aqr107_set_tunable(struct phy_device *phydev,
> */
> int aqr_wait_reset_complete(struct phy_device *phydev)
> {
> - int val;
> + int ret, val;
> +
> + ret = read_poll_timeout(phy_read_mmd, val, val != 0,
> + AQR_FW_WAIT_SLEEP_US, AQR_FW_WAIT_TIMEOUT_US,
> + false, phydev, MDIO_MMD_VEND1,
> + VEND1_GLOBAL_FW_ID);
> + if (val < 0) {
> + phydev_err(phydev, "Failed to read VEND1_GLOBAL_FW_ID: %pe\n",
> + ERR_PTR(val));
> + return val;
> + }
>
> - return phy_read_mmd_poll_timeout(phydev, MDIO_MMD_VEND1,
> - VEND1_GLOBAL_FW_ID, val, val != 0,
> - 20000, 2000000, false);
> + return ret;
> }
>
> static void aqr107_chip_info(struct phy_device *phydev)
Tested-by: Hans-Frieder Vogt <hfdevel@gmx.net>
Hans
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present
2024-09-13 12:12 [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present Vladimir Oltean
2024-09-13 13:18 ` Bartosz Golaszewski
2024-09-14 13:16 ` Hans-Frieder Vogt
@ 2024-09-19 11:00 ` patchwork-bot+netdevbpf
2 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-09-19 11:00 UTC (permalink / raw)
To: Vladimir Oltean
Cc: netdev, davem, edumazet, kuba, pabeni, andrew, hkallweit1, linux,
bartosz.golaszewski, ansuelsmth, xiaoning.wang, jonathanh,
hfdevel
Hello:
This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Fri, 13 Sep 2024 15:12:30 +0300 you wrote:
> The author of the blamed commit apparently did not notice something
> about aqr_wait_reset_complete(): it polls the exact same register -
> MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID - as aqr_firmware_load().
>
> Thus, the entire logic after the introduction of aqr_wait_reset_complete() is
> now completely side-stepped, because if aqr_wait_reset_complete()
> succeeds, MDIO_MMD_VEND1:VEND1_GLOBAL_FW_ID could have only been a
> non-zero value. The handling of the case where the register reads as 0
> is dead code, due to the previous -ETIMEDOUT having stopped execution
> and returning a fatal error to the caller. We never attempt to load
> new firmware if no firmware is present.
>
> [...]
Here is the summary with links:
- [net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present
https://git.kernel.org/netdev/net/c/194ef9d0de90
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-09-19 11:00 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-13 12:12 [PATCH net] net: phy: aquantia: fix -ETIMEDOUT PHY probe failure when firmware not present Vladimir Oltean
2024-09-13 13:18 ` Bartosz Golaszewski
2024-09-13 13:21 ` Vladimir Oltean
2024-09-14 13:16 ` Hans-Frieder Vogt
2024-09-19 11:00 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox