netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: atlantic: fix warning during hot unplug
@ 2025-02-02 22:09 Jacob Moroni
  2025-02-03  8:55 ` [EXTERNAL] " Igor Russkikh
  2025-02-03 10:02 ` Simon Horman
  0 siblings, 2 replies; 7+ messages in thread
From: Jacob Moroni @ 2025-02-02 22:09 UTC (permalink / raw)
  To: Igor Russkikh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: Jacob Moroni, netdev, linux-kernel

Firmware deinitialization performs MMIO accesses which are not
necessary if the device has already been removed. In some cases,
these accesses happen via readx_poll_timeout_atomic which ends up
timing out, resulting in a warning at hw_atl2_utils_fw.c:112:

[  104.595913] Call Trace:
[  104.595915]  <TASK>
[  104.595918]  ? show_regs+0x6c/0x80
[  104.595923]  ? __warn+0x8d/0x150
[  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595934]  ? report_bug+0x182/0x1b0
[  104.595938]  ? handle_bug+0x6e/0xb0
[  104.595940]  ? exc_invalid_op+0x18/0x80
[  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
[  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
[  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
[  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
[  104.595975]  __dev_close_many+0xad/0x160
[  104.595978]  dev_close_many+0x99/0x170
[  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
[  104.595981]  ? __call_rcu_common+0xcd/0x700
[  104.595984]  unregister_netdevice_queue+0xc6/0x110
[  104.595986]  unregister_netdev+0x1c/0x30
[  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]

Fix this by skipping firmware deinitialization altogether if the
PCI device is no longer present.

Tested with an AQC113 attached via Thunderbolt by performing
repeated unplug cycles while traffic was running via iperf.

Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index fe0e3e2a8117..e2ae95a01947 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -1428,7 +1428,7 @@ void aq_nic_deinit(struct aq_nic_s *self, bool link_down)
 	unsigned int i = 0U;
 
 	if (!self)
-		goto err_exit;
+		return;
 
 	for (i = 0U; i < self->aq_vecs; i++) {
 		aq_vec = self->aq_vec[i];
@@ -1441,13 +1441,14 @@ void aq_nic_deinit(struct aq_nic_s *self, bool link_down)
 	aq_ptp_ring_free(self);
 	aq_ptp_free(self);
 
-	if (likely(self->aq_fw_ops->deinit) && link_down) {
-		mutex_lock(&self->fwreq_mutex);
-		self->aq_fw_ops->deinit(self->aq_hw);
-		mutex_unlock(&self->fwreq_mutex);
+	/* May be invoked during hot unplug. */
+	if (pci_device_is_present(self->pdev)) {
+		if (likely(self->aq_fw_ops->deinit) && link_down) {
+			mutex_lock(&self->fwreq_mutex);
+			self->aq_fw_ops->deinit(self->aq_hw);
+			mutex_unlock(&self->fwreq_mutex);
+		}
 	}
-
-err_exit:;
 }
 
 void aq_nic_free_vectors(struct aq_nic_s *self)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: [EXTERNAL] [PATCH net] net: atlantic: fix warning during hot unplug
  2025-02-02 22:09 [PATCH net] net: atlantic: fix warning during hot unplug Jacob Moroni
@ 2025-02-03  8:55 ` Igor Russkikh
  2025-02-03 10:02 ` Simon Horman
  1 sibling, 0 replies; 7+ messages in thread
From: Igor Russkikh @ 2025-02-03  8:55 UTC (permalink / raw)
  To: Jacob Moroni, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org


> Firmware deinitialization performs MMIO accesses which are not necessary if the device has already been removed. In some cases, these accesses happen via readx_poll_timeout_atomic which ends up timing out, resulting in a warning at hw_atl2_utils_fw. c: 112: 

Hi Jacob,

Makes sense, thanks!

Reviewed-by: Igor Russkikh <irusskikh@marvell.com>

Igor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: atlantic: fix warning during hot unplug
  2025-02-02 22:09 [PATCH net] net: atlantic: fix warning during hot unplug Jacob Moroni
  2025-02-03  8:55 ` [EXTERNAL] " Igor Russkikh
@ 2025-02-03 10:02 ` Simon Horman
  2025-02-03 14:34   ` Jacob S. Moroni
  2025-02-03 14:36   ` [PATCH net v2] " Jacob Moroni
  1 sibling, 2 replies; 7+ messages in thread
From: Simon Horman @ 2025-02-03 10:02 UTC (permalink / raw)
  To: Jacob Moroni
  Cc: Igor Russkikh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

On Sun, Feb 02, 2025 at 05:09:21PM -0500, Jacob Moroni wrote:
> Firmware deinitialization performs MMIO accesses which are not
> necessary if the device has already been removed. In some cases,
> these accesses happen via readx_poll_timeout_atomic which ends up
> timing out, resulting in a warning at hw_atl2_utils_fw.c:112:
> 
> [  104.595913] Call Trace:
> [  104.595915]  <TASK>
> [  104.595918]  ? show_regs+0x6c/0x80
> [  104.595923]  ? __warn+0x8d/0x150
> [  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595934]  ? report_bug+0x182/0x1b0
> [  104.595938]  ? handle_bug+0x6e/0xb0
> [  104.595940]  ? exc_invalid_op+0x18/0x80
> [  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
> [  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
> [  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
> [  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
> [  104.595975]  __dev_close_many+0xad/0x160
> [  104.595978]  dev_close_many+0x99/0x170
> [  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
> [  104.595981]  ? __call_rcu_common+0xcd/0x700
> [  104.595984]  unregister_netdevice_queue+0xc6/0x110
> [  104.595986]  unregister_netdev+0x1c/0x30
> [  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]
> 
> Fix this by skipping firmware deinitialization altogether if the
> PCI device is no longer present.
> 
> Tested with an AQC113 attached via Thunderbolt by performing
> repeated unplug cycles while traffic was running via iperf.
> 

Hi Jacob,

As a fix for net a Fixes tag should go here
(immediately before your signed-off-by line, no blank line in between).

I'm wondering if this one is appropriate: the problem seems
to go all the way back to here.

Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")

> Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
> ---
>  drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> index fe0e3e2a8117..e2ae95a01947 100644
> --- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> +++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
> @@ -1428,7 +1428,7 @@ void aq_nic_deinit(struct aq_nic_s *self, bool link_down)
>  	unsigned int i = 0U;
>  
>  	if (!self)
> -		goto err_exit;
> +		return;
>  
>  	for (i = 0U; i < self->aq_vecs; i++) {
>  		aq_vec = self->aq_vec[i];

This hunk, and the removal of the err_exit label, seem to be more
clean-up than addressing the bug described in the patch description.
I don't think they belong in this patch. But could be candidates for
a follow-up patch targeted at net-next.

> @@ -1441,13 +1441,14 @@ void aq_nic_deinit(struct aq_nic_s *self, bool link_down)
>  	aq_ptp_ring_free(self);
>  	aq_ptp_free(self);
>  
> -	if (likely(self->aq_fw_ops->deinit) && link_down) {
> -		mutex_lock(&self->fwreq_mutex);
> -		self->aq_fw_ops->deinit(self->aq_hw);
> -		mutex_unlock(&self->fwreq_mutex);
> +	/* May be invoked during hot unplug. */
> +	if (pci_device_is_present(self->pdev)) {
> +		if (likely(self->aq_fw_ops->deinit) && link_down) {

Maybe not important, but I would have written this as a single if
condition rather than two.

Also, not really appropriate to change in this patch as it's not part
of the bug, but I'm not sure that likely() is appropriate here:
is this a fast path?

> +			mutex_lock(&self->fwreq_mutex);
> +			self->aq_fw_ops->deinit(self->aq_hw);
> +			mutex_unlock(&self->fwreq_mutex);
> +		}
>  	}
> -
> -err_exit:;
>  }
>  
>  void aq_nic_free_vectors(struct aq_nic_s *self)

-- 
pw-bot: changes-requested

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] net: atlantic: fix warning during hot unplug
  2025-02-03 10:02 ` Simon Horman
@ 2025-02-03 14:34   ` Jacob S. Moroni
  2025-02-03 14:36   ` [PATCH net v2] " Jacob Moroni
  1 sibling, 0 replies; 7+ messages in thread
From: Jacob S. Moroni @ 2025-02-03 14:34 UTC (permalink / raw)
  To: Simon Horman
  Cc: Igor Russkikh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, netdev, linux-kernel

Hello,

Thanks for the feedback.

> This hunk, and the removal of the err_exit label, seem to be more
> clean-up than addressing the bug described in the patch description.
> I don't think they belong in this patch. But could be candidates for
> a follow-up patch targeted at net-next.

Makes sense. I'll send some follow up patches to clean these up.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net v2] net: atlantic: fix warning during hot unplug
  2025-02-03 10:02 ` Simon Horman
  2025-02-03 14:34   ` Jacob S. Moroni
@ 2025-02-03 14:36   ` Jacob Moroni
  2025-02-04 10:54     ` Simon Horman
  2025-02-04 22:20     ` patchwork-bot+netdevbpf
  1 sibling, 2 replies; 7+ messages in thread
From: Jacob Moroni @ 2025-02-03 14:36 UTC (permalink / raw)
  To: Igor Russkikh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Pavel Belous, Alexander Loktionov,
	Dmitrii Tarakanov, David VomLehn, Dmitry Bezrukov
  Cc: Jacob Moroni, netdev, linux-kernel

Firmware deinitialization performs MMIO accesses which are not
necessary if the device has already been removed. In some cases,
these accesses happen via readx_poll_timeout_atomic which ends up
timing out, resulting in a warning at hw_atl2_utils_fw.c:112:

[  104.595913] Call Trace:
[  104.595915]  <TASK>
[  104.595918]  ? show_regs+0x6c/0x80
[  104.595923]  ? __warn+0x8d/0x150
[  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595934]  ? report_bug+0x182/0x1b0
[  104.595938]  ? handle_bug+0x6e/0xb0
[  104.595940]  ? exc_invalid_op+0x18/0x80
[  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
[  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
[  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
[  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
[  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
[  104.595975]  __dev_close_many+0xad/0x160
[  104.595978]  dev_close_many+0x99/0x170
[  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
[  104.595981]  ? __call_rcu_common+0xcd/0x700
[  104.595984]  unregister_netdevice_queue+0xc6/0x110
[  104.595986]  unregister_netdev+0x1c/0x30
[  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]

Fix this by skipping firmware deinitialization altogether if the
PCI device is no longer present.

Tested with an AQC113 attached via Thunderbolt by performing
repeated unplug cycles while traffic was running via iperf.

Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
Reviewed-by: Igor Russkikh <irusskikh@marvell.com>
---
 drivers/net/ethernet/aquantia/atlantic/aq_nic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index fe0e3e2a8117..71e50fc65c14 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -1441,7 +1441,9 @@ void aq_nic_deinit(struct aq_nic_s *self, bool link_down)
 	aq_ptp_ring_free(self);
 	aq_ptp_free(self);
 
-	if (likely(self->aq_fw_ops->deinit) && link_down) {
+	/* May be invoked during hot unplug. */
+	if (pci_device_is_present(self->pdev) &&
+	    likely(self->aq_fw_ops->deinit) && link_down) {
 		mutex_lock(&self->fwreq_mutex);
 		self->aq_fw_ops->deinit(self->aq_hw);
 		mutex_unlock(&self->fwreq_mutex);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net v2] net: atlantic: fix warning during hot unplug
  2025-02-03 14:36   ` [PATCH net v2] " Jacob Moroni
@ 2025-02-04 10:54     ` Simon Horman
  2025-02-04 22:20     ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 7+ messages in thread
From: Simon Horman @ 2025-02-04 10:54 UTC (permalink / raw)
  To: Jacob Moroni
  Cc: Igor Russkikh, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Pavel Belous, Alexander Loktionov,
	Dmitrii Tarakanov, David VomLehn, Dmitry Bezrukov, netdev,
	linux-kernel

On Mon, Feb 03, 2025 at 09:36:05AM -0500, Jacob Moroni wrote:
> Firmware deinitialization performs MMIO accesses which are not
> necessary if the device has already been removed. In some cases,
> these accesses happen via readx_poll_timeout_atomic which ends up
> timing out, resulting in a warning at hw_atl2_utils_fw.c:112:
> 
> [  104.595913] Call Trace:
> [  104.595915]  <TASK>
> [  104.595918]  ? show_regs+0x6c/0x80
> [  104.595923]  ? __warn+0x8d/0x150
> [  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595934]  ? report_bug+0x182/0x1b0
> [  104.595938]  ? handle_bug+0x6e/0xb0
> [  104.595940]  ? exc_invalid_op+0x18/0x80
> [  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
> [  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
> [  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
> [  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
> [  104.595975]  __dev_close_many+0xad/0x160
> [  104.595978]  dev_close_many+0x99/0x170
> [  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
> [  104.595981]  ? __call_rcu_common+0xcd/0x700
> [  104.595984]  unregister_netdevice_queue+0xc6/0x110
> [  104.595986]  unregister_netdev+0x1c/0x30
> [  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]
> 
> Fix this by skipping firmware deinitialization altogether if the
> PCI device is no longer present.
> 
> Tested with an AQC113 attached via Thunderbolt by performing
> repeated unplug cycles while traffic was running via iperf.
> 
> Fixes: 97bde5c4f909 ("net: ethernet: aquantia: Support for NIC-specific code")
> Signed-off-by: Jacob Moroni <mail@jakemoroni.com>
> Reviewed-by: Igor Russkikh <irusskikh@marvell.com>

Thanks for addressing my review of v1.

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net v2] net: atlantic: fix warning during hot unplug
  2025-02-03 14:36   ` [PATCH net v2] " Jacob Moroni
  2025-02-04 10:54     ` Simon Horman
@ 2025-02-04 22:20     ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 7+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-02-04 22:20 UTC (permalink / raw)
  To: Jacob Moroni
  Cc: irusskikh, andrew+netdev, davem, edumazet, kuba, pabeni,
	Pavel.Belous, Alexander.Loktionov, Dmitrii.Tarakanov, vomlehn,
	Dmitry.Bezrukov, netdev, linux-kernel

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon,  3 Feb 2025 09:36:05 -0500 you wrote:
> Firmware deinitialization performs MMIO accesses which are not
> necessary if the device has already been removed. In some cases,
> these accesses happen via readx_poll_timeout_atomic which ends up
> timing out, resulting in a warning at hw_atl2_utils_fw.c:112:
> 
> [  104.595913] Call Trace:
> [  104.595915]  <TASK>
> [  104.595918]  ? show_regs+0x6c/0x80
> [  104.595923]  ? __warn+0x8d/0x150
> [  104.595925]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595934]  ? report_bug+0x182/0x1b0
> [  104.595938]  ? handle_bug+0x6e/0xb0
> [  104.595940]  ? exc_invalid_op+0x18/0x80
> [  104.595942]  ? asm_exc_invalid_op+0x1b/0x20
> [  104.595944]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595952]  ? aq_a2_fw_deinit+0xcf/0xe0 [atlantic]
> [  104.595959]  aq_nic_deinit.part.0+0xbd/0xf0 [atlantic]
> [  104.595964]  aq_nic_deinit+0x17/0x30 [atlantic]
> [  104.595970]  aq_ndev_close+0x2b/0x40 [atlantic]
> [  104.595975]  __dev_close_many+0xad/0x160
> [  104.595978]  dev_close_many+0x99/0x170
> [  104.595979]  unregister_netdevice_many_notify+0x18b/0xb20
> [  104.595981]  ? __call_rcu_common+0xcd/0x700
> [  104.595984]  unregister_netdevice_queue+0xc6/0x110
> [  104.595986]  unregister_netdev+0x1c/0x30
> [  104.595988]  aq_pci_remove+0xb1/0xc0 [atlantic]
> 
> [...]

Here is the summary with links:
  - [net,v2] net: atlantic: fix warning during hot unplug
    https://git.kernel.org/netdev/net/c/028676bb189e

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-02-04 22:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-02 22:09 [PATCH net] net: atlantic: fix warning during hot unplug Jacob Moroni
2025-02-03  8:55 ` [EXTERNAL] " Igor Russkikh
2025-02-03 10:02 ` Simon Horman
2025-02-03 14:34   ` Jacob S. Moroni
2025-02-03 14:36   ` [PATCH net v2] " Jacob Moroni
2025-02-04 10:54     ` Simon Horman
2025-02-04 22:20     ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).