public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] net: mana: Fix double destroy_workqueue on service rescan PCI path
@ 2026-02-24 12:38 Dipayaan Roy
  2026-02-25 13:19 ` Simon Horman
  2026-02-26  3:20 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Dipayaan Roy @ 2026-02-24 12:38 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, longli, kotaranov, horms, shradhagupta, ssengar,
	ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, dipayanroy

While testing corner cases in the driver, a use-after-free crash
was found on the service rescan PCI path.

When mana_serv_reset() calls mana_gd_suspend(), mana_gd_cleanup()
destroys gc->service_wq. If the subsequent mana_gd_resume() fails
with -ETIMEDOUT or -EPROTO, the code falls through to
mana_serv_rescan() which triggers pci_stop_and_remove_bus_device().
This invokes the PCI .remove callback (mana_gd_remove), which calls
mana_gd_cleanup() a second time, attempting to destroy the already-
freed workqueue. Fix this by NULL-checking gc->service_wq in
mana_gd_cleanup() and setting it to NULL after destruction.

Call stack of issue for reference:
[Sat Feb 21 18:53:48 2026] Call Trace:
[Sat Feb 21 18:53:48 2026]  <TASK>
[Sat Feb 21 18:53:48 2026]  mana_gd_cleanup+0x33/0x70 [mana]
[Sat Feb 21 18:53:48 2026]  mana_gd_remove+0x3a/0xc0 [mana]
[Sat Feb 21 18:53:48 2026]  pci_device_remove+0x41/0xb0
[Sat Feb 21 18:53:48 2026]  device_remove+0x46/0x70
[Sat Feb 21 18:53:48 2026]  device_release_driver_internal+0x1e3/0x250
[Sat Feb 21 18:53:48 2026]  device_release_driver+0x12/0x20
[Sat Feb 21 18:53:48 2026]  pci_stop_bus_device+0x6a/0x90
[Sat Feb 21 18:53:48 2026]  pci_stop_and_remove_bus_device+0x13/0x30
[Sat Feb 21 18:53:48 2026]  mana_do_service+0x180/0x290 [mana]
[Sat Feb 21 18:53:48 2026]  mana_serv_func+0x24/0x50 [mana]
[Sat Feb 21 18:53:48 2026]  process_one_work+0x190/0x3d0
[Sat Feb 21 18:53:48 2026]  worker_thread+0x16e/0x2e0
[Sat Feb 21 18:53:48 2026]  kthread+0xf7/0x130
[Sat Feb 21 18:53:48 2026]  ? __pfx_worker_thread+0x10/0x10
[Sat Feb 21 18:53:48 2026]  ? __pfx_kthread+0x10/0x10
[Sat Feb 21 18:53:48 2026]  ret_from_fork+0x269/0x350
[Sat Feb 21 18:53:48 2026]  ? __pfx_kthread+0x10/0x10
[Sat Feb 21 18:53:48 2026]  ret_from_fork_asm+0x1a/0x30
[Sat Feb 21 18:53:48 2026]  </TASK>

Fixes: 505cc26bcae0 ("net: mana: Add support for auxiliary device servicing events")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
 drivers/net/ethernet/microsoft/mana/gdma_main.c | 5 ++++-
 drivers/net/ethernet/microsoft/mana/mana_en.c   | 4 +++-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index 0055c231acf6..3926d18f1840 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1946,7 +1946,10 @@ static void mana_gd_cleanup(struct pci_dev *pdev)
 
 	mana_gd_remove_irqs(pdev);
 
-	destroy_workqueue(gc->service_wq);
+	if (gc->service_wq) {
+		destroy_workqueue(gc->service_wq);
+		gc->service_wq = NULL;
+	}
 	dev_dbg(&pdev->dev, "mana gdma cleanup successful\n");
 }
 
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 9b5a72ada5c4..f69e42651359 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3762,7 +3762,9 @@ void mana_rdma_remove(struct gdma_dev *gd)
 	}
 
 	WRITE_ONCE(gd->rdma_teardown, true);
-	flush_workqueue(gc->service_wq);
+
+	if (gc->service_wq)
+		flush_workqueue(gc->service_wq);
 
 	if (gd->adev)
 		remove_adev(gd);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net: mana: Fix double destroy_workqueue on service rescan PCI path
  2026-02-24 12:38 [PATCH net] net: mana: Fix double destroy_workqueue on service rescan PCI path Dipayaan Roy
@ 2026-02-25 13:19 ` Simon Horman
  2026-02-26  3:20 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Simon Horman @ 2026-02-25 13:19 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, longli, kotaranov, shradhagupta, ssengar, ernis,
	shirazsaleem, linux-hyperv, netdev, linux-kernel, linux-rdma,
	dipayanroy, Leon Romanovsky

+ Leon

On Tue, Feb 24, 2026 at 04:38:36AM -0800, Dipayaan Roy wrote:
> While testing corner cases in the driver, a use-after-free crash
> was found on the service rescan PCI path.
> 
> When mana_serv_reset() calls mana_gd_suspend(), mana_gd_cleanup()
> destroys gc->service_wq. If the subsequent mana_gd_resume() fails
> with -ETIMEDOUT or -EPROTO, the code falls through to
> mana_serv_rescan() which triggers pci_stop_and_remove_bus_device().
> This invokes the PCI .remove callback (mana_gd_remove), which calls
> mana_gd_cleanup() a second time, attempting to destroy the already-
> freed workqueue. Fix this by NULL-checking gc->service_wq in
> mana_gd_cleanup() and setting it to NULL after destruction.
> 
> Call stack of issue for reference:
> [Sat Feb 21 18:53:48 2026] Call Trace:
> [Sat Feb 21 18:53:48 2026]  <TASK>
> [Sat Feb 21 18:53:48 2026]  mana_gd_cleanup+0x33/0x70 [mana]
> [Sat Feb 21 18:53:48 2026]  mana_gd_remove+0x3a/0xc0 [mana]
> [Sat Feb 21 18:53:48 2026]  pci_device_remove+0x41/0xb0
> [Sat Feb 21 18:53:48 2026]  device_remove+0x46/0x70
> [Sat Feb 21 18:53:48 2026]  device_release_driver_internal+0x1e3/0x250
> [Sat Feb 21 18:53:48 2026]  device_release_driver+0x12/0x20
> [Sat Feb 21 18:53:48 2026]  pci_stop_bus_device+0x6a/0x90
> [Sat Feb 21 18:53:48 2026]  pci_stop_and_remove_bus_device+0x13/0x30
> [Sat Feb 21 18:53:48 2026]  mana_do_service+0x180/0x290 [mana]
> [Sat Feb 21 18:53:48 2026]  mana_serv_func+0x24/0x50 [mana]
> [Sat Feb 21 18:53:48 2026]  process_one_work+0x190/0x3d0
> [Sat Feb 21 18:53:48 2026]  worker_thread+0x16e/0x2e0
> [Sat Feb 21 18:53:48 2026]  kthread+0xf7/0x130
> [Sat Feb 21 18:53:48 2026]  ? __pfx_worker_thread+0x10/0x10
> [Sat Feb 21 18:53:48 2026]  ? __pfx_kthread+0x10/0x10
> [Sat Feb 21 18:53:48 2026]  ret_from_fork+0x269/0x350
> [Sat Feb 21 18:53:48 2026]  ? __pfx_kthread+0x10/0x10
> [Sat Feb 21 18:53:48 2026]  ret_from_fork_asm+0x1a/0x30
> [Sat Feb 21 18:53:48 2026]  </TASK>
> 
> Fixes: 505cc26bcae0 ("net: mana: Add support for auxiliary device servicing events")
> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>

> ---
>  drivers/net/ethernet/microsoft/mana/gdma_main.c | 5 ++++-
>  drivers/net/ethernet/microsoft/mana/mana_en.c   | 4 +++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index 0055c231acf6..3926d18f1840 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1946,7 +1946,10 @@ static void mana_gd_cleanup(struct pci_dev *pdev)
>  
>  	mana_gd_remove_irqs(pdev);
>  
> -	destroy_workqueue(gc->service_wq);
> +	if (gc->service_wq) {
> +		destroy_workqueue(gc->service_wq);
> +		gc->service_wq = NULL;
> +	}
>  	dev_dbg(&pdev->dev, "mana gdma cleanup successful\n");
>  }
>  
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9b5a72ada5c4..f69e42651359 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3762,7 +3762,9 @@ void mana_rdma_remove(struct gdma_dev *gd)
>  	}
>  
>  	WRITE_ONCE(gd->rdma_teardown, true);
> -	flush_workqueue(gc->service_wq);
> +
> +	if (gc->service_wq)
> +		flush_workqueue(gc->service_wq);
>  
>  	if (gd->adev)
>  		remove_adev(gd);
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] net: mana: Fix double destroy_workqueue on service rescan PCI path
  2026-02-24 12:38 [PATCH net] net: mana: Fix double destroy_workqueue on service rescan PCI path Dipayaan Roy
  2026-02-25 13:19 ` Simon Horman
@ 2026-02-26  3:20 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-02-26  3:20 UTC (permalink / raw)
  To: Dipayaan Roy
  Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
	kuba, pabeni, longli, kotaranov, horms, shradhagupta, ssengar,
	ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
	linux-rdma, dipayanroy

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 24 Feb 2026 04:38:36 -0800 you wrote:
> While testing corner cases in the driver, a use-after-free crash
> was found on the service rescan PCI path.
> 
> When mana_serv_reset() calls mana_gd_suspend(), mana_gd_cleanup()
> destroys gc->service_wq. If the subsequent mana_gd_resume() fails
> with -ETIMEDOUT or -EPROTO, the code falls through to
> mana_serv_rescan() which triggers pci_stop_and_remove_bus_device().
> This invokes the PCI .remove callback (mana_gd_remove), which calls
> mana_gd_cleanup() a second time, attempting to destroy the already-
> freed workqueue. Fix this by NULL-checking gc->service_wq in
> mana_gd_cleanup() and setting it to NULL after destruction.
> 
> [...]

Here is the summary with links:
  - [net] net: mana: Fix double destroy_workqueue on service rescan PCI path
    https://git.kernel.org/netdev/net/c/f975a0955276

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-02-26  3:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-24 12:38 [PATCH net] net: mana: Fix double destroy_workqueue on service rescan PCI path Dipayaan Roy
2026-02-25 13:19 ` Simon Horman
2026-02-26  3:20 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox