netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v3 1/1] mlxbf_gige: Fix kernel panic at shutdown
@ 2023-07-20 20:56 Asmaa Mnebhi
  2023-07-20 21:11 ` Asmaa Mnebhi
  0 siblings, 1 reply; 3+ messages in thread
From: Asmaa Mnebhi @ 2023-07-20 20:56 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni
  Cc: Asmaa Mnebhi, netdev, sridhar.samudrala, maciej.fijalkowski,
	olteanv, davthompson

There is a race condition happening during shutdown due to pending
napi transactions. Since mlxbf_gige_poll is still running, it tries
to access a NULL pointer and as a result causes a kernel panic.
To fix this during shutdown, invoke mlxbf_gige_remove to disable and
dequeue napi.

Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver")
Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
---
v2-v3:
 - remove mlxbf_gige_shutdown() since it is redundant with mlxbf_gige_remove().
v1->v2:
 - fixed the tag from net-next to net
 - check that the "priv" pointer is not NULL in mlxbf_gige_remove()

 .../ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c  | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
index 694de9513b9f..31b84d6417de 100644
--- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
+++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c
@@ -475,6 +475,9 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
 {
 	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
 
+	if (!priv)
+		return 0;
+
 	unregister_netdev(priv->netdev);
 	phy_disconnect(priv->netdev->phydev);
 	mlxbf_gige_mdio_remove(priv);
@@ -483,14 +486,6 @@ static int mlxbf_gige_remove(struct platform_device *pdev)
 	return 0;
 }
 
-static void mlxbf_gige_shutdown(struct platform_device *pdev)
-{
-	struct mlxbf_gige *priv = platform_get_drvdata(pdev);
-
-	writeq(0, priv->base + MLXBF_GIGE_INT_EN);
-	mlxbf_gige_clean_port(priv);
-}
-
 static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {
 	{ "MLNXBF17", 0 },
 	{},
@@ -500,7 +495,7 @@ MODULE_DEVICE_TABLE(acpi, mlxbf_gige_acpi_match);
 static struct platform_driver mlxbf_gige_driver = {
 	.probe = mlxbf_gige_probe,
 	.remove = mlxbf_gige_remove,
-	.shutdown = mlxbf_gige_shutdown,
+	.shutdown = mlxbf_gige_remove,
 	.driver = {
 		.name = KBUILD_MODNAME,
 		.acpi_match_table = ACPI_PTR(mlxbf_gige_acpi_match),
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* RE: [PATCH net v3 1/1] mlxbf_gige: Fix kernel panic at shutdown
  2023-07-20 20:56 [PATCH net v3 1/1] mlxbf_gige: Fix kernel panic at shutdown Asmaa Mnebhi
@ 2023-07-20 21:11 ` Asmaa Mnebhi
  2023-07-21  3:21   ` Jakub Kicinski
  0 siblings, 1 reply; 3+ messages in thread
From: Asmaa Mnebhi @ 2023-07-20 21:11 UTC (permalink / raw)
  To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com
  Cc: netdev@vger.kernel.org, sridhar.samudrala@intel.com,
	maciej.fijalkowski@intel.com, olteanv@gmail.com, David Thompson

>  	.probe = mlxbf_gige_probe,
>  	.remove = mlxbf_gige_remove,
> -	.shutdown = mlxbf_gige_shutdown,
> +	.shutdown = mlxbf_gige_remove,

Actually, apologies for this commit in response to Sridhar's comment on v2. This will not work. Mlxbf_gige_remove() returns void while ".shutdown" expects to return an int.
Please advise on how to proceed. Should I send a v4 with the same patch as v2?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net v3 1/1] mlxbf_gige: Fix kernel panic at shutdown
  2023-07-20 21:11 ` Asmaa Mnebhi
@ 2023-07-21  3:21   ` Jakub Kicinski
  0 siblings, 0 replies; 3+ messages in thread
From: Jakub Kicinski @ 2023-07-21  3:21 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
	netdev@vger.kernel.org, sridhar.samudrala@intel.com,
	maciej.fijalkowski@intel.com, olteanv@gmail.com, David Thompson

On Thu, 20 Jul 2023 21:11:53 +0000 Asmaa Mnebhi wrote:
> >  	.probe = mlxbf_gige_probe,
> >  	.remove = mlxbf_gige_remove,
> > -	.shutdown = mlxbf_gige_shutdown,
> > +	.shutdown = mlxbf_gige_remove,  
> 
> Actually, apologies for this commit in response to Sridhar's comment
> on v2. This will not work. Mlxbf_gige_remove() returns void while
> ".shutdown" expects to return an int. Please advise on how to
> proceed. Should I send a v4 with the same patch as v2?

You can make it work, there is a new version of remove upstream:

https://elixir.bootlin.com/linux/v6.5-rc2/source/include/linux/platform_device.h#L220

You can make them both void.
-- 
pw-bot: cr

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-21  3:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-07-20 20:56 [PATCH net v3 1/1] mlxbf_gige: Fix kernel panic at shutdown Asmaa Mnebhi
2023-07-20 21:11 ` Asmaa Mnebhi
2023-07-21  3:21   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).