public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs
@ 2026-04-13  5:08 Erni Sri Satya Vennela
  2026-04-13  5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

Fix four pre-existing bugs in mana_probe()/mana_remove() error handling
that can cause warnings on uninitialized work structs, masked errors,
and resource leaks when early probe steps fail.

Patches 1-2 move work struct initialization (link_change_work and
gf_stats_work) to before any error path that could trigger
mana_remove(), preventing WARN_ON in __flush_work() or debug object
warnings when sync cancellation runs on uninitialized work structs.

Patch 3 prevents add_adev() from overwriting a port probe error,
which could leave the driver in a broken state with NULL ports while
reporting success.

Patch 4 changes 'goto out' to 'break' in mana_remove()'s port loop
so that mana_destroy_eq() is always reached, preventing EQ leaks when
a NULL port is encountered.
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
Erni Sri Satya Vennela (4):
  net: mana: Init link_change_work before potential error paths in probe
  net: mana: Init gf_stats_work before potential error paths in probe
  net: mana: Don't overwrite port probe error with add_adev result
  net: mana: Fix EQ leak in mana_remove on NULL port

 drivers/net/ethernet/microsoft/mana/mana_en.c | 28 +++++++++----------
 1 file changed, 14 insertions(+), 14 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
  2026-04-13  5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
@ 2026-04-13  5:08 ` Erni Sri Satya Vennela
  2026-04-14 15:41   ` Simon Horman
  2026-04-13  5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

Move INIT_WORK(link_change_work) to right after the mana_context
allocation, before any error path that could reach mana_remove().

Previously, if mana_create_eq() or mana_query_device_cfg() failed,
mana_probe() would jump to the error path which calls mana_remove().
mana_remove() unconditionally calls disable_work_sync(link_change_work),
but the work struct had not been initialized yet. This can trigger
CONFIG_DEBUG_OBJECTS_WORK enabled.

Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 07630322545f..57f146ea6f66 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3631,6 +3631,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 
 		ac->gdma_dev = gd;
 		gd->driver_data = ac;
+
+		INIT_WORK(&ac->link_change_work, mana_link_state_handle);
 	}
 
 	err = mana_create_eq(ac);
@@ -3648,8 +3650,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 
 	if (!resuming) {
 		ac->num_ports = num_ports;
-
-		INIT_WORK(&ac->link_change_work, mana_link_state_handle);
 	} else {
 		if (ac->num_ports != num_ports) {
 			dev_err(dev, "The number of vPorts changed: %d->%d\n",
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
  2026-04-13  5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
  2026-04-13  5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
@ 2026-04-13  5:08 ` Erni Sri Satya Vennela
  2026-04-14 15:41   ` Simon Horman
  2026-04-13  5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
  2026-04-13  5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
  3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
while keeping schedule_delayed_work() at its original location.

Previously, if any function between mana_create_eq() and the
INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
in __flush_work() or debug object warnings with
CONFIG_DEBUG_OBJECTS_WORK enabled.

Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 57f146ea6f66..f6ad46736418 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3635,6 +3635,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 		INIT_WORK(&ac->link_change_work, mana_link_state_handle);
 	}
 
+	INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
+
 	err = mana_create_eq(ac);
 	if (err) {
 		dev_err(dev, "Failed to create EQs: %d\n", err);
@@ -3709,7 +3711,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 
 	err = add_adev(gd, "eth");
 
-	INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
 	schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
 
 out:
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
  2026-04-13  5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
  2026-04-13  5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
  2026-04-13  5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
@ 2026-04-13  5:08 ` Erni Sri Satya Vennela
  2026-04-14 15:35   ` Simon Horman
  2026-04-13  5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
  3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

In mana_probe(), if mana_probe_port() fails for any port, the error
is stored in 'err' and the loop breaks. However, the subsequent
unconditional 'err = add_adev(gd, "eth")' overwrites this error.
If add_adev() succeeds, mana_probe() returns success despite ports
being left in a partially initialized state (ac->ports[i] == NULL).

Only call add_adev() when there is no prior error, so the probe
correctly fails and triggers mana_remove() cleanup.

Fixes: ced82fce77e9 ("net: mana: Probe rdma device in mana driver")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f6ad46736418..1a141c46ac27 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3680,10 +3680,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 	if (!resuming) {
 		for (i = 0; i < ac->num_ports; i++) {
 			err = mana_probe_port(ac, i, &ac->ports[i]);
-			/* we log the port for which the probe failed and stop
-			 * probes for subsequent ports.
-			 * Note that we keep running ports, for which the probes
-			 * were successful, unless add_adev fails too
+			/* Log the port for which the probe failed, stop probing
+			 * subsequent ports, and skip add_adev.
+			 * Already-probed ports remain functional.
 			 */
 			if (err) {
 				dev_err(dev, "Probe Failed for port %d\n", i);
@@ -3697,10 +3696,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 			enable_work(&apc->queue_reset_work);
 			err = mana_attach(ac->ports[i]);
 			rtnl_unlock();
-			/* we log the port for which the attach failed and stop
-			 * attach for subsequent ports
-			 * Note that we keep running ports, for which the attach
-			 * were successful, unless add_adev fails too
+			/* Log the port for which the attach failed, stop
+			 * attaching subsequent ports, and skip add_adev.
+			 * Already-attached ports remain functional.
 			 */
 			if (err) {
 				dev_err(dev, "Attach Failed for port %d\n", i);
@@ -3709,7 +3707,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
 		}
 	}
 
-	err = add_adev(gd, "eth");
+	if (!err)
+		err = add_adev(gd, "eth");
 
 	schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
  2026-04-13  5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
                   ` (2 preceding siblings ...)
  2026-04-13  5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
@ 2026-04-13  5:08 ` Erni Sri Satya Vennela
  2026-04-14 15:40   ` Simon Horman
  3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13  5:08 UTC (permalink / raw)
  To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq().

This can happen when mana_probe_port() fails for port 0, leaving
ac->ports[0] as NULL. On driver unload or error cleanup, mana_remove()
hits the NULL entry and jumps past mana_destroy_eq().

Change 'goto out' to 'break' so the for-loop exits normally and
mana_destroy_eq() is always reached. Remove the now-unreferenced out:
label.

Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
 drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 1a141c46ac27..97237d137cbf 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3747,7 +3747,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
 		if (!ndev) {
 			if (i == 0)
 				dev_err(dev, "No net device to remove\n");
-			goto out;
+			break;
 		}
 
 		apc = netdev_priv(ndev);
@@ -3778,7 +3778,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
 	}
 
 	mana_destroy_eq(ac);
-out:
+
 	if (ac->per_port_queue_reset_wq) {
 		destroy_workqueue(ac->per_port_queue_reset_wq);
 		ac->per_port_queue_reset_wq = NULL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
  2026-04-13  5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
@ 2026-04-14 15:35   ` Simon Horman
  2026-04-15  7:04     ` Erni Sri Satya Vennela
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:35 UTC (permalink / raw)
  To: ernis
  Cc: 'Simon Horman', kys, haiyangz, wei.liu, decui, longli,
	andrew+netdev, davem, edumazet, kuba, pabeni, ssengar, dipayanroy,
	gargaditya, shirazsaleem, kees, kotaranov, leon, shacharr,
	stephen, linux-hyperv, netdev, linux-kernel

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net: mana: Don't overwrite port probe error with add_adev result

This commit ensures that if mana_probe_port() or mana_attach() fails, the
error is not overwritten by the result of add_adev(). By preserving the
error, it triggers mana_remove() cleanup instead of leaving ports partially
initialized.

> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9d095a6fb56c7..4d42cbfb3f15b 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
>  	if (!resuming) {
>  		for (i = 0; i < ac->num_ports; i++) {
>  			err = mana_probe_port(ac, i, &ac->ports[i]);
> -			/* we log the port for which the probe failed and stop
> -			 * probes for subsequent ports.
> -			 * Note that we keep running ports, for which the probes
> -			 * were successful, unless add_adev fails too
> +			/* Log the port for which the probe failed, stop probing
> +			 * subsequent ports, and skip add_adev.
> +			 * Already-probed ports remain functional.
>  			 */

Are these comments accurate regarding ports remaining functional?

Since this patch preserves the error code, execution falls through to the end
of mana_probe() where it checks if err is set and calls mana_remove().
mana_remove() will iterate over the successfully allocated ports, completely
detach them, unregister their net_devices, and destroy event queues.

It appears no ports remain functional because the device is completely torn
down.

[ ... ]

> @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
>  			enable_work(&apc->queue_reset_work);
>  			err = mana_attach(ac->ports[i]);
>  			rtnl_unlock();
> -			/* we log the port for which the attach failed and stop
> -			 * attach for subsequent ports
> -			 * Note that we keep running ports, for which the attach
> -			 * were successful, unless add_adev fails too
> +			/* Log the port for which the attach failed, stop
> +			 * attaching subsequent ports, and skip add_adev.
> +			 * Already-attached ports remain functional.
>  			 */
>  			if (err) {
>  				dev_err(dev, "Attach Failed for port %d\n", i);
> @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)

Does propagating this error during PM resume introduce a NULL pointer
dereference?

If PM resume fails here (resuming == true), the preserved error code causes
mana_probe() to call mana_remove(), which tears down the device and sets
gd->gdma_context = NULL.

However, a failed resume callback does not automatically unbind the driver.
When the device is eventually unbound later, mana_remove() is invoked again:

drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
    struct gdma_context *gc = gd->gdma_context;
    struct device *dev = gc->dev;
    ...
}

In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
which could lead to a kernel panic when dereferencing gc->dev.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
  2026-04-13  5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
@ 2026-04-14 15:40   ` Simon Horman
  2026-04-15  7:01     ` Erni Sri Satya Vennela
  0 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:40 UTC (permalink / raw)
  To: ernis
  Cc: Simon Horman, pabeni, edumazet, ssengar, dipayanroy, gargaditya,
	decui, shacharr, linux-hyperv, kotaranov, kys, shirazsaleem,
	longli, kuba, andrew+netdev, wei.liu, linux-kernel, netdev,
	haiyangz, kees, stephen, leon, davem

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: mana: Fix EQ leak in mana_remove on NULL port

In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq(). This can happen when
mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On
driver unload or error cleanup, mana_remove() hits the NULL entry and
jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the
for-loop exits normally and mana_destroy_eq() is always reached.

> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")

Does this Fixes tag point to the correct commit? Looking at the commit
history, the original driver commit ca9c54d2d6a5 did not have
mana_destroy_eq() in mana_remove() at all, so the 'goto out' pattern could
not have caused an EQ leak at that point.

The bug appears to have been introduced by commit 1e2d0824a9c3 ("net: mana:
Add support for EQ sharing"), which added mana_destroy_eq(ac) to
mana_remove() and placed it after the 'out:' label, making it skippable by
the 'goto out' on NULL port.

Should the Fixes tag be:
  Fixes: 1e2d0824a9c3 ("net: mana: Add support for EQ sharing")

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
  2026-04-13  5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
@ 2026-04-14 15:41   ` Simon Horman
  0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:41 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

On Sun, Apr 12, 2026 at 10:08:37PM -0700, Erni Sri Satya Vennela wrote:
> Move INIT_WORK(link_change_work) to right after the mana_context
> allocation, before any error path that could reach mana_remove().
> 
> Previously, if mana_create_eq() or mana_query_device_cfg() failed,
> mana_probe() would jump to the error path which calls mana_remove().
> mana_remove() unconditionally calls disable_work_sync(link_change_work),
> but the work struct had not been initialized yet. This can trigger
> CONFIG_DEBUG_OBJECTS_WORK enabled.
> 
> Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
> Changes in v2:
> * Apply the patch in net instead of net-next.

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
  2026-04-13  5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
@ 2026-04-14 15:41   ` Simon Horman
  0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:41 UTC (permalink / raw)
  To: Erni Sri Satya Vennela
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

On Sun, Apr 12, 2026 at 10:08:38PM -0700, Erni Sri Satya Vennela wrote:
> Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
> while keeping schedule_delayed_work() at its original location.
> 
> Previously, if any function between mana_create_eq() and the
> INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
> which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
> in __flush_work() or debug object warnings with
> CONFIG_DEBUG_OBJECTS_WORK enabled.
> 
> Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
  2026-04-14 15:40   ` Simon Horman
@ 2026-04-15  7:01     ` Erni Sri Satya Vennela
  0 siblings, 0 replies; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-15  7:01 UTC (permalink / raw)
  To: Simon Horman
  Cc: pabeni, edumazet, ssengar, dipayanroy, gargaditya, decui,
	shacharr, linux-hyperv, kotaranov, kys, shirazsaleem, longli,
	kuba, andrew+netdev, wei.liu, linux-kernel, netdev, haiyangz,
	kees, stephen, leon, davem

On Tue, Apr 14, 2026 at 04:40:58PM +0100, Simon Horman wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> 
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net: mana: Fix EQ leak in mana_remove on NULL port
> 
> In mana_remove(), when a NULL port is encountered in the port iteration
> loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
> queues allocated earlier by mana_create_eq(). This can happen when
> mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On
> driver unload or error cleanup, mana_remove() hits the NULL entry and
> jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the
> for-loop exits normally and mana_destroy_eq() is always reached.
> 
> > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
> 
> Does this Fixes tag point to the correct commit? Looking at the commit
> history, the original driver commit ca9c54d2d6a5 did not have
> mana_destroy_eq() in mana_remove() at all, so the 'goto out' pattern could
> not have caused an EQ leak at that point.
> 
> The bug appears to have been introduced by commit 1e2d0824a9c3 ("net: mana:
> Add support for EQ sharing"), which added mana_destroy_eq(ac) to
> mana_remove() and placed it after the 'out:' label, making it skippable by
> the 'goto out' on NULL port.
> 
> Should the Fixes tag be:
>   Fixes: 1e2d0824a9c3 ("net: mana: Add support for EQ sharing")
Thankyou for the correction, Simon. 
I will make this change in the next version of the patchset.

- Vennela

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
  2026-04-14 15:35   ` Simon Horman
@ 2026-04-15  7:04     ` Erni Sri Satya Vennela
  0 siblings, 0 replies; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-15  7:04 UTC (permalink / raw)
  To: Simon Horman
  Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
	edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
	shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
	linux-hyperv, netdev, linux-kernel

On Tue, Apr 14, 2026 at 04:35:03PM +0100, Simon Horman wrote:
> From: 'Simon Horman' <horms@kernel.org>
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> net: mana: Don't overwrite port probe error with add_adev result
> 
> This commit ensures that if mana_probe_port() or mana_attach() fails, the
> error is not overwritten by the result of add_adev(). By preserving the
> error, it triggers mana_remove() cleanup instead of leaving ports partially
> initialized.
> 
> > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > index 9d095a6fb56c7..4d42cbfb3f15b 100644
> > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> >  	if (!resuming) {
> >  		for (i = 0; i < ac->num_ports; i++) {
> >  			err = mana_probe_port(ac, i, &ac->ports[i]);
> > -			/* we log the port for which the probe failed and stop
> > -			 * probes for subsequent ports.
> > -			 * Note that we keep running ports, for which the probes
> > -			 * were successful, unless add_adev fails too
> > +			/* Log the port for which the probe failed, stop probing
> > +			 * subsequent ports, and skip add_adev.
> > +			 * Already-probed ports remain functional.
> >  			 */
> 
> Are these comments accurate regarding ports remaining functional?
> 
> Since this patch preserves the error code, execution falls through to the end
> of mana_probe() where it checks if err is set and calls mana_remove().
> mana_remove() will iterate over the successfully allocated ports, completely
> detach them, unregister their net_devices, and destroy event queues.
> 
> It appears no ports remain functional because the device is completely torn
> down.
>
I will update the patch with the updated comment message. 
> [ ... ]
> 
> > @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> >  			enable_work(&apc->queue_reset_work);
> >  			err = mana_attach(ac->ports[i]);
> >  			rtnl_unlock();
> > -			/* we log the port for which the attach failed and stop
> > -			 * attach for subsequent ports
> > -			 * Note that we keep running ports, for which the attach
> > -			 * were successful, unless add_adev fails too
> > +			/* Log the port for which the attach failed, stop
> > +			 * attaching subsequent ports, and skip add_adev.
> > +			 * Already-attached ports remain functional.
> >  			 */
> >  			if (err) {
> >  				dev_err(dev, "Attach Failed for port %d\n", i);
> > @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> 
> Does propagating this error during PM resume introduce a NULL pointer
> dereference?
> 
> If PM resume fails here (resuming == true), the preserved error code causes
> mana_probe() to call mana_remove(), which tears down the device and sets
> gd->gdma_context = NULL.
> 
> However, a failed resume callback does not automatically unbind the driver.
> When the device is eventually unbound later, mana_remove() is invoked again:
> 
> drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
>     struct gdma_context *gc = gd->gdma_context;
>     struct device *dev = gc->dev;
>     ...
> }
> 
> In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
> which could lead to a kernel panic when dereferencing gc->dev.

Thankyou for pointing it out, Simon.
Since this is a pre-existing bug, I will create a different patch for
this change and make it as part of this patchset.

- Vennela

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-04-15  7:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13  5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13  5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
2026-04-14 15:41   ` Simon Horman
2026-04-13  5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
2026-04-14 15:41   ` Simon Horman
2026-04-13  5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
2026-04-14 15:35   ` Simon Horman
2026-04-15  7:04     ` Erni Sri Satya Vennela
2026-04-13  5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
2026-04-14 15:40   ` Simon Horman
2026-04-15  7:01     ` Erni Sri Satya Vennela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox