* [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs
@ 2026-04-13 5:08 Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Fix four pre-existing bugs in mana_probe()/mana_remove() error handling
that can cause warnings on uninitialized work structs, masked errors,
and resource leaks when early probe steps fail.
Patches 1-2 move work struct initialization (link_change_work and
gf_stats_work) to before any error path that could trigger
mana_remove(), preventing WARN_ON in __flush_work() or debug object
warnings when sync cancellation runs on uninitialized work structs.
Patch 3 prevents add_adev() from overwriting a port probe error,
which could leave the driver in a broken state with NULL ports while
reporting success.
Patch 4 changes 'goto out' to 'break' in mana_remove()'s port loop
so that mana_destroy_eq() is always reached, preventing EQ leaks when
a NULL port is encountered.
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
Erni Sri Satya Vennela (4):
net: mana: Init link_change_work before potential error paths in probe
net: mana: Init gf_stats_work before potential error paths in probe
net: mana: Don't overwrite port probe error with add_adev result
net: mana: Fix EQ leak in mana_remove on NULL port
drivers/net/ethernet/microsoft/mana/mana_en.c | 28 +++++++++----------
1 file changed, 14 insertions(+), 14 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Move INIT_WORK(link_change_work) to right after the mana_context
allocation, before any error path that could reach mana_remove().
Previously, if mana_create_eq() or mana_query_device_cfg() failed,
mana_probe() would jump to the error path which calls mana_remove().
mana_remove() unconditionally calls disable_work_sync(link_change_work),
but the work struct had not been initialized yet. This can trigger
CONFIG_DEBUG_OBJECTS_WORK enabled.
Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 07630322545f..57f146ea6f66 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3631,6 +3631,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
ac->gdma_dev = gd;
gd->driver_data = ac;
+
+ INIT_WORK(&ac->link_change_work, mana_link_state_handle);
}
err = mana_create_eq(ac);
@@ -3648,8 +3650,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
if (!resuming) {
ac->num_ports = num_ports;
-
- INIT_WORK(&ac->link_change_work, mana_link_state_handle);
} else {
if (ac->num_ports != num_ports) {
dev_err(dev, "The number of vPorts changed: %d->%d\n",
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
3 siblings, 1 reply; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
while keeping schedule_delayed_work() at its original location.
Previously, if any function between mana_create_eq() and the
INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
in __flush_work() or debug object warnings with
CONFIG_DEBUG_OBJECTS_WORK enabled.
Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 57f146ea6f66..f6ad46736418 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3635,6 +3635,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
INIT_WORK(&ac->link_change_work, mana_link_state_handle);
}
+ INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
+
err = mana_create_eq(ac);
if (err) {
dev_err(dev, "Failed to create EQs: %d\n", err);
@@ -3709,7 +3711,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
err = add_adev(gd, "eth");
- INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
out:
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:35 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
3 siblings, 1 reply; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In mana_probe(), if mana_probe_port() fails for any port, the error
is stored in 'err' and the loop breaks. However, the subsequent
unconditional 'err = add_adev(gd, "eth")' overwrites this error.
If add_adev() succeeds, mana_probe() returns success despite ports
being left in a partially initialized state (ac->ports[i] == NULL).
Only call add_adev() when there is no prior error, so the probe
correctly fails and triggers mana_remove() cleanup.
Fixes: ced82fce77e9 ("net: mana: Probe rdma device in mana driver")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f6ad46736418..1a141c46ac27 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3680,10 +3680,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
if (!resuming) {
for (i = 0; i < ac->num_ports; i++) {
err = mana_probe_port(ac, i, &ac->ports[i]);
- /* we log the port for which the probe failed and stop
- * probes for subsequent ports.
- * Note that we keep running ports, for which the probes
- * were successful, unless add_adev fails too
+ /* Log the port for which the probe failed, stop probing
+ * subsequent ports, and skip add_adev.
+ * Already-probed ports remain functional.
*/
if (err) {
dev_err(dev, "Probe Failed for port %d\n", i);
@@ -3697,10 +3696,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
enable_work(&apc->queue_reset_work);
err = mana_attach(ac->ports[i]);
rtnl_unlock();
- /* we log the port for which the attach failed and stop
- * attach for subsequent ports
- * Note that we keep running ports, for which the attach
- * were successful, unless add_adev fails too
+ /* Log the port for which the attach failed, stop
+ * attaching subsequent ports, and skip add_adev.
+ * Already-attached ports remain functional.
*/
if (err) {
dev_err(dev, "Attach Failed for port %d\n", i);
@@ -3709,7 +3707,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
}
}
- err = add_adev(gd, "eth");
+ if (!err)
+ err = add_adev(gd, "eth");
schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
` (2 preceding siblings ...)
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:40 ` Simon Horman
3 siblings, 1 reply; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq().
This can happen when mana_probe_port() fails for port 0, leaving
ac->ports[0] as NULL. On driver unload or error cleanup, mana_remove()
hits the NULL entry and jumps past mana_destroy_eq().
Change 'goto out' to 'break' so the for-loop exits normally and
mana_destroy_eq() is always reached. Remove the now-unreferenced out:
label.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 1a141c46ac27..97237d137cbf 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3747,7 +3747,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
if (!ndev) {
if (i == 0)
dev_err(dev, "No net device to remove\n");
- goto out;
+ break;
}
apc = netdev_priv(ndev);
@@ -3778,7 +3778,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
}
mana_destroy_eq(ac);
-out:
+
if (ac->per_port_queue_reset_wq) {
destroy_workqueue(ac->per_port_queue_reset_wq);
ac->per_port_queue_reset_wq = NULL;
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
@ 2026-04-14 15:35 ` Simon Horman
2026-04-15 7:04 ` Erni Sri Satya Vennela
0 siblings, 1 reply; 12+ messages in thread
From: Simon Horman @ 2026-04-14 15:35 UTC (permalink / raw)
To: ernis
Cc: 'Simon Horman', kys, haiyangz, wei.liu, decui, longli,
andrew+netdev, davem, edumazet, kuba, pabeni, ssengar, dipayanroy,
gargaditya, shirazsaleem, kees, kotaranov, leon, shacharr,
stephen, linux-hyperv, netdev, linux-kernel
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net: mana: Don't overwrite port probe error with add_adev result
This commit ensures that if mana_probe_port() or mana_attach() fails, the
error is not overwritten by the result of add_adev(). By preserving the
error, it triggers mana_remove() cleanup instead of leaving ports partially
initialized.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9d095a6fb56c7..4d42cbfb3f15b 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> if (!resuming) {
> for (i = 0; i < ac->num_ports; i++) {
> err = mana_probe_port(ac, i, &ac->ports[i]);
> - /* we log the port for which the probe failed and stop
> - * probes for subsequent ports.
> - * Note that we keep running ports, for which the probes
> - * were successful, unless add_adev fails too
> + /* Log the port for which the probe failed, stop probing
> + * subsequent ports, and skip add_adev.
> + * Already-probed ports remain functional.
> */
Are these comments accurate regarding ports remaining functional?
Since this patch preserves the error code, execution falls through to the end
of mana_probe() where it checks if err is set and calls mana_remove().
mana_remove() will iterate over the successfully allocated ports, completely
detach them, unregister their net_devices, and destroy event queues.
It appears no ports remain functional because the device is completely torn
down.
[ ... ]
> @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> enable_work(&apc->queue_reset_work);
> err = mana_attach(ac->ports[i]);
> rtnl_unlock();
> - /* we log the port for which the attach failed and stop
> - * attach for subsequent ports
> - * Note that we keep running ports, for which the attach
> - * were successful, unless add_adev fails too
> + /* Log the port for which the attach failed, stop
> + * attaching subsequent ports, and skip add_adev.
> + * Already-attached ports remain functional.
> */
> if (err) {
> dev_err(dev, "Attach Failed for port %d\n", i);
> @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
Does propagating this error during PM resume introduce a NULL pointer
dereference?
If PM resume fails here (resuming == true), the preserved error code causes
mana_probe() to call mana_remove(), which tears down the device and sets
gd->gdma_context = NULL.
However, a failed resume callback does not automatically unbind the driver.
When the device is eventually unbound later, mana_remove() is invoked again:
drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
struct gdma_context *gc = gd->gdma_context;
struct device *dev = gc->dev;
...
}
In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
which could lead to a kernel panic when dereferencing gc->dev.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
@ 2026-04-14 15:40 ` Simon Horman
2026-04-15 7:01 ` Erni Sri Satya Vennela
0 siblings, 1 reply; 12+ messages in thread
From: Simon Horman @ 2026-04-14 15:40 UTC (permalink / raw)
To: ernis
Cc: Simon Horman, pabeni, edumazet, ssengar, dipayanroy, gargaditya,
decui, shacharr, linux-hyperv, kotaranov, kys, shirazsaleem,
longli, kuba, andrew+netdev, wei.liu, linux-kernel, netdev,
haiyangz, kees, stephen, leon, davem
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: mana: Fix EQ leak in mana_remove on NULL port
In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq(). This can happen when
mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On
driver unload or error cleanup, mana_remove() hits the NULL entry and
jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the
for-loop exits normally and mana_destroy_eq() is always reached.
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Does this Fixes tag point to the correct commit? Looking at the commit
history, the original driver commit ca9c54d2d6a5 did not have
mana_destroy_eq() in mana_remove() at all, so the 'goto out' pattern could
not have caused an EQ leak at that point.
The bug appears to have been introduced by commit 1e2d0824a9c3 ("net: mana:
Add support for EQ sharing"), which added mana_destroy_eq(ac) to
mana_remove() and placed it after the 'out:' label, making it skippable by
the 'goto out' on NULL port.
Should the Fixes tag be:
Fixes: 1e2d0824a9c3 ("net: mana: Add support for EQ sharing")
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
@ 2026-04-14 15:41 ` Simon Horman
0 siblings, 0 replies; 12+ messages in thread
From: Simon Horman @ 2026-04-14 15:41 UTC (permalink / raw)
To: Erni Sri Satya Vennela
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Sun, Apr 12, 2026 at 10:08:37PM -0700, Erni Sri Satya Vennela wrote:
> Move INIT_WORK(link_change_work) to right after the mana_context
> allocation, before any error path that could reach mana_remove().
>
> Previously, if mana_create_eq() or mana_query_device_cfg() failed,
> mana_probe() would jump to the error path which calls mana_remove().
> mana_remove() unconditionally calls disable_work_sync(link_change_work),
> but the work struct had not been initialized yet. This can trigger
> CONFIG_DEBUG_OBJECTS_WORK enabled.
>
> Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
> Changes in v2:
> * Apply the patch in net instead of net-next.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
@ 2026-04-14 15:41 ` Simon Horman
0 siblings, 0 replies; 12+ messages in thread
From: Simon Horman @ 2026-04-14 15:41 UTC (permalink / raw)
To: Erni Sri Satya Vennela
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Sun, Apr 12, 2026 at 10:08:38PM -0700, Erni Sri Satya Vennela wrote:
> Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
> while keeping schedule_delayed_work() at its original location.
>
> Previously, if any function between mana_create_eq() and the
> INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
> which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
> in __flush_work() or debug object warnings with
> CONFIG_DEBUG_OBJECTS_WORK enabled.
>
> Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
2026-04-14 15:40 ` Simon Horman
@ 2026-04-15 7:01 ` Erni Sri Satya Vennela
0 siblings, 0 replies; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-15 7:01 UTC (permalink / raw)
To: Simon Horman
Cc: pabeni, edumazet, ssengar, dipayanroy, gargaditya, decui,
shacharr, linux-hyperv, kotaranov, kys, shirazsaleem, longli,
kuba, andrew+netdev, wei.liu, linux-kernel, netdev, haiyangz,
kees, stephen, leon, davem
On Tue, Apr 14, 2026 at 04:40:58PM +0100, Simon Horman wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net: mana: Fix EQ leak in mana_remove on NULL port
>
> In mana_remove(), when a NULL port is encountered in the port iteration
> loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
> queues allocated earlier by mana_create_eq(). This can happen when
> mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On
> driver unload or error cleanup, mana_remove() hits the NULL entry and
> jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the
> for-loop exits normally and mana_destroy_eq() is always reached.
>
> > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
>
> Does this Fixes tag point to the correct commit? Looking at the commit
> history, the original driver commit ca9c54d2d6a5 did not have
> mana_destroy_eq() in mana_remove() at all, so the 'goto out' pattern could
> not have caused an EQ leak at that point.
>
> The bug appears to have been introduced by commit 1e2d0824a9c3 ("net: mana:
> Add support for EQ sharing"), which added mana_destroy_eq(ac) to
> mana_remove() and placed it after the 'out:' label, making it skippable by
> the 'goto out' on NULL port.
>
> Should the Fixes tag be:
> Fixes: 1e2d0824a9c3 ("net: mana: Add support for EQ sharing")
Thankyou for the correction, Simon.
I will make this change in the next version of the patchset.
- Vennela
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-14 15:35 ` Simon Horman
@ 2026-04-15 7:04 ` Erni Sri Satya Vennela
2026-04-15 12:37 ` Simon Horman
0 siblings, 1 reply; 12+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-15 7:04 UTC (permalink / raw)
To: Simon Horman
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Tue, Apr 14, 2026 at 04:35:03PM +0100, Simon Horman wrote:
> From: 'Simon Horman' <horms@kernel.org>
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> net: mana: Don't overwrite port probe error with add_adev result
>
> This commit ensures that if mana_probe_port() or mana_attach() fails, the
> error is not overwritten by the result of add_adev(). By preserving the
> error, it triggers mana_remove() cleanup instead of leaving ports partially
> initialized.
>
> > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > index 9d095a6fb56c7..4d42cbfb3f15b 100644
> > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > if (!resuming) {
> > for (i = 0; i < ac->num_ports; i++) {
> > err = mana_probe_port(ac, i, &ac->ports[i]);
> > - /* we log the port for which the probe failed and stop
> > - * probes for subsequent ports.
> > - * Note that we keep running ports, for which the probes
> > - * were successful, unless add_adev fails too
> > + /* Log the port for which the probe failed, stop probing
> > + * subsequent ports, and skip add_adev.
> > + * Already-probed ports remain functional.
> > */
>
> Are these comments accurate regarding ports remaining functional?
>
> Since this patch preserves the error code, execution falls through to the end
> of mana_probe() where it checks if err is set and calls mana_remove().
> mana_remove() will iterate over the successfully allocated ports, completely
> detach them, unregister their net_devices, and destroy event queues.
>
> It appears no ports remain functional because the device is completely torn
> down.
>
I will update the patch with the updated comment message.
> [ ... ]
>
> > @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > enable_work(&apc->queue_reset_work);
> > err = mana_attach(ac->ports[i]);
> > rtnl_unlock();
> > - /* we log the port for which the attach failed and stop
> > - * attach for subsequent ports
> > - * Note that we keep running ports, for which the attach
> > - * were successful, unless add_adev fails too
> > + /* Log the port for which the attach failed, stop
> > + * attaching subsequent ports, and skip add_adev.
> > + * Already-attached ports remain functional.
> > */
> > if (err) {
> > dev_err(dev, "Attach Failed for port %d\n", i);
> > @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
>
> Does propagating this error during PM resume introduce a NULL pointer
> dereference?
>
> If PM resume fails here (resuming == true), the preserved error code causes
> mana_probe() to call mana_remove(), which tears down the device and sets
> gd->gdma_context = NULL.
>
> However, a failed resume callback does not automatically unbind the driver.
> When the device is eventually unbound later, mana_remove() is invoked again:
>
> drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
> struct gdma_context *gc = gd->gdma_context;
> struct device *dev = gc->dev;
> ...
> }
>
> In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
> which could lead to a kernel panic when dereferencing gc->dev.
Thankyou for pointing it out, Simon.
Since this is a pre-existing bug, I will create a different patch for
this change and make it as part of this patchset.
- Vennela
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-15 7:04 ` Erni Sri Satya Vennela
@ 2026-04-15 12:37 ` Simon Horman
0 siblings, 0 replies; 12+ messages in thread
From: Simon Horman @ 2026-04-15 12:37 UTC (permalink / raw)
To: Erni Sri Satya Vennela
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Wed, Apr 15, 2026 at 12:04:51AM -0700, Erni Sri Satya Vennela wrote:
> On Tue, Apr 14, 2026 at 04:35:03PM +0100, Simon Horman wrote:
> > From: 'Simon Horman' <horms@kernel.org>
> >
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> > Full review at: https://sashiko.dev
> > ---
> > net: mana: Don't overwrite port probe error with add_adev result
> >
> > This commit ensures that if mana_probe_port() or mana_attach() fails, the
> > error is not overwritten by the result of add_adev(). By preserving the
> > error, it triggers mana_remove() cleanup instead of leaving ports partially
> > initialized.
> >
> > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > > index 9d095a6fb56c7..4d42cbfb3f15b 100644
> > > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > > @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > > if (!resuming) {
> > > for (i = 0; i < ac->num_ports; i++) {
> > > err = mana_probe_port(ac, i, &ac->ports[i]);
> > > - /* we log the port for which the probe failed and stop
> > > - * probes for subsequent ports.
> > > - * Note that we keep running ports, for which the probes
> > > - * were successful, unless add_adev fails too
> > > + /* Log the port for which the probe failed, stop probing
> > > + * subsequent ports, and skip add_adev.
> > > + * Already-probed ports remain functional.
> > > */
> >
> > Are these comments accurate regarding ports remaining functional?
> >
> > Since this patch preserves the error code, execution falls through to the end
> > of mana_probe() where it checks if err is set and calls mana_remove().
> > mana_remove() will iterate over the successfully allocated ports, completely
> > detach them, unregister their net_devices, and destroy event queues.
> >
> > It appears no ports remain functional because the device is completely torn
> > down.
> >
> I will update the patch with the updated comment message.
Thanks.
> > [ ... ]
> >
> > > @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > > enable_work(&apc->queue_reset_work);
> > > err = mana_attach(ac->ports[i]);
> > > rtnl_unlock();
> > > - /* we log the port for which the attach failed and stop
> > > - * attach for subsequent ports
> > > - * Note that we keep running ports, for which the attach
> > > - * were successful, unless add_adev fails too
> > > + /* Log the port for which the attach failed, stop
> > > + * attaching subsequent ports, and skip add_adev.
> > > + * Already-attached ports remain functional.
> > > */
> > > if (err) {
> > > dev_err(dev, "Attach Failed for port %d\n", i);
> > > @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> >
> > Does propagating this error during PM resume introduce a NULL pointer
> > dereference?
> >
> > If PM resume fails here (resuming == true), the preserved error code causes
> > mana_probe() to call mana_remove(), which tears down the device and sets
> > gd->gdma_context = NULL.
> >
> > However, a failed resume callback does not automatically unbind the driver.
> > When the device is eventually unbound later, mana_remove() is invoked again:
> >
> > drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
> > struct gdma_context *gc = gd->gdma_context;
> > struct device *dev = gc->dev;
> > ...
> > }
> >
> > In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
> > which could lead to a kernel panic when dereferencing gc->dev.
>
> Thankyou for pointing it out, Simon.
> Since this is a pre-existing bug, I will create a different patch for
> this change and make it as part of this patchset.
Likewise, thanks.
FTR, it it is a pre-existing bug then I don't think it needs
to block progress of your patchset. Even if fixing things
sooner than later is a good maxim.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-04-15 12:37 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
2026-04-14 15:35 ` Simon Horman
2026-04-15 7:04 ` Erni Sri Satya Vennela
2026-04-15 12:37 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
2026-04-14 15:40 ` Simon Horman
2026-04-15 7:01 ` Erni Sri Satya Vennela
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.