* [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs
@ 2026-04-13 5:08 Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Fix four pre-existing bugs in mana_probe()/mana_remove() error handling
that can cause warnings on uninitialized work structs, masked errors,
and resource leaks when early probe steps fail.
Patches 1-2 move work struct initialization (link_change_work and
gf_stats_work) to before any error path that could trigger
mana_remove(), preventing WARN_ON in __flush_work() or debug object
warnings when sync cancellation runs on uninitialized work structs.
Patch 3 prevents add_adev() from overwriting a port probe error,
which could leave the driver in a broken state with NULL ports while
reporting success.
Patch 4 changes 'goto out' to 'break' in mana_remove()'s port loop
so that mana_destroy_eq() is always reached, preventing EQ leaks when
a NULL port is encountered.
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
Erni Sri Satya Vennela (4):
net: mana: Init link_change_work before potential error paths in probe
net: mana: Init gf_stats_work before potential error paths in probe
net: mana: Don't overwrite port probe error with add_adev result
net: mana: Fix EQ leak in mana_remove on NULL port
drivers/net/ethernet/microsoft/mana/mana_en.c | 28 +++++++++----------
1 file changed, 14 insertions(+), 14 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Move INIT_WORK(link_change_work) to right after the mana_context
allocation, before any error path that could reach mana_remove().
Previously, if mana_create_eq() or mana_query_device_cfg() failed,
mana_probe() would jump to the error path which calls mana_remove().
mana_remove() unconditionally calls disable_work_sync(link_change_work),
but the work struct had not been initialized yet. This can trigger
CONFIG_DEBUG_OBJECTS_WORK enabled.
Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 07630322545f..57f146ea6f66 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3631,6 +3631,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
ac->gdma_dev = gd;
gd->driver_data = ac;
+
+ INIT_WORK(&ac->link_change_work, mana_link_state_handle);
}
err = mana_create_eq(ac);
@@ -3648,8 +3650,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
if (!resuming) {
ac->num_ports = num_ports;
-
- INIT_WORK(&ac->link_change_work, mana_link_state_handle);
} else {
if (ac->num_ports != num_ports) {
dev_err(dev, "The number of vPorts changed: %d->%d\n",
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
while keeping schedule_delayed_work() at its original location.
Previously, if any function between mana_create_eq() and the
INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
in __flush_work() or debug object warnings with
CONFIG_DEBUG_OBJECTS_WORK enabled.
Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 57f146ea6f66..f6ad46736418 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3635,6 +3635,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
INIT_WORK(&ac->link_change_work, mana_link_state_handle);
}
+ INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
+
err = mana_create_eq(ac);
if (err) {
dev_err(dev, "Failed to create EQs: %d\n", err);
@@ -3709,7 +3711,6 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
err = add_adev(gd, "eth");
- INIT_DELAYED_WORK(&ac->gf_stats_work, mana_gf_stats_work_handler);
schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
out:
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:35 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In mana_probe(), if mana_probe_port() fails for any port, the error
is stored in 'err' and the loop breaks. However, the subsequent
unconditional 'err = add_adev(gd, "eth")' overwrites this error.
If add_adev() succeeds, mana_probe() returns success despite ports
being left in a partially initialized state (ac->ports[i] == NULL).
Only call add_adev() when there is no prior error, so the probe
correctly fails and triggers mana_remove() cleanup.
Fixes: ced82fce77e9 ("net: mana: Probe rdma device in mana driver")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f6ad46736418..1a141c46ac27 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3680,10 +3680,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
if (!resuming) {
for (i = 0; i < ac->num_ports; i++) {
err = mana_probe_port(ac, i, &ac->ports[i]);
- /* we log the port for which the probe failed and stop
- * probes for subsequent ports.
- * Note that we keep running ports, for which the probes
- * were successful, unless add_adev fails too
+ /* Log the port for which the probe failed, stop probing
+ * subsequent ports, and skip add_adev.
+ * Already-probed ports remain functional.
*/
if (err) {
dev_err(dev, "Probe Failed for port %d\n", i);
@@ -3697,10 +3696,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
enable_work(&apc->queue_reset_work);
err = mana_attach(ac->ports[i]);
rtnl_unlock();
- /* we log the port for which the attach failed and stop
- * attach for subsequent ports
- * Note that we keep running ports, for which the attach
- * were successful, unless add_adev fails too
+ /* Log the port for which the attach failed, stop
+ * attaching subsequent ports, and skip add_adev.
+ * Already-attached ports remain functional.
*/
if (err) {
dev_err(dev, "Attach Failed for port %d\n", i);
@@ -3709,7 +3707,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
}
}
- err = add_adev(gd, "eth");
+ if (!err)
+ err = add_adev(gd, "eth");
schedule_delayed_work(&ac->gf_stats_work, MANA_GF_STATS_PERIOD);
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
` (2 preceding siblings ...)
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
@ 2026-04-13 5:08 ` Erni Sri Satya Vennela
2026-04-14 15:40 ` Simon Horman
3 siblings, 1 reply; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-13 5:08 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ernis, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq().
This can happen when mana_probe_port() fails for port 0, leaving
ac->ports[0] as NULL. On driver unload or error cleanup, mana_remove()
hits the NULL entry and jumps past mana_destroy_eq().
Change 'goto out' to 'break' so the for-loop exits normally and
mana_destroy_eq() is always reached. Remove the now-unreferenced out:
label.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
---
Changes in v2:
* Apply the patch in net instead of net-next.
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index 1a141c46ac27..97237d137cbf 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -3747,7 +3747,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
if (!ndev) {
if (i == 0)
dev_err(dev, "No net device to remove\n");
- goto out;
+ break;
}
apc = netdev_priv(ndev);
@@ -3778,7 +3778,7 @@ void mana_remove(struct gdma_dev *gd, bool suspending)
}
mana_destroy_eq(ac);
-out:
+
if (ac->per_port_queue_reset_wq) {
destroy_workqueue(ac->per_port_queue_reset_wq);
ac->per_port_queue_reset_wq = NULL;
--
2.34.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
@ 2026-04-14 15:35 ` Simon Horman
2026-04-15 7:04 ` Erni Sri Satya Vennela
0 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:35 UTC (permalink / raw)
To: ernis
Cc: 'Simon Horman', kys, haiyangz, wei.liu, decui, longli,
andrew+netdev, davem, edumazet, kuba, pabeni, ssengar, dipayanroy,
gargaditya, shirazsaleem, kees, kotaranov, leon, shacharr,
stephen, linux-hyperv, netdev, linux-kernel
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
net: mana: Don't overwrite port probe error with add_adev result
This commit ensures that if mana_probe_port() or mana_attach() fails, the
error is not overwritten by the result of add_adev(). By preserving the
error, it triggers mana_remove() cleanup instead of leaving ports partially
initialized.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9d095a6fb56c7..4d42cbfb3f15b 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> if (!resuming) {
> for (i = 0; i < ac->num_ports; i++) {
> err = mana_probe_port(ac, i, &ac->ports[i]);
> - /* we log the port for which the probe failed and stop
> - * probes for subsequent ports.
> - * Note that we keep running ports, for which the probes
> - * were successful, unless add_adev fails too
> + /* Log the port for which the probe failed, stop probing
> + * subsequent ports, and skip add_adev.
> + * Already-probed ports remain functional.
> */
Are these comments accurate regarding ports remaining functional?
Since this patch preserves the error code, execution falls through to the end
of mana_probe() where it checks if err is set and calls mana_remove().
mana_remove() will iterate over the successfully allocated ports, completely
detach them, unregister their net_devices, and destroy event queues.
It appears no ports remain functional because the device is completely torn
down.
[ ... ]
> @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> enable_work(&apc->queue_reset_work);
> err = mana_attach(ac->ports[i]);
> rtnl_unlock();
> - /* we log the port for which the attach failed and stop
> - * attach for subsequent ports
> - * Note that we keep running ports, for which the attach
> - * were successful, unless add_adev fails too
> + /* Log the port for which the attach failed, stop
> + * attaching subsequent ports, and skip add_adev.
> + * Already-attached ports remain functional.
> */
> if (err) {
> dev_err(dev, "Attach Failed for port %d\n", i);
> @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
Does propagating this error during PM resume introduce a NULL pointer
dereference?
If PM resume fails here (resuming == true), the preserved error code causes
mana_probe() to call mana_remove(), which tears down the device and sets
gd->gdma_context = NULL.
However, a failed resume callback does not automatically unbind the driver.
When the device is eventually unbound later, mana_remove() is invoked again:
drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
struct gdma_context *gc = gd->gdma_context;
struct device *dev = gc->dev;
...
}
In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
which could lead to a kernel panic when dereferencing gc->dev.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
@ 2026-04-14 15:40 ` Simon Horman
2026-04-15 7:01 ` Erni Sri Satya Vennela
0 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:40 UTC (permalink / raw)
To: ernis
Cc: Simon Horman, pabeni, edumazet, ssengar, dipayanroy, gargaditya,
decui, shacharr, linux-hyperv, kotaranov, kys, shirazsaleem,
longli, kuba, andrew+netdev, wei.liu, linux-kernel, netdev,
haiyangz, kees, stephen, leon, davem
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
net: mana: Fix EQ leak in mana_remove on NULL port
In mana_remove(), when a NULL port is encountered in the port iteration
loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
queues allocated earlier by mana_create_eq(). This can happen when
mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On
driver unload or error cleanup, mana_remove() hits the NULL entry and
jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the
for-loop exits normally and mana_destroy_eq() is always reached.
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Does this Fixes tag point to the correct commit? Looking at the commit
history, the original driver commit ca9c54d2d6a5 did not have
mana_destroy_eq() in mana_remove() at all, so the 'goto out' pattern could
not have caused an EQ leak at that point.
The bug appears to have been introduced by commit 1e2d0824a9c3 ("net: mana:
Add support for EQ sharing"), which added mana_destroy_eq(ac) to
mana_remove() and placed it after the 'out:' label, making it skippable by
the 'goto out' on NULL port.
Should the Fixes tag be:
Fixes: 1e2d0824a9c3 ("net: mana: Add support for EQ sharing")
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
@ 2026-04-14 15:41 ` Simon Horman
0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:41 UTC (permalink / raw)
To: Erni Sri Satya Vennela
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Sun, Apr 12, 2026 at 10:08:37PM -0700, Erni Sri Satya Vennela wrote:
> Move INIT_WORK(link_change_work) to right after the mana_context
> allocation, before any error path that could reach mana_remove().
>
> Previously, if mana_create_eq() or mana_query_device_cfg() failed,
> mana_probe() would jump to the error path which calls mana_remove().
> mana_remove() unconditionally calls disable_work_sync(link_change_work),
> but the work struct had not been initialized yet. This can trigger
> CONFIG_DEBUG_OBJECTS_WORK enabled.
>
> Fixes: 54133f9b4b53 ("net: mana: Support HW link state events")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
> ---
> Changes in v2:
> * Apply the patch in net instead of net-next.
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 2/4] net: mana: Init gf_stats_work before potential error paths in probe
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
@ 2026-04-14 15:41 ` Simon Horman
0 siblings, 0 replies; 11+ messages in thread
From: Simon Horman @ 2026-04-14 15:41 UTC (permalink / raw)
To: Erni Sri Satya Vennela
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Sun, Apr 12, 2026 at 10:08:38PM -0700, Erni Sri Satya Vennela wrote:
> Move INIT_DELAYED_WORK(gf_stats_work) to before mana_create_eq(),
> while keeping schedule_delayed_work() at its original location.
>
> Previously, if any function between mana_create_eq() and the
> INIT_DELAYED_WORK call failed, mana_probe() would call mana_remove()
> which unconditionally calls cancel_delayed_work_sync(gf_stats_work)
> in __flush_work() or debug object warnings with
> CONFIG_DEBUG_OBJECTS_WORK enabled.
>
> Fixes: be4f1d67ec56 ("net: mana: Add standard counter rx_missed_errors")
> Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port
2026-04-14 15:40 ` Simon Horman
@ 2026-04-15 7:01 ` Erni Sri Satya Vennela
0 siblings, 0 replies; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-15 7:01 UTC (permalink / raw)
To: Simon Horman
Cc: pabeni, edumazet, ssengar, dipayanroy, gargaditya, decui,
shacharr, linux-hyperv, kotaranov, kys, shirazsaleem, longli,
kuba, andrew+netdev, wei.liu, linux-kernel, netdev, haiyangz,
kees, stephen, leon, davem
On Tue, Apr 14, 2026 at 04:40:58PM +0100, Simon Horman wrote:
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
>
> For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
> ---
> net: mana: Fix EQ leak in mana_remove on NULL port
>
> In mana_remove(), when a NULL port is encountered in the port iteration
> loop, 'goto out' skips the mana_destroy_eq(ac) call, leaking the event
> queues allocated earlier by mana_create_eq(). This can happen when
> mana_probe_port() fails for port 0, leaving ac->ports[0] as NULL. On
> driver unload or error cleanup, mana_remove() hits the NULL entry and
> jumps past mana_destroy_eq(). Change 'goto out' to 'break' so the
> for-loop exits normally and mana_destroy_eq() is always reached.
>
> > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
>
> Does this Fixes tag point to the correct commit? Looking at the commit
> history, the original driver commit ca9c54d2d6a5 did not have
> mana_destroy_eq() in mana_remove() at all, so the 'goto out' pattern could
> not have caused an EQ leak at that point.
>
> The bug appears to have been introduced by commit 1e2d0824a9c3 ("net: mana:
> Add support for EQ sharing"), which added mana_destroy_eq(ac) to
> mana_remove() and placed it after the 'out:' label, making it skippable by
> the 'goto out' on NULL port.
>
> Should the Fixes tag be:
> Fixes: 1e2d0824a9c3 ("net: mana: Add support for EQ sharing")
Thankyou for the correction, Simon.
I will make this change in the next version of the patchset.
- Vennela
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result
2026-04-14 15:35 ` Simon Horman
@ 2026-04-15 7:04 ` Erni Sri Satya Vennela
0 siblings, 0 replies; 11+ messages in thread
From: Erni Sri Satya Vennela @ 2026-04-15 7:04 UTC (permalink / raw)
To: Simon Horman
Cc: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, ssengar, dipayanroy, gargaditya,
shirazsaleem, kees, kotaranov, leon, shacharr, stephen,
linux-hyperv, netdev, linux-kernel
On Tue, Apr 14, 2026 at 04:35:03PM +0100, Simon Horman wrote:
> From: 'Simon Horman' <horms@kernel.org>
>
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> net: mana: Don't overwrite port probe error with add_adev result
>
> This commit ensures that if mana_probe_port() or mana_attach() fails, the
> error is not overwritten by the result of add_adev(). By preserving the
> error, it triggers mana_remove() cleanup instead of leaving ports partially
> initialized.
>
> > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > index 9d095a6fb56c7..4d42cbfb3f15b 100644
> > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > @@ -3634,10 +3634,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > if (!resuming) {
> > for (i = 0; i < ac->num_ports; i++) {
> > err = mana_probe_port(ac, i, &ac->ports[i]);
> > - /* we log the port for which the probe failed and stop
> > - * probes for subsequent ports.
> > - * Note that we keep running ports, for which the probes
> > - * were successful, unless add_adev fails too
> > + /* Log the port for which the probe failed, stop probing
> > + * subsequent ports, and skip add_adev.
> > + * Already-probed ports remain functional.
> > */
>
> Are these comments accurate regarding ports remaining functional?
>
> Since this patch preserves the error code, execution falls through to the end
> of mana_probe() where it checks if err is set and calls mana_remove().
> mana_remove() will iterate over the successfully allocated ports, completely
> detach them, unregister their net_devices, and destroy event queues.
>
> It appears no ports remain functional because the device is completely torn
> down.
>
I will update the patch with the updated comment message.
> [ ... ]
>
> > @@ -3651,10 +3650,9 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
> > enable_work(&apc->queue_reset_work);
> > err = mana_attach(ac->ports[i]);
> > rtnl_unlock();
> > - /* we log the port for which the attach failed and stop
> > - * attach for subsequent ports
> > - * Note that we keep running ports, for which the attach
> > - * were successful, unless add_adev fails too
> > + /* Log the port for which the attach failed, stop
> > + * attaching subsequent ports, and skip add_adev.
> > + * Already-attached ports remain functional.
> > */
> > if (err) {
> > dev_err(dev, "Attach Failed for port %d\n", i);
> > @@ -3663,7 +3661,8 @@ int mana_probe(struct gdma_dev *gd, bool resuming)
>
> Does propagating this error during PM resume introduce a NULL pointer
> dereference?
>
> If PM resume fails here (resuming == true), the preserved error code causes
> mana_probe() to call mana_remove(), which tears down the device and sets
> gd->gdma_context = NULL.
>
> However, a failed resume callback does not automatically unbind the driver.
> When the device is eventually unbound later, mana_remove() is invoked again:
>
> drivers/net/ethernet/microsoft/mana/mana_en.c:mana_remove() {
> struct gdma_context *gc = gd->gdma_context;
> struct device *dev = gc->dev;
> ...
> }
>
> In the second invocation, gd->gdma_context is NULL, meaning gc is NULL,
> which could lead to a kernel panic when dereferencing gc->dev.
Thankyou for pointing it out, Simon.
Since this is a pre-existing bug, I will create a different patch for
this change and make it as part of this patchset.
- Vennela
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-04-15 7:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-13 5:08 [PATCH net v2 0/4] net: mana: Fix probe/remove error path bugs Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 1/4] net: mana: Init link_change_work before potential error paths in probe Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 2/4] net: mana: Init gf_stats_work " Erni Sri Satya Vennela
2026-04-14 15:41 ` Simon Horman
2026-04-13 5:08 ` [PATCH net v2 3/4] net: mana: Don't overwrite port probe error with add_adev result Erni Sri Satya Vennela
2026-04-14 15:35 ` Simon Horman
2026-04-15 7:04 ` Erni Sri Satya Vennela
2026-04-13 5:08 ` [PATCH net v2 4/4] net: mana: Fix EQ leak in mana_remove on NULL port Erni Sri Satya Vennela
2026-04-14 15:40 ` Simon Horman
2026-04-15 7:01 ` Erni Sri Satya Vennela
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox