* [PATCH net v2 0/2] net: mana: fix error-path issues in queue setup
@ 2026-06-08 10:13 Aditya Garg
2026-06-08 10:13 ` [PATCH net v2 1/2] net: mana: initialize gdma queue id to INVALID_QUEUE_ID Aditya Garg
2026-06-08 10:13 ` [PATCH net v2 2/2] net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check Aditya Garg
0 siblings, 2 replies; 5+ messages in thread
From: Aditya Garg @ 2026-06-08 10:13 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, horms, shradhagupta, dipayanroy, ernis,
kees, shacharr, stephen, gargaditya, gargaditya, ssengar,
linux-hyperv, netdev, linux-kernel
Two error-path fixes in MANA queue setup, both surfaced during Sashiko
AI review of a recently upstreamed patch series.
Patch 1 initializes queue->id to INVALID_QUEUE_ID in
mana_gd_create_mana_wq_cq() so that a CQ creation failure before the
firmware id is assigned does not NULL gc->cq_table[0] and silently
break whichever real CQ owns that slot. This mirrors the existing
pattern in mana_gd_create_eq().
Patch 2 guards mana_destroy_txq()'s call to mana_destroy_wq_obj() with
an INVALID_MANA_HANDLE check, mirroring mana_destroy_rxq(). Without
it, TX setup failures lead to a firmware-rejected destroy of (u64)-1
and a spurious error in dmesg.
Changes in v2:
- Rebased onto net.
Aditya Garg (2):
net: mana: initialize gdma queue id to INVALID_QUEUE_ID
net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check
drivers/net/ethernet/microsoft/mana/gdma_main.c | 2 ++
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
2 files changed, 4 insertions(+), 1 deletion(-)
--
2.43.0
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH net v2 1/2] net: mana: initialize gdma queue id to INVALID_QUEUE_ID
2026-06-08 10:13 [PATCH net v2 0/2] net: mana: fix error-path issues in queue setup Aditya Garg
@ 2026-06-08 10:13 ` Aditya Garg
2026-06-09 10:14 ` sashiko-bot
2026-06-08 10:13 ` [PATCH net v2 2/2] net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check Aditya Garg
1 sibling, 1 reply; 5+ messages in thread
From: Aditya Garg @ 2026-06-08 10:13 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, horms, shradhagupta, dipayanroy, ernis,
kees, shacharr, stephen, gargaditya, gargaditya, ssengar,
linux-hyperv, netdev, linux-kernel
mana_gd_create_mana_wq_cq() leaves queue->id as 0 (from kzalloc_obj())
until mana_create_wq_obj() assigns the firmware-returned id. If creation
fails before that, cleanup calls mana_gd_destroy_cq() with id 0, NULLing
gc->cq_table[0] and silently breaking whichever real CQ owns that slot.
Initialize queue->id to INVALID_QUEUE_ID right after allocation, matching
mana_gd_create_eq(). The existing (id >= max_num_cqs) guard then
short-circuits cleanly.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
drivers/net/ethernet/microsoft/mana/gdma_main.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
index d8e816882f02..ac71ca8450bf 100644
--- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
+++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
@@ -1192,6 +1192,8 @@ int mana_gd_create_mana_wq_cq(struct gdma_dev *gd,
if (!queue)
return -ENOMEM;
+ queue->id = INVALID_QUEUE_ID;
+
gmi = &queue->mem_info;
err = mana_gd_alloc_memory(gc, spec->queue_size, gmi);
if (err) {
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH net v2 2/2] net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check
2026-06-08 10:13 [PATCH net v2 0/2] net: mana: fix error-path issues in queue setup Aditya Garg
2026-06-08 10:13 ` [PATCH net v2 1/2] net: mana: initialize gdma queue id to INVALID_QUEUE_ID Aditya Garg
@ 2026-06-08 10:13 ` Aditya Garg
2026-06-09 10:14 ` sashiko-bot
1 sibling, 1 reply; 5+ messages in thread
From: Aditya Garg @ 2026-06-08 10:13 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, longli, andrew+netdev, davem,
edumazet, kuba, pabeni, horms, shradhagupta, dipayanroy, ernis,
kees, shacharr, stephen, gargaditya, gargaditya, ssengar,
linux-hyperv, netdev, linux-kernel
mana_create_txq() has several error paths (after mana_alloc_queues() or
mana_create_wq_obj() failure) where tx_qp[i].tx_object stays as the
INVALID_MANA_HANDLE sentinel set at allocation. mana_destroy_txq() then
unconditionally calls mana_destroy_wq_obj() with (u64)-1, which firmware
rejects and logs an error.
Mirror the RX-side pattern in mana_destroy_rxq() and skip the destroy
when the handle is still INVALID_MANA_HANDLE.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Aditya Garg <gargaditya@linux.microsoft.com>
Reviewed-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index c9b1df1ed109..d7de4c4d25bb 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2334,7 +2334,8 @@ static void mana_destroy_txq(struct mana_port_context *apc)
netif_napi_del_locked(napi);
apc->tx_qp[i].txq.napi_initialized = false;
}
- mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
+ if (apc->tx_qp[i].tx_object != INVALID_MANA_HANDLE)
+ mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
mana_deinit_cq(apc, &apc->tx_qp[i].tx_cq);
--
2.43.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net v2 1/2] net: mana: initialize gdma queue id to INVALID_QUEUE_ID
2026-06-08 10:13 ` [PATCH net v2 1/2] net: mana: initialize gdma queue id to INVALID_QUEUE_ID Aditya Garg
@ 2026-06-09 10:14 ` sashiko-bot
0 siblings, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-06-09 10:14 UTC (permalink / raw)
To: Aditya Garg; +Cc: linux-hyperv
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] The mana_gd_create_mana_eq function leaks a firmware DMA region object on its error path if the EQ hardware creation fails.
--
commit 4c25a3c164c64bd1c7eed40e10b6c6526c99eb4c
Author: Aditya Garg <gargaditya@linux.microsoft.com>
net: mana: initialize gdma queue id to INVALID_QUEUE_ID
mana_gd_create_mana_wq_cq() leaves queue->id as 0 (from kzalloc_obj())
until mana_create_wq_obj() assigns the firmware-returned id. If creation
fails before that, cleanup calls mana_gd_destroy_cq() with id 0, NULLing
gc->cq_table[0] and silently breaking whichever real CQ owns that slot.
Initialize queue->id to INVALID_QUEUE_ID right after allocation, matching
mana_gd_create_eq(). The existing (id >= max_num_cqs) guard then
short-circuits cleanly.
> diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> index d8e816882f02c..ac71ca8450bf4 100644
> --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c
> +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c
> @@ -1192,6 +1192,8 @@ int mana_gd_create_mana_wq_cq(struct gdma_dev *gd,
> if (!queue)
> return -ENOMEM;
>
> + queue->id = INVALID_QUEUE_ID;
> +
> gmi = &queue->mem_info;
> err = mana_gd_alloc_memory(gc, spec->queue_size, gmi);
> if (err) {
[Severity: High]
This is a pre-existing issue, but I noticed mana_gd_create_mana_eq() leaks a
firmware DMA region object on its error path if the EQ hardware creation fails.
In mana_gd_create_mana_eq(), mana_gd_create_dma_region() registers a DMA region
with the hardware and returns a valid handle. If the subsequent call to
mana_gd_create_eq() fails, execution jumps to the out label:
drivers/net/ethernet/microsoft/mana/gdma_main.c:mana_gd_create_mana_eq() {
...
err = mana_gd_create_eq(gd, spec, true, queue);
if (err)
goto out;
...
out:
dev_err(gc->dev, "Failed to create queue type %d of size: %u, err: %d\n",
spec->type, spec->queue_size, err);
mana_gd_free_memory(gmi);
free_q:
kfree(queue);
return err;
}
Does this code need to call mana_gd_destroy_dma_region(gc,
gmi->dma_region_handle) in the error path?
Without it, does this leak the hardware DMA region resource and leave freed
physical pages mapped in the hardware, potentially creating a hardware
use-after-free scenario?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260608101345.2267320-1-gargaditya@linux.microsoft.com?part=1
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net v2 2/2] net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check
2026-06-08 10:13 ` [PATCH net v2 2/2] net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check Aditya Garg
@ 2026-06-09 10:14 ` sashiko-bot
0 siblings, 0 replies; 5+ messages in thread
From: sashiko-bot @ 2026-06-09 10:14 UTC (permalink / raw)
To: Aditya Garg; +Cc: linux-hyperv
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
Pre-existing issues:
- [High] The patch prevents destroying uninitialized TX WQ objects that hold the `INVALID_MANA_HANDLE` sentinel, but bypasses unreached elements in the zero-initialized array, causing handle `0` to be erroneously passed to firmware.
- [High] Hardware vport registrations and configurations are not cleanly reverted if subsequent queue allocations fail inside `mana_create_vport()`, permanently breaking the network interface.
--
commit 27abeb920ad813c7f2d581fe8f249fdc9e94666a
Author: Aditya Garg <gargaditya@linux.microsoft.com>
net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check
This commit adds a check in mana_destroy_txq() to prevent destroying
uninitialized TX WQ objects that hold the INVALID_MANA_HANDLE sentinel. This
avoids passing invalid handles to the firmware, which previously caused
errors and logs. The change mirrors existing RX-side cleanup patterns.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index c9b1df1ed1098..d7de4c4d25bbe 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -2334,7 +2334,8 @@ static void mana_destroy_txq(struct mana_port_context *apc)
> netif_napi_del_locked(napi);
> apc->tx_qp[i].txq.napi_initialized = false;
> }
> - mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
> + if (apc->tx_qp[i].tx_object != INVALID_MANA_HANDLE)
> + mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
[Severity: High]
This is a pre-existing issue, but does this check safely handle all
uninitialized tx_object values?
In mana_create_txq(), apc->tx_qp is allocated using kzalloc_objs(), which
zero-initializes the array:
mana_create_txq() {
apc->tx_qp = kzalloc_objs(struct mana_tx_qp, apc->num_queues);
if (!apc->tx_qp)
return -ENOMEM;
for (i = 0; i < apc->num_queues; i++) {
apc->tx_qp[i].tx_object = INVALID_MANA_HANDLE;
...
}
If mana_create_txq() fails during the loop, the unvisited elements
(where j > i) will still have their zero-initialized values, meaning
tx_qp[j].tx_object == 0.
Since mana_destroy_txq() iterates over all apc->num_queues, and 0 is not
INVALID_MANA_HANDLE, the condition apc->tx_qp[i].tx_object !=
INVALID_MANA_HANDLE will be true for these unvisited queues.
Could this lead to erroneously calling mana_destroy_wq_obj() with handle 0,
potentially destroying an active WQ object used by another queue or port?
>
> mana_deinit_cq(apc, &apc->tx_qp[i].tx_cq);
>
[Severity: High]
This isn't a bug introduced by this patch, but is there a resource leak in
mana_create_vport() if mana_create_txq() fails?
mana_create_vport() {
...
err = mana_pf_register_hw_vport(apc);
if (err)
return err;
err = mana_cfg_vport(apc, gd->pdid, gd->doorbell);
if (err)
return err;
return mana_create_txq(apc, net);
}
If mana_create_txq() fails, it returns the error directly to
mana_alloc_queues() without jumping to a destroy_vport label. This bypasses
mana_destroy_vport() and skips cleaning up the hardware vport registrations.
Because the interface fails to open, apc->port_is_up remains false. Future
teardown attempts will skip mana_dealloc_queues(), leaving vport_use_count
permanently incremented.
Would this prevent the interface from ever being brought up again, as
mana_cfg_vport() enforces apc->vport_use_count == 0, requiring a driver
reload to recover?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260608101345.2267320-1-gargaditya@linux.microsoft.com?part=2
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-09 10:14 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-08 10:13 [PATCH net v2 0/2] net: mana: fix error-path issues in queue setup Aditya Garg
2026-06-08 10:13 ` [PATCH net v2 1/2] net: mana: initialize gdma queue id to INVALID_QUEUE_ID Aditya Garg
2026-06-09 10:14 ` sashiko-bot
2026-06-08 10:13 ` [PATCH net v2 2/2] net: mana: guard TX wq object destroy with INVALID_MANA_HANDLE check Aditya Garg
2026-06-09 10:14 ` sashiko-bot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox