* [PATCH 1/3] net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq()
2026-04-30 3:57 [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Dipayaan Roy
@ 2026-04-30 3:57 ` Dipayaan Roy
2026-05-01 4:02 ` sashiko-bot
2026-04-30 3:57 ` [PATCH 2/3] net: mana: Skip WQ object destruction for uninitialized RXQ Dipayaan Roy
` (3 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Dipayaan Roy @ 2026-04-30 3:57 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
When mana_create_rxq() fails at mana_create_wq_obj() or any step before
xdp_rxq_info_reg() is called, the error path jumps to `out:` which calls
mana_destroy_rxq(). mana_destroy_rxq() unconditionally calls
xdp_rxq_info_unreg() on xilinx xdp_rxq that was never registered,
triggering a WARN_ON in net/core/xdp.c:
mana 7870:00:00.0: HWC: Failed hw_channel req: 0xc000009a
mana 7870:00:00.0 eth7: Failed to create RXQ: err = -71
Driver BUG
WARNING: CPU: 442 PID: 491615 at ../net/core/xdp.c:150 xdp_rxq_info_unreg+0x44/0x70
Modules linked in: tcp_bbr xsk_diag udp_diag raw_diag unix_diag af_packet_diag netlink_diag nf_tables nfnetlink tcp_diag inet_diag binfmt_misc rpcsec_gss_krb5 nfsv3 nfs_acl auth_rpcgss nfsv4 dns_resolver nfs lockd ext4 grace crc16 iscsi_tcp mbcache fscache libiscsi_tcp jbd2 netfs rpcrdma af_packet sunrpc rdma_ucm ib_iser rdma_cm iw_cm iscsi_ibft ib_cm iscsi_boot_sysfs libiscsi rfkill scsi_transport_iscsi mana_ib ib_uverbs ib_core mana hyperv_drm(X) drm_shmem_helper intel_rapl_msr drm_kms_helper intel_rapl_common syscopyarea nls_iso8859_1 sysfillrect intel_uncore_frequency_common nls_cp437 vfat fat nfit sysimgblt libnvdimm hv_netvsc(X) hv_utils(X) fb_sys_fops hv_balloon(X) joydev fuse drm dm_mod configfs ip_tables x_tables xfs libcrc32c sd_mod nvme nvme_core nvme_common t10_pi crc64_rocksoft_generic crc64_rocksoft crc64 hid_generic serio_raw pci_hyperv(X) hv_storvsc(X) scsi_transport_fc hyperv_keyboard(X) hid_hyperv(X) pci_hyperv_intf(X) crc32_pclmul
crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd hv_vmbus(X) softdog sg scsi_mod efivarfs
Supported: Yes, External
CPU: 442 PID: 491615 Comm: ethtool Kdump: loaded Tainted: G X 5.14.21-150500.55.136-default #1 SLE15-SP5 a627be1b53abbfd64ad16b2685e4308c52847f42
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 07/25/2025
RIP: 0010:xdp_rxq_info_unreg+0x44/0x70
Code: e8 91 fe ff ff c7 43 0c 02 00 00 00 48 c7 03 00 00 00 00 5b c3 cc cc cc cc e9 58 3a 1c 00 48 c7 c7 f6 5f 19 97 e8 5c a4 7e ff <0f> 0b 83 7b 0c 01 74 ca 48 c7 c7 d9 5f 19 97 e8 48 a4 7e ff 0f 0b
RSP: 0018:ff3df6c8f7207818 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ff30d89f94808a80 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000002 RDI: ff30d94bdcca2908
RBP: 0000000000080000 R08: ffffffff98ed11a0 R09: ff3df6c8f72077a0
R10: dead000000000100 R11: 000000000000000a R12: 0000000000000000
R13: 0000000000002000 R14: 0000000000040000 R15: ff30d89f94800000
FS: 00007fe6d8432b80(0000) GS:ff30d94bdcc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe6d81a89b1 CR3: 00000b3b6d578001 CR4: 0000000000371ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
Call Trace:
<TASK>
mana_destroy_rxq+0x5b/0x2f0 [mana 267acf7006bcb696095bba4d810643d1db3b9e94]
mana_create_rxq.isra.55+0x3db/0x720 [mana 267acf7006bcb696095bba4d810643d1db3b9e94]
? simple_lookup+0x36/0x50
? current_time+0x42/0x80
? __d_free_external+0x30/0x30
mana_alloc_queues+0x32a/0x470 [mana 267acf7006bcb696095bba4d810643d1db3b9e94]
? _raw_spin_unlock+0xa/0x30
? d_instantiate.part.29+0x2e/0x40
? _raw_spin_unlock+0xa/0x30
? debugfs_create_dir+0xe4/0x140
mana_attach+0x5c/0xf0 [mana 267acf7006bcb696095bba4d810643d1db3b9e94]
mana_set_ringparam+0xd5/0x1a0 [mana 267acf7006bcb696095bba4d810643d1db3b9e94]
ethnl_set_rings+0x292/0x320
genl_family_rcv_msg_doit.isra.15+0x11b/0x150
genl_rcv_msg+0xe3/0x1e0
? rings_prepare_data+0x80/0x80
? genl_family_rcv_msg_doit.isra.15+0x150/0x150
netlink_rcv_skb+0x50/0x100
genl_rcv+0x24/0x40
netlink_unicast+0x1b6/0x280
netlink_sendmsg+0x365/0x4d0
sock_sendmsg+0x5f/0x70
__sys_sendto+0x112/0x140
__x64_sys_sendto+0x24/0x30
do_syscall_64+0x5b/0x80
? handle_mm_fault+0xd7/0x290
? do_user_addr_fault+0x2d8/0x740
? exc_page_fault+0x67/0x150
entry_SYSCALL_64_after_hwframe+0x6b/0xd5
RIP: 0033:0x7fe6d8122f06
Code: 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 72 f3 c3 41 57 41 56 4d 89 c7 41 55 41 54 41
RSP: 002b:00007fff2b66b068 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 000055771123d2a0 RCX: 00007fe6d8122f06
RDX: 0000000000000034 RSI: 000055771123d3b0 RDI: 0000000000000003
RBP: 00007fff2b66b100 R08: 00007fe6d8203360 R09: 000000000000000c
R10: 0000000000000000 R11: 0000000000000246 R12: 000055771123d350
R13: 000055771123d340 R14: 0000000000000000 R15: 00007fff2b66b2b0
</TASK>
Guard the xdp_rxq_info_unreg() call with xdp_rxq_info_is_reg() so that
mana_destroy_rxq() is safe to call regardless of how far initialization
progressed.
Fixes: ed5356b53f07 ("net: mana: Add XDP support")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index a654b3699c4c..dfb4ba9f7664 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2520,7 +2520,9 @@ static void mana_destroy_rxq(struct mana_port_context *apc,
napi_disable_locked(napi);
netif_napi_del_locked(napi);
}
- xdp_rxq_info_unreg(&rxq->xdp_rxq);
+
+ if (xdp_rxq_info_is_reg(&rxq->xdp_rxq))
+ xdp_rxq_info_unreg(&rxq->xdp_rxq);
mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 1/3] net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq()
2026-04-30 3:57 ` [PATCH 1/3] net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq() Dipayaan Roy
@ 2026-05-01 4:02 ` sashiko-bot
0 siblings, 0 replies; 11+ messages in thread
From: sashiko-bot @ 2026-05-01 4:02 UTC (permalink / raw)
To: Dipayaan Roy; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Ignored return values for XDP memory model registration leads to page pool leaks
- [High] Hardware state corruption by requesting destruction of handle 0 in TX queue error path
--
commit 981d87e9d96d90f1ed0564dfd6bbd85f261ecd6d
Author: Dipayaan Roy <dipayanroy@linux.microsoft.com>
net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq()
When mana_create_rxq() fails before xdp_rxq_info_reg() is called, the error
path calls mana_destroy_rxq(). Guard xdp_rxq_info_unreg() with
xdp_rxq_info_is_reg() to prevent unregistering a never-registered xdp_rxq.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index a654b3699c4c..dfb4ba9f7664 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -2520,7 +2520,9 @@ static void mana_destroy_rxq(struct mana_port_context *apc,
> napi_disable_locked(napi);
> netif_napi_del_locked(napi);
> }
> - xdp_rxq_info_unreg(&rxq->xdp_rxq);
> +
> + if (xdp_rxq_info_is_reg(&rxq->xdp_rxq))
> + xdp_rxq_info_unreg(&rxq->xdp_rxq);
This problem wasn't introduced by this patch, but it is a pre-existing logic
error not fixed later in the series.
In mana_create_rxq(), the calls to xdp_rxq_info_reg() and
xdp_rxq_info_reg_mem_model() are wrapped in WARN_ON() but their negative
return codes are never assigned to err or handled:
drivers/net/ethernet/microsoft/mana/mana_en.c:mana_create_rxq() {
...
WARN_ON(xdp_rxq_info_reg(&rxq->xdp_rxq, ndev, rxq_idx,
cq->napi.napi_id));
WARN_ON(xdp_rxq_info_reg_mem_model(&rxq->xdp_rxq, MEM_TYPE_PAGE_POOL,
rxq->page_pool));
napi_enable_locked(&cq->napi);
...
}
If xdp_rxq_info_reg_mem_model() fails (e.g., due to an ida_alloc failure
under memory pressure), the warning is printed but the queue creation
successfully completes.
Consequently, rxq->xdp_rxq is left with a zeroed mem_info.type
(MEM_TYPE_PAGE_ORDER0).
When an XDP program drops or redirects packets, will xdp_return_frame() read
the incorrect memory type and call put_page() instead of returning the page
to the page_pool?
Does this permanently leak the page_pool inflight count, resulting in an
indefinite system hang when page_pool_destroy() is called during interface
teardown?
> mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
This problem wasn't introduced by this patch, but it is a pre-existing flaw.
While a subsequent patch in the series ('net: mana: Skip WQ object destruction
for uninitialized RXQ') fixes a similar issue for RX queues by checking for
INVALID_MANA_HANDLE, the same issue remains unaddressed for TX queues.
In mana_create_txq(), apc->tx_qp is zero-allocated. The initialization loop
assigns tx_object = INVALID_MANA_HANDLE only when it reaches index i.
If initialization fails early (e.g., at i = 0), the error path calls
mana_destroy_txq(), which iterates over all apc->num_queues and
unconditionally calls mana_destroy_wq_obj():
drivers/net/ethernet/microsoft/mana/mana_en.c:mana_create_txq() {
...
apc->tx_qp = kzalloc_objs(struct mana_tx_qp, apc->num_queues);
...
for (i = 0; i < apc->num_queues; i++) {
apc->tx_qp[i].tx_object = INVALID_MANA_HANDLE;
...
err = mana_create_wq_obj(apc, apc->port_handle, GDMA_SQ,
&wq_spec, &cq_spec,
&apc->tx_qp[i].tx_object);
if (err)
goto out;
...
out:
netdev_err(net, "Failed to create %d TX queues, %d\n",
apc->num_queues, err);
mana_destroy_txq(apc);
return err;
}
drivers/net/ethernet/microsoft/mana/mana_en.c:mana_destroy_txq() {
...
for (i = 0; i < apc->num_queues; i++) {
...
mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
...
}
For all unvisited queues, tx_object will be 0 (from kzalloc).
Since mana_destroy_wq_obj() does not validate the handle, could this
inadvertently ask the hardware to destroy handle 0, which might be a valid,
unrelated queue, and corrupt the hardware state?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260430035935.1859220-1-dipayanroy@linux.microsoft.com?part=1
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/3] net: mana: Skip WQ object destruction for uninitialized RXQ
2026-04-30 3:57 [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Dipayaan Roy
2026-04-30 3:57 ` [PATCH 1/3] net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq() Dipayaan Roy
@ 2026-04-30 3:57 ` Dipayaan Roy
2026-05-01 4:02 ` sashiko-bot
2026-04-30 3:57 ` [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path Dipayaan Roy
` (2 subsequent siblings)
4 siblings, 1 reply; 11+ messages in thread
From: Dipayaan Roy @ 2026-04-30 3:57 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
In mana_destroy_rxq(), mana_destroy_wq_obj() is called unconditionally
even when the WQ object was never created (rxobj is still
INVALID_MANA_HANDLE). When mana_create_rxq() fails before
mana_create_wq_obj() succeeds, the error path calls mana_destroy_rxq()
which sends a bogus destroy command to the hardware:
mana 7870:00:00.0: HWC: Failed hw_channel req: 0x1d
mana 7870:00:00.0: Failed to send mana message: -71, 0x1d
mana 7870:00:00.0 eth7: Failed to destroy WQ object: -71
Guard mana_destroy_wq_obj() with an INVALID_MANA_HANDLE check so that
mana_destroy_rxq() is safe to call at any stage of RXQ initialization.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index dfb4ba9f7664..f2a6ea162dc3 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2524,7 +2524,8 @@ static void mana_destroy_rxq(struct mana_port_context *apc,
if (xdp_rxq_info_is_reg(&rxq->xdp_rxq))
xdp_rxq_info_unreg(&rxq->xdp_rxq);
- mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
+ if (rxq->rxobj != INVALID_MANA_HANDLE)
+ mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
mana_deinit_cq(apc, &rxq->rx_cq);
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 2/3] net: mana: Skip WQ object destruction for uninitialized RXQ
2026-04-30 3:57 ` [PATCH 2/3] net: mana: Skip WQ object destruction for uninitialized RXQ Dipayaan Roy
@ 2026-05-01 4:02 ` sashiko-bot
0 siblings, 0 replies; 11+ messages in thread
From: sashiko-bot @ 2026-05-01 4:02 UTC (permalink / raw)
To: Dipayaan Roy; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [Critical] Critical Use-After-Free and Double-Free in `mana_create_rxq()` error path.
- [High] Incomplete Fix: The identical bug fixed in `mana_destroy_rxq()` exists in `mana_destroy_txq()`.
--
commit c317e37e7bd422dc85e0c08611eda7f514d25eef
Author: Dipayaan Roy <dipayanroy@linux.microsoft.com>
net: mana: Skip WQ object destruction for uninitialized RXQ
This commit adds a check for INVALID_MANA_HANDLE in mana_destroy_rxq()
to prevent sending a bogus destroy command to the hardware when
mana_create_rxq() fails before the WQ object is created.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index dfb4ba9f7664..f2a6ea162dc3 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -2524,7 +2524,8 @@ static void mana_destroy_rxq(struct mana_port_context *apc,
> if (xdp_rxq_info_is_reg(&rxq->xdp_rxq))
> xdp_rxq_info_unreg(&rxq->xdp_rxq);
>
> - mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
> + if (rxq->rxobj != INVALID_MANA_HANDLE)
> + mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
Does this incomplete fix leave a regression in the TX queue destruction path?
In mana_create_txq(), apc->tx_qp is allocated via kzalloc_objs(), defaulting
tx_object to 0. It is set to INVALID_MANA_HANDLE only at the start of each
queue's loop iteration.
If mana_create_txq() fails early, it calls mana_destroy_txq(apc), which
unconditionally iterates over all apc->num_queues and calls:
mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
For the failing queue, it sends a destroy command for INVALID_MANA_HANDLE,
and for unreached queues, it sends 0. Should mana_destroy_txq() conditionally
check if tx_object is initialized before destroying it, similar to this fix?
While looking at this error path, I noticed a pre-existing regression that
is noted as fixed in a later commit (fbbff2b6d2993ad net: mana: remove
double CQ cleanup in mana_create_rxq error path).
In mana_create_rxq(), a local pointer cq is assigned to &rxq->rx_cq. If an
error occurs, it calls mana_destroy_rxq(), which frees rxq. Immediately
after, the error path checks if (cq) and calls mana_deinit_cq(apc, cq).
Since cq points into the freed rxq memory, does this result in a
use-after-free and double-free when mana_deinit_cq() reads cq->gdma_cq?
>
> mana_deinit_cq(apc, &rxq->rx_cq);
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260430035935.1859220-1-dipayanroy@linux.microsoft.com?part=2
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path
2026-04-30 3:57 [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Dipayaan Roy
2026-04-30 3:57 ` [PATCH 1/3] net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq() Dipayaan Roy
2026-04-30 3:57 ` [PATCH 2/3] net: mana: Skip WQ object destruction for uninitialized RXQ Dipayaan Roy
@ 2026-04-30 3:57 ` Dipayaan Roy
2026-04-30 4:14 ` Aditya Garg
2026-05-01 4:03 ` sashiko-bot
2026-05-02 16:52 ` [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Simon Horman
2026-05-05 10:20 ` patchwork-bot+netdevbpf
4 siblings, 2 replies; 11+ messages in thread
From: Dipayaan Roy @ 2026-04-30 3:57 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
In mana_create_rxq(), the error cleanup path calls mana_destroy_rxq()
followed by mana_deinit_cq(). This is incorrect for two reasons:
1. mana_destroy_rxq() already calls mana_deinit_cq() internally,
so the CQ's GDMA queue is destroyed twice.
2. mana_destroy_rxq() frees the rxq via kfree(rxq) before returning.
The subsequent mana_deinit_cq(apc, cq) then operates on freed memory
since cq points to &rxq->rx_cq, which is embedded in the
already-freed rxq structure — a use-after-free.
Remove the redundant mana_deinit_cq() call from the error path since
mana_destroy_rxq() already handles CQ cleanup. mana_deinit_cq() is
itself safe for an uninitialized CQ as it checks for a NULL gdma_cq
before proceeding.
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
---
drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
index f2a6ea162dc3..9afc786b297a 100644
--- a/drivers/net/ethernet/microsoft/mana/mana_en.c
+++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
@@ -2799,9 +2799,6 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc,
mana_destroy_rxq(apc, rxq, false);
- if (cq)
- mana_deinit_cq(apc, cq);
-
return NULL;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path
2026-04-30 3:57 ` [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path Dipayaan Roy
@ 2026-04-30 4:14 ` Aditya Garg
2026-05-01 4:03 ` sashiko-bot
1 sibling, 0 replies; 11+ messages in thread
From: Aditya Garg @ 2026-04-30 4:14 UTC (permalink / raw)
To: Dipayaan Roy, kys, haiyangz, wei.liu, decui, andrew+netdev, davem,
edumazet, kuba, pabeni, leon, longli, kotaranov, horms,
shradhagupta, ssengar, ernis, shirazsaleem, linux-hyperv, netdev,
linux-kernel, linux-rdma, stephen, jacob.e.keller, dipayanroy,
leitao, kees, john.fastabend, hawk, bpf, daniel, ast, sdf,
yury.norov
On 30-04-2026 09:27, Dipayaan Roy wrote:
> In mana_create_rxq(), the error cleanup path calls mana_destroy_rxq()
> followed by mana_deinit_cq(). This is incorrect for two reasons:
>
> 1. mana_destroy_rxq() already calls mana_deinit_cq() internally,
> so the CQ's GDMA queue is destroyed twice.
>
> 2. mana_destroy_rxq() frees the rxq via kfree(rxq) before returning.
> The subsequent mana_deinit_cq(apc, cq) then operates on freed memory
> since cq points to &rxq->rx_cq, which is embedded in the
> already-freed rxq structure — a use-after-free.
>
> Remove the redundant mana_deinit_cq() call from the error path since
> mana_destroy_rxq() already handles CQ cleanup. mana_deinit_cq() is
> itself safe for an uninitialized CQ as it checks for a NULL gdma_cq
> before proceeding.
>
> Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
> ---
> drivers/net/ethernet/microsoft/mana/mana_en.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index f2a6ea162dc3..9afc786b297a 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -2799,9 +2799,6 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc,
>
> mana_destroy_rxq(apc, rxq, false);
>
> - if (cq)
> - mana_deinit_cq(apc, cq);
> -
> return NULL;
> }
>
Reviewed-by: Aditya Garg <gargaditya@linux.microsoft.com>
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path
2026-04-30 3:57 ` [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path Dipayaan Roy
2026-04-30 4:14 ` Aditya Garg
@ 2026-05-01 4:03 ` sashiko-bot
1 sibling, 0 replies; 11+ messages in thread
From: sashiko-bot @ 2026-05-01 4:03 UTC (permalink / raw)
To: Dipayaan Roy; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Uninitialized CQ ID in error path clobbers global `cq_table[0]`
- [High] Uninitialized `tx_object` handles in `mana_create_txq` error path destroy valid WQ object 0
--
commit fbbff2b6d2993ad6206b12c69d7d995282d02696
Author: Dipayaan Roy <dipayanroy@linux.microsoft.com>
net: mana: remove double CQ cleanup in mana_create_rxq error path
This commit removes a redundant mana_deinit_cq() call in the error path of
mana_create_rxq(). Previously, mana_destroy_rxq() was called first, which
already cleans up the CQ and frees the rxq structure. The subsequent
mana_deinit_cq() call caused a use-after-free by accessing the embedded
CQ from the already-freed structure.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index f2a6ea162dc3..9afc786b297a 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -2799,9 +2799,6 @@ static struct mana_rxq *mana_create_rxq(struct mana_port_context *apc,
>
> mana_destroy_rxq(apc, rxq, false);
This isn't a bug introduced by this commit, but while reviewing this error
path I noticed a pre-existing issue. Does this sequence clobber the global
cq_table[0]?
When mana_gd_create_mana_wq_cq() allocates a queue via kzalloc_obj(),
queue->id is initialized to 0.
The actual hardware queue ID is only assigned later when
mana_create_wq_obj() successfully completes.
If mana_create_wq_obj() fails in mana_create_rxq(), the error path calls:
mana_destroy_rxq() -> mana_deinit_cq() -> mana_gd_destroy_queue() ->
mana_gd_destroy_cq()
Inside mana_gd_destroy_cq(), it uses queue->id (which is still 0) to clear
the gc->cq_table[id] array entry:
mana_gd_destroy_cq() {
u32 id = queue->id;
...
gc->cq_table[id] = NULL;
}
This unintentionally clobbers the cq_table entry for CQ 0, which is a
valid queue ID (e.g., belonging to the hardware control channel or
another active queue).
Could this lead to dropped completions for CQ 0 and potentially hang the
device? Should queue->id be explicitly initialized to INVALID_QUEUE_ID?
I also noticed a similar pre-existing issue in the mana_create_txq()
error path. Can this destroy valid WQ object 0?
In mana_create_txq(), apc->tx_qp is allocated using kzalloc_objs(), meaning
all tx_object fields initially default to 0.
The initialization loop sets tx_object = INVALID_MANA_HANDLE iteratively.
If queue creation fails midway (e.g., at i = 0), mana_create_txq() jumps to
the error path and calls mana_destroy_txq().
mana_destroy_txq() loops over all apc->num_queues and unconditionally calls
mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object).
For the queue indices that were not yet reached by the initialization loop,
tx_object remains 0. Because 0 is a valid hardware handle, this issues
spurious hardware requests to destroy WQ object 0 multiple times.
Could this inadvertently destroy an active queue belonging to another VF/PF
or the hardware control channel? Should all tx_object handles be initialized
to INVALID_MANA_HANDLE before the queue creation loop begins?
> - if (cq)
> - mana_deinit_cq(apc, cq);
> -
> return NULL;
> }
>
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260430035935.1859220-1-dipayanroy@linux.microsoft.com?part=3
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init
2026-04-30 3:57 [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Dipayaan Roy
` (2 preceding siblings ...)
2026-04-30 3:57 ` [PATCH 3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path Dipayaan Roy
@ 2026-05-02 16:52 ` Simon Horman
2026-05-03 3:38 ` Dipayaan Roy
2026-05-05 10:20 ` patchwork-bot+netdevbpf
4 siblings, 1 reply; 11+ messages in thread
From: Simon Horman @ 2026-05-02 16:52 UTC (permalink / raw)
To: Dipayaan Roy
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, shradhagupta, ssengar,
ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
On Wed, Apr 29, 2026 at 08:57:51PM -0700, Dipayaan Roy wrote:
> When mana_create_rxq() fails partway through initialization (e.g. the
> hardware rejects the WQ object creation), the error path calls
> mana_destroy_rxq() to tear down a partially-initialized RXQ.
> This exposed multiple issues in mana_destroy_rxq() path, as it assumed
> the RXQ was always fully initialized, leading to multiple issues:
>
> 1. xdp_rxq_info_unreg() was called on an unregistered xdp_rxq,
> triggering a WARN_ON ("Driver BUG") in net/core/xdp.c.
>
> 2. mana_destroy_wq_obj() was called with INVALID_MANA_HANDLE,
> sending a bogus destroy command to the hardware.
>
> 3. mana_deinit_cq() was called twice — once inside mana_destroy_rxq()
> and again in mana_create_rxq()'s error path — causing a
> use-after-free since mana_destroy_rxq() frees the rxq first.
>
> This was observed during ethtool ring parameter changes when the
> hardware returned an error creating the RXQ. This series makes
> mana_destroy_rxq() safe to call at any stage of RXQ initialization
> by guarding each teardown step, and removes the redundant cleanup
> in mana_create_rxq().
For the series:
Reviewed-by: Simon Horman <horms@kernel.org>
I don't think that you need to repost for this. But please keep in mind for
future submissions that fixes for code present in the net tree should be
targeted at that tree, like this:
Subject: [PATCH net vN/M] ...
Also, FTR, there is an AI generated review of this patch-set available
on sashiko.dev. It seems to me that the issues flagged there pre-date
this patch-set and should not block progress of it. But you may wish
to use that review as the basis of some follow-up.
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init
2026-05-02 16:52 ` [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Simon Horman
@ 2026-05-03 3:38 ` Dipayaan Roy
0 siblings, 0 replies; 11+ messages in thread
From: Dipayaan Roy @ 2026-05-03 3:38 UTC (permalink / raw)
To: Simon Horman
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, shradhagupta, ssengar,
ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
On Sat, May 02, 2026 at 05:52:58PM +0100, Simon Horman wrote:
> On Wed, Apr 29, 2026 at 08:57:51PM -0700, Dipayaan Roy wrote:
> > When mana_create_rxq() fails partway through initialization (e.g. the
> > hardware rejects the WQ object creation), the error path calls
> > mana_destroy_rxq() to tear down a partially-initialized RXQ.
> > This exposed multiple issues in mana_destroy_rxq() path, as it assumed
> > the RXQ was always fully initialized, leading to multiple issues:
> >
> > 1. xdp_rxq_info_unreg() was called on an unregistered xdp_rxq,
> > triggering a WARN_ON ("Driver BUG") in net/core/xdp.c.
> >
> > 2. mana_destroy_wq_obj() was called with INVALID_MANA_HANDLE,
> > sending a bogus destroy command to the hardware.
> >
> > 3. mana_deinit_cq() was called twice — once inside mana_destroy_rxq()
> > and again in mana_create_rxq()'s error path — causing a
> > use-after-free since mana_destroy_rxq() frees the rxq first.
> >
> > This was observed during ethtool ring parameter changes when the
> > hardware returned an error creating the RXQ. This series makes
> > mana_destroy_rxq() safe to call at any stage of RXQ initialization
> > by guarding each teardown step, and removes the redundant cleanup
> > in mana_create_rxq().
>
> For the series:
>
> Reviewed-by: Simon Horman <horms@kernel.org>
>
> I don't think that you need to repost for this. But please keep in mind for
> future submissions that fixes for code present in the net tree should be
> targeted at that tree, like this:
>
> Subject: [PATCH net vN/M] ...
Thanks Simon,
it was a typo from my end.
>
> Also, FTR, there is an AI generated review of this patch-set available
> on sashiko.dev. It seems to me that the issues flagged there pre-date
> this patch-set and should not block progress of it. But you may wish
> to use that review as the basis of some follow-up.
Agreed, Sashiko flagged a pre-exisitng issue in the tx path.
I will send that as a separate patch set.
Thanks and Regards
Dipayaan Roy
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init
2026-04-30 3:57 [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Dipayaan Roy
` (3 preceding siblings ...)
2026-05-02 16:52 ` [PATCH 0/3] net: mana: Fix mana_destroy_rxq() cleanup for partial RXQ init Simon Horman
@ 2026-05-05 10:20 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-05 10:20 UTC (permalink / raw)
To: Dipayaan Roy
Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
Hello:
This series was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:
On Wed, 29 Apr 2026 20:57:51 -0700 you wrote:
> When mana_create_rxq() fails partway through initialization (e.g. the
> hardware rejects the WQ object creation), the error path calls
> mana_destroy_rxq() to tear down a partially-initialized RXQ.
> This exposed multiple issues in mana_destroy_rxq() path, as it assumed
> the RXQ was always fully initialized, leading to multiple issues:
>
> 1. xdp_rxq_info_unreg() was called on an unregistered xdp_rxq,
> triggering a WARN_ON ("Driver BUG") in net/core/xdp.c.
>
> [...]
Here is the summary with links:
- [1/3] net: mana: check xdp_rxq registration before unreg in mana_destroy_rxq()
https://git.kernel.org/netdev/net/c/e9e334f8063a
- [2/3] net: mana: Skip WQ object destruction for uninitialized RXQ
https://git.kernel.org/netdev/net/c/2a1c69118282
- [3/3] net: mana: remove double CQ cleanup in mana_create_rxq error path
https://git.kernel.org/netdev/net/c/3985c9a56da4
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 11+ messages in thread