* [PATCH net 0/2] net: mana: Fix NULL dereferences during teardown after attach failure.
@ 2026-05-18 19:43 Dipayaan Roy
2026-05-18 19:43 ` [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on " Dipayaan Roy
2026-05-18 19:43 ` [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached Dipayaan Roy
0 siblings, 2 replies; 6+ messages in thread
From: Dipayaan Roy @ 2026-05-18 19:43 UTC (permalink / raw)
To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet,
kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta,
ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel,
linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees,
john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov
When mana_attach() fails (e.g. during queue allocation), the error
cleanup frees apc->tx_qp and apc->rxqs and sets them to NULL. Multiple
subsequent teardown paths can then dereference these NULL pointers,
causing kernel panics.
Patch 1 adds NULL guards in the low-level teardown functions
(mana_fence_rqs, mana_destroy_vport, mana_dealloc_queues) so they are
safe to call regardless of queue initialization state. This covers all
callers: mana_remove(), mana_change_mtu() recovery, and internal error
paths in mana_alloc_queues().
Patch 2 addresses the queue reset work handler specifically, where an
unconditional mana_detach() on an already-detached port caused
redundant teardown. It checks netif_device_present() to skip the detach
and directly retry mana_attach().
Dipayaan Roy (2):
net: mana: Add NULL guards in teardown path to prevent panic on attach
failure
net: mana: Skip redundant detach in queue reset handler if already
detached
drivers/net/ethernet/microsoft/mana/mana_en.c | 77 ++++++++++++-------
1 file changed, 48 insertions(+), 29 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 6+ messages in thread* [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on attach failure 2026-05-18 19:43 [PATCH net 0/2] net: mana: Fix NULL dereferences during teardown after attach failure Dipayaan Roy @ 2026-05-18 19:43 ` Dipayaan Roy 2026-05-18 21:49 ` Harshitha Ramamurthy 2026-05-19 19:47 ` sashiko-bot 2026-05-18 19:43 ` [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached Dipayaan Roy 1 sibling, 2 replies; 6+ messages in thread From: Dipayaan Roy @ 2026-05-18 19:43 UTC (permalink / raw) To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet, kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel, linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees, john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov When queue allocation fails partway through, the error cleanup frees and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as mana_remove(), mana_change_mtu() recovery, and internal error handling in mana_alloc_queues() can subsequently call into functions that dereference these pointers without NULL checks: - mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26). - mana_destroy_vport() iterates apc->rxqs without a NULL check. - mana_fence_rqs() iterates apc->rxqs without a NULL check. - mana_dealloc_queues() iterates apc->tx_qp without a NULL check. Add NULL guards for apc->rxqs in mana_fence_rqs(), mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue draining when TX queues were never allocated or already freed. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> --- drivers/net/ethernet/microsoft/mana/mana_en.c | 70 +++++++++++-------- 1 file changed, 41 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 9afc786b297a..0582803907a8 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1727,6 +1727,9 @@ static void mana_fence_rqs(struct mana_port_context *apc) struct mana_rxq *rxq; int err; + if (!apc->rxqs) + return; + for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { rxq = apc->rxqs[rxq_idx]; err = mana_fence_rq(apc, rxq); @@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct mana_port_context *apc) struct mana_rxq *rxq; u32 rxq_idx; - for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { - rxq = apc->rxqs[rxq_idx]; - if (!rxq) - continue; + if (apc->rxqs) { - mana_destroy_rxq(apc, rxq, true); - apc->rxqs[rxq_idx] = NULL; + for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { + rxq = apc->rxqs[rxq_idx]; + if (!rxq) + continue; + + mana_destroy_rxq(apc, rxq, true); + apc->rxqs[rxq_idx] = NULL; + } } mana_destroy_txq(apc); @@ -3269,7 +3275,8 @@ static int mana_dealloc_queues(struct net_device *ndev) if (apc->port_is_up) return -EINVAL; - mana_chn_setxdp(apc, NULL); + if (apc->rxqs) + mana_chn_setxdp(apc, NULL); if (gd->gdma_context->is_pf && !apc->ac->bm_hostmode) mana_pf_deregister_filter(apc); @@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device *ndev) * number of queues. */ - for (i = 0; i < apc->num_queues; i++) { - txq = &apc->tx_qp[i].txq; - tsleep = 1000; - while (atomic_read(&txq->pending_sends) > 0 && - time_before(jiffies, timeout)) { - usleep_range(tsleep, tsleep + 1000); - tsleep <<= 1; - } - if (atomic_read(&txq->pending_sends)) { - err = pcie_flr(to_pci_dev(gd->gdma_context->dev)); - if (err) { - netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n", - err, atomic_read(&txq->pending_sends), - txq->gdma_txq_id); + if (apc->tx_qp) { + for (i = 0; i < apc->num_queues; i++) { + txq = &apc->tx_qp[i].txq; + tsleep = 1000; + while (atomic_read(&txq->pending_sends) > 0 && + time_before(jiffies, timeout)) { + usleep_range(tsleep, tsleep + 1000); + tsleep <<= 1; + } + if (atomic_read(&txq->pending_sends)) { + err = + pcie_flr(to_pci_dev(gd->gdma_context->dev)); + if (err) { + netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n", + err, + atomic_read(&txq->pending_sends), + txq->gdma_txq_id); + } + break; } - break; } - } - for (i = 0; i < apc->num_queues; i++) { - txq = &apc->tx_qp[i].txq; - while ((skb = skb_dequeue(&txq->pending_skbs))) { - mana_unmap_skb(skb, apc); - dev_kfree_skb_any(skb); + for (i = 0; i < apc->num_queues; i++) { + txq = &apc->tx_qp[i].txq; + while ((skb = skb_dequeue(&txq->pending_skbs))) { + mana_unmap_skb(skb, apc); + dev_kfree_skb_any(skb); + } + atomic_set(&txq->pending_sends, 0); } - atomic_set(&txq->pending_sends, 0); } + /* We're 100% sure the queues can no longer be woken up, because * we're sure now mana_poll_tx_cq() can't be running. */ -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on attach failure 2026-05-18 19:43 ` [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on " Dipayaan Roy @ 2026-05-18 21:49 ` Harshitha Ramamurthy 2026-05-19 19:47 ` sashiko-bot 1 sibling, 0 replies; 6+ messages in thread From: Harshitha Ramamurthy @ 2026-05-18 21:49 UTC (permalink / raw) To: Dipayaan Roy Cc: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet, kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel, linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees, john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov On Mon, May 18, 2026 at 12:52 PM Dipayaan Roy <dipayanroy@linux.microsoft.com> wrote: > > When queue allocation fails partway through, the error cleanup frees > and NULLs apc->tx_qp and apc->rxqs. Multiple teardown paths such as > mana_remove(), mana_change_mtu() recovery, and internal error handling > in mana_alloc_queues() can subsequently call into functions that > dereference these pointers without NULL checks: > > - mana_chn_setxdp() dereferences apc->rxqs[0], causing a NULL pointer > dereference panic (CR2: 0000000000000000 at mana_chn_setxdp+0x26). > - mana_destroy_vport() iterates apc->rxqs without a NULL check. > - mana_fence_rqs() iterates apc->rxqs without a NULL check. > - mana_dealloc_queues() iterates apc->tx_qp without a NULL check. > > Add NULL guards for apc->rxqs in mana_fence_rqs(), > mana_destroy_vport(), and before the mana_chn_setxdp() call. Add a > NULL guard for apc->tx_qp in mana_dealloc_queues() to skip TX queue > draining when TX queues were never allocated or already freed. > > Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") > Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> > Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> > --- > drivers/net/ethernet/microsoft/mana/mana_en.c | 70 +++++++++++-------- > 1 file changed, 41 insertions(+), 29 deletions(-) > > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c > index 9afc786b297a..0582803907a8 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -1727,6 +1727,9 @@ static void mana_fence_rqs(struct mana_port_context *apc) > struct mana_rxq *rxq; > int err; > > + if (!apc->rxqs) > + return; > + > for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { > rxq = apc->rxqs[rxq_idx]; > err = mana_fence_rq(apc, rxq); > @@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct mana_port_context *apc) > struct mana_rxq *rxq; > u32 rxq_idx; > > - for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { > - rxq = apc->rxqs[rxq_idx]; > - if (!rxq) > - continue; > + if (apc->rxqs) { > > - mana_destroy_rxq(apc, rxq, true); > - apc->rxqs[rxq_idx] = NULL; > + for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { > + rxq = apc->rxqs[rxq_idx]; > + if (!rxq) > + continue; > + > + mana_destroy_rxq(apc, rxq, true); > + apc->rxqs[rxq_idx] = NULL; > + } > } > > mana_destroy_txq(apc); > @@ -3269,7 +3275,8 @@ static int mana_dealloc_queues(struct net_device *ndev) > if (apc->port_is_up) > return -EINVAL; > > - mana_chn_setxdp(apc, NULL); > + if (apc->rxqs) > + mana_chn_setxdp(apc, NULL); Is this check required? mana_dealloc_queues() is only called by mana_detach(). And mana_detach() calls `mana_dealloc_queues()` only if the port state was previously up. apc->port_is_up seems to be set to true only on successful queue allocation. Can apc->rxqs be NULL here? > > if (gd->gdma_context->is_pf && !apc->ac->bm_hostmode) > mana_pf_deregister_filter(apc); > @@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device *ndev) > * number of queues. > */ > > - for (i = 0; i < apc->num_queues; i++) { > - txq = &apc->tx_qp[i].txq; > - tsleep = 1000; > - while (atomic_read(&txq->pending_sends) > 0 && > - time_before(jiffies, timeout)) { > - usleep_range(tsleep, tsleep + 1000); > - tsleep <<= 1; > - } > - if (atomic_read(&txq->pending_sends)) { > - err = pcie_flr(to_pci_dev(gd->gdma_context->dev)); > - if (err) { > - netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n", > - err, atomic_read(&txq->pending_sends), > - txq->gdma_txq_id); > + if (apc->tx_qp) { And same as above, is it possible for apc->tx_qp to be NULL here? > + for (i = 0; i < apc->num_queues; i++) { > + txq = &apc->tx_qp[i].txq; > + tsleep = 1000; > + while (atomic_read(&txq->pending_sends) > 0 && > + time_before(jiffies, timeout)) { > + usleep_range(tsleep, tsleep + 1000); > + tsleep <<= 1; > + } > + if (atomic_read(&txq->pending_sends)) { > + err = > + pcie_flr(to_pci_dev(gd->gdma_context->dev)); > + if (err) { > + netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n", > + err, > + atomic_read(&txq->pending_sends), > + txq->gdma_txq_id); > + } > + break; > } > - break; > } > - } > > - for (i = 0; i < apc->num_queues; i++) { > - txq = &apc->tx_qp[i].txq; > - while ((skb = skb_dequeue(&txq->pending_skbs))) { > - mana_unmap_skb(skb, apc); > - dev_kfree_skb_any(skb); > + for (i = 0; i < apc->num_queues; i++) { > + txq = &apc->tx_qp[i].txq; > + while ((skb = skb_dequeue(&txq->pending_skbs))) { > + mana_unmap_skb(skb, apc); > + dev_kfree_skb_any(skb); > + } > + atomic_set(&txq->pending_sends, 0); > } > - atomic_set(&txq->pending_sends, 0); > } > + > /* We're 100% sure the queues can no longer be woken up, because > * we're sure now mana_poll_tx_cq() can't be running. > */ > -- > 2.43.0 > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on attach failure 2026-05-18 19:43 ` [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on " Dipayaan Roy 2026-05-18 21:49 ` Harshitha Ramamurthy @ 2026-05-19 19:47 ` sashiko-bot 1 sibling, 0 replies; 6+ messages in thread From: sashiko-bot @ 2026-05-19 19:47 UTC (permalink / raw) To: Dipayaan Roy; +Cc: linux-hyperv, bpf Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider: - [High] Missing re-allocation of `apc->rxqs` during interface bring-up causes deterministic NULL pointer dereference - [High] Uninitialized queue handles (0 and INVALID_MANA_HANDLE) are sent to the firmware during error teardown, corrupting hardware state - [High] GDMA admin commands are issued immediately after a PCI Function Level Reset (FLR), causing severe cascading timeouts and hardware state violations -- commit 2129283d37aabfe724d7509e34346eab93d302a2 Author: Dipayaan Roy <dipayanroy@linux.microsoft.com> net: mana: Add NULL guards in teardown path to prevent panic on attach failure When queue allocation fails partway through, the error cleanup frees and NULLs apc->tx_qp and apc->rxqs. The patch adds NULL guards for these pointers in teardown paths such as mana_destroy_vport(), mana_fence_rqs(), and mana_dealloc_queues() to prevent NULL pointer dereferences. > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c > index 9afc786b297a8..9e7e4bf526bf6 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c [ ... ] > @@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct mana_port_context *apc) > struct mana_rxq *rxq; > u32 rxq_idx; > > - for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { > - rxq = apc->rxqs[rxq_idx]; > - if (!rxq) > - continue; > + if (apc->rxqs) { > > - mana_destroy_rxq(apc, rxq, true); > - apc->rxqs[rxq_idx] = NULL; > + for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) { > + rxq = apc->rxqs[rxq_idx]; > + if (!rxq) > + continue; > + > + mana_destroy_rxq(apc, rxq, true); > + apc->rxqs[rxq_idx] = NULL; > + } > } If the teardown path frees apc->rxqs and sets it to NULL on attach failure, how does the interface recover if the user subsequently brings it back up? When mana_open() is invoked, it directly calls mana_alloc_queues(). Inside mana_alloc_queues() -> mana_add_rx_queues(), the code does: apc->rxqs[i] = rxq; If apc->rxqs was left as NULL during the teardown, will this cause a NULL pointer dereference on the next interface bring-up? [ ... ] > @@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device *ndev) > * number of queues. > */ > > - for (i = 0; i < apc->num_queues; i++) { > - txq = &apc->tx_qp[i].txq; > - tsleep = 1000; > - while (atomic_read(&txq->pending_sends) > 0 && > - time_before(jiffies, timeout)) { > - usleep_range(tsleep, tsleep + 1000); > - tsleep <<= 1; > - } > - if (atomic_read(&txq->pending_sends)) { > - err = pcie_flr(to_pci_dev(gd->gdma_context->dev)); > - if (err) { > - netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n", > - err, atomic_read(&txq->pending_sends), > - txq->gdma_txq_id); > + if (apc->tx_qp) { In the error path where queue creation fails partway through, will mana_destroy_txq() be called? If mana_destroy_txq() unconditionally iterates over all apc->num_queues: for (i = 0; i < apc->num_queues; i++) { ... mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object); ... Will it pass 0 or INVALID_MANA_HANDLE to mana_destroy_wq_obj() for queues that were never fully initialized? Does mana_destroy_wq_obj() send a MANA_DESTROY_WQ_OBJ command to the hardware with these uninitialized handles, and could handle 0 corrupt the firmware state? > + for (i = 0; i < apc->num_queues; i++) { > + txq = &apc->tx_qp[i].txq; > + tsleep = 1000; > + while (atomic_read(&txq->pending_sends) > 0 && > + time_before(jiffies, timeout)) { > + usleep_range(tsleep, tsleep + 1000); > + tsleep <<= 1; > + } > + if (atomic_read(&txq->pending_sends)) { > + err = > + pcie_flr(to_pci_dev(gd->gdma_context->dev)); If pcie_flr() is triggered here, the PCI Function Level Reset wipes the internal GDMA state. Further down in this function, mana_disable_vport_rx(), mana_fence_rqs(), and mana_destroy_vport() are called. Will these functions attempt to send commands to the now-destroyed GDMA admin queue via mana_send_request() and wait for completions? Could this lead to cascading timeouts since the hardware was just reset and cannot process admin commands? > + if (err) { > + netdev_err(ndev, "flr failed %d with %d pkts pending in txq %u\n", > + err, > + atomic_read(&txq->pending_sends), > + txq->gdma_txq_id); > + } > + break; > } > - break; > } -- Sashiko AI review · https://sashiko.dev/#/patchset/20260518194654.735580-1-dipayanroy@linux.microsoft.com?part=1 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached 2026-05-18 19:43 [PATCH net 0/2] net: mana: Fix NULL dereferences during teardown after attach failure Dipayaan Roy 2026-05-18 19:43 ` [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on " Dipayaan Roy @ 2026-05-18 19:43 ` Dipayaan Roy 2026-05-19 19:47 ` sashiko-bot 1 sibling, 1 reply; 6+ messages in thread From: Dipayaan Roy @ 2026-05-18 19:43 UTC (permalink / raw) To: kys, haiyangz, wei.liu, decui, andrew+netdev, davem, edumazet, kuba, pabeni, leon, longli, kotaranov, horms, shradhagupta, ssengar, ernis, shirazsaleem, linux-hyperv, netdev, linux-kernel, linux-rdma, stephen, jacob.e.keller, dipayanroy, leitao, kees, john.fastabend, hawk, bpf, daniel, ast, sdf, yury.norov When mana_per_port_queue_reset_work_handler() runs, it unconditionally calls mana_detach() which attempts to tear down queues that are already freed, leading to NULL pointer dereferences on apc->tx_qp and apc->rxqs. Check netif_device_present() in the reset handler and skip mana_detach() when the port is already in detached state. This avoids the redundant teardown and goes directly to mana_attach() to retry bringing the port back up. Fixes: 3b194343c250 ("net: mana: Implement ndo_tx_timeout and serialize queue resets per port.") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com> --- drivers/net/ethernet/microsoft/mana/mana_en.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 0582803907a8..907efadf6fd6 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -316,12 +316,19 @@ static void mana_per_port_queue_reset_work_handler(struct work_struct *work) goto out; } + /* If already detached (indicates detach succeeded but attach failed + * previously). Now skip mana detach and just retry mana_attach. + */ + if (!netif_device_present(ndev)) + goto attach; + err = mana_detach(ndev, false); if (err) { netdev_err(ndev, "mana_detach failed: %d\n", err); goto dealloc_pre_rxbufs; } +attach: err = mana_attach(ndev); if (err) netdev_err(ndev, "mana_attach failed: %d\n", err); -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached 2026-05-18 19:43 ` [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached Dipayaan Roy @ 2026-05-19 19:47 ` sashiko-bot 0 siblings, 0 replies; 6+ messages in thread From: sashiko-bot @ 2026-05-19 19:47 UTC (permalink / raw) To: Dipayaan Roy; +Cc: bpf, linux-hyperv Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: - [High] The patch correctly prevents `mana_detach()` from running on an already-detached device in the reset handler, but identical bugs remain in `mana_change_mtu()`, `mana_set_channels()`, `mana_set_ringparam()`, and `mana_bpf()`. -- commit bac73a4812067e75feeec962b27ab9647e30ed9c Author: Dipayaan Roy <dipayanroy@linux.microsoft.com> net: mana: Skip redundant detach in queue reset handler if already detached This commit adds a check for netif_device_present() in the queue reset handler. If the device is already detached, it skips the redundant mana_detach() call and proceeds directly to mana_attach(). > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c > index 9e7e4bf526bf6..740f3c34db451 100644 > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c > @@ -316,12 +316,19 @@ static void mana_per_port_queue_reset_work_handler(struct work_struct *work) [ ... ] > + /* If already detached (indicates detach succeeded but attach failed > + * previously). Now skip mana detach and just retry mana_attach. > + */ > + if (!netif_device_present(ndev)) > + goto attach; > + > err = mana_detach(ndev, false); The fix here correctly prevents mana_detach() from running on an already detached device. However, do identical issues remain in mana_change_mtu(), mana_set_channels(), mana_set_ringparam(), and mana_bpf()? If mana_attach() previously failed (for example, due to out of memory), the port is left with apc->port_is_up = false, but it preserves apc->port_st_save = true, and calls netif_device_detach(ndev). If an admin tries to recover the port by retrying a configuration command like changing the MTU or attaching XDP, these functions appear to unconditionally call mana_detach(ndev, false) without the !netif_device_present(ndev) check added here. Inside mana_detach(), apc->port_st_save = apc->port_is_up executes. Since port_is_up is already false, port_st_save is overwritten to false, permanently erasing the intent that the port should be up. The subsequent call to mana_attach() would then see port_st_save == false, skip mana_alloc_queues(), set port_is_up = false, and successfully call netif_device_attach(ndev). Would this result in the configuration command returning 0 to userspace while the port queues were never allocated, leaving the interface silently broken until manually toggled down and up? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260518194654.735580-1-dipayanroy@linux.microsoft.com?part=2 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-19 19:47 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-18 19:43 [PATCH net 0/2] net: mana: Fix NULL dereferences during teardown after attach failure Dipayaan Roy 2026-05-18 19:43 ` [PATCH net 1/2] net: mana: Add NULL guards in teardown path to prevent panic on " Dipayaan Roy 2026-05-18 21:49 ` Harshitha Ramamurthy 2026-05-19 19:47 ` sashiko-bot 2026-05-18 19:43 ` [PATCH net 2/2] net: mana: Skip redundant detach in queue reset handler if already detached Dipayaan Roy 2026-05-19 19:47 ` sashiko-bot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox