Netdev List
 help / color / mirror / Atom feed
* [PATCH] bnx2: fix refcount leak in bnx2_change_ring_size()
@ 2026-06-09  7:47 Wentao Liang
  2026-06-11 15:13 ` Simon Horman
  0 siblings, 1 reply; 2+ messages in thread
From: Wentao Liang @ 2026-06-09  7:47 UTC (permalink / raw)
  To: rmody, GR-Linux-NIC-Dev, andrew+netdev, davem, edumazet, kuba,
	pabeni
  Cc: netdev, linux-kernel, Wentao Liang, stable

The error path in bnx2_change_ring_size() when reopening the
device fails does not balance the intr_sem semaphore.
bnx2_netif_stop(true) increments the semaphore at the start of the
function if the netif is running. On success, bnx2_netif_start(true)
decrements it. However, when bnx2_alloc_mem(), bnx2_request_irq(),
or bnx2_init_nic() fail, the error handling calls bnx2_napi_enable()
and dev_close() without performing the corresponding decrement. This
leaves the semaphore permanently imbalanced, eventually causing
stall or deadlock on subsequent internal synchronization.

Fix this by calling bnx2_netif_start(true) in the error path to
properly release the semaphore count.

Cc: stable@vger.kernel.org
Fixes: 6fefb65e78f0 ("bnx2: Close device if MTU change or ring size change fails.")
Signed-off-by: Wentao Liang <vulab@iscas.ac.cn>
---
 drivers/net/ethernet/broadcom/bnx2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index f5722e929833..7f8ade29b7e5 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -7360,6 +7360,7 @@ bnx2_change_ring_size(struct bnx2 *bp, u32 rx, u32 tx, bool reset_irq)
 		if (rc) {
 			bnx2_napi_enable(bp);
 			dev_close(bp->dev);
+			atomic_dec(&bp->intr_sem);
 			return rc;
 		}
 #ifdef BCM_CNIC
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] bnx2: fix refcount leak in bnx2_change_ring_size()
  2026-06-09  7:47 [PATCH] bnx2: fix refcount leak in bnx2_change_ring_size() Wentao Liang
@ 2026-06-11 15:13 ` Simon Horman
  0 siblings, 0 replies; 2+ messages in thread
From: Simon Horman @ 2026-06-11 15:13 UTC (permalink / raw)
  To: vulab
  Cc: Simon Horman, rmody, GR-Linux-NIC-Dev, andrew+netdev, davem,
	edumazet, kuba, pabeni, netdev, linux-kernel, stable

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
bnx2: fix refcount leak in bnx2_change_ring_size()

This patch attempts to balance the intr_sem semaphore on the failure path
of bnx2_change_ring_size() by adding an atomic_dec() after the
bnx2_napi_enable()/dev_close() sequence.

> diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
> index f5722e929833..7f8ade29b7e5 100644
> --- a/drivers/net/ethernet/broadcom/bnx2.c
> +++ b/drivers/net/ethernet/broadcom/bnx2.c
> @@ -7360,6 +7360,7 @@ bnx2_change_ring_size(struct bnx2 *bp, u32 rx, u32 tx, bool reset_irq)
>  		if (rc) {
>  			bnx2_napi_enable(bp);
>  			dev_close(bp->dev);
> +			atomic_dec(&bp->intr_sem);
>  			return rc;
>  		}

[Medium]
Does the commit message match the change?

The message states:

    Fix this by calling bnx2_netif_start(true) in the error path to
    properly release the semaphore count.

but the diff inserts a raw atomic_dec(&bp->intr_sem), not a call to
bnx2_netif_start().  These are not equivalent.  bnx2_netif_start() does
an atomic_dec_and_test() and, when it transitions to zero with the
device still running, additionally calls netif_tx_wake_all_queues(),
bnx2_napi_enable(), bnx2_enable_int(), and bnx2_cnic_start():

    static void
    bnx2_netif_start(struct bnx2 *bp, bool start_cnic)
    {
            if (atomic_dec_and_test(&bp->intr_sem)) {
                    if (netif_running(bp->dev)) {
                            netif_tx_wake_all_queues(bp->dev);
                            ...
                            bnx2_napi_enable(bp);
                            bnx2_enable_int(bp);
                            if (start_cnic)
                                    bnx2_cnic_start(bp);
                    }
            }
    }

At the patch site dev_close() has not yet run, so netif_running() is
still true; calling bnx2_netif_start() here would re-enable interrupts
and NAPI on a device that is about to be torn down.

Could the commit message be updated to describe the actual change
(an atomic_dec), since this affects stable backporters reading the log?

The message also says:

    bnx2_netif_stop(true) increments the semaphore at the start of the
    function if the netif is running.

Looking at bnx2_disable_int_sync() (called from bnx2_netif_stop()):

    static void
    bnx2_disable_int_sync(struct bnx2 *bp)
    {
            int i;

            atomic_inc(&bp->intr_sem);
            if (!netif_running(bp->dev))
                    return;
            ...
    }

the increment happens unconditionally before the netif_running() check,
and bnx2_netif_stop() invokes bnx2_disable_int_sync() near the end (not
the start).  Should the commit message be reworded to reflect that?

[Medium]
Does the single atomic_dec actually balance intr_sem on this error path?

Tracing from the bnx2_open() baseline of intr_sem=0:

  1. bnx2_netif_stop(bp, true) -> bnx2_disable_int_sync() does
     atomic_inc (0 -> 1).

  2. On error, bnx2_napi_enable() does not touch intr_sem.

  3. dev_close(bp->dev) dispatches via ndo_stop to bnx2_close():

        static int
        bnx2_close(struct net_device *dev)
        {
                struct bnx2 *bp = netdev_priv(dev);

                bnx2_disable_int_sync(bp);
                ...
        }

     which performs another atomic_inc (1 -> 2).

  4. The new atomic_dec brings it from 2 to 1.

The semaphore is therefore left at +1, not 0.  Compare bnx2_reset_task(),
which uses an explicit atomic_set(&bp->intr_sem, 1) before calling
bnx2_netif_start() to forcibly reset state on a similar error path.

Should this site either set intr_sem back to 0, or perform two
atomic_dec calls (one for the bnx2_netif_stop() increment and one for
the bnx2_close() increment), to fully balance the count as the changelog
claims?

In practice the residual +1 will be overwritten by the next
bnx2_open() via atomic_set(&bp->intr_sem, 0), but the patch as written
does not match its stated goal of fully releasing the count.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-06-11 15:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09  7:47 [PATCH] bnx2: fix refcount leak in bnx2_change_ring_size() Wentao Liang
2026-06-11 15:13 ` Simon Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox