Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths
@ 2026-05-20 10:45 Tao Cui
  2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
  To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui

This small series fixes two bugs in the RDMA counter subsystem,
both related to error cleanup paths in drivers/infiniband/core/counters.c.

Patch 1 fixes a variable mismatch in rdma_counter_init()'s cleanup loop:
the loop iterates with 'i' but indexes into port_data[] with 'port',
causing double-frees on the failed port and leaking hstats of
previously initialized ports.

Patch 2 fixes a num_counters leak in alloc_and_bind(): when
__rdma_counter_bind_qp() fails, the counter is freed without
decrementing port_counter->num_counters.  This leak accumulates
across repeated failures, permanently preventing the port from
switching back to AUTO mode (-EBUSY) and leaving the mode stuck
in MANUAL when it was originally NONE.

Tao Cui (2):
  RDMA/counter: Fix num_counters leak on bind_qp failure in
    alloc_and_bind()
  RDMA/counter: Fix incorrect port index in rdma_counter_init() error
    cleanup

 drivers/infiniband/core/counters.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind()
  2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
@ 2026-05-20 10:45 ` Tao Cui
  2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
  2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
  2 siblings, 0 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
  To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui

When __rdma_counter_bind_qp() fails in alloc_and_bind(), the error path
jumps to err_mode which frees the counter without decrementing
port_counter->num_counters. The only place that decrements is
rdma_counter_free(), which is unreachable since the counter was never
successfully bound.

This leak accumulates across repeated failures, permanently preventing
the port from switching to AUTO mode (-EBUSY in __counter_set_mode())
and blocking the MANUAL→NONE auto-revert in rdma_counter_free(). When
the mode was NONE before the call, the MANUAL mode set by
__counter_set_mode() also leaks since the revert logic is never
reached.

Add an err_bind label between the num_counters increment and the
existing err_mode label. It decrements num_counters and mirrors the
MANUAL→NONE revert from rdma_counter_free(), ensuring the port state
is fully restored on bind failure.

Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
 drivers/infiniband/core/counters.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index c3aa6d7fc66b..5ddd607d5fbe 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -198,12 +198,20 @@ static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
 
 	ret = __rdma_counter_bind_qp(counter, qp, port);
 	if (ret)
-		goto err_mode;
+		goto err_bind;
 
 	rdma_restrack_parent_name(&counter->res, &qp->res);
 	rdma_restrack_add(&counter->res);
 	return counter;
 
+err_bind:
+	mutex_lock(&port_counter->lock);
+	port_counter->num_counters--;
+	if (!port_counter->num_counters &&
+	    port_counter->mode.mode == RDMA_COUNTER_MODE_MANUAL)
+		__counter_set_mode(port_counter, RDMA_COUNTER_MODE_NONE, 0,
+				   false);
+	mutex_unlock(&port_counter->lock);
 err_mode:
 	rdma_free_hw_stats_struct(counter->stats);
 err_stats:
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup
  2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
  2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
@ 2026-05-20 10:45 ` Tao Cui
  2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
  2 siblings, 0 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
  To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui

The error cleanup loop in rdma_counter_init() iterates with variable
'i' but accesses dev->port_data[port] instead of dev->port_data[i].
This causes the failed port's hstats to be freed multiple times while
leaking hstats of previously initialized ports.

Fixes: 56594ae1d250 ("RDMA/core: Annotate destroy of mutex to ensure that it is released as unlocked")
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
 drivers/infiniband/core/counters.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 5ddd607d5fbe..a9e189194c13 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -669,7 +669,7 @@ void rdma_counter_init(struct ib_device *dev)
 
 fail:
 	for (i = port; i >= rdma_start_port(dev); i--) {
-		port_counter = &dev->port_data[port].port_counter;
+		port_counter = &dev->port_data[i].port_counter;
 		rdma_free_hw_stats_struct(port_counter->hstats);
 		port_counter->hstats = NULL;
 		mutex_destroy(&port_counter->lock);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths
  2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
  2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
  2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
@ 2026-05-25 15:42 ` Jason Gunthorpe
  2 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-05-25 15:42 UTC (permalink / raw)
  To: Tao Cui; +Cc: leon, linux-rdma, linux-kernel

On Wed, May 20, 2026 at 06:45:44PM +0800, Tao Cui wrote:
> This small series fixes two bugs in the RDMA counter subsystem,
> both related to error cleanup paths in drivers/infiniband/core/counters.c.
> 
> Patch 1 fixes a variable mismatch in rdma_counter_init()'s cleanup loop:
> the loop iterates with 'i' but indexes into port_data[] with 'port',
> causing double-frees on the failed port and leaking hstats of
> previously initialized ports.
> 
> Patch 2 fixes a num_counters leak in alloc_and_bind(): when
> __rdma_counter_bind_qp() fails, the counter is freed without
> decrementing port_counter->num_counters.  This leak accumulates
> across repeated failures, permanently preventing the port from
> switching back to AUTO mode (-EBUSY) and leaving the mode stuck
> in MANUAL when it was originally NONE.
> 
> Tao Cui (2):
>   RDMA/counter: Fix num_counters leak on bind_qp failure in
>     alloc_and_bind()
>   RDMA/counter: Fix incorrect port index in rdma_counter_init() error
>     cleanup

Applied to for-next

Thanks,
Jason

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-25 15:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox