* [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths
@ 2026-05-20 10:45 Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui
This small series fixes two bugs in the RDMA counter subsystem,
both related to error cleanup paths in drivers/infiniband/core/counters.c.
Patch 1 fixes a variable mismatch in rdma_counter_init()'s cleanup loop:
the loop iterates with 'i' but indexes into port_data[] with 'port',
causing double-frees on the failed port and leaking hstats of
previously initialized ports.
Patch 2 fixes a num_counters leak in alloc_and_bind(): when
__rdma_counter_bind_qp() fails, the counter is freed without
decrementing port_counter->num_counters. This leak accumulates
across repeated failures, permanently preventing the port from
switching back to AUTO mode (-EBUSY) and leaving the mode stuck
in MANUAL when it was originally NONE.
Tao Cui (2):
RDMA/counter: Fix num_counters leak on bind_qp failure in
alloc_and_bind()
RDMA/counter: Fix incorrect port index in rdma_counter_init() error
cleanup
drivers/infiniband/core/counters.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind()
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
@ 2026-05-20 10:45 ` Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui
When __rdma_counter_bind_qp() fails in alloc_and_bind(), the error path
jumps to err_mode which frees the counter without decrementing
port_counter->num_counters. The only place that decrements is
rdma_counter_free(), which is unreachable since the counter was never
successfully bound.
This leak accumulates across repeated failures, permanently preventing
the port from switching to AUTO mode (-EBUSY in __counter_set_mode())
and blocking the MANUAL→NONE auto-revert in rdma_counter_free(). When
the mode was NONE before the call, the MANUAL mode set by
__counter_set_mode() also leaks since the revert logic is never
reached.
Add an err_bind label between the num_counters increment and the
existing err_mode label. It decrements num_counters and mirrors the
MANUAL→NONE revert from rdma_counter_free(), ensuring the port state
is fully restored on bind failure.
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
drivers/infiniband/core/counters.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index c3aa6d7fc66b..5ddd607d5fbe 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -198,12 +198,20 @@ static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
ret = __rdma_counter_bind_qp(counter, qp, port);
if (ret)
- goto err_mode;
+ goto err_bind;
rdma_restrack_parent_name(&counter->res, &qp->res);
rdma_restrack_add(&counter->res);
return counter;
+err_bind:
+ mutex_lock(&port_counter->lock);
+ port_counter->num_counters--;
+ if (!port_counter->num_counters &&
+ port_counter->mode.mode == RDMA_COUNTER_MODE_MANUAL)
+ __counter_set_mode(port_counter, RDMA_COUNTER_MODE_NONE, 0,
+ false);
+ mutex_unlock(&port_counter->lock);
err_mode:
rdma_free_hw_stats_struct(counter->stats);
err_stats:
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
@ 2026-05-20 10:45 ` Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui
The error cleanup loop in rdma_counter_init() iterates with variable
'i' but accesses dev->port_data[port] instead of dev->port_data[i].
This causes the failed port's hstats to be freed multiple times while
leaking hstats of previously initialized ports.
Fixes: 56594ae1d250 ("RDMA/core: Annotate destroy of mutex to ensure that it is released as unlocked")
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
drivers/infiniband/core/counters.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 5ddd607d5fbe..a9e189194c13 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -669,7 +669,7 @@ void rdma_counter_init(struct ib_device *dev)
fail:
for (i = port; i >= rdma_start_port(dev); i--) {
- port_counter = &dev->port_data[port].port_counter;
+ port_counter = &dev->port_data[i].port_counter;
rdma_free_hw_stats_struct(port_counter->hstats);
port_counter->hstats = NULL;
mutex_destroy(&port_counter->lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
@ 2026-05-25 15:42 ` Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-05-25 15:42 UTC (permalink / raw)
To: Tao Cui; +Cc: leon, linux-rdma, linux-kernel
On Wed, May 20, 2026 at 06:45:44PM +0800, Tao Cui wrote:
> This small series fixes two bugs in the RDMA counter subsystem,
> both related to error cleanup paths in drivers/infiniband/core/counters.c.
>
> Patch 1 fixes a variable mismatch in rdma_counter_init()'s cleanup loop:
> the loop iterates with 'i' but indexes into port_data[] with 'port',
> causing double-frees on the failed port and leaking hstats of
> previously initialized ports.
>
> Patch 2 fixes a num_counters leak in alloc_and_bind(): when
> __rdma_counter_bind_qp() fails, the counter is freed without
> decrementing port_counter->num_counters. This leak accumulates
> across repeated failures, permanently preventing the port from
> switching back to AUTO mode (-EBUSY) and leaving the mode stuck
> in MANUAL when it was originally NONE.
>
> Tao Cui (2):
> RDMA/counter: Fix num_counters leak on bind_qp failure in
> alloc_and_bind()
> RDMA/counter: Fix incorrect port index in rdma_counter_init() error
> cleanup
Applied to for-next
Thanks,
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-25 15:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.