* [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths
@ 2026-05-20 10:45 Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui
This small series fixes two bugs in the RDMA counter subsystem,
both related to error cleanup paths in drivers/infiniband/core/counters.c.
Patch 1 fixes a variable mismatch in rdma_counter_init()'s cleanup loop:
the loop iterates with 'i' but indexes into port_data[] with 'port',
causing double-frees on the failed port and leaking hstats of
previously initialized ports.
Patch 2 fixes a num_counters leak in alloc_and_bind(): when
__rdma_counter_bind_qp() fails, the counter is freed without
decrementing port_counter->num_counters. This leak accumulates
across repeated failures, permanently preventing the port from
switching back to AUTO mode (-EBUSY) and leaving the mode stuck
in MANUAL when it was originally NONE.
Tao Cui (2):
RDMA/counter: Fix num_counters leak on bind_qp failure in
alloc_and_bind()
RDMA/counter: Fix incorrect port index in rdma_counter_init() error
cleanup
drivers/infiniband/core/counters.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind()
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
@ 2026-05-20 10:45 ` Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui
When __rdma_counter_bind_qp() fails in alloc_and_bind(), the error path
jumps to err_mode which frees the counter without decrementing
port_counter->num_counters. The only place that decrements is
rdma_counter_free(), which is unreachable since the counter was never
successfully bound.
This leak accumulates across repeated failures, permanently preventing
the port from switching to AUTO mode (-EBUSY in __counter_set_mode())
and blocking the MANUAL→NONE auto-revert in rdma_counter_free(). When
the mode was NONE before the call, the MANUAL mode set by
__counter_set_mode() also leaks since the revert logic is never
reached.
Add an err_bind label between the num_counters increment and the
existing err_mode label. It decrements num_counters and mirrors the
MANUAL→NONE revert from rdma_counter_free(), ensuring the port state
is fully restored on bind failure.
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
drivers/infiniband/core/counters.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index c3aa6d7fc66b..5ddd607d5fbe 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -198,12 +198,20 @@ static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
ret = __rdma_counter_bind_qp(counter, qp, port);
if (ret)
- goto err_mode;
+ goto err_bind;
rdma_restrack_parent_name(&counter->res, &qp->res);
rdma_restrack_add(&counter->res);
return counter;
+err_bind:
+ mutex_lock(&port_counter->lock);
+ port_counter->num_counters--;
+ if (!port_counter->num_counters &&
+ port_counter->mode.mode == RDMA_COUNTER_MODE_MANUAL)
+ __counter_set_mode(port_counter, RDMA_COUNTER_MODE_NONE, 0,
+ false);
+ mutex_unlock(&port_counter->lock);
err_mode:
rdma_free_hw_stats_struct(counter->stats);
err_stats:
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
@ 2026-05-20 10:45 ` Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Tao Cui @ 2026-05-20 10:45 UTC (permalink / raw)
To: leon, jgg, linux-rdma; +Cc: linux-kernel, Tao Cui
The error cleanup loop in rdma_counter_init() iterates with variable
'i' but accesses dev->port_data[port] instead of dev->port_data[i].
This causes the failed port's hstats to be freed multiple times while
leaking hstats of previously initialized ports.
Fixes: 56594ae1d250 ("RDMA/core: Annotate destroy of mutex to ensure that it is released as unlocked")
Signed-off-by: Tao Cui <cuitao@kylinos.cn>
---
drivers/infiniband/core/counters.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/infiniband/core/counters.c b/drivers/infiniband/core/counters.c
index 5ddd607d5fbe..a9e189194c13 100644
--- a/drivers/infiniband/core/counters.c
+++ b/drivers/infiniband/core/counters.c
@@ -669,7 +669,7 @@ void rdma_counter_init(struct ib_device *dev)
fail:
for (i = port; i >= rdma_start_port(dev); i--) {
- port_counter = &dev->port_data[port].port_counter;
+ port_counter = &dev->port_data[i].port_counter;
rdma_free_hw_stats_struct(port_counter->hstats);
port_counter->hstats = NULL;
mutex_destroy(&port_counter->lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
@ 2026-05-25 15:42 ` Jason Gunthorpe
2 siblings, 0 replies; 4+ messages in thread
From: Jason Gunthorpe @ 2026-05-25 15:42 UTC (permalink / raw)
To: Tao Cui; +Cc: leon, linux-rdma, linux-kernel
On Wed, May 20, 2026 at 06:45:44PM +0800, Tao Cui wrote:
> This small series fixes two bugs in the RDMA counter subsystem,
> both related to error cleanup paths in drivers/infiniband/core/counters.c.
>
> Patch 1 fixes a variable mismatch in rdma_counter_init()'s cleanup loop:
> the loop iterates with 'i' but indexes into port_data[] with 'port',
> causing double-frees on the failed port and leaking hstats of
> previously initialized ports.
>
> Patch 2 fixes a num_counters leak in alloc_and_bind(): when
> __rdma_counter_bind_qp() fails, the counter is freed without
> decrementing port_counter->num_counters. This leak accumulates
> across repeated failures, permanently preventing the port from
> switching back to AUTO mode (-EBUSY) and leaving the mode stuck
> in MANUAL when it was originally NONE.
>
> Tao Cui (2):
> RDMA/counter: Fix num_counters leak on bind_qp failure in
> alloc_and_bind()
> RDMA/counter: Fix incorrect port index in rdma_counter_init() error
> cleanup
Applied to for-next
Thanks,
Jason
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-25 15:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 10:45 [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 1/2] RDMA/counter: Fix num_counters leak on bind_qp failure in alloc_and_bind() Tao Cui
2026-05-20 10:45 ` [PATCH rdma-next 2/2] RDMA/counter: Fix incorrect port index in rdma_counter_init() error cleanup Tao Cui
2026-05-25 15:42 ` [PATCH rdma-next 0/2] RDMA/counter: Two bug fixes in counter error paths Jason Gunthorpe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox