linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
@ 2025-08-06 12:39 Philipp Reisner
  2025-08-07  1:09 ` Zhu Yanjun
  2025-08-19  2:37 ` Zhu Yanjun
  0 siblings, 2 replies; 13+ messages in thread
From: Philipp Reisner @ 2025-08-06 12:39 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, linux-kernel,
	Philipp Reisner

Allow the comp_handler callback implementation to call ib_poll_cq().
A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.

The Mellanox and Intel drivers allow a comp_handler callback
implementation to call ib_poll_cq().

Avoid the deadlock by calling the comp_handler callback without
holding cq->cw_lock.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
---
 drivers/infiniband/sw/rxe/rxe_cq.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index fffd144d509e..1195e109f89b 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	int full;
 	void *addr;
 	unsigned long flags;
+	u8 notify;
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 
@@ -110,14 +111,15 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 
 	queue_advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
 
-	if ((cq->notify & IB_CQ_NEXT_COMP) ||
-	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
-		cq->notify = 0;
-		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
-	}
+	notify = cq->notify;
+	cq->notify = 0;
 
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
+	if ((notify & IB_CQ_NEXT_COMP) ||
+	    (notify & IB_CQ_SOLICITED && solicited))
+		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+
 	return 0;
 }
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-06 12:39 [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock Philipp Reisner
@ 2025-08-07  1:09 ` Zhu Yanjun
  2025-08-11  5:26   ` Philipp Reisner
  2025-08-19  2:37 ` Zhu Yanjun
  1 sibling, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2025-08-07  1:09 UTC (permalink / raw)
  To: Philipp Reisner, Zhu Yanjun
  Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, linux-kernel

在 2025/8/6 5:39, Philipp Reisner 写道:
> Allow the comp_handler callback implementation to call ib_poll_cq().
> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
> 
> The Mellanox and Intel drivers allow a comp_handler callback
> implementation to call ib_poll_cq().
> 
> Avoid the deadlock by calling the comp_handler callback without
> holding cq->cw_lock.
> 
> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>

ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
Test resize CQ, start with specific value and then increase and decrease
----------------------------------------------------------------------
Traceback (most recent call last):
   File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
     u.poll_cq(self.client.cq)
   File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
     wcs = _poll_cq(cq, count, data)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
     raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs 
remaining)

After I applied your patch in kervel v6.16, I got the above errors.

Zhu Yanjun

> ---
>   drivers/infiniband/sw/rxe/rxe_cq.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
> index fffd144d509e..1195e109f89b 100644
> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
> @@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   	int full;
>   	void *addr;
>   	unsigned long flags;
> +	u8 notify;
>   
>   	spin_lock_irqsave(&cq->cq_lock, flags);
>   
> @@ -110,14 +111,15 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   
>   	queue_advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>   
> -	if ((cq->notify & IB_CQ_NEXT_COMP) ||
> -	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
> -		cq->notify = 0;
> -		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> -	}
> +	notify = cq->notify;
> +	cq->notify = 0;
>   
>   	spin_unlock_irqrestore(&cq->cq_lock, flags);
>   
> +	if ((notify & IB_CQ_NEXT_COMP) ||
> +	    (notify & IB_CQ_SOLICITED && solicited))
> +		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-07  1:09 ` Zhu Yanjun
@ 2025-08-11  5:26   ` Philipp Reisner
  2025-08-11 13:48     ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: Philipp Reisner @ 2025-08-11  5:26 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>
> 在 2025/8/6 5:39, Philipp Reisner 写道:
> > Allow the comp_handler callback implementation to call ib_poll_cq().
> > A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
> > And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
> >
> > The Mellanox and Intel drivers allow a comp_handler callback
> > implementation to call ib_poll_cq().
> >
> > Avoid the deadlock by calling the comp_handler callback without
> > holding cq->cw_lock.
> >
> > Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>
> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
> Test resize CQ, start with specific value and then increase and decrease
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>    File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
>      u.poll_cq(self.client.cq)
>    File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>      wcs = _poll_cq(cq, count, data)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^
>    File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>      raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
> remaining)
>
> After I applied your patch in kervel v6.16, I got the above errors.
>
> Zhu Yanjun
>

Hello Zhu,

When I run the test_resize_cq test in a loop (100 runs each) on the
original code and with my patch, I get about the same failure rate.

without my patch success=87 failure=13
without my patch success=82 failure=18
without my patch success=81 failure=19
with my patch    success=89 failure=11
with my patch    success=90 failure=10
with my patch    success=82 failure=18

The patch I am proposing does not change the failure rate of this test.

Best regards,
 Philipp

#!/bin/bash

success=0
failure=0

for (( i = 0; i < 100; i++ )) do
      if rdma-core/build/bin/run_tests.py -k test_resize_cq; then
  success=$((success+1))
      else
  failure=$((failure+1))
      fi
done
echo success=$success failure=$failure

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-11  5:26   ` Philipp Reisner
@ 2025-08-11 13:48     ` Zhu Yanjun
  2025-08-12 15:54       ` Daisuke Matsuda
  0 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2025-08-11 13:48 UTC (permalink / raw)
  To: Philipp Reisner, Daisuke Matsuda
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

在 2025/8/10 22:26, Philipp Reisner 写道:
> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>
>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>> Allow the comp_handler callback implementation to call ib_poll_cq().
>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
>>>
>>> The Mellanox and Intel drivers allow a comp_handler callback
>>> implementation to call ib_poll_cq().
>>>
>>> Avoid the deadlock by calling the comp_handler callback without
>>> holding cq->cw_lock.
>>>
>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>
>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>> Test resize CQ, start with specific value and then increase and decrease
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>     File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
>>       u.poll_cq(self.client.cq)
>>     File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>       wcs = _poll_cq(cq, count, data)
>>             ^^^^^^^^^^^^^^^^^^^^^^^^^
>>     File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>       raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>> remaining)
>>
>> After I applied your patch in kervel v6.16, I got the above errors.
>>
>> Zhu Yanjun
>>
> 
> Hello Zhu,
> 
> When I run the test_resize_cq test in a loop (100 runs each) on the
> original code and with my patch, I get about the same failure rate.

Add Daisuke Matsuda

If I remember it correctly, when Daisuke and I discussed ODP patches, we 
both made tests with rxe, from our tests results, it seems that this 
test_resize_cq error does not occur.

Yanjun.Zhu

> 
> without my patch success=87 failure=13
> without my patch success=82 failure=18
> without my patch success=81 failure=19
> with my patch    success=89 failure=11
> with my patch    success=90 failure=10
> with my patch    success=82 failure=18
> 
> The patch I am proposing does not change the failure rate of this test.
> 
> Best regards,
>   Philipp
> 
> #!/bin/bash
> 
> success=0
> failure=0
> 
> for (( i = 0; i < 100; i++ )) do
>        if rdma-core/build/bin/run_tests.py -k test_resize_cq; then
>    success=$((success+1))
>        else
>    failure=$((failure+1))
>        fi
> done
> echo success=$success failure=$failure


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-11 13:48     ` Zhu Yanjun
@ 2025-08-12 15:54       ` Daisuke Matsuda
  2025-08-14  5:33         ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: Daisuke Matsuda @ 2025-08-12 15:54 UTC (permalink / raw)
  To: Zhu Yanjun, Philipp Reisner
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

On 2025/08/11 22:48, Zhu Yanjun wrote:
> 在 2025/8/10 22:26, Philipp Reisner 写道:
>> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>
>>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>>> Allow the comp_handler callback implementation to call ib_poll_cq().
>>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
>>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
>>>>
>>>> The Mellanox and Intel drivers allow a comp_handler callback
>>>> implementation to call ib_poll_cq().
>>>>
>>>> Avoid the deadlock by calling the comp_handler callback without
>>>> holding cq->cw_lock.
>>>>
>>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>>
>>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>>> Test resize CQ, start with specific value and then increase and decrease
>>> ----------------------------------------------------------------------
>>> Traceback (most recent call last):
>>>     File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
>>>       u.poll_cq(self.client.cq)
>>>     File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>>       wcs = _poll_cq(cq, count, data)
>>>             ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>     File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>>       raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
>>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>>> remaining)
>>>
>>> After I applied your patch in kervel v6.16, I got the above errors.
>>>
>>> Zhu Yanjun
>>>
>>
>> Hello Zhu,
>>
>> When I run the test_resize_cq test in a loop (100 runs each) on the
>> original code and with my patch, I get about the same failure rate.
> 
> Add Daisuke Matsuda
> 
> If I remember it correctly, when Daisuke and I discussed ODP patches, we both made tests with rxe, from our tests results, it seems that this test_resize_cq error does not occur.

Hi Zhu and Philipp,

As far as I know, this error has been present for some time.
It might be possible to investigate further by capturing a memory dump while the polling is stuck, but I have not had time to do that yet.
At least, I can confirm that this is not a regression caused by Philipp's patch.

Thanks,
Daisuke


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-12 15:54       ` Daisuke Matsuda
@ 2025-08-14  5:33         ` Zhu Yanjun
  2025-08-14 14:07           ` Daisuke Matsuda
  0 siblings, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2025-08-14  5:33 UTC (permalink / raw)
  To: Daisuke Matsuda, Philipp Reisner
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

在 2025/8/12 8:54, Daisuke Matsuda 写道:
> On 2025/08/11 22:48, Zhu Yanjun wrote:
>> 在 2025/8/10 22:26, Philipp Reisner 写道:
>>> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>>
>>>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>>>> Allow the comp_handler callback implementation to call ib_poll_cq().
>>>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
>>>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock 
>>>>> deadlock.
>>>>>
>>>>> The Mellanox and Intel drivers allow a comp_handler callback
>>>>> implementation to call ib_poll_cq().
>>>>>
>>>>> Avoid the deadlock by calling the comp_handler callback without
>>>>> holding cq->cw_lock.
>>>>>
>>>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>>>
>>>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>>>> Test resize CQ, start with specific value and then increase and 
>>>> decrease
>>>> ----------------------------------------------------------------------
>>>> Traceback (most recent call last):
>>>>     File "/root/deb/rdma-core/tests/test_cq.py", line 135, in 
>>>> test_resize_cq
>>>>       u.poll_cq(self.client.cq)
>>>>     File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>>>       wcs = _poll_cq(cq, count, data)
>>>>             ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>     File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>>>       raise PyverbsError(f'Got timeout on polling ({count} CQEs 
>>>> remaining)')
>>>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>>>> remaining)
>>>>
>>>> After I applied your patch in kervel v6.16, I got the above errors.
>>>>
>>>> Zhu Yanjun
>>>>
>>>
>>> Hello Zhu,
>>>
>>> When I run the test_resize_cq test in a loop (100 runs each) on the
>>> original code and with my patch, I get about the same failure rate.
>>
>> Add Daisuke Matsuda
>>
>> If I remember it correctly, when Daisuke and I discussed ODP patches, 
>> we both made tests with rxe, from our tests results, it seems that 
>> this test_resize_cq error does not occur.
> 
> Hi Zhu and Philipp,
> 
> As far as I know, this error has been present for some time.
> It might be possible to investigate further by capturing a memory dump 
> while the polling is stuck, but I have not had time to do that yet.
> At least, I can confirm that this is not a regression caused by 
> Philipp's patch.

Hi, Daisuke

Thanks a lot. I’m now able to consistently reproduce this problem. I 
have created a commit here: 
https://github.com/zhuyj/linux/commit/8db3abc00bf49cac6ea1d5718d28c6516c94fb4e.

After applying this commit, I ran test_resize_cq 10,000 times, and the 
problem did not occur.

I’m not sure if there’s a better way to fix this issue. If anyone has a 
better solution, please share it.

Thanks a lot.
Zhu Yanjun

> 
> Thanks,
> Daisuke
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-14  5:33         ` Zhu Yanjun
@ 2025-08-14 14:07           ` Daisuke Matsuda
       [not found]             ` <3cb43241-20d7-4ac9-b055-373fd058b3a3@linux.dev>
  2025-08-15 18:29             ` Yanjun.Zhu
  0 siblings, 2 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2025-08-14 14:07 UTC (permalink / raw)
  To: Zhu Yanjun, Philipp Reisner
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

On 2025/08/14 14:33, Zhu Yanjun wrote:
> 在 2025/8/12 8:54, Daisuke Matsuda 写道:
>> On 2025/08/11 22:48, Zhu Yanjun wrote:
>>> 在 2025/8/10 22:26, Philipp Reisner 写道:
>>>> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>>>
>>>>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>>>>> Allow the comp_handler callback implementation to call ib_poll_cq().
>>>>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
>>>>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
>>>>>>
>>>>>> The Mellanox and Intel drivers allow a comp_handler callback
>>>>>> implementation to call ib_poll_cq().
>>>>>>
>>>>>> Avoid the deadlock by calling the comp_handler callback without
>>>>>> holding cq->cw_lock.
>>>>>>
>>>>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>>>>
>>>>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>>>>> Test resize CQ, start with specific value and then increase and decrease
>>>>> ----------------------------------------------------------------------
>>>>> Traceback (most recent call last):
>>>>>     File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
>>>>>       u.poll_cq(self.client.cq)
>>>>>     File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>>>>       wcs = _poll_cq(cq, count, data)
>>>>>             ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>     File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>>>>       raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
>>>>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>>>>> remaining)
>>>>>
>>>>> After I applied your patch in kervel v6.16, I got the above errors.
>>>>>
>>>>> Zhu Yanjun
>>>>>
>>>>
>>>> Hello Zhu,
>>>>
>>>> When I run the test_resize_cq test in a loop (100 runs each) on the
>>>> original code and with my patch, I get about the same failure rate.
>>>
>>> Add Daisuke Matsuda
>>>
>>> If I remember it correctly, when Daisuke and I discussed ODP patches, we both made tests with rxe, from our tests results, it seems that this test_resize_cq error does not occur.
>>
>> Hi Zhu and Philipp,
>>
>> As far as I know, this error has been present for some time.
>> It might be possible to investigate further by capturing a memory dump while the polling is stuck, but I have not had time to do that yet.
>> At least, I can confirm that this is not a regression caused by Philipp's patch.
> 
> Hi, Daisuke
> 
> Thanks a lot. I’m now able to consistently reproduce this problem. I have created a commit here: https://github.com/zhuyj/linux/commit/8db3abc00bf49cac6ea1d5718d28c6516c94fb4e.
> 
> After applying this commit, I ran test_resize_cq 10,000 times, and the problem did not occur.
> 
> I’m not sure if there’s a better way to fix this issue. If anyone has a better solution, please share it.

Hi Zhu,

Thank you very much for the investigation.

I agree that the issue can be worked around by adding a delay in the rxe completer path.
However, since the issue is easily reproducible, introducing an explicit sleep might
add unnecessary overhead. I think a short busy-wait would be a more desirable alternative.

The intermediate change below does make the issue disappear on my node, but I don't think
this is a complete solution. In particular, it appears that ibcq->event_handler() —
typically ib_uverbs_cq_event_handler() — is not re-entrant, so simply spinning like this
could be risky.

===
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index a5b2b62f596b..a10a173e53cf 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -454,7 +454,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
         queue_advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);

         if (post)
-               rxe_cq_post(qp->scq, &cqe, 0);
+               while (rxe_cq_post(qp->scq, &cqe, 0) == -EBUSY);

         if (wqe->wr.opcode == IB_WR_SEND ||
             wqe->wr.opcode == IB_WR_SEND_WITH_IMM ||
===

If you agree with this direction, I can take some time in the next week or so to make a
formal patch. Of course, you are welcome to take over this idea if you prefer.

Thanks,
Daisuke

> 
> Thanks a lot.
> Zhu Yanjun
> 
>>
>> Thanks,
>> Daisuke
>>
> 


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
       [not found]               ` <2e645d1c-f853-4cee-9590-6f01820d027b@linux.dev>
@ 2025-08-15  4:25                 ` Zhu Yanjun
  0 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2025-08-15  4:25 UTC (permalink / raw)
  To: Daisuke Matsuda, Philipp Reisner
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

>>>
>>> ===
>>> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c 
>>> b/drivers/infiniband/sw/rxe/rxe_comp.c
>>> index a5b2b62f596b..a10a173e53cf 100644
>>> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
>>> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
>>> @@ -454,7 +454,7 @@ static void do_complete(struct rxe_qp *qp, 
>>> struct rxe_send_wqe *wqe)
>>>         queue_advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>>>
>>>         if (post)
>>> -               rxe_cq_post(qp->scq, &cqe, 0);
>>> +               while (rxe_cq_post(qp->scq, &cqe, 0) == -EBUSY);

Just now the maillist notified the mail was not sent successfully. Thus 
I resend it now.


Do you make tests with your commit in the local hosts?


I have applied your commit and made tests in my local host. I run the 
tests for 10000 times.

Sometimes problems occurred. I am not sure if this problem is related 
with this commit or not.

I will make further investigations about this problem.

Yanjun.Zhu

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-14 14:07           ` Daisuke Matsuda
       [not found]             ` <3cb43241-20d7-4ac9-b055-373fd058b3a3@linux.dev>
@ 2025-08-15 18:29             ` Yanjun.Zhu
  2025-08-16 15:57               ` Daisuke Matsuda
  1 sibling, 1 reply; 13+ messages in thread
From: Yanjun.Zhu @ 2025-08-15 18:29 UTC (permalink / raw)
  To: Daisuke Matsuda, Philipp Reisner
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel


On 8/14/25 7:07 AM, Daisuke Matsuda wrote:
> On 2025/08/14 14:33, Zhu Yanjun wrote:
>> 在 2025/8/12 8:54, Daisuke Matsuda 写道:
>>> On 2025/08/11 22:48, Zhu Yanjun wrote:
>>>> 在 2025/8/10 22:26, Philipp Reisner 写道:
>>>>> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> 
>>>>> wrote:
>>>>>>
>>>>>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>>>>>> Allow the comp_handler callback implementation to call 
>>>>>>> ib_poll_cq().
>>>>>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe 
>>>>>>> driver.
>>>>>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock 
>>>>>>> deadlock.
>>>>>>>
>>>>>>> The Mellanox and Intel drivers allow a comp_handler callback
>>>>>>> implementation to call ib_poll_cq().
>>>>>>>
>>>>>>> Avoid the deadlock by calling the comp_handler callback without
>>>>>>> holding cq->cw_lock.
>>>>>>>
>>>>>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>>>>>
>>>>>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>>>>>> Test resize CQ, start with specific value and then increase and 
>>>>>> decrease
>>>>>> ---------------------------------------------------------------------- 
>>>>>>
>>>>>> Traceback (most recent call last):
>>>>>>     File "/root/deb/rdma-core/tests/test_cq.py", line 135, in 
>>>>>> test_resize_cq
>>>>>>       u.poll_cq(self.client.cq)
>>>>>>     File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>>>>>       wcs = _poll_cq(cq, count, data)
>>>>>>             ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>>     File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>>>>>       raise PyverbsError(f'Got timeout on polling ({count} CQEs 
>>>>>> remaining)')
>>>>>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>>>>>> remaining)
>>>>>>
>>>>>> After I applied your patch in kervel v6.16, I got the above errors.
>>>>>>
>>>>>> Zhu Yanjun
>>>>>>
>>>>>
>>>>> Hello Zhu,
>>>>>
>>>>> When I run the test_resize_cq test in a loop (100 runs each) on the
>>>>> original code and with my patch, I get about the same failure rate.
>>>>
>>>> Add Daisuke Matsuda
>>>>
>>>> If I remember it correctly, when Daisuke and I discussed ODP 
>>>> patches, we both made tests with rxe, from our tests results, it 
>>>> seems that this test_resize_cq error does not occur.
>>>
>>> Hi Zhu and Philipp,
>>>
>>> As far as I know, this error has been present for some time.
>>> It might be possible to investigate further by capturing a memory 
>>> dump while the polling is stuck, but I have not had time to do that 
>>> yet.
>>> At least, I can confirm that this is not a regression caused by 
>>> Philipp's patch.
>>
>> Hi, Daisuke
>>
>> Thanks a lot. I’m now able to consistently reproduce this problem. I 
>> have created a commit here: 
>> https://github.com/zhuyj/linux/commit/8db3abc00bf49cac6ea1d5718d28c6516c94fb4e.
>>
>> After applying this commit, I ran test_resize_cq 10,000 times, and 
>> the problem did not occur.
>>
>> I’m not sure if there’s a better way to fix this issue. If anyone has 
>> a better solution, please share it.
>
> Hi Zhu,
>
> Thank you very much for the investigation.
>
> I agree that the issue can be worked around by adding a delay in the 
> rxe completer path.
> However, since the issue is easily reproducible, introducing an 
> explicit sleep might
> add unnecessary overhead. I think a short busy-wait would be a more 
> desirable alternative.
>
> The intermediate change below does make the issue disappear on my 
> node, but I don't think
> this is a complete solution. In particular, it appears that 
> ibcq->event_handler() —
> typically ib_uverbs_cq_event_handler() — is not re-entrant, so simply 
> spinning like this
> could be risky.
>
> ===
> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c 
> b/drivers/infiniband/sw/rxe/rxe_comp.c
> index a5b2b62f596b..a10a173e53cf 100644
> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
> @@ -454,7 +454,7 @@ static void do_complete(struct rxe_qp *qp, struct 
> rxe_send_wqe *wqe)
>         queue_advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>
>         if (post)
> -               rxe_cq_post(qp->scq, &cqe, 0);
> +               while (rxe_cq_post(qp->scq, &cqe, 0) == -EBUSY);
>
>         if (wqe->wr.opcode == IB_WR_SEND ||
>             wqe->wr.opcode == IB_WR_SEND_WITH_IMM ||
> ===
>
> If you agree with this direction, I can take some time in the next 
> week or so to make a
> formal patch. Of course, you are welcome to take over this idea if you 
> prefer.


Thanks for building on top of my earlier proposal and for sharing your 
findings. I appreciate the effort you’ve put into exploring this approach.

That said, there are a few concerns we should address before moving forward:

Busy-wait duration – It’s difficult to guarantee that the busy-wait will 
remain short. If it lasts too long, we may hit the “CPU is locked for 
too long” warnings, which could impact system responsiveness.

Placement of busy-wait – The current implementation adds the busy-wait 
after -EBUSY is returned, rather than at the exact location where it 
occurs. This may hide the actual contention source and could introduce 
side effects in other paths.

Reentrancy risk – Since ibcq->event_handler() (usually 
ib_uverbs_cq_event_handler()) is not re-entrant, spinning inside it 
could be risky and lead to subtle bugs.

I think the idea of avoiding a full sleep makes sense, but perhaps we 
could look into alternative approaches — for example, an adaptive delay 
mechanism or handling the -EBUSY condition closer to where it originates.

If you’re interested in pursuing this further, I’d be happy to review an 
updated patch. I believe this direction still has potential if we 
address the above points.

Thanks again for your contribution, and I look forward to your next version.
Best regards,

Yanjun.Zhu

>
> Thanks,
> Daisuke
>
>>
>> Thanks a lot.
>> Zhu Yanjun
>>
>>>
>>> Thanks,
>>> Daisuke
>>>
>>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-15 18:29             ` Yanjun.Zhu
@ 2025-08-16 15:57               ` Daisuke Matsuda
  0 siblings, 0 replies; 13+ messages in thread
From: Daisuke Matsuda @ 2025-08-16 15:57 UTC (permalink / raw)
  To: Yanjun.Zhu, Philipp Reisner
  Cc: Zhu Yanjun, Jason Gunthorpe, Leon Romanovsky, linux-rdma,
	linux-kernel

On 2025/08/16 3:29, Yanjun.Zhu wrote:
> 
> On 8/14/25 7:07 AM, Daisuke Matsuda wrote:
>> On 2025/08/14 14:33, Zhu Yanjun wrote:
>>> 在 2025/8/12 8:54, Daisuke Matsuda 写道:
>>>> On 2025/08/11 22:48, Zhu Yanjun wrote:
>>>>> 在 2025/8/10 22:26, Philipp Reisner 写道:
>>>>>> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>>>>>
>>>>>>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>>>>>>> Allow the comp_handler callback implementation to call ib_poll_cq().
>>>>>>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
>>>>>>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
>>>>>>>>
>>>>>>>> The Mellanox and Intel drivers allow a comp_handler callback
>>>>>>>> implementation to call ib_poll_cq().
>>>>>>>>
>>>>>>>> Avoid the deadlock by calling the comp_handler callback without
>>>>>>>> holding cq->cw_lock.
>>>>>>>>
>>>>>>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>>>>>>
>>>>>>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>>>>>>> Test resize CQ, start with specific value and then increase and decrease
>>>>>>> ----------------------------------------------------------------------
>>>>>>> Traceback (most recent call last):
>>>>>>>     File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
>>>>>>>       u.poll_cq(self.client.cq)
>>>>>>>     File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>>>>>>       wcs = _poll_cq(cq, count, data)
>>>>>>>             ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>>>>     File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>>>>>>       raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
>>>>>>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>>>>>>> remaining)
>>>>>>>
>>>>>>> After I applied your patch in kervel v6.16, I got the above errors.
>>>>>>>
>>>>>>> Zhu Yanjun
>>>>>>>
>>>>>>
>>>>>> Hello Zhu,
>>>>>>
>>>>>> When I run the test_resize_cq test in a loop (100 runs each) on the
>>>>>> original code and with my patch, I get about the same failure rate.
>>>>>
>>>>> Add Daisuke Matsuda
>>>>>
>>>>> If I remember it correctly, when Daisuke and I discussed ODP patches, we both made tests with rxe, from our tests results, it seems that this test_resize_cq error does not occur.
>>>>
>>>> Hi Zhu and Philipp,
>>>>
>>>> As far as I know, this error has been present for some time.
>>>> It might be possible to investigate further by capturing a memory dump while the polling is stuck, but I have not had time to do that yet.
>>>> At least, I can confirm that this is not a regression caused by Philipp's patch.
>>>
>>> Hi, Daisuke
>>>
>>> Thanks a lot. I’m now able to consistently reproduce this problem. I have created a commit here: https://github.com/zhuyj/linux/commit/8db3abc00bf49cac6ea1d5718d28c6516c94fb4e.
>>>
>>> After applying this commit, I ran test_resize_cq 10,000 times, and the problem did not occur.
>>>
>>> I’m not sure if there’s a better way to fix this issue. If anyone has a better solution, please share it.
>>
>> Hi Zhu,
>>
>> Thank you very much for the investigation.
>>
>> I agree that the issue can be worked around by adding a delay in the rxe completer path.
>> However, since the issue is easily reproducible, introducing an explicit sleep might
>> add unnecessary overhead. I think a short busy-wait would be a more desirable alternative.
>>
>> The intermediate change below does make the issue disappear on my node, but I don't think
>> this is a complete solution. In particular, it appears that ibcq->event_handler() —
>> typically ib_uverbs_cq_event_handler() — is not re-entrant, so simply spinning like this
>> could be risky.
>>
>> ===
>> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
>> index a5b2b62f596b..a10a173e53cf 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
>> @@ -454,7 +454,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
>>         queue_advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
>>
>>         if (post)
>> -               rxe_cq_post(qp->scq, &cqe, 0);
>> +               while (rxe_cq_post(qp->scq, &cqe, 0) == -EBUSY);
>>
>>         if (wqe->wr.opcode == IB_WR_SEND ||
>>             wqe->wr.opcode == IB_WR_SEND_WITH_IMM ||
>> ===
>>
>> If you agree with this direction, I can take some time in the next week or so to make a
>> formal patch. Of course, you are welcome to take over this idea if you prefer.
> 
> 
> Thanks for building on top of my earlier proposal and for sharing your findings. I appreciate the effort you’ve put into exploring this approach.
> 
> That said, there are a few concerns we should address before moving forward:
> 
> Busy-wait duration – It’s difficult to guarantee that the busy-wait will remain short. If it lasts too long, we may hit the “CPU is locked for too long” warnings, which could impact system responsiveness.
> 
> Placement of busy-wait – The current implementation adds the busy-wait after -EBUSY is returned, rather than at the exact location where it occurs. This may hide the actual contention source and could introduce side effects in other paths.
> 
> Reentrancy risk – Since ibcq->event_handler() (usually ib_uverbs_cq_event_handler()) is not re-entrant, spinning inside it could be risky and lead to subtle bugs.
> 
> I think the idea of avoiding a full sleep makes sense, but perhaps we could look into alternative approaches — for example, an adaptive delay mechanism or handling the -EBUSY condition closer to where it originates.
> 
> If you’re interested in pursuing this further, I’d be happy to review an updated patch. I believe this direction still has potential if we address the above points.

Other parts with similar queueing implementation do not present any problem,
so adding "an adaptive delay mechanism" only inside rxe_cq_post() looks the best solution.

I will look into and submit a patch

Thanks,
Daisuke


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-06 12:39 [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock Philipp Reisner
  2025-08-07  1:09 ` Zhu Yanjun
@ 2025-08-19  2:37 ` Zhu Yanjun
  2025-08-19 17:24   ` Philipp Reisner
  1 sibling, 1 reply; 13+ messages in thread
From: Zhu Yanjun @ 2025-08-19  2:37 UTC (permalink / raw)
  To: Philipp Reisner, Zhu Yanjun
  Cc: Jason Gunthorpe, Leon Romanovsky, linux-rdma, linux-kernel

在 2025/8/6 5:39, Philipp Reisner 写道:
> Allow the comp_handler callback implementation to call ib_poll_cq().
> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
> 
> The Mellanox and Intel drivers allow a comp_handler callback
> implementation to call ib_poll_cq().
> 
> Avoid the deadlock by calling the comp_handler callback without
> holding cq->cw_lock.
> 
> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_cq.c | 12 +++++++-----
>   1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
> index fffd144d509e..1195e109f89b 100644
> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
> @@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   	int full;
>   	void *addr;
>   	unsigned long flags;
> +	u8 notify;
>   
>   	spin_lock_irqsave(&cq->cq_lock, flags);
>   
> @@ -110,14 +111,15 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   
>   	queue_advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>   
> -	if ((cq->notify & IB_CQ_NEXT_COMP) ||
> -	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
> -		cq->notify = 0;
> -		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> -	}
> +	notify = cq->notify;
> +	cq->notify = 0;

Thanks a lot. In the original source code, cq->notify is set to 0 in the 
following test cases:

if ((cq->notify & IB_CQ_NEXT_COMP) ||
	    (cq->notify & IB_CQ_SOLICITED && solicited))

but in your commit, the above test case is removed.
I am not sure if this will introduce any risk or not.

I am fine with others.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

>   
>   	spin_unlock_irqrestore(&cq->cq_lock, flags);
>   
> +	if ((notify & IB_CQ_NEXT_COMP) ||
> +	    (notify & IB_CQ_SOLICITED && solicited))
> +		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-19  2:37 ` Zhu Yanjun
@ 2025-08-19 17:24   ` Philipp Reisner
  2025-08-22  2:54     ` Zhu Yanjun
  0 siblings, 1 reply; 13+ messages in thread
From: Philipp Reisner @ 2025-08-19 17:24 UTC (permalink / raw)
  To: yanjun.zhu
  Cc: jgg, leon, linux-kernel, linux-rdma, philipp.reisner, zyjzyj2000

Allow the comp_handler callback implementation to call ib_poll_cq().
A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.

The Mellanox and Intel drivers allow a comp_handler callback
implementation to call ib_poll_cq().

Avoid the deadlock by calling the comp_handler callback without
holding cq->cq_lock.

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
---
 drivers/infiniband/sw/rxe/rxe_cq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index fffd144d509e..95652001665d 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	int full;
 	void *addr;
 	unsigned long flags;
+	bool invoke_handler = false;
 
 	spin_lock_irqsave(&cq->cq_lock, flags);
 
@@ -113,11 +114,14 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	if ((cq->notify & IB_CQ_NEXT_COMP) ||
 	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
 		cq->notify = 0;
-		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+		invoke_handler = true;
 	}
 
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
+	if (invoke_handler)
+		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
+
 	return 0;
 }
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
  2025-08-19 17:24   ` Philipp Reisner
@ 2025-08-22  2:54     ` Zhu Yanjun
  0 siblings, 0 replies; 13+ messages in thread
From: Zhu Yanjun @ 2025-08-22  2:54 UTC (permalink / raw)
  To: Philipp Reisner; +Cc: jgg, leon, linux-kernel, linux-rdma, zyjzyj2000

在 2025/8/19 10:24, Philipp Reisner 写道:
> Allow the comp_handler callback implementation to call ib_poll_cq().
> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
> 
> The Mellanox and Intel drivers allow a comp_handler callback
> implementation to call ib_poll_cq().
> 
> Avoid the deadlock by calling the comp_handler callback without
> holding cq->cq_lock.
> 
> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
> ---

The new patch should be sent in a new mail thread; it is not appropriate 
to reply to the old thread.

Additionally:

The subject line does not include a version number.

The commit log of the new patch does not contain a changelog.

Other than these issues, I am fine with this commit.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>

Zhu Yanjun

>   drivers/infiniband/sw/rxe/rxe_cq.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
> index fffd144d509e..95652001665d 100644
> --- a/drivers/infiniband/sw/rxe/rxe_cq.c
> +++ b/drivers/infiniband/sw/rxe/rxe_cq.c
> @@ -88,6 +88,7 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   	int full;
>   	void *addr;
>   	unsigned long flags;
> +	bool invoke_handler = false;
>   
>   	spin_lock_irqsave(&cq->cq_lock, flags);
>   
> @@ -113,11 +114,14 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>   	if ((cq->notify & IB_CQ_NEXT_COMP) ||
>   	    (cq->notify & IB_CQ_SOLICITED && solicited)) {
>   		cq->notify = 0;
> -		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> +		invoke_handler = true;
>   	}
>   
>   	spin_unlock_irqrestore(&cq->cq_lock, flags);
>   
> +	if (invoke_handler)
> +		cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
> +
>   	return 0;
>   }
>   


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-08-22  2:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-06 12:39 [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock Philipp Reisner
2025-08-07  1:09 ` Zhu Yanjun
2025-08-11  5:26   ` Philipp Reisner
2025-08-11 13:48     ` Zhu Yanjun
2025-08-12 15:54       ` Daisuke Matsuda
2025-08-14  5:33         ` Zhu Yanjun
2025-08-14 14:07           ` Daisuke Matsuda
     [not found]             ` <3cb43241-20d7-4ac9-b055-373fd058b3a3@linux.dev>
     [not found]               ` <2e645d1c-f853-4cee-9590-6f01820d027b@linux.dev>
2025-08-15  4:25                 ` Zhu Yanjun
2025-08-15 18:29             ` Yanjun.Zhu
2025-08-16 15:57               ` Daisuke Matsuda
2025-08-19  2:37 ` Zhu Yanjun
2025-08-19 17:24   ` Philipp Reisner
2025-08-22  2:54     ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).