From: Daisuke Matsuda <dskmtsd@gmail.com>
To: Zhu Yanjun <yanjun.zhu@linux.dev>,
Philipp Reisner <philipp.reisner@linbit.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>, Jason Gunthorpe <jgg@ziepe.ca>,
Leon Romanovsky <leon@kernel.org>,
linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock
Date: Thu, 14 Aug 2025 23:07:33 +0900 [thread overview]
Message-ID: <620f8611-1e95-4ebd-9db2-eb7231cfb3f2@gmail.com> (raw)
In-Reply-To: <885bb38c-4108-4fa2-a6d2-1e60d5e84af9@linux.dev>
On 2025/08/14 14:33, Zhu Yanjun wrote:
> 在 2025/8/12 8:54, Daisuke Matsuda 写道:
>> On 2025/08/11 22:48, Zhu Yanjun wrote:
>>> 在 2025/8/10 22:26, Philipp Reisner 写道:
>>>> On Thu, Aug 7, 2025 at 3:09 AM Zhu Yanjun <yanjun.zhu@linux.dev> wrote:
>>>>>
>>>>> 在 2025/8/6 5:39, Philipp Reisner 写道:
>>>>>> Allow the comp_handler callback implementation to call ib_poll_cq().
>>>>>> A call to ib_poll_cq() calls rxe_poll_cq() with the rdma_rxe driver.
>>>>>> And rxe_poll_cq() locks cq->cq_lock. That leads to a spinlock deadlock.
>>>>>>
>>>>>> The Mellanox and Intel drivers allow a comp_handler callback
>>>>>> implementation to call ib_poll_cq().
>>>>>>
>>>>>> Avoid the deadlock by calling the comp_handler callback without
>>>>>> holding cq->cw_lock.
>>>>>>
>>>>>> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
>>>>>
>>>>> ERROR: test_resize_cq (tests.test_cq.CQTest.test_resize_cq)
>>>>> Test resize CQ, start with specific value and then increase and decrease
>>>>> ----------------------------------------------------------------------
>>>>> Traceback (most recent call last):
>>>>> File "/root/deb/rdma-core/tests/test_cq.py", line 135, in test_resize_cq
>>>>> u.poll_cq(self.client.cq)
>>>>> File "/root/deb/rdma-core/tests/utils.py", line 687, in poll_cq
>>>>> wcs = _poll_cq(cq, count, data)
>>>>> ^^^^^^^^^^^^^^^^^^^^^^^^^
>>>>> File "/root/deb/rdma-core/tests/utils.py", line 669, in _poll_cq
>>>>> raise PyverbsError(f'Got timeout on polling ({count} CQEs remaining)')
>>>>> pyverbs.pyverbs_error.PyverbsError: Got timeout on polling (1 CQEs
>>>>> remaining)
>>>>>
>>>>> After I applied your patch in kervel v6.16, I got the above errors.
>>>>>
>>>>> Zhu Yanjun
>>>>>
>>>>
>>>> Hello Zhu,
>>>>
>>>> When I run the test_resize_cq test in a loop (100 runs each) on the
>>>> original code and with my patch, I get about the same failure rate.
>>>
>>> Add Daisuke Matsuda
>>>
>>> If I remember it correctly, when Daisuke and I discussed ODP patches, we both made tests with rxe, from our tests results, it seems that this test_resize_cq error does not occur.
>>
>> Hi Zhu and Philipp,
>>
>> As far as I know, this error has been present for some time.
>> It might be possible to investigate further by capturing a memory dump while the polling is stuck, but I have not had time to do that yet.
>> At least, I can confirm that this is not a regression caused by Philipp's patch.
>
> Hi, Daisuke
>
> Thanks a lot. I’m now able to consistently reproduce this problem. I have created a commit here: https://github.com/zhuyj/linux/commit/8db3abc00bf49cac6ea1d5718d28c6516c94fb4e.
>
> After applying this commit, I ran test_resize_cq 10,000 times, and the problem did not occur.
>
> I’m not sure if there’s a better way to fix this issue. If anyone has a better solution, please share it.
Hi Zhu,
Thank you very much for the investigation.
I agree that the issue can be worked around by adding a delay in the rxe completer path.
However, since the issue is easily reproducible, introducing an explicit sleep might
add unnecessary overhead. I think a short busy-wait would be a more desirable alternative.
The intermediate change below does make the issue disappear on my node, but I don't think
this is a complete solution. In particular, it appears that ibcq->event_handler() —
typically ib_uverbs_cq_event_handler() — is not re-entrant, so simply spinning like this
could be risky.
===
diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index a5b2b62f596b..a10a173e53cf 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -454,7 +454,7 @@ static void do_complete(struct rxe_qp *qp, struct rxe_send_wqe *wqe)
queue_advance_consumer(qp->sq.queue, QUEUE_TYPE_FROM_CLIENT);
if (post)
- rxe_cq_post(qp->scq, &cqe, 0);
+ while (rxe_cq_post(qp->scq, &cqe, 0) == -EBUSY);
if (wqe->wr.opcode == IB_WR_SEND ||
wqe->wr.opcode == IB_WR_SEND_WITH_IMM ||
===
If you agree with this direction, I can take some time in the next week or so to make a
formal patch. Of course, you are welcome to take over this idea if you prefer.
Thanks,
Daisuke
>
> Thanks a lot.
> Zhu Yanjun
>
>>
>> Thanks,
>> Daisuke
>>
>
next prev parent reply other threads:[~2025-08-14 14:07 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-06 12:39 [PATCH] rdma_rxe: call comp_handler without holding cq->cq_lock Philipp Reisner
2025-08-07 1:09 ` Zhu Yanjun
2025-08-11 5:26 ` Philipp Reisner
2025-08-11 13:48 ` Zhu Yanjun
2025-08-12 15:54 ` Daisuke Matsuda
2025-08-14 5:33 ` Zhu Yanjun
2025-08-14 14:07 ` Daisuke Matsuda [this message]
[not found] ` <3cb43241-20d7-4ac9-b055-373fd058b3a3@linux.dev>
[not found] ` <2e645d1c-f853-4cee-9590-6f01820d027b@linux.dev>
2025-08-15 4:25 ` Zhu Yanjun
2025-08-15 18:29 ` Yanjun.Zhu
2025-08-16 15:57 ` Daisuke Matsuda
2025-08-19 2:37 ` Zhu Yanjun
2025-08-19 17:24 ` Philipp Reisner
2025-08-22 2:54 ` Zhu Yanjun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=620f8611-1e95-4ebd-9db2-eb7231cfb3f2@gmail.com \
--to=dskmtsd@gmail.com \
--cc=jgg@ziepe.ca \
--cc=leon@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=philipp.reisner@linbit.com \
--cc=yanjun.zhu@linux.dev \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.