All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhu Yanjun <yanjun.zhu@linux.dev>
To: Honggang LI <honggangli@163.com>, Greg Sword <gregsword0@gmail.com>
Cc: zyjzyj2000@gmail.com, jgg@ziepe.ca, leon@kernel.org,
	rpearsonhpe@gmail.com, linux-rdma@vger.kernel.org
Subject: Re: [PATCH] RDMA/rxe: Restore tasklet call for rxe_cq.c
Date: Thu, 11 Jul 2024 15:46:24 +0200	[thread overview]
Message-ID: <ebcebbc3-24c0-4a44-a08a-dc1ef2d1458b@linux.dev> (raw)
In-Reply-To: <Zo-DSIrjIGavnuTD@fc39>

在 2024/7/11 9:01, Honggang LI 写道:
> On Thu, Jul 11, 2024 at 11:06:06AM +0800, Greg Sword wrote:
>> Subject: Re: [PATCH] RDMA/rxe: Restore tasklet call for rxe_cq.c
>> From: Greg Sword <gregsword0@gmail.com>
>> Date: Thu, 11 Jul 2024 11:06:06 +0800
>>
>> On Thu, Jul 11, 2024 at 9:41 AM Honggang LI <honggangli@163.com> wrote:
>>>
>>> If ib_req_notify_cq() was called in complete handler, deadlock occurs
>>> in receive path.
>>>
>>> rxe_req_notify_cq+0x21/0x70 [rdma_rxe]
>>> krping_cq_event_handler+0x26f/0x2c0 [rdma_krping]
>>
>> What is rdma_krping? What is the deadlock?
> 
> https://github.com/larrystevenwise/krping.git
> 
>> Please explain the deadlock in details.

I read the discussion carefully. I have the following:
1). This problem is not from RXE. It seems to be related with krping 
modules. As such, the root cause is not in RXE. It is not good to fix 
this problem in RXE.

2). In the kernel upstream, tasklet is marked obsolete and has some 
design flaws. So replacing workqueue with tasklet in RXE does not keep 
up with the kernel upstream.

https://patchwork.kernel.org/project/linux-rdma/cover/20240621050525.3720069-1-allen.lkml@gmail.com/
In this link, there are some work to replace tasklet with BH workqueue.
As such, it is not good to replace workqueue with tasklet.

 From the above, to now I can not agree with you. This is just my 2-cent 
suggestions.

I am not sure if others have better suggestions about this commit or not.

Zhu Yanjun
> 
>     88 int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
>     89 {
>     90         struct ib_event ev;
>     91         int full;
>     92         void *addr;
>     93         unsigned long flags;
>     94
>     95         spin_lock_irqsave(&cq->cq_lock, flags);  // Lock!
>                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>     96
>     97         full = queue_full(cq->queue, QUEUE_TYPE_TO_CLIENT);
>     98         if (unlikely(full)) {
>     99                 rxe_err_cq(cq, "queue full\n");
>    100                 spin_unlock_irqrestore(&cq->cq_lock, flags);
>    101                 if (cq->ibcq.event_handler) {
>    102                         ev.device = cq->ibcq.device;
>    103                         ev.element.cq = &cq->ibcq;
>    104                         ev.event = IB_EVENT_CQ_ERR;
>    105                         cq->ibcq.event_handler(&ev, cq->ibcq.cq_context);
>    106                 }
>    107
>    108                 return -EBUSY;
>    109         }
>    110
>    111         addr = queue_producer_addr(cq->queue, QUEUE_TYPE_TO_CLIENT);
>    112         memcpy(addr, cqe, sizeof(*cqe));
>    113
>    114         queue_advance_producer(cq->queue, QUEUE_TYPE_TO_CLIENT);
>    115
>    116         if ((cq->notify & IB_CQ_NEXT_COMP) ||
>    117             (cq->notify & IB_CQ_SOLICITED && solicited)) {
>    118                 cq->notify = 0;
>    119                 cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
>                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 		      call the complete handler   krping_cq_event_handler()
>    120         }
>    121
>    122         spin_unlock_irqrestore(&cq->cq_lock, flags);
> 
> 
> 
> static void krping_cq_event_handler(struct ib_cq *cq, void *ctx)
> {
>          struct krping_cb *cb = ctx;
>          struct ib_wc wc;
>          const struct ib_recv_wr *bad_wr;
>          int ret;
> 
>          BUG_ON(cb->cq != cq);
>          if (cb->state == ERROR) {
>                  printk(KERN_ERR PFX "cq completion in ERROR state\n");
>                  return;
>          }
>          if (cb->frtest) {
>                  printk(KERN_ERR PFX "cq completion event in frtest!\n");
>                  return;
>          }
>          if (!cb->wlat && !cb->rlat && !cb->bw)
>                  ib_req_notify_cq(cb->cq, IB_CQ_NEXT_COMP);
> 		^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>          while ((ret = ib_poll_cq(cb->cq, 1, &wc)) == 1) {
>                  if (wc.status) {
> 
> static int rxe_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
> {
>          struct rxe_cq *cq = to_rcq(ibcq);
>          int ret = 0;
>          int empty;
>          unsigned long irq_flags;
> 
>          spin_lock_irqsave(&cq->cq_lock, irq_flags);
> 	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Deadlock
> 


  reply	other threads:[~2024-07-11 13:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-11  1:40 [PATCH] RDMA/rxe: Restore tasklet call for rxe_cq.c Honggang LI
2024-07-11  3:06 ` Greg Sword
2024-07-11  7:01   ` Honggang LI
2024-07-11 13:46     ` Zhu Yanjun [this message]
2024-07-12  2:46       ` Honggang LI
2024-07-11 23:25     ` Zhu Yanjun
2024-09-15 12:26       ` Zhu Yanjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ebcebbc3-24c0-4a44-a08a-dc1ef2d1458b@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=gregsword0@gmail.com \
    --cc=honggangli@163.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.