public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Yanjun Zhu <yanjun.zhu@linux.dev>
To: Bob Pearson <rpearsonhpe@gmail.com>,
	jgg@nvidia.com, zyjzyj2000@gmail.com, linux-rdma@vger.kernel.org
Subject: Re: [PATCH for-rc] RDMA/rxe: Fix rnr retry behavior
Date: Fri, 13 May 2022 21:38:52 +0800	[thread overview]
Message-ID: <36f6e476-9762-6d39-e167-abb8dcc9f2bb@linux.dev> (raw)
In-Reply-To: <e587f531-0650-1548-1fe0-04d0152a5082@linux.dev>



在 2022/5/13 10:40, Yanjun Zhu 写道:
> 在 2022/5/13 3:49, Bob Pearson 写道:
>> Currently the completer tasklet when it sets the retransmit timer or the
>> nak timer sets the same flag (qp->req.need_retry) so that if either
>> timer fires it will attempt to perform a retry flow on the send queue.
>> This has the effect of responding to an RNR NAK at the first retransmit
>> timer event which does not allow for the requested rnr timeout.
>>
>> This patch adds a new flag (qp->req.need_rnr_timeout) which, if set,
>> prevents a retry flow until the rnr nak timer fires.
>>
>> This patch fixes rnr retry errors which can be observed by running the
>> pyverbs test suite 50-100X. With this patch applied they do not occur.
> 
> Do you mean that running run_tests.py for 50-100times can reproduce this 
> bug? I want to reproduce this problem.

Running run_tests.py for 50-100times, I can not reproduce this problem

Zhu Yanjun

> 
> Thanks a lot.
> Zhu Yanjun
> 
>>
>> Link: 
>> https://lore.kernel.org/linux-rdma/a8287823-1408-4273-bc22-99a0678db640@gmail.com/ 
>>
>> Fixes: 8700e3e7c485 ("Soft RoCE (RXE) - The software RoCE driver")
>> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_comp.c  | 4 +---
>>   drivers/infiniband/sw/rxe/rxe_qp.c    | 1 +
>>   drivers/infiniband/sw/rxe/rxe_req.c   | 6 ++++--
>>   drivers/infiniband/sw/rxe/rxe_verbs.h | 1 +
>>   4 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c 
>> b/drivers/infiniband/sw/rxe/rxe_comp.c
>> index 138b3e7d3a5f..bc668cb211b1 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
>> @@ -733,9 +733,7 @@ int rxe_completer(void *arg)
>>                   if (qp->comp.rnr_retry != 7)
>>                       qp->comp.rnr_retry--;
>> -                qp->req.need_retry = 1;
>> -                pr_debug("qp#%d set rnr nak timer\n",
>> -                     qp_num(qp));
>> +                qp->req.need_rnr_timeout = 1;
>>                   mod_timer(&qp->rnr_nak_timer,
>>                         jiffies + rnrnak_jiffies(aeth_syn(pkt)
>>                           & ~AETH_TYPE_MASK));
>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c 
>> b/drivers/infiniband/sw/rxe/rxe_qp.c
>> index 62acf890af6c..1c962468714e 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>> @@ -513,6 +513,7 @@ static void rxe_qp_reset(struct rxe_qp *qp)
>>       atomic_set(&qp->ssn, 0);
>>       qp->req.opcode = -1;
>>       qp->req.need_retry = 0;
>> +    qp->req.need_rnr_timeout = 0;
>>       qp->req.noack_pkts = 0;
>>       qp->resp.msn = 0;
>>       qp->resp.opcode = -1;
>> diff --git a/drivers/infiniband/sw/rxe/rxe_req.c 
>> b/drivers/infiniband/sw/rxe/rxe_req.c
>> index ae5fbc79dd5c..770ae4279f73 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_req.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_req.c
>> @@ -103,7 +103,8 @@ void rnr_nak_timer(struct timer_list *t)
>>   {
>>       struct rxe_qp *qp = from_timer(qp, t, rnr_nak_timer);
>> -    pr_debug("qp#%d rnr nak timer fired\n", qp_num(qp));
>> +    qp->req.need_retry = 1;
>> +    qp->req.need_rnr_timeout = 0;
>>       rxe_run_task(&qp->req.task, 1);
>>   }
>> @@ -624,10 +625,11 @@ int rxe_requester(void *arg)
>>           qp->req.need_rd_atomic = 0;
>>           qp->req.wait_psn = 0;
>>           qp->req.need_retry = 0;
>> +        qp->req.need_rnr_timeout = 0;
>>           goto exit;
>>       }
>> -    if (unlikely(qp->req.need_retry)) {
>> +    if (unlikely(qp->req.need_retry && !qp->req.need_rnr_timeout)) {
>>           req_retry(qp);
>>           qp->req.need_retry = 0;
>>       }
>> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h 
>> b/drivers/infiniband/sw/rxe/rxe_verbs.h
>> index e7eff1ca75e9..ab3186478c3f 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_verbs.h
>> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
>> @@ -123,6 +123,7 @@ struct rxe_req_info {
>>       int            need_rd_atomic;
>>       int            wait_psn;
>>       int            need_retry;
>> +    int            need_rnr_timeout;
>>       int            noack_pkts;
>>       struct rxe_task        task;
>>   };
>>
>> base-commit: c5eb0a61238dd6faf37f58c9ce61c9980aaffd7a
> 

  reply	other threads:[~2022-05-13 13:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12 19:49 [PATCH for-rc] RDMA/rxe: Fix rnr retry behavior Bob Pearson
2022-05-13  2:40 ` Yanjun Zhu
2022-05-13 13:38   ` Yanjun Zhu [this message]
2022-05-13 15:42     ` Bob Pearson
2022-05-13 18:22     ` Bob Pearson
2022-05-14  2:30       ` Yanjun Zhu
2022-05-14  3:25         ` Bob Pearson
2022-05-13 13:04 ` Tom Talpey
2022-05-13 15:33   ` Bob Pearson
2022-05-13 17:43     ` Tom Talpey
2022-05-13 18:32       ` Bob Pearson
2022-05-13 19:08         ` Tom Talpey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=36f6e476-9762-6d39-e167-abb8dcc9f2bb@linux.dev \
    --to=yanjun.zhu@linux.dev \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox