From: Greg KH <gregkh@linuxfoundation.org>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org, linux-rdma@vger.kernel.org,
linux-nfs@vger.kernel.org
Subject: Re: [PATCH] xprtrdma: Fix disconnect regression
Date: Tue, 28 Aug 2018 07:19:42 +0200 [thread overview]
Message-ID: <20180828051942.GE2107@kroah.com> (raw)
In-Reply-To: <20180827232321.12635.40263.stgit@manet.1015granger.net>
On Mon, Aug 27, 2018 at 07:29:27PM -0400, Chuck Lever wrote:
> I found that injecting disconnects with v4.18-rc resulted in
> random failures of the multi-threaded git regression test.
>
> The root cause appears to be that, after a reconnect, the
> RPC/RDMA transport is waking pending RPCs before the transport has
> posted enough Receive buffers to receive the Replies. If a Reply
> arrives before enough Receive buffers are posted, the connection
> is dropped. A few connection drops happen in quick succession as
> the client and server struggle to regain credit synchronization.
>
> This regression was introduced with commit 7c8d9e7c8863 ("xprtrdma:
> Move Receive posting to Receive handler"). The client is supposed to
> post a single Receive when a connection is established because
> it's not supposed to send more than one RPC Call before it gets
> a fresh credit grant in the first RPC Reply [RFC 8166, Section
> 3.3.3].
>
> Unfortunately there appears to be a longstanding bug in the Linux
> client's credit accounting mechanism. On connect, it simply dumps
> all pending RPC Calls onto the new connection. It's possible it has
> done this ever since the RPC/RDMA transport was added to the kernel
> ten years ago.
>
> Servers have so far been tolerant of this bad behavior. Currently no
> server implementation ever changes its credit grant over reconnects,
> and servers always repost enough Receives before connections are
> fully established.
>
> The Linux client implementation used to post a Receive before each
> of these Calls. This has covered up the flooding send behavior.
>
> I could try to correct this old bug so that the client sends exactly
> one RPC Call and waits for a Reply. Since we are so close to the
> next merge window, I'm going to instead provide a simple patch to
> post enough Receives before a reconnect completes (based on the
> number of credits granted to the previous connection).
>
> The spurious disconnects will be gone, but the client will still
> send multiple RPC Calls immediately after a reconnect.
>
> Addressing the latter problem will wait for a merge window because
> a) I expect it to be a large change requiring lots of testing, and
> b) obviously the Linux client has interoperated successfully since
> day zero while still being broken.
>
> Fixes: 7c8d9e7c8863 ("xprtrdma: Move Receive posting to ... ")
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> net/sunrpc/xprtrdma/verbs.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> Hi stable@ -
>
> This fix has been merged into v4.19 as upstream commit 8d4fb8ff427a
> ("xprtrdma: Fix disconnect regression"). It addresses a regression
> in v4.18. I expected it to go into late v4.18-rc, which is why there
> is no "cc: stable" on the original submission.
>
> Could you please apply it to 4.18.y ? Thank you!
That commit does have a cc: stable in it, it is in my very large queue
of patches to apply...
thanks,
greg k-h
next prev parent reply other threads:[~2018-08-28 9:09 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-27 23:29 [PATCH] xprtrdma: Fix disconnect regression Chuck Lever
2018-08-28 5:19 ` Greg KH [this message]
-- strict thread matches above, loose matches on Subject: below --
2018-07-28 14:46 Chuck Lever
2018-08-06 14:00 ` Chuck Lever
2018-08-06 15:22 ` Anna Schumaker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180828051942.GE2107@kroah.com \
--to=gregkh@linuxfoundation.org \
--cc=chuck.lever@oracle.com \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).