From: "J.Bruce Fields" <bfields@citi.umich.edu>
To: NeilBrown <neilb@suse.de>
Cc: Ben Myers <bpm@sgi.com>, Olga Kornievskaia <aglo@citi.umich.edu>,
NFS <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH] NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure.
Date: Thu, 25 Jul 2013 16:18:05 -0400 [thread overview]
Message-ID: <20130725201805.GB17962@fieldses.org> (raw)
In-Reply-To: <20130725113023.7bcbc347@notabene.brown>
On Thu, Jul 25, 2013 at 11:30:23AM +1000, NeilBrown wrote:
>
> Since we enabled auto-tuning for sunrpc TCP connections we do not
> guarantee that there is enough write-space on each connection to
> queue a reply.
>
> If memory pressure causes the window to shrink too small, the request
> throttling in sunrpc/svc will not accept any requests so no more requests
> will be handled. Even when pressure decreases the window will not
> grow again until data is sent on the connection.
> This means we get a deadlock: no requests will be handled until there
> is more space, and no space will be allocated until a request is
> handled.
>
> This can be simulated by modifying svc_tcp_has_wspace to inflate the
> number of byte required and removing the 'svc_sock_setbufsize' calls
> in svc_setup_socket.
Ah-hah!
> I found that multiplying by 16 was enough to make the requirement
> exceed the default allocation. With this modification in place:
> mount -o vers=3,proto=tcp 127.0.0.1:/home /mnt
> would block and eventually time out because the nfs server could not
> accept any requests.
So, this?:
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 305374d..36de50d 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1193,6 +1193,7 @@ static int svc_tcp_has_wspace(struct svc_xprt *xprt)
if (test_bit(XPT_LISTENER, &xprt->xpt_flags))
return 1;
required = atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg;
+ required *= 16;
if (sk_stream_wspace(svsk->sk_sk) >= required)
return 1;
set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
@@ -1378,14 +1379,8 @@ static struct svc_sock *svc_setup_socket(struct svc_serv *serv,
/* Initialize the socket */
if (sock->type == SOCK_DGRAM)
svc_udp_init(svsk, serv);
- else {
- /* initialise setting must have enough space to
- * receive and respond to one request.
- */
- svc_sock_setbufsize(svsk->sk_sock, 4 * serv->sv_max_mesg,
- 4 * serv->sv_max_mesg);
+ else
svc_tcp_init(svsk, serv);
- }
dprintk("svc: svc_setup_socket created %p (inet %p)\n",
svsk, svsk->sk_sk);
> This patch relaxes the request throttling to always allow at least one
> request through per connection. It does this by checking both
> sk_stream_min_wspace() and xprt->xpt_reserved
> are zero.
> The first is zero when the TCP transmit queue is empty.
> The second is zero when there are no RPC requests being processed.
> When both of these are zero the socket is idle and so one more
> request can safely be allowed through.
>
> Applying this patch allows the above mount command to succeed cleanly.
> Tracing shows that the allocated write buffer space quickly grows and
> after a few requests are handled, the extra tests are no longer needed
> to permit further requests to be processed.
>
> The main purpose of request throttling is to handle the case when one
> client is slow at collecting replies and the send queue gets full of
> replies that the client hasn't acknowledged (at the TCP level) yet.
> As we only change behaviour when the send queue is empty this main
> purpose is still preserved.
>
> Reported-by: Ben Myers <bpm@sgi.com>
> Signed-off-by: NeilBrown <neilb@suse.de>
>
> --
> As you can see I've changed the patch. While writing up the above
> description realised there was a weakness and so added the sk_stream_min_wspace
> test. That allowed me to write the final paragraph.
This is great, thanks!
Inclined to queue it up for 3.11 and stable....
--b.
>
> NeilBrown
>
>
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 305374d..7762b9f 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -1193,7 +1193,9 @@ static int svc_tcp_has_wspace(struct svc_xprt *xprt)
> if (test_bit(XPT_LISTENER, &xprt->xpt_flags))
> return 1;
> required = atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg;
> - if (sk_stream_wspace(svsk->sk_sk) >= required)
> + if (sk_stream_wspace(svsk->sk_sk) >= required ||
> + (sk_stream_min_wspace(svsk->sk_sk) == 0 &&
> + atomic_read(&xprt->xpt_reserved) == 0))
> return 1;
> set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> return 0;
next prev parent reply other threads:[~2013-07-25 20:18 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20130710092255.0240a36d@notabene.brown>
2013-07-10 2:27 ` Is tcp autotuning really what NFS wants? J.Bruce Fields
2013-07-10 4:32 ` NeilBrown
2013-07-10 19:07 ` J.Bruce Fields
2013-07-15 4:32 ` NeilBrown
2013-07-16 1:58 ` J.Bruce Fields
2013-07-16 4:00 ` NeilBrown
2013-07-16 14:24 ` J.Bruce Fields
2013-07-18 0:03 ` Ben Myers
2013-07-24 21:07 ` J.Bruce Fields
2013-07-25 1:30 ` [PATCH] NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure NeilBrown
2013-07-25 12:35 ` Jim Rees
2013-07-25 20:18 ` J.Bruce Fields [this message]
2013-07-25 20:33 ` NeilBrown
2013-07-26 14:19 ` J.Bruce Fields
2013-07-30 2:48 ` NeilBrown
2013-08-01 2:49 ` J.Bruce Fields
2013-07-10 17:33 ` Is tcp autotuning really what NFS wants? Dean
2013-07-10 17:39 ` Ben Greear
2013-07-15 4:35 ` NeilBrown
2013-07-15 23:32 ` Ben Greear
2013-07-16 4:46 ` NeilBrown
2013-07-10 19:59 ` Michael Richardson
2013-07-15 1:26 ` Jim Rees
2013-07-15 5:02 ` NeilBrown
2013-07-15 11:57 ` Jim Rees
2013-07-15 13:42 ` Jim Rees
2013-07-16 1:10 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130725201805.GB17962@fieldses.org \
--to=bfields@citi.umich.edu \
--cc=aglo@citi.umich.edu \
--cc=bpm@sgi.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).