From: "J.Bruce Fields" <bfields@citi.umich.edu>
To: Ben Myers <bpm@sgi.com>
Cc: NeilBrown <neilb@suse.de>,
Olga Kornievskaia <aglo@citi.umich.edu>,
NFS <linux-nfs@vger.kernel.org>
Subject: Re: Is tcp autotuning really what NFS wants?
Date: Wed, 24 Jul 2013 17:07:46 -0400 [thread overview]
Message-ID: <20130724210746.GB5777@fieldses.org> (raw)
In-Reply-To: <20130718000319.GL1681@sgi.com>
On Wed, Jul 17, 2013 at 07:03:19PM -0500, Ben Myers wrote:
> Hey Bruce,
>
> On Tue, Jul 16, 2013 at 10:24:30AM -0400, J.Bruce Fields wrote:
> > Adding Ben Myers to Cc: in case he has any testing help or advice (see
> > below):
> >
> > On Tue, Jul 16, 2013 at 02:00:21PM +1000, NeilBrown wrote:
> > > That would be because I only allowed extra requests up to half the number of
> > > idle threads, and there would be zero idle threads.
> > >
> > > So yes - I see your point.
> > >
> > > I now think that it is sufficient to just allow one request through per
> > > socket. While there is high memory pressure, that is sufficient. When the
> > > memory pressure drops, that should be enough to cause sndbuf to grow.
> > >
> > > We should be able to use xpt_reserved to check if this is the only request
> > > or not:
> >
> > A remaining possible bad case is if the number of "bad" sockets is more
> > than the number of threads, and if the problem is something that won't
> > resolve itself by just waiting a little while.
> >
> > I don't know if that's likely in practice. Maybe a network failure cuts
> > you off from a large swath (but not all) of your clients? Or maybe some
> > large proportion of your clients are just slow?
> >
> > But this is two lines and looks likely to solve the immediate
> > problem:
> >
> > > diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
> > > index 80a6640..5b832ef 100644
> > > --- a/net/sunrpc/svc_xprt.c
> > > +++ b/net/sunrpc/svc_xprt.c
> > > @@ -331,7 +331,8 @@ static bool svc_xprt_has_something_to_do(struct svc_xprt *xprt)
> > > if (xprt->xpt_flags & ((1<<XPT_CONN)|(1<<XPT_CLOSE)))
> > > return true;
> > > if (xprt->xpt_flags & ((1<<XPT_DATA)|(1<<XPT_DEFERRED)))
> > > - return xprt->xpt_ops->xpo_has_wspace(xprt);
> > > + return xprt->xpt_ops->xpo_has_wspace(xprt) ||
> > > + atomic_read(&xprt->xpt_reserved) == 0;
> > > return false;
> > > }
> > >
> > > @@ -730,7 +731,8 @@ static int svc_handle_xprt(struct svc_rqst *rqstp, struct svc_xprt *xprt)
> > > newxpt = xprt->xpt_ops->xpo_accept(xprt);
> > > if (newxpt)
> > > svc_add_new_temp_xprt(serv, newxpt);
> > > - } else if (xprt->xpt_ops->xpo_has_wspace(xprt)) {
> > > + } else if (xprt->xpt_ops->xpo_has_wspace(xprt) ||
> > > + atomic_read(&xprt->xpt_reserved) == 0) {
> > > /* XPT_DATA|XPT_DEFERRED case: */
> > > dprintk("svc: server %p, pool %u, transport %p, inuse=%d\n",
> > > rqstp, rqstp->rq_pool->sp_id, xprt,
> > >
> > >
> > > Thoughts?
> > >
> > > I tried testing this, but xpo_has_wspace never fails for me, even if I remove
> > > the calls to svc_sock_setbufsize (which probably aren't wanted for TCP any
> > > more).
> >
> > Ben, do you still have a setup that can reproduce the problem? Or did you
> > ever find an easier way to reproduce?
>
> Unfortunately I don't still have that test rig handy, and I did not find an
> easier way to reproduce this. I do agree that it 'is sufficient to just allow
> one request through per socket'... probably that would have solved the problem
> in our case. Maybe it would help to limit the amount of memory in your system
> to try and induce additional memory pressure? IIRC it was a 1mb block size
> read workload where we would hit this.
That makes sense.
Or I wonder if you could cheat and artificially force the tcp code to
behave as if there was memory pressure. (E.g. by calling
tcp_enter_memory_pressure on some trigger).
Neil, are you still looking at this? (If you're done with it, could you
resend that patch with a signed-off-by?)
--b.
next prev parent reply other threads:[~2013-07-24 21:07 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20130710092255.0240a36d@notabene.brown>
2013-07-10 2:27 ` Is tcp autotuning really what NFS wants? J.Bruce Fields
2013-07-10 4:32 ` NeilBrown
2013-07-10 19:07 ` J.Bruce Fields
2013-07-15 4:32 ` NeilBrown
2013-07-16 1:58 ` J.Bruce Fields
2013-07-16 4:00 ` NeilBrown
2013-07-16 14:24 ` J.Bruce Fields
2013-07-18 0:03 ` Ben Myers
2013-07-24 21:07 ` J.Bruce Fields [this message]
2013-07-25 1:30 ` [PATCH] NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure NeilBrown
2013-07-25 12:35 ` Jim Rees
2013-07-25 20:18 ` J.Bruce Fields
2013-07-25 20:33 ` NeilBrown
2013-07-26 14:19 ` J.Bruce Fields
2013-07-30 2:48 ` NeilBrown
2013-08-01 2:49 ` J.Bruce Fields
2013-07-10 17:33 ` Is tcp autotuning really what NFS wants? Dean
2013-07-10 17:39 ` Ben Greear
2013-07-15 4:35 ` NeilBrown
2013-07-15 23:32 ` Ben Greear
2013-07-16 4:46 ` NeilBrown
2013-07-10 19:59 ` Michael Richardson
2013-07-15 1:26 ` Jim Rees
2013-07-15 5:02 ` NeilBrown
2013-07-15 11:57 ` Jim Rees
2013-07-15 13:42 ` Jim Rees
2013-07-16 1:10 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130724210746.GB5777@fieldses.org \
--to=bfields@citi.umich.edu \
--cc=aglo@citi.umich.edu \
--cc=bpm@sgi.com \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).