From: Trond Myklebust <trond.myklebust@fys.uio.no>
To: "J. Bruce Fields" <bfields@fieldses.org>
Cc: Jeff Moyer <jmoyer@redhat.com>,
netdev@vger.kernel.org, Andrew Morton <akpm@linux-foundation.org>,
Jens Axboe <jens.axboe@oracle.com>,
linux-kernel@vger.kernel.org, "Rafael J. Wysocki" <rjw@sisk.pl>,
Olga Kornievskaia <aglo@citi.umich.edu>,
Jim Rees <rees@umich.edu>,
linux-nfs@vger.kernel.org
Subject: Re: 2.6.30-rc deadline scheduler performance regression for iozone over NFS
Date: Thu, 14 May 2009 14:26:09 -0400 [thread overview]
Message-ID: <1242325569.6560.27.camel@heimdal.trondhjem.org> (raw)
In-Reply-To: <20090514175500.GB5675@fieldses.org>
On Thu, 2009-05-14 at 13:55 -0400, J. Bruce Fields wrote:
> On Wed, May 13, 2009 at 07:45:38PM -0400, Trond Myklebust wrote:
> > On Wed, 2009-05-13 at 15:29 -0400, Jeff Moyer wrote:
> > > Hi, netdev folks. The summary here is:
> > >
> > > A patch added in the 2.6.30 development cycle caused a performance
> > > regression in my NFS iozone testing. The patch in question is the
> > > following:
> > >
> > > commit 47a14ef1af48c696b214ac168f056ddc79793d0e
> > > Author: Olga Kornievskaia <aglo@citi.umich.edu>
> > > Date: Tue Oct 21 14:13:47 2008 -0400
> > >
> > > svcrpc: take advantage of tcp autotuning
> > >
> > > which is also quoted below. Using 8 nfsd threads, a single client doing
> > > 2GB of streaming read I/O goes from 107590 KB/s under 2.6.29 to 65558
> > > KB/s under 2.6.30-rc4. I also see more run to run variation under
> > > 2.6.30-rc4 using the deadline I/O scheduler on the server. That
> > > variation disappears (as does the performance regression) when reverting
> > > the above commit.
> >
> > It looks to me as if we've got a bug in the svc_tcp_has_wspace() helper
> > function. I can see no reason why we should stop processing new incoming
> > RPC requests just because the send buffer happens to be 2/3 full. If we
>
> I agree, the calculation doesn't look right. But where do you get the
> 2/3 number from?
That's the sk_stream_wspace() vs. sk_stream_min_wspace() comparison.
> ...
> > @@ -964,23 +973,14 @@ static int svc_tcp_has_wspace(struct svc_xprt *xprt)
> > struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
> > struct svc_serv *serv = svsk->sk_xprt.xpt_server;
> > int required;
> > - int wspace;
> > -
> > - /*
> > - * Set the SOCK_NOSPACE flag before checking the available
> > - * sock space.
> > - */
> > - set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > - required = atomic_read(&svsk->sk_xprt.xpt_reserved) + serv->sv_max_mesg;
> > - wspace = sk_stream_wspace(svsk->sk_sk);
> > -
> > - if (wspace < sk_stream_min_wspace(svsk->sk_sk))
> > - return 0;
> > - if (required * 2 > wspace)
> > - return 0;
> >
> > - clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > + required = (atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg) * 2;
> > + if (sk_stream_wspace(svsk->sk_sk) < required)
>
> This calculation looks the same before and after--you've just moved the
> "*2" into the calcualtion of "required". Am I missing something? Maybe
> you meant to write:
>
> required = atomic_read(&xprt->xpt_reserved) + serv->sv_max_mesg * 2;
>
> without the parentheses?
I wasn't trying to change that part of the calculation. I'm just
splitting out the stuff which has to do with TCP congestion (i.e. the
window size), and stuff which has to do with remaining socket buffer
space. I do, however, agree that we should probably drop that *2.
However there is (as usual) 'interesting behaviour' when it comes to
deferred requests. Their buffer space is already accounted for in the
'xpt_reserved' calculation, but they cannot get re-scheduled unless
svc_tcp_has_wspace() thinks it has enough free socket space for yet
another reply. Can you spell 'deadlock', children?
Trond
next prev parent reply other threads:[~2009-05-14 18:26 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <x49hc0f79k9.fsf@segfault.boston.devel.redhat.com>
[not found] ` <20090508120119.8c93cfd7.akpm@linux-foundation.org>
[not found] ` <20090511081415.GL4694@kernel.dk>
[not found] ` <x49skjb21b7.fsf@segfault.boston.devel.redhat.com>
[not found] ` <20090511165826.GG4694@kernel.dk>
[not found] ` <x494ovp4r51.fsf@segfault.boston.devel.redhat.com>
2009-05-13 3:44 ` 2.6.30-rc deadline scheduler performance regression for iozone over NFS Andrew Morton
2009-05-13 14:58 ` Jeff Moyer
[not found] ` <x49y6t1rqw0.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2009-05-13 16:20 ` Olga Kornievskaia
[not found] ` <b4ff356f0905130920v184ab529mb52a4346d4c77c14-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-05-13 16:32 ` Andrew Morton
2009-05-13 18:16 ` Olga Kornievskaia
[not found] ` <b4ff356f0905131116o48181ccu4786578cc72c8ceb-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-05-13 19:06 ` Jeff Moyer
2009-05-13 18:25 ` Jim Rees
2009-05-13 19:45 ` Trond Myklebust
2009-05-13 19:29 ` Jeff Moyer
2009-05-13 23:45 ` Trond Myklebust
[not found] ` <1242258338.5407.244.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-14 13:34 ` Jeff Moyer
[not found] ` <x49octv7qr8.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2009-05-14 14:33 ` Trond Myklebust
[not found] ` <1242311620.6560.14.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-14 14:38 ` Jeff Moyer
2009-05-14 15:00 ` Jeff Moyer
[not found] ` <x49ws8j686r.fsf-RRHT56Q3PSP4kTEheFKJxxDDeQx5vsVwAInAS/Ez/D0@public.gmane.org>
2009-05-17 19:10 ` Trond Myklebust
2009-05-17 19:12 ` Trond Myklebust
[not found] ` <1242587524.17796.3.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-18 14:15 ` Jeff Moyer
2009-05-22 23:45 ` J. Bruce Fields
2009-05-14 17:55 ` J. Bruce Fields
2009-05-14 18:26 ` Trond Myklebust [this message]
[not found] ` <1242325569.6560.27.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2009-05-15 21:37 ` J. Bruce Fields
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1242325569.6560.27.camel@heimdal.trondhjem.org \
--to=trond.myklebust@fys.uio.no \
--cc=aglo@citi.umich.edu \
--cc=akpm@linux-foundation.org \
--cc=bfields@fieldses.org \
--cc=jens.axboe@oracle.com \
--cc=jmoyer@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rees@umich.edu \
--cc=rjw@sisk.pl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox