From: Tom Tucker <tom@opengridcomputing.com>
To: Greg Banks <gnb@sgi.com>
Cc: Neil Brown <neilb@suse.de>,
"J. Bruce Fields" <bfields@fieldses.org>,
Thomas Talpey <Thomas.Talpey@netapp.com>,
Peter Leckie <pleckie@melbourne.sgi.com>,
Linux NFS Mailing List <nfs@lists.sourceforge.net>
Subject: Re: [RFC,PATCH 4/14] knfsd: has_wspace per transport
Date: Fri, 18 May 2007 08:39:00 -0500 [thread overview]
Message-ID: <1179495540.23385.32.camel@trinity.ogc.int> (raw)
In-Reply-To: <1179495234.23385.30.camel@trinity.ogc.int>
On Fri, 2007-05-18 at 08:33 -0500, Tom Tucker wrote:
> On Fri, 2007-05-18 at 14:05 +1000, Greg Banks wrote:
> > On Thu, May 17, 2007 at 08:30:39PM +1000, Neil Brown wrote:
> > > On Thursday May 17, gnb@sgi.com wrote:
> > > > On Wed, May 16, 2007 at 05:10:53PM -0400, J. Bruce Fields wrote:
> > > > > On Thu, May 17, 2007 at 05:22:11AM +1000, Greg Banks wrote:
> > > > > > + set_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > > > > > + if (required*2 > wspace) {
> > > > > > + /* Don't enqueue while not enough space for reply */
> > > > > > + dprintk("svc: socket %p no space, %d*2 > %d, not enqueued\n",
> > > > > > + svsk->sk_sk, required, wspace);
> > > > > > + return 0;
> > > > > > + }
> > > > > > + clear_bit(SOCK_NOSPACE, &svsk->sk_sock->flags);
> > > > > > + return 1;
> > > > > > +}
> > > > >
> > > > > So, this is just my ignorance--why do the set and clear of SOCK_NOSPACE
> > > > > need to be ordered in the way they are? (Why not just set once inside
> > > > > the if clause?)
> > > >
> > > > I can't see a good reason for it, but I'm trying to minimise
> > > > perturbations to the logic.
> > >
> > > Unfortunately, you actually perturbed the important bit... Or at
> > > least, the bit that I thought was important when I wrote it.
> > >
> > > Previously, sk_stream_wspace(), or sock_wspace() would be called *after*
> > > SOCK_NOSPACE was set. With your patch it is called *before*.
> > >
> > > It is a fairly improbably race, but if the output queue flushed
> > > completely between calling XX_wspace and setting SOCK_NOSPACE, the
> > > sk_write_space callback might never get called.
> >
> > Woops. I'll fix that.
> >
> > > And I gather by the fact that you test "->sko_has_wspace" that RDMA
> > > doesn't have such a function?
> >
> > You gather correctly.
> >
> > > Do that mean that RDMA will never
> > > reject a write due to lack of space?
> >
> > No, it means that the current RDMA send code will block waiting
> > for space to become available. That's right, nfsd threads block on
> > the network. Steel yourself, there's worse to come.
> >
>
> Uh... Not really. The queue depths are designed to match credits to
> worst case reply sizes. In the normal case, it should never have to
> wait. The wait is to catch the margins in the same way that a kmalloc
> will wait for memory to become available.
What I mean here is that the server code will wait when a kmalloc fails,
not that kmalloc itself will wait.
>
> There's actually a stat kept by the transport that counts the number of
> times it waits.
>
> There is a place that a wait is done in the "normal" case and that's for
> the completion of an RDMA_READ in the process of gathering the data for
> and RPC on receive. That wait happens _every_ time.
>
> > > That seems unlikely.
> > > I would rather assume that every transport has a sko_has_wspace
> > > function...
> >
> > Ok, but for today the RDMA one will be
> >
> > static int svc_rdma_has_wspace(struct svc_sock *svsk)
> > {
> > return 1;
> > }
> >
> > We might be able to pull the condition in the blocking logic out
> > of svc_rdma_send() out to implement an sko_has_wspace, but there's
> > something of an impedance mismatch. The RDMA queue limit is expressed
> > in Work Requests not bytes, and the mapping between the two is not
> > precisely visible at the point when has_wspace is called. I guess
> > we'd have to use an upper bound.
> > Greg.
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by DB2 Express
> Download DB2 Express C - the FREE version of DB2 express and take
> control of your XML. No limits. Just data. Click to get it now.
> http://sourceforge.net/powerbar/db2/
> _______________________________________________
> NFS maillist - NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
next prev parent reply other threads:[~2007-05-18 13:39 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-16 19:22 [RFC,PATCH 4/14] knfsd: has_wspace per transport Greg Banks
2007-05-16 21:10 ` J. Bruce Fields
2007-05-17 7:12 ` Greg Banks
2007-05-17 10:30 ` Neil Brown
2007-05-17 12:39 ` Talpey, Thomas
2007-05-18 0:30 ` Neil Brown
2007-05-18 4:05 ` Greg Banks
2007-05-18 13:33 ` Tom Tucker
2007-05-18 13:39 ` Tom Tucker [this message]
2007-05-22 11:16 ` Greg Banks
2007-05-22 17:34 ` Tom Tucker
2007-05-23 2:32 ` Greg Banks
2007-05-23 5:22 ` Tom Tucker
2007-05-23 6:41 ` Greg Banks
2007-05-23 13:36 ` Chuck Lever
2007-05-23 14:39 ` Greg Banks
2007-05-23 20:11 ` Chuck Lever
2007-05-18 13:44 ` Talpey, Thomas
2007-05-18 6:21 ` Greg Banks
2007-05-18 6:38 ` Neil Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1179495540.23385.32.camel@trinity.ogc.int \
--to=tom@opengridcomputing.com \
--cc=Thomas.Talpey@netapp.com \
--cc=bfields@fieldses.org \
--cc=gnb@sgi.com \
--cc=neilb@suse.de \
--cc=nfs@lists.sourceforge.net \
--cc=pleckie@melbourne.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.