From mboxrd@z Thu Jan  1 00:00:00 1970
From: "J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: [PATCH 00/12] Some improvements to request deferral and
	related code
Date: Mon, 10 Aug 2009 11:05:01 -0400
Message-ID: <20090810150501.GA3401@fieldses.org>
References: <20090804051145.15929.11356.stgit@notabene.brown> <20090804140428.GE14249@fieldses.org> <19067.43518.105153.247173@notabene.brown>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: linux-nfs@vger.kernel.org
To: Neil Brown <neilb@suse.de>
Return-path: <linux-nfs-owner@vger.kernel.org>
Received: from fieldses.org ([174.143.236.118]:48381 "EHLO fieldses.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751502AbZHJPFA (ORCPT <rfc822;linux-nfs@vger.kernel.org>);
	Mon, 10 Aug 2009 11:05:00 -0400
In-Reply-To: <19067.43518.105153.247173-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
Sender: linux-nfs-owner@vger.kernel.org
List-ID: <linux-nfs.vger.kernel.org>

On Fri, Aug 07, 2009 at 02:13:50PM +1000, Neil Brown wrote:
> On Tuesday August 4, bfields@fieldses.org wrote:
> > On Tue, Aug 04, 2009 at 03:22:38PM +1000, NeilBrown wrote:
> > >  This series fixes a few little bugs and tidies up some code but does
> > >  two main important things.
> > > 
> > >  1/ 'allow thread to block....' will wait a little while if there is a
> > >  cache miss.  If an answer is available in that time, it continues on
> > >  it's merry way.  If no answer arrives, the old deferral approach is
> > >  used.  It waits 5 seconds if there are spare nfsd threads, and 1
> > >  second if there all threads are busy.  I have almost nothing with
> > >  which to justify these numbers.
> > 
> > I think the v4 server at least should return NFS4ERR_DELAY in this case
> > instead of doing the internal replay.  That avoids possible problems
> > with non-idempotent compound ops.
> 
> If the request has been handed to nfsd, then yes I agree.  We probably
> want some way for nfsd to mark the request as "don't replay" so that
> an error will propagate out.  Currently we map that error to EJUKEBOX
> for v3 or v4, but you are right, we want ERR_DELAY for v4.

Note actually DELAY and JUKEBOX are both 10008--the v4 spec just renamed
it.

> If the request is still in the RPC code (trying to identify the
> origin or to decode the crypto) then we cannot return ERR_DELAY, but
> as none of the request will have been processed yet, there is no room
> for a problem with non-idempotent ops.
> 
> It has occurred to me that we could throw away the current request
> deferral completely:  if we don't feel comfortable delaying the thread
> for as long as it takes, we just return an error or drop the request
> (closing any connection).
> I'm not sure I'd be comfortable doing that if there were only a few
> (8?) threads though.
> Maybe if we got dynamic nfsd threads so that new ones could be created
> on demand I would feel quite happy to discard the deferral stuff and
> just use a delay.

How about just increasing the default number of threads for now?

--b.

> 
> > 
> > >From the protocol point of view I don't know if there's any rule of
> > thumb about when it'd be best to return DELAY.  Perhaps it's best to
> > avoid it whenever possible, but when the delay is on the order of
> > seconds it sounds reasonable to me.
> 
> Of course you don't know how long the delay will be until it happens:-)
> 
> But I agree.  Delay internally if possible, but as soon as that seems
> to be awkward (e.g. run out of threads), return DELAY
> 
> Thanks,
> NeilBrown
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html