linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Joschi Brauchle <joschi.brauchle@tum.de>
Cc: "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>
Subject: Re: nfsd4: utime sometimes takes 40+ seconds to return (but on SLES11SP3 with kernel 3.0.82)
Date: Tue, 10 Sep 2013 17:55:34 -0400	[thread overview]
Message-ID: <20130910215534.GA21829@fieldses.org> (raw)
In-Reply-To: <038EA8B0-BD46-47F2-8F20-DEC8B6DA9087@tum.de>

On Tue, Sep 10, 2013 at 09:48:12PM +0000, Joschi Brauchle wrote:
> Am 10.09.2013 um 22:35 schrieb "J. Bruce Fields" <bfields@fieldses.org>:
> 
> > On Tue, Sep 10, 2013 at 08:49:05PM +0200, Joschi Brauchle wrote:
> >> Hello everyone,
> >> 
> >> we are administrating an NFS high-availability cluster running on
> >> SLES11SP1 with kernel 2.6.32.59. Just recently, one of the cluster
> >> machines was updated to SLES11SP3 with kernel 3.0.82.
> >> 
> >> 
> >> We are now experiencing severe hangs on NFS clients when the
> >> SLES11SP3 server is running the NFS services. An strace on the
> >> hanging processes on the client side show that is is waiting up to
> >> 60+ seconds for a "utime()" call to complete.
> >> 
> >> 
> >> The problem we see is matching the problem described in the thread
> >> "v3.5 nfsd4 regression; utime sometimes takes 40+ seconds to
> >> return". If the NFS server is running on SLES11SP3, the little test
> >> program provided in this tread hangs at the "utime()" call for 60+
> >> seconds. It hangs each time it is run! It finishes right away with 0
> >> seconds delay is SLES11SP1 is providing NFS services, each time.
> >> 
> >> 
> >> Now, in the serverside logfiles of SLES11SP3 we see these messages
> >> (not so on SP1):
> >> --------------
> >> kernel: [99381.184976] RPC: AUTH_GSS upcall timed out.
> >> kernel: [99381.184978] Please check user daemon is running.
> >> --------------
> >> 
> >> We have always been running the NFS server without rpc.gssd on the
> >> server side, as the init script for the nfsserver also does not
> >> start rpc.gssd.
> >> 
> >> 
> >> Once we started rpc.gssd on the SLES11SP3 server, using the test
> >> utility on the client shows that the first call to "utime()"
> >> succeeds right away, the second call takes ~25s to complete. But
> >> now, any consecutive runs of the utility finish with no more delay.
> >> 
> >> 
> >> So can anyone confirm that with kernel 3.0+ the rpc.gssd daemon is
> >> also required on the server side for correct operation?
> >> 
> >> Has there been a change between kernel 2.6.32.59 and 3.0.x?
> >> 
> >> Thus, is the init script of the nfsserver in SLES11SP3 indeed
> >> missing to start rpc.gssd?
> > 
> > It should be starting rpc.gssd to allow callbacks, yes.
> > 
> > --b.
> 
> Ok, we will run rpc.gssd on the server. Thanks. 
> 
> Could you please comment on having the nfs clients hang on utime() calls is to be expected when *not* running rpc.gssd? Or is this a problem that needs to be investigated?

I think what happens is the utime call breaks a delegation, and the
delay is because the lack of gssd prevents the server from calling back
to the client to tell it that its delegation is broken, so the
delegation has to time out.

That said, the server does a null callback to the client to test whether
callbacks are working before it gives out any delegations, so I'm
surprised it wouldn't have noticed the broken callbacks then.

--b.

  reply	other threads:[~2013-09-10 21:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-09-10 18:49 nfsd4: utime sometimes takes 40+ seconds to return (but on SLES11SP3 with kernel 3.0.82) Joschi Brauchle
2013-09-10 20:35 ` J. Bruce Fields
2013-09-10 21:48   ` Joschi Brauchle
2013-09-10 21:55     ` J. Bruce Fields [this message]
2013-09-10 22:08       ` Joschi Brauchle
2013-09-10 22:11         ` J. Bruce Fields
2013-09-13 11:32           ` Joschi Brauchle
2013-09-17 13:31             ` J. Bruce Fields

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130910215534.GA21829@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=joschi.brauchle@tum.de \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).