linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: bfields@fieldses.org (J. Bruce Fields)
To: Jason L Tibbitts III <tibbs@math.uh.edu>
Cc: linux-nfs@vger.kernel.org
Subject: Re: NFS: nfs4_reclaim_open_state: Lock reclaim failed! log spew
Date: Thu, 25 Feb 2016 14:58:27 -0500	[thread overview]
Message-ID: <20160225195827.GC23315@fieldses.org> (raw)
In-Reply-To: <ufafuwhlr72.fsf@epithumia.math.uh.edu>

On Wed, Feb 24, 2016 at 03:43:45PM -0600, Jason L Tibbitts III wrote:
> My NFS infrastructure has servers running current RHEL7.2 (mostly kernel
> 3.10.0-327.4.5.el7 with a one-line patch needed to fix a soft lockup in
> nfs4_laundromat) and clients running current Fedora 23
> (4.3.5-300.fc23.x86_64).  Everything is mounted NFS4.1 with sec=krb5p.
> 
> Occasionally a client will get into a state where it just hammers the
> server with network traffic, sometimes at full line rate, with:
> 
> NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> 
> spewed to the log about 500 times a second.  The load goes up quite a
> bit (to 5-7 or so).  The machine isn't doing anything and there isn't
> even a user logged in.  However, there are always a few user processes
> hanging around, usually kwin_x11 for whatever reason.  (My guess is
> because of a lock on ~/.Xauthority.)
> 
> When I kill those user processes, this is logged once:
> 
> NFS: nfs4_reclaim_open_state: unhandled error -10068
> 
> -10068 is NFS4ERR_RETRY_UNCACHED_REP.

The only place the server sets that error is in
fs/nfsd/nfs4state.c:nfsd4_enc_sequence_replay.

If the server's correct, then the client attempted to resend a request
that the server was not required to cache.  In which case
NFS4ERR_RETRY_UNCACHED_REP is a valid error, and the client should give
up (or retry with a new slot/seqid?).

In any case, something's wrong with the 4.1 reply caching logic on
client or server.....

> Unfortunately I did not grab any of that traffic (I just wanted it to
> stop).  This happens to me periodically so I'll be sure to do that when
> it hits again.

OK, that'd be helpful.  Unfortunately what would probably be *most*
helpful would be the traffic that lead up to this--by the time the
client and server get into this loop the interesting problem may have
already happened--but just seeing the loop may be useful too.

--b.

> One theory is that this is related to a user's kerberos ticket
> expiring.  I see some hits when I search for the line that's spewed, but
> they're either not recent or or weren't reproducible.  I don't find any
> hits for that specific unhandled error.
> 
>  - J<
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2016-02-25 19:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24 21:43 NFS: nfs4_reclaim_open_state: Lock reclaim failed! log spew Jason L Tibbitts III
2016-02-25 19:58 ` J. Bruce Fields [this message]
2016-02-29 23:06   ` Jason L Tibbitts III
2016-03-01  0:48     ` J. Bruce Fields
2016-03-01  0:53       ` Jason L Tibbitts III
2016-03-01  1:01         ` J. Bruce Fields
2016-03-01  1:03           ` Jason L Tibbitts III
2016-11-16 20:55             ` Jason L Tibbitts III
2016-11-17 16:31               ` J. Bruce Fields
2016-11-17 17:08                 ` Jason L Tibbitts III
2016-11-17 20:22                   ` Andrew W Elble
2016-11-17 17:45                 ` Trond Myklebust
2016-11-17 19:32                   ` bfields
2016-11-17 19:58                     ` Olga Kornievskaia
2016-11-17 20:17                       ` bfields
2016-11-17 20:29                         ` Olga Kornievskaia
2016-11-17 20:46                           ` bfields
2016-11-17 21:05                             ` Olga Kornievskaia
2016-11-17 21:26                               ` bfields
2016-11-17 21:45                                 ` Trond Myklebust
2016-11-17 21:53                                   ` Olga Kornievskaia
2016-11-17 22:15                                     ` Trond Myklebust
2016-11-17 22:27                                       ` Olga Kornievskaia
2016-11-17 22:43                                         ` Trond Myklebust
2016-11-18 20:52                                           ` bfields
2016-11-18 22:44                                             ` Trond Myklebust
2016-11-21 18:37                                               ` Fields Bruce James

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160225195827.GC23315@fieldses.org \
    --to=bfields@fieldses.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tibbs@math.uh.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).