Linux NFS development
 help / color / mirror / Atom feed
From: Christian Robottom Reis <kiko@acm.org>
To: Jeff Layton <jlayton@poochiereds.net>
Cc: NFS List <linux-nfs@vger.kernel.org>
Subject: Re: Finding and breaking client locks
Date: Mon, 21 Mar 2016 21:09:11 -0300	[thread overview]
Message-ID: <20160322000911.GA27183@chorus> (raw)
In-Reply-To: <20160321172735.7936f1f0@tlielax.poochiereds.net>

On Mon, Mar 21, 2016 at 05:27:35PM -0400, Jeff Layton wrote:
> And you're also correct that there is currently no facility for
> administratively revoking locks. That's something that would be a nice
> to have, if someone wanted to propose a sane interface and mechanism
> for it. Solaris had such a thing, IIRC, but I don't know how it was
> implemented.

I might look into that -- I think the right thing to do is (as you had
originally alluded to) dropping all locks pertaining to a specific
client, as the only failure scenario that can't be worked around that
I'm thinking about is the client disappearing.

I would also like to understand whether the data structure behind
/proc/locks could be extended to provide additional metadata which
the nfs kernel client could annotate to indicate client information.
That would allow one to figure out who the actual culprit machine was.

> There is one other option too -- you can send a SIGKILL to the lockd
> kernel thread and it will drop _all_ of its locks. That sort of sucks
> for all of the other clients, but it can unwedge things without
> restarting NFS.

That's quite useful to know, thanks -- I knew that messing with the
initscripts responsible for the nfs kernel services "fixed" the problem,
but killing lockd is much more convenient.

I wonder, is it normal application behaviour that any locks dropped
would be detected and reestablished on the client side?

> > In the situation which happened today my guess (because it's a mbox
> > file) is that a client ran something like mutt and the machine died
> > somewhere during shutdown. It's my guess because AIUI the lock doesn't
> > get stuck if the process is simply KILLed or crashes.
> 
> What should happen there is that the client notify the server when it
> comes back up, so it can release its locks. That can fail to occur for
> all sorts of reasons, and that leads exactly to the problem you have
> now. It's also possible for the client to just drop off the net
> indefinitely while holding locks in which case you're just out of luck.

That's quite interesting. I had initially thought that a misbehaved
application could die while holding a lock, but it seems as though the
client kernel tracks any remote locks held and releases them regardless
of how the process dies. It seems like the actual problem scenarios are:

    - Client disappears off the net while holding a lock
    - Client kernel fails to clear NFS locks (likely a bug)
    - Rogue or misbehaved client holds a lock indefinitely

In any of these cases, the useful thing to know is which client actually
holds the lock.

> It really is better to use NFSv4 if you can at all get away with it.
> Lease-based locking puts the onus on the client to stay in contact with
> the server if it wants to maintain its state.

I considered moving a few times, but the setup here is a bit fragile and
AIUI NFSv4 isn't a straight drop-in for v3. Beyond modifying nfsvers,
IIRC at least idmapd needed to be set up and perhaps there was more.
-- 
Christian Robottom Reis | [+55 16] 3376 0125   | http://async.com.br/~kiko
                        | [+55 16] 991 126 430 | http://launchpad.net/~kiko

  reply	other threads:[~2016-03-22  0:09 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-21 14:39 Finding and breaking client locks Christian Robottom Reis
2016-03-21 17:19 ` Jeff Layton
2016-03-21 17:55   ` Christian Robottom Reis
2016-03-21 20:56     ` Christian Robottom Reis
2016-03-21 21:27       ` Jeff Layton
2016-03-22  0:09         ` Christian Robottom Reis [this message]
2016-03-22  0:30           ` J. Bruce Fields
2016-03-31  5:11         ` NeilBrown
2016-03-31 20:52           ` Frank Filz
2016-03-22  0:58 ` Christian Robottom Reis
2016-03-31  5:07   ` NeilBrown
2016-03-31 13:34     ` Trond Myklebust
2016-03-31 22:40       ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160322000911.GA27183@chorus \
    --to=kiko@acm.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-nfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox