All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Frank Filz" <ffilzlnx@mindspring.com>
To: "'NeilBrown'" <neilb@suse.com>,
	"'Linux NFS mailing list'" <linux-nfs@vger.kernel.org>
Subject: RE: Should NLM resends change the xid ??
Date: Mon, 28 Mar 2016 09:04:43 -0700	[thread overview]
Message-ID: <00a301d1890b$8b6ac190$a24044b0$@mindspring.com> (raw)
In-Reply-To: <877fgnwkuv.fsf@notabene.neil.brown.name>

> I've always thought that NLM was a less-than-perfect locking protocol, but
I
> recently discovered as aspect of it that is worse than I imagined.
> 
> Suppose client-A holds a lock on some region of a file, and client-B makes
a
> non-blocking lock request for that region.
> Now suppose as just before handling that request the lockd thread on the
> server stalls - for example due to excessive memory pressure causing a
> kmalloc to take 11 seconds (rare, but possible.  Such allocations never
fail,
> they just block until they can be served).
> 
> During this 11 seconds (say, at the 5 second mark), client-A releases the
lock -
> the UNLOCK request to the server queues up behind the non-blocking LOCK
> from client-B
> 
> The default retry time for NLM in Linux is 10 seconds (even for TCP!) so
NLM
> on client-B resends the non-blocking LOCK request, and it queues up behind
> the UNLOCK request.
> 
> Now finally the lockd thread gets some memory/CPU time and starts
> handling requests:
>  LOCK from client-B  - DENIED
>  UNLOCK from client-A - OK
>  LOCK from client-B - OK
> 
> Both replies to client-B have the same XID so client-B will believe
whichever
> one it gets first - DENIED.
> 
> So now we have the situation where client-B doesn't think it holds a lock,
but
> the server thinks it does.  This is not good.
> 
> I think this explains a locking problem that a customer is seeing.  The
> application seems to busy-wait for the lock using non-blocking LOCK
> requests.  Each LOCK request has a different 'svid' so I assume each comes
> from a different process. If you busy-wait from the one process this
problem
> won't occur.
> 
> Having a reply-cache on the server lockd might help, but such things
easily fill
> up and cannot provide a guarantee.
> 
> Having a longer timeout on the client would probably help too.  At the
very
> least we should increase the maximum timeout beyond 20 seconds.
> (assuming I reading the code correctly, the client resend timeout is based
on
> nlmsvc_timeout which is set from nlm_timeout which is restricted to the
> range 3-20).
> 
> Forcing the xid to change on every retransmit (for NLM) would ensure that
> we only accept the last reply, which I think is safe.

That sounds like a good solution to me. Since the requests are non-blocking,
each request should be considered separate from the others.

Frank


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus


  reply	other threads:[~2016-03-28 16:13 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-27 23:40 Should NLM resends change the xid ?? NeilBrown
2016-03-28 16:04 ` Frank Filz [this message]
2016-03-28 21:58   ` Tom Talpey
2016-03-29 22:35     ` NeilBrown
2016-03-29 14:40 ` Chuck Lever
2016-03-29 22:47   ` NeilBrown
2016-03-29 23:07     ` Chuck Lever
2016-03-30  1:02       ` NeilBrown
2016-03-30 15:53         ` Chuck Lever
2016-03-30 16:07         ` Frank Filz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='00a301d1890b$8b6ac190$a24044b0$@mindspring.com' \
    --to=ffilzlnx@mindspring.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.