Re: [RFC] After server stop nfslock service, client still can get lock success

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Mi Jinlong <mijinlong@cn.fujitsu.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "Trond.Myklebust" <trond.myklebust@fys.uio.no>,
	NFSv3 list <linux-nfs@vger.kernel.org>,
	"J. Bruce Fields" <bfields@fieldses.org>
Subject: Re: [RFC] After server stop nfslock service, client still can get lock success
Date: Thu, 19 Nov 2009 17:48:23 +0800	[thread overview]
Message-ID: <4B051467.70404@cn.fujitsu.com> (raw)
In-Reply-To: <799834E0-C52E-462A-A036-64B4C4DF5C06@oracle.com>

Hi

Chuck Lever:
> 
> On Nov 18, 2009, at 4:50 AM, Mi Jinlong wrote:
> 
>> Hi
>>
>> Chuck Lever:
>>>
>>> On Nov 17, 2009, at 4:47 AM, Mi Jinlong wrote:
>>>
>>>> When testing NLM, i find a bug.
>>>> After server stop nfslock service, client still can get lock success
>>>>
>>>> Test process:
>>>>
>>>> Step1: client open nfs file.
>>>> Step2: client using fcntl to get lock.
>>>> Step3: client using fcntl to release lock.
>>>> Step4: service stop it's nfslock service.
>>>> Step5: client using fcntl to get lock again.
>>>>
>>>> At step5, client should get lock fail, but it's success.
>>>>
>>>> Reason:
>>>> When server stop nfslock service, client's host struct not be
>>>> unmonitor at server. When client get lock again, the client's
>>>> host struct will be reuse but don't monitor again.
>>>> So that, at step5 client can get lock success.
>>>
>>> Effectively, the client is still monitored, since it is still in statd's
>>> monitored list.  Shutting down statd does not remove it from the monitor
>>> list.  If the local host reboots, sm-notify will still send the remote
>>> an SM_NOTIFY request, which is correct.
>>>
>>> Additionally, new clients attempting to lock files when statd is down
>>> will fail, which is correct if statd is not available.
>>>
>>> Conversely, if a monitored remote reboots, there is no way to notify the
>>> local lockd of the reboot, since statd normally relays the SM_NOTIFY to
>>> lockd, but isn't running.  That might be a problem.
>>
>>  Yes, it seems a problem.
>>
>>  I don't confirm it, so i want get your opinion.
> 
> Currently, there isn't a high degree of coordination between lockd and
> statd.  This is to maintain good scalability when serving NFS lock
> requests.  You offered a couple of alternatives for improving this
> specific situation, but my opinion is that there are larger, more
> general coordination issues here, and that what you observed is expected
> behavior for the current design.
> 
> This still seems to me like a case of "Patient: Doctor, it hurts when I
> do that." "Doctor: Well, then, don't do that."  In other words, we
> assume that "service nfslock stop" won't be used under normal operating
> conditions, and we know that NLM will misbehave if you stop statd during
> normal operation.
> 
>>> However, shutting down statd during normal operation is not a normal or
>>> supported thing to do.
>>>
>>>> Question:
>>>> 1. Should unmonitor the client's host struct at server
>>>>    when server stop nfslock service ?
>>>>
>>>> 2. Whether let rpc.statd tell kernel it's status(when start and stop)
>>>>    by send a SM_NOTIFY ?
>>>
>>> There are a number of other coordination issues around statd start-up
>>> and shut down.  The server's grace period, for instance, is not
>>> synchronized with sending reboot notifications.  So, we do recognize
>>> this is a general problem.
>>>
>>> In this case, however, I would expect indeterminate behavior if statd is
>>> shut down during normal operation, and that's exactly what we get.  I'm
>>> not sure it's even reasonable to support this use case.  Why would
>>> someone shut down statd and expect reliable NFSv2/v3 locking behavior?
>>> In other words, with due respect, what problem would we solve by fixing
>>> this, other than making your test case work?
>>
>>  When server's nfslock service is stop, client can get lock success
>> sometimes
>>  and can't get success sometimes, it's puzzled.
> 
> On Linux, the user space "nfslock" service is actually nothing more than
> statd.  Linux's NLM service is handled in the kernel, and is started and
> stopped when either a) there are NFS mounts, or b) NFSD is started.  The
> kernel's NLM service has nothing to do with "service nfslock start" any
> more.  I think there used to be a user space NLM implementation.
> 
>>> Out of curiosity, what happens if you try this on a Solaris server?
>>
>>  I'm a new man for Solaris.
>>  When Solaris's nlockmgr is stop, client can't get lock immediately.
> 
> I should have been more clear: if you stop Solaris' user space NSM
> daemon, can you lock files consistently?  My bet is that Solaris will
> demonstrate a similar degree of inconsistent behavior if you try
> NFSv2/v3 locking while starting and stopping its NSM service daemon.

  ^_^ 

  You are right, when i stop Solaris's NSM, client still can get lock success.
  Maybe it's the same as Linux.

-- 
Regards
Mi Jinlong

next prev parent reply	other threads:[~2009-11-19  9:47 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-17  9:47 [RFC] After server stop nfslock service, client still can get lock success Mi Jinlong
2009-11-17 15:34 ` Chuck Lever
2009-11-18  9:50   ` Mi Jinlong
2009-11-18 17:03     ` Chuck Lever
2009-11-19  9:48       ` Mi Jinlong [this message]
2009-11-19 15:41         ` Chuck Lever

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B051467.70404@cn.fujitsu.com \
    --to=mijinlong@cn.fujitsu.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox