All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mi Jinlong <mijinlong@cn.fujitsu.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: "Trond.Myklebust" <trond.myklebust@fys.uio.no>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	NFSv3 list <linux-nfs@vger.kernel.org>
Subject: Re: [RFC] server's statd and lockd will not sync after its nfslock restart
Date: Wed, 16 Dec 2009 18:27:09 +0800	[thread overview]
Message-ID: <4B28B5FD.5000103@cn.fujitsu.com> (raw)
In-Reply-To: <F9F5EA38-B51C-44A4-9812-873EEE1891C9@oracle.com>



Chuck Lever:
> On Dec 15, 2009, at 5:02 AM, Mi Jinlong wrote:
>> Hi,
>>
>> When testing the NLM at the latest kernel(2.6.32),  i find a bug.
>> When a client hold locks, after server restart its nfslock service,
>> server's statd will not synchronize with lockd.
>> If server restart nfslock twice or more, client's lock will be lost.
>>
>> Test process:
>>
>>  Step1: client open nfs file.
>>  Step2: client using fcntl to get lock.
>>  Step3: server restart it's nfslock service.
> 
> I'll assume here that you mean the equivalent of "service nfslock
> restart".  This restarts statd and possibly runs sm-notify, but it has
> no effect on lockd.

  Yes, i used "service nfslock restart".

  It has effect on lockd too, when service stop, lockd will get a KILL signal.
  Lockd will release all client's locks, and go into grace_period and wait 
  client reclaime it's lock.

> 
> Again, this test seems artificial to me.  Is there a real world use case
> where someone would deliberately restart statd while an NFS server is
> serving files?  I pose this question because I've worked on statd only
> for a year or so, and I am quite likely ignorant of all the ways it can
> be deployed.

  ^/^, but maybe someone will restart nfslock when an NFS server is serving files.
  It is inevitable.

> 
>> After step3, server's lockd records client holding locks, but statd's
>> /var/lib/nfs/statd/sm/ directory is empty. It means statd and lockd are
>> not sync. If server restart it's nfslock again, client's locks will be
>> lost.
>>
>> The Primary Reason:
>>
>>  At step3, when client's reclaimed lock request is sent to server,
>> client's host(the host struct) is reused but not be re-monitored at
>> server's lockd. After that, statd and lockd are not sync.
> 
> The kernel squashes SM_MON upcalls for hosts that it already believes
> are monitored.  This is a scalability feature.

  When statd start, it will move files from /var/lib/nfs/statd/sm/ to
  /var/lib/nfs/statd/sm.bak/. If lockd don't send a SM_MON to statd, 
  statd will not monitor those client which be monitored before statd restart.
  I don't make sure, is it right?  

> 
>> Question:
>>
>> In my opinion, if lockd is allowed reuseing the client's host, it should
>> send a SM_MON to statd when reuse. If not allowed, the client's host
>> should
>> be destroyed immediately.
>>
>> What should lockd to do?  Reuse ? Destroy ? Or some other action?
> 
> I don't immediately see why lockd should change it's behavior.  Perhaps
> statd/sm-notify were incorrect to delete the monitor list when you
> restarted the nfslock service?

  Sorry, maybe i did not express clearly.
  I mean, lockd reuse the host struct which was created before statd restart.

  It seems have deleted the monitor list when nfslock restart.

> 
> Can you show exactly how statd's state (ie it's on-disk monitor list in
> /var/lib/nfs/statd/sm) changed across the restart?  Did sm-notify run
> when you restarted statd?  If so, why didn't the sm-notify pid file stop
> it?
> 

  The statd and lockd's state at server when nfslock restart:

        lockd                   statd         |
                                              |
      host(monitored = 1)      /sm/client     |  client get locks success at first
          (locks)                             |
                                              |
      host(monitored = 1)      /sm/client     |  nfslock stop (lockd release client's locks)
          (no locks)                          |
                                              |  
      host(monitored = 1)      /sm/           |  nfslock start (client reclaim locks)
          (locks)                             |                (but statd don't monitor it)

  note: host(monitored=1)  means: client's host struct is created, and is marked be monitored.
        (locks), (no locks)means: host strcut holds locks, or not.
        /sm/client         means: there have a file under /var/lib/nfs/statd/sm directory
        /sm/               means: /var/lib/nfs/statd/sm is empty!


thanks,
Mi Jinlong


  reply	other threads:[~2009-12-16 10:25 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-15 10:02 [RFC] server's statd and lockd will not sync after its nfslock restart Mi Jinlong
2009-12-15 12:41 ` J. Bruce Fields
2009-12-16  9:46   ` Mi Jinlong
2009-12-15 15:10 ` Chuck Lever
2009-12-16 10:27   ` Mi Jinlong [this message]
2009-12-16 13:49     ` Jeff Layton
     [not found]       ` <20091216084902.64f722ad-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2009-12-17  9:34         ` Mi Jinlong
2009-12-16 19:33     ` Chuck Lever
2009-12-17 10:07       ` Mi Jinlong
2009-12-17 16:18         ` Chuck Lever
2009-12-17 20:14           ` J. Bruce Fields
2009-12-17 20:35             ` Chuck Lever
2009-12-17 20:27           ` Trond Myklebust
2009-12-17 20:34             ` Chuck Lever
2009-12-17 20:48               ` Trond Myklebust
2009-12-17 23:14           ` Neil Brown
     [not found]             ` <20091218101438.48eb06a4-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2009-12-18 15:18               ` Chuck Lever
2009-12-19 16:42                 ` Steve Dickson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B28B5FD.5000103@cn.fujitsu.com \
    --to=mijinlong@cn.fujitsu.com \
    --cc=bfields@fieldses.org \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.