From: Mi Jinlong <mijinlong@cn.fujitsu.com>
To: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@redhat.com>,
"Trond.Myklebust" <trond.myklebust@fys.uio.no>,
"J. Bruce Fields" <bfields@fieldses.org>,
NFSv3 list <linux-nfs@vger.kernel.org>
Subject: Re: [RFC] After nfs restart, locks can't be recovered which record by lockd before
Date: Mon, 18 Jan 2010 18:51:50 +0800 [thread overview]
Message-ID: <4B543D46.1070900@cn.fujitsu.com> (raw)
In-Reply-To: <2479069D-FCE0-42B5-9531-A3B7BA231E2F@oracle.com>
Hi Chuck,
Chuck Lever =E5=86=99=E9=81=93:
> On Jan 15, 2010, at 4:35 AM, Mi Jinlong wrote:
=2E..snip...
>>>>
>>>> Maybe, the kernel should fix this.
>>>
>>> What did you have in mind?
>>
>> I think when lockd restart, statd should restart too and sent
>> sm-notify to other client.
>=20
> Sending notifications is likely the correct thing to do if lockd is
> restarted while there are active locks. A statd restart isn't
> necessarily required to send reboot notifications, however. You can =
do
> it with "sm-notify -f".
>=20
> The problem with "sm-notify -f" is that it deletes the on-disk monito=
r
> list while statd is still running. This means the on-disk monitor li=
st
> and statd's in-memory monitor list will be out of sync. I seem to
> recall that sm-notify is run by itself by cluster scripts, and that
> could be a real problem.
>=20
> As implemented on RH, "service nfslock restart" will restart statd an=
d
> force an sm-notify anyway, so no real harm done, but that's pretty
> heavyweight (and requires that admins do "service nfs stop; service
> nfslock restart; service nfs start" or something like that if they wa=
nt
> to get proper lock recovery).
>=20
> A simple restart of statd (outside of the nfslock script) probably wo=
n't
> be adequate, though. It will respect the sm-notify pidfile, and not
> send notifications when started up. I don't see a flag on statd to
> force it to send notifications on restart (-N only sends notification=
s;
> it doesn't also start the statd daemon).
>=20
> In a perfect world, when lockd restarts, it would send up an
> SM_SIMU_CRASH, and statd would do the right thing: if there are
> monitored peers, it would send reboot notifications, and adjust it's
> monitor list accordingly; if there were no monitored peers, it would=
do
> nothing. Thus no statd restart would be needed.
Did this part have implemented at kernel?
I don't find the codes about SM_SIMU_CRASH.
IMO, if SM_SIMU_CRASH can work correctly, after lockd restart,
the lockd and statd would work correctly (SYN) too.
And the problem I asked above will not be happened.
>=20
>> But now, in kernel and nfs-uitls, it don't implemented.
>> As the communication style between lockd and statd, this is indeed
>> not easy to implement it.
>>
>> So, I think it's should more easy to implement it through the
>> mechanism that exposes
>> the kernel's nlm_host cache via /sys you show me before.
>=20
>>>> I want to know when using cammond "service nfslock restart" restar=
t the
>>>> nfslock service(means restart statd and lockd), will the statd cal=
l
>>>> sm-notify
>>>> to notify other client? Or don't?
>>>
>>> Currently "service nfslock restart" always causes a notification to=
be
>>> sent. Since "service nfslock restart" causes lockd to drop its loc=
ks (I
>>> assume that's what that "killproc lockd" does) I guess we need to f=
orce
>>> reboot notifications here. (I still argue that removing the pidfil=
e in
>>> the "start" case is not correct).
>>>
>>> It appears that both the nfs and nfslock start up scripts do someth=
ing
>>> to lockd (as well as the case when the number of NFS mounts goes to
>>> zero). However, only the nfslock script forces sm-notify to send
>>> notifications.
>>
>> But, at RHLE5 and Fedora, when using cammond "service nfslock
>> restart" restart
>> the nfslock service, the lockd isn't shutdown and rmmod'd.
>>
>> Is it a bug?
>=20
> For the "no more NFS mounts case" and the server shutdown case, the N=
=46S
> client or server, both being in the kernel, call lockd_down enough ti=
mes
> to make the user count go to zero. lockd.ko can be removed at that
> point. I seem to recall there being some kind of automatic mechanism
> for module removal after a period of zero module refcount. In other
> words, lockd.ko is removed as a side effect, afaict.
>=20
> The nfslock script doesn't stop either the kernel client or server co=
de,
> so it doesn't really cause a lockd_down call. But, nfslock does do a
> "killproc lockd". My assumption is that causes all locks to be
> dropped. So it's not a cold restart of lockd, but we still potential=
ly
> lose a lot of lock state here.
Thanks for your explain!
Thanks,
Mi Jinlong
next prev parent reply other threads:[~2010-01-18 10:51 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-13 9:51 [RFC] After nfs restart, locks can't be recovered which record by lockd before Mi Jinlong
2010-01-13 12:51 ` Jeff Layton
[not found] ` <20100113075155.5c409567-xSBYVWDuneFaJnirhKH9O4GKTjYczspe@public.gmane.org>
2010-01-13 18:53 ` Chuck Lever
2010-01-14 10:06 ` Mi Jinlong
2010-01-14 16:13 ` Chuck Lever
2010-01-15 9:35 ` Mi Jinlong
2010-01-15 16:12 ` Chuck Lever
2010-01-18 10:51 ` Mi Jinlong [this message]
2010-01-18 16:17 ` Chuck Lever
2010-01-19 10:36 ` Mi Jinlong
2010-01-14 9:41 ` Mi Jinlong
2010-01-14 12:10 ` Jeff Layton
[not found] ` <20100114071036.09583f4a-9yPaYZwiELC+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2010-01-15 9:28 ` Mi Jinlong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4B543D46.1070900@cn.fujitsu.com \
--to=mijinlong@cn.fujitsu.com \
--cc=bfields@fieldses.org \
--cc=chuck.lever@oracle.com \
--cc=jlayton@redhat.com \
--cc=linux-nfs@vger.kernel.org \
--cc=trond.myklebust@fys.uio.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.