From: Peter Staubach <staubach@redhat.com>
To: Jeff Layton <jlayton@redhat.com>
Cc: Neil Brown <neilb@suse.de>,
akpm@linux-foundation.org, linux-nfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/6] NLM: Add reference counting to lockd
Date: Tue, 08 Jan 2008 11:13:03 -0500 [thread overview]
Message-ID: <4783A10F.1080604@redhat.com> (raw)
In-Reply-To: <20080108082603.089718fc-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
Jeff Layton wrote:
> On Tue, 8 Jan 2008 17:46:33 +1100
> Neil Brown <neilb@suse.de> wrote:
>
> The comments about patch 5/6 seem sane. I'll plan to incorporate them
> in the respin...
>
>
>> On Saturday January 5, jlayton@redhat.com wrote:
>>
>>> @@ -357,7 +375,18 @@ lockd_down(void)
>>> goto out;
>>> }
>>> warned = 0;
>>> - kthread_stop(nlmsvc_task);
>>> + if (atomic_sub_return(1, &nlmsvc_ref) != 0)
>>> + printk(KERN_WARNING "lockd_down: lockd is waiting
>>> for "
>>> + "outstanding requests to complete before
>>> exiting.\n");
>>>
>> Why not "atomic_dec_and_test" ??
>>
>>
>
> Temporary amnesia? :-) I'll change that, atomic_dec_and_test will be
> clearer.
>
>
>>> +
>>> + /*
>>> + * Sending a signal is necessary here. If we get to this
>>> point and
>>> + * nlm_blocked isn't empty then lockd may be held hostage
>>> by clients
>>> + * that are still blocking. Sending the signal makes sure
>>> that lockd
>>> + * invalidates all of its locks so that it's just waiting
>>> on RPC
>>> + * callbacks to complete
>>> + */
>>> + kill_proc(nlmsvc_task->pid, SIGKILL, 1);
>>>
>> The previous patch removes a kill_proc(... SIGKILL), this one adds it
>> back.
>> That makes me wonder if the intermediate state is 'correct'.
>>
>> But I also wonder what "correct" means.
>> Do we want all locks to be dropped when the last nfsd thread dies?
>> The answer is presumably either "yes" or "no".
>> If "yes", then we don't have that because if there are any NFS mounts
>> active, lockd will not be killed.
>> If "no", then we don't want this kill_proc here.
>>
>> The comment in lockd() which currently reads:
>>
>> /*
>> * The main request loop. We don't terminate until the last
>> * NFS mount or NFS daemon has gone away, and we've been sent
>> a
>> * signal, or else another process has taken over our job.
>> */
>>
>> suggests that someone once thought that lockd could hang around after
>> all nfsd threads and nfs mounts had gone, but I don't think it does.
>>
>> We really should think this through and get it right, because if lockd
>> ever drops it's locks, then we really need to make sure sm_notify gets
>> run. So it needs to be a well defined event.
>>
>> Thoughts?
>>
>>
>
> This is the part I've been struggling with the most -- defining what
> proper behavior should be when lockd is restarted. As you point out,
> restarting lockd without doing a sm_notify could be bad news for data
> integrity.
>
> Then again, we'd like someone to be able to shut down the NFS "service"
> and be able to unmount underlying filesystems without jumping through
> special hoops....
>
> Overall, I think I'd vote "yes". We need to drop locks when the last
> nfsd goes down. If userspace brings down nfsd, then it's userspace's
> responsibility to make sure that a sm_notify is sent when nfsd and lockd
> are restarted.
>
I would vote for the simplest possible model that makes sense.
We need a simple model for admins as well as a simple model
which is easy to implement in as bug free way as possible. The
trick is not making it too simple because that can cost
performance, but not making it too complicated to implement
reasonably and for admins to be able to figure out.
So, I would vote for "yes" as well. That will yield an
architecture where we can shutdown systems cleanly and will
be easy to understand when locks for clients exist and when
they do not.
Thanx...
ps
> As a side note, I'm not thrilled with this design that mixes signals
> and kthreads, but didn't see another way to do this. I'm open to
> suggestions if anyone has them...
>
>
>> Also, it is sad that the inc/dec of nlmsvc_ref is called in somewhat
>> non-obvious ways.
>> e.g.
>>
>>
>>> + if (!nlmsvc_users && error)
>>> + atomic_dec(&nlmsvc_ref);
>>>
>> and
>>
>>
>>> + if (list_empty(&nlm_blocked))
>>> + atomic_inc(&nlmsvc_ref);
>>> +
>>> if (list_empty(&block->b_list)) {
>>> kref_get(&block->b_count);
>>> } else {
>>>
>> where if we moved the atomic_inc a little bit later next to the
>> "list_add_tail" (which seems to make more sense) it would actually be
>> wrong... But I think that code is correct as it is - just non-obvious.
>>
>>
>
> The nlmsvc_ref logic is pretty convoluted, unfortunately. I'll plan to
> add some comments to clarify what I'm doing there.
>
> Thanks for the review, Neil. I'll see if I can get a new patchset done
> in the next few days.
>
> Cheers,
>
next prev parent reply other threads:[~2008-01-08 16:13 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-05 12:02 [PATCH 0/6] Intro: convert lockd to kthread and fix use-after-free (try #5) Jeff Layton
2008-01-05 12:02 ` [PATCH 1/6] SUNRPC: spin svc_rqst initialization to its own function Jeff Layton
2008-01-05 12:02 ` [PATCH 2/6] SUNRPC: export svc_sock_update_bufs Jeff Layton
2008-01-05 12:02 ` [PATCH 3/6] NLM: Initialize completion variable in lockd_up Jeff Layton
2008-01-05 12:02 ` [PATCH 4/6] NLM: Have lockd call try_to_freeze Jeff Layton
2008-01-05 12:02 ` [PATCH 5/6] NLM: Convert lockd to use kthreads Jeff Layton
2008-01-05 12:02 ` [PATCH 6/6] NLM: Add reference counting to lockd Jeff Layton
2008-01-08 6:46 ` Neil Brown
[not found] ` <18307.7241.831689.998668-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-01-08 13:26 ` Jeff Layton
[not found] ` <20080108082603.089718fc-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-08 15:52 ` Wendy Cheng
2008-01-08 16:13 ` Jeff Layton
2008-01-08 16:13 ` Peter Staubach [this message]
2008-01-08 6:16 ` [PATCH 5/6] NLM: Convert lockd to use kthreads Neil Brown
2008-01-08 5:53 ` [PATCH 1/6] SUNRPC: spin svc_rqst initialization to its own function Neil Brown
[not found] ` <18307.4037.415675.519239-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-01-08 12:11 ` Jeff Layton
-- strict thread matches above, loose matches on Subject: below --
2008-01-08 19:33 [PATCH 0/6] Intro: convert lockd to kthread and fix use-after-free (try #6) Jeff Layton
2008-01-08 19:33 ` [PATCH 1/6] SUNRPC: spin svc_rqst initialization to its own function Jeff Layton
2008-01-08 19:33 ` [PATCH 2/6] SUNRPC: export svc_sock_update_bufs Jeff Layton
2008-01-08 19:33 ` [PATCH 3/6] NLM: Initialize completion variable in lockd_up Jeff Layton
2008-01-08 19:33 ` [PATCH 4/6] NLM: Have lockd call try_to_freeze Jeff Layton
2008-01-08 19:33 ` [PATCH 5/6] NLM: Convert lockd to use kthreads Jeff Layton
2008-01-08 19:33 ` [PATCH 6/6] NLM: Add reference counting to lockd Jeff Layton
2008-01-09 17:47 ` Christoph Hellwig
2008-01-09 18:36 ` Jeff Layton
[not found] ` <20080109133621.72f611ec-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2008-01-09 18:48 ` Christoph Hellwig
2008-01-09 18:59 ` Jeff Layton
2008-01-10 3:29 ` Neil Brown
[not found] ` <18309.37138.207880.305870-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2008-01-10 11:58 ` Jeff Layton
2007-12-21 15:28 [PATCH 0/6] Intro: convert lockd to kthread and fix use-after-free (try #4) Jeff Layton
2007-12-21 15:28 ` [PATCH 1/6] SUNRPC: spin svc_rqst initialization to its own function Jeff Layton
2007-12-21 15:28 ` [PATCH 2/6] SUNRPC: export svc_sock_update_bufs Jeff Layton
2007-12-21 15:28 ` [PATCH 3/6] NLM: Initialize completion variable in lockd_up Jeff Layton
2007-12-21 15:28 ` [PATCH 4/6] NLM: Have lockd call try_to_freeze Jeff Layton
2007-12-21 15:28 ` [PATCH 5/6] NLM: Convert lockd to use kthreads Jeff Layton
2007-12-21 15:28 ` [PATCH 6/6] NLM: Add reference counting to lockd Jeff Layton
2007-12-21 16:43 ` Chuck Lever
2007-12-21 17:02 ` Jeff Layton
[not found] ` <20071221120215.03beada0-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2007-12-21 17:51 ` Chuck Lever
2007-12-21 18:25 ` Jeff Layton
2007-12-21 19:54 ` Jeff Layton
[not found] ` <20071221145456.122174d0-RtJpwOs3+0O+kQycOl6kW4xkIHaj4LzF@public.gmane.org>
2007-12-21 20:25 ` Chuck Lever
2007-12-21 20:46 ` Jeff Layton
2007-12-13 20:40 [PATCH 0/6] Intro: convert lockd to kthread and fix use-after-free Jeff Layton
2007-12-13 20:40 ` [PATCH 1/6] SUNRPC: Allow svc_pool_map_set_cpumask to work with any task Jeff Layton
2007-12-13 20:40 ` [PATCH 2/6] SUNRPC: Break up __svc_create_thread and make svc_create_kthread Jeff Layton
2007-12-13 20:40 ` [PATCH 3/6] NLM: Initialize completion variable in lockd_up Jeff Layton
2007-12-13 20:40 ` [PATCH 4/6] NLM: Have lockd call try_to_freeze Jeff Layton
2007-12-13 20:40 ` [PATCH 5/6] NLM: Convert lockd to use kthreads Jeff Layton
2007-12-13 20:40 ` [PATCH 6/6] NLM: Add reference counting to lockd Jeff Layton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4783A10F.1080604@redhat.com \
--to=staubach@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=jlayton@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=neilb@suse.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox