From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Pavel Emelyanov <xemul@openvz.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linux Containers <containers@lists.osdl.org>,
linux-kernel@vger.kernel.org,
Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Subject: Re: [RFC][PATCH 2/2] pidns: Remove proc flush races when a pid namespaces are exiting.
Date: Fri, 09 Jul 2010 06:05:54 -0700 [thread overview]
Message-ID: <m1lj9k4xfx.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <20100709121425.GB18586@hawkmoon.kerlabs.com> (Louis Rilling's message of "Fri\, 9 Jul 2010 14\:14\:25 +0200")
Louis Rilling <Louis.Rilling@kerlabs.com> writes:
> On 08/07/10 21:39 -0700, Eric W. Biederman wrote:
>>
>> Currently it is possible to put proc_mnt before we have flushed the
>> last process that will use the proc_mnt to flush it's proc entries.
>>
>> This race is fixed by not flushing proc entries for dead pid
>> namespaces, and calling pid_ns_release_proc unconditionally from
>> zap_pid_ns_processes after the pid namespace has been declared dead.
>
> One comment below.
>
>>
>> To ensure we don't unnecessarily leak any dcache entries with skipped
>> flushes pid_ns_release_proc flushes the entire proc_mnt when it is
>> called.
>>
>> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>> ---
>> fs/proc/base.c | 9 +++++----
>> fs/proc/root.c | 3 +++
>> kernel/pid_namespace.c | 1 +
>> 3 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index acb7ef8..e9d84e1 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -2742,13 +2742,14 @@ void proc_flush_task(struct task_struct *task)
>>
>> for (i = 0; i <= pid->level; i++) {
>> upid = &pid->numbers[i];
>> +
>> + /* Don't bother flushing dead pid namespaces */
>> + if (test_bit(PIDNS_DEAD, &upid->ns->flags))
>> + continue;
>> +
>
> IMHO, nothing prevents zap_pid_ns_processes() from setting PIDNS_DEAD and
> calling pid_ns_release_proc() right now. zap_pid_ns_processes() does not wait
> for EXIT_DEAD (self-reaping) children to be released.
Good point we need something probably a lock to prevent proc_mnt from
going away here. We might do a little better if we were starting with
a specific dentry, those at least have some rcu properties but that isn't
a big help.
Hmm. Perhaps there is a way to completely restructure this flushing
of dentries. It is just an optimization after all so we don't get too many
stale dentries building up.
It might just be worth it simply kill proc_flush_mnt altogether. I know
it is measurable when we don't do the flushing but perhaps there can
be a work struct that periodically wakes up and smacks stale proc dentries.
Right now I really don't think proc_flush_task is worth the hassle it
causes.
Grumble, Grumble more thinking to do.
Eric
next prev parent reply other threads:[~2010-07-09 13:05 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-06-16 16:34 [PATCH] procfs: Do not release pid_ns->proc_mnt too early Louis Rilling
[not found] ` <1276706068-18567-1-git-send-email-louis.rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2010-06-17 9:53 ` Pavel Emelyanov
2010-06-17 13:41 ` Eric W. Biederman
2010-06-17 14:20 ` Louis Rilling
2010-06-17 21:36 ` Oleg Nesterov
2010-06-18 8:27 ` Louis Rilling
[not found] ` <20100618082738.GE16877-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-06-18 16:27 ` Oleg Nesterov
[not found] ` <20100618162734.GB7404@redhat.com>
2010-06-21 11:11 ` Louis Rilling
2010-06-21 12:58 ` Eric W. Biederman
2010-06-21 14:15 ` Louis Rilling
2010-06-21 14:26 ` Eric W. Biederman
2010-06-17 21:20 ` Oleg Nesterov
2010-06-18 8:20 ` Louis Rilling
2010-06-18 11:15 ` Oleg Nesterov
2010-06-18 16:08 ` Oleg Nesterov
2010-06-18 17:33 ` Louis Rilling
2010-06-18 17:55 ` Oleg Nesterov
2010-06-18 21:23 ` Oleg Nesterov
2010-06-19 19:08 ` [PATCH 0/4] pid_ns_prepare_proc/unshare cleanups Oleg Nesterov
2010-06-19 19:09 ` [PATCH 1/4] procfs: proc_get_sb: consolidate/cleanup root_inode->pid logic Oleg Nesterov
2010-06-19 19:10 ` [PATCH 2/4] procfs: kill the global proc_mnt variable Oleg Nesterov
2010-06-19 19:10 ` [PATCH 3/4] procfs: move pid_ns_prepare_proc() from copy_process() to create_pid_namespace() Oleg Nesterov
2010-06-19 19:11 ` [PATCH RESEND 4/4] sys_unshare: simplify the not-really-implemented CLONE_THREAD/SIGHAND/VM code Oleg Nesterov
2010-06-20 8:42 ` [PATCH 0/6] Unshare support for the pid namespace Eric W. Biederman
2010-06-20 8:44 ` [PATCH 1/6] pid: Remove the child_reaper special case in init/main.c Eric W. Biederman
[not found] ` <m1ljaaqejm.fsf_-_-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-06-20 18:29 ` Oleg Nesterov
2010-06-20 20:27 ` Oleg Nesterov
2010-06-20 8:45 ` [PATCH 2/6] pidns: Call pid_ns_prepare_proc from create_pid_namespace Eric W. Biederman
[not found] ` <m1hbkyqeib.fsf_-_-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-06-20 18:19 ` Oleg Nesterov
2010-06-20 8:45 ` [PATCH 3/6] procfs: kill the global proc_mnt variable Eric W. Biederman
2010-06-20 8:47 ` [PATCH 4/6] pidns: Don't allow new pids after the namespace is dead Eric W. Biederman
2010-06-20 18:44 ` Oleg Nesterov
2010-06-20 8:48 ` [PATCH 5/6] pidns: Use task_active_pid_ns where appropriate Eric W. Biederman
2010-06-20 8:49 ` [PATCH 6/6] pidns: Support unsharing the pid namespace Eric W. Biederman
2010-06-20 20:14 ` Oleg Nesterov
2010-06-20 20:42 ` Oleg Nesterov
2010-06-21 1:53 ` Eric W. Biederman
2010-06-20 18:03 ` [PATCH 0/6] Unshare support for " Oleg Nesterov
2010-06-20 18:05 ` [PATCH 0/2] pid_ns_release_proc() fixes Oleg Nesterov
2010-06-20 18:06 ` [PATCH 1/2] pid_ns: move destroy_pid_namespace() into workqueue context Oleg Nesterov
2010-06-20 18:06 ` [PATCH 2/2] pid_ns: refactor the buggy pid_ns_release_proc() logic Oleg Nesterov
2010-06-20 21:00 ` [PATCH 0/6] Unshare support for the pid namespace Eric W. Biederman
2010-06-20 21:48 ` Oleg Nesterov
[not found] ` <m14ogxctd6.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-06-20 21:56 ` Oleg Nesterov
2011-01-26 15:57 ` Daniel Lezcano
2010-06-23 20:36 ` [PATCH 0/1] pid_ns: move pid_ns_release_proc() from proc_flush_task() to zap_pid_ns_processes() Oleg Nesterov
[not found] ` <20100623203652.GA25298-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2010-06-23 20:37 ` [PATCH 1/1] " Oleg Nesterov
2010-06-24 6:36 ` Sukadev Bhattiprolu
2010-06-24 12:59 ` Oleg Nesterov
2010-06-24 7:06 ` Eric W. Biederman
2010-06-24 13:01 ` Oleg Nesterov
2010-06-24 8:37 ` [PATCH] pid_ns: Fix proc_flush_task() accessing freed proc_mnt Louis Rilling
2010-06-24 17:08 ` [RESEND PATCH] " Louis Rilling
2010-06-24 19:18 ` Oleg Nesterov
2010-06-25 10:23 ` Louis Rilling
[not found] ` <20100625102303.GG3773-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-06-25 12:21 ` Oleg Nesterov
2010-06-25 18:37 ` Sukadev Bhattiprolu
[not found] ` <20100625183733.GA2627@us.ibm.com>
2010-06-25 19:29 ` Oleg Nesterov
2010-06-25 21:26 ` Sukadev Bhattiprolu
2010-06-25 21:27 ` Oleg Nesterov
2010-06-25 22:07 ` Sukadev Bhattiprolu
2010-07-09 4:36 ` [RFC][PATCH 1/2] pidns: Add a flag to indicate a pid namespace is dead Eric W. Biederman
2010-07-09 4:39 ` [RFC][PATCH 2/2] pidns: Remove proc flush races when a pid namespaces are exiting Eric W. Biederman
2010-07-09 12:14 ` Louis Rilling
2010-07-09 13:05 ` Eric W. Biederman [this message]
2010-07-09 14:13 ` Louis Rilling
2010-07-09 15:58 ` [PATCH 01/24] pidns: Remove races by stopping the caching of proc_mnt Eric W. Biederman
2010-07-09 22:13 ` Serge E. Hallyn
2010-07-11 14:14 ` Louis Rilling
2010-07-11 14:25 ` Eric W. Biederman
2010-07-12 18:09 ` [PATCH] pidns: Fix wait for zombies to be reaped in zap_pid_ns_processes Eric W. Biederman
2010-07-13 21:42 ` Louis Rilling
[not found] ` <20100713214234.GA21042-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2010-07-13 22:34 ` Serge E. Hallyn
2010-07-14 1:47 ` Eric W. Biederman
[not found] ` <m1oceakf5x.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2010-10-30 7:07 ` Sukadev Bhattiprolu
2010-07-14 20:53 ` Sukadev Bhattiprolu
2010-07-14 21:35 ` Eric W. Biederman
2010-06-21 11:09 ` [PATCH] procfs: Do not release pid_ns->proc_mnt too early Louis Rilling
2010-06-21 11:15 ` Louis Rilling
2010-06-21 14:38 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m1lj9k4xfx.fsf@fess.ebiederm.org \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=sukadev@linux.vnet.ibm.com \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox