From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Pavel Emelyanov <xemul@parallels.com>,
Cyrill Gorcunov <gorcunov@openvz.org>,
Louis Rilling <louis.rilling@kerlabs.com>,
Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped.
Date: Thu, 17 May 2012 15:46:53 -0600 [thread overview]
Message-ID: <87d3628oqa.fsf@xmission.com> (raw)
In-Reply-To: <20120517170015.GA12436@redhat.com> (Oleg Nesterov's message of "Thu, 17 May 2012 19:00:15 +0200")
Oleg Nesterov <oleg@redhat.com> writes:
> On 05/16, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@redhat.com> writes:
>>
>> > Hmm. I don't think the patch is 100% correct. Afaics, this needs more
>> > delay_pidns_leader() checks.
>> >
>> > For example. Suppose we have a CLONE_NEWPID zombie I, it has an
>> > EXIT_DEAD child D so delay_pidns_leader(I) == T.
>> >
>> > Now suppose that I->real_parent exits, lets denote this task as P.
>> >
>> > Suppose that P->real_parent ignores SIGCHLD.
>> >
>> > In this case P will do release_task(I) prematurely. And worse, when
>> > D finally does realease_task(D) it will do realease_task(I) again.
>>
>> Good point. I will fix that and post a patch shortly. It doesn't
>> need a full delay_pidns_leader test just a test for children.
>
> This will add more complications. And even this is not enough, I guess.
> For example __ptrace_detach()...
Agreed. I am having to step back and think about this a bit more.
I don't like doing things two different ways but delay_thread_group
leader and all of that is pretty horrible from a maintenance point
of view and extending that just makes things worse.
> I agree, the idea to "hack" release_task() so that it switches to
> init is clever, but imho this is too clever ;)
>
> Seriously, what do you think about the patch below? Or something
> like this. It is still based on your suggestion to check ->children,
> but it is much, much more simple and understandable.
>
> Just in case... Even with the PF_EXITING check __wake_up_parent()
> can be wrong, but this is very unlikely and harmless.
>
> What do you think?
I think there is something very compelling about your solution,
we do need my bit about making the init process ignore SIGCHLD
so all of init's children self reap.
Before I go farther I am going to play with the code more.
In part I think the current code for waiting for processes to
die etc is pretty horrible maintenance wise and it might just
be worth cleaning up before we extending it with yet another
strange and bizarre case, if for no other reason than to make
it clear what we are doing.
>> In looking for any other weird corner case bugs I am noticing that
>> I don't think I handled the case of a ptraced init quite right.
>> I don't understand the change signaling semantics when the
>> ptracer is our parent.
>
> Do you mean the "if (tsk->ptrace)" code in exit_notify() ? Nobody
> understand it ;) Last time this code was modified by me (iirc), but
> I simply tried to preserve the previous behaviour.
Yes. It is some pretty strange code. Especially where we are reading
a return result which is always false. I think there is a bug somewhere
between that code and ptrace detach but I don't know that I could tell
you what it is.
Hopefully I have a follow-on patch in another couple of hours.
Eric
> Oleg.
>
> --- x/kernel/exit.c
> +++ x/kernel/exit.c
> @@ -63,6 +63,13 @@ static void exit_mm(struct task_struct *
>
> static void __unhash_process(struct task_struct *p, bool group_dead)
> {
> + struct task_struct *parent = p->parent;
> + bool parent_is_init = false;
> +
> +#ifdef CONFIG_PID_NS
> + parent_is_init = (task_active_pid_ns(p)->child_reaper == parent);
> +#endif
> +
> nr_threads--;
> detach_pid(p, PIDTYPE_PID);
> if (group_dead) {
> @@ -72,6 +79,11 @@ static void __unhash_process(struct task
> list_del_rcu(&p->tasks);
> list_del_init(&p->sibling);
> __this_cpu_dec(process_counts);
> +
> + if (parent_is_init && (parent->flags & PF_EXITING)) {
> + if (list_empty(&parent->children))
> + __wake_up_parent(p, parent);
> + }
> }
> list_del_rcu(&p->thread_group);
> }
> --- x/kernel/pid_namespace.c
> +++ x/kernel/pid_namespace.c
> @@ -184,6 +184,9 @@ void zap_pid_ns_processes(struct pid_nam
> rc = sys_wait4(-1, NULL, __WALL, NULL);
> } while (rc != -ECHILD);
>
> + wait_event(¤t->signal->wait_chldexit,
> + list_empty(¤t->children));
> +
> if (pid_ns->reboot)
> current->signal->group_exit_code = pid_ns->reboot;
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2012-05-17 21:47 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-28 9:19 [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
2012-04-28 14:26 ` Oleg Nesterov
2012-04-29 4:13 ` Mike Galbraith
2012-04-29 7:57 ` Eric W. Biederman
2012-04-29 9:49 ` Mike Galbraith
2012-04-29 16:58 ` Oleg Nesterov
2012-04-30 2:59 ` Eric W. Biederman
2012-04-30 3:25 ` Mike Galbraith
2012-05-02 12:40 ` Oleg Nesterov
2012-05-02 17:37 ` Eric W. Biederman
2012-04-30 3:01 ` [PATCH] " Mike Galbraith
[not found] ` <m1zk9rmyh4.fsf@fess.ebiederm.org>
2012-05-01 20:42 ` Andrew Morton
2012-05-03 3:12 ` Mike Galbraith
2012-05-03 14:56 ` Mike Galbraith
2012-05-04 4:27 ` Mike Galbraith
2012-05-04 7:55 ` Eric W. Biederman
2012-05-04 8:34 ` Mike Galbraith
2012-05-04 9:45 ` Mike Galbraith
2012-05-04 14:13 ` Eric W. Biederman
2012-05-04 14:49 ` Mike Galbraith
2012-05-04 15:36 ` Eric W. Biederman
2012-05-04 16:57 ` Mike Galbraith
2012-05-04 20:29 ` Eric W. Biederman
2012-05-05 5:56 ` Mike Galbraith
2012-05-05 6:08 ` Mike Galbraith
2012-05-05 7:12 ` Mike Galbraith
2012-05-05 11:37 ` Eric W. Biederman
2012-05-07 21:51 ` [PATCH] vfs: Speed up deactivate_super for non-modular filesystems Eric W. Biederman
2012-05-07 22:17 ` Al Viro
2012-05-07 23:56 ` Paul E. McKenney
2012-05-08 1:07 ` Eric W. Biederman
2012-05-08 4:53 ` Mike Galbraith
2012-05-09 7:55 ` Nick Piggin
2012-05-09 11:02 ` Eric W. Biederman
2012-05-09 11:02 ` Eric W. Biederman
2012-05-15 8:40 ` Nick Piggin
2012-05-16 0:34 ` Eric W. Biederman
2012-05-16 0:34 ` Eric W. Biederman
2012-05-09 13:59 ` Paul E. McKenney
2012-05-04 8:03 ` [PATCH] Re: [RFC PATCH] namespaces: fix leak on fork() failure Eric W. Biederman
2012-05-04 8:19 ` Mike Galbraith
2012-05-04 8:54 ` Mike Galbraith
2012-05-07 0:32 ` [PATCH 0/3] pidns: Closing the pid namespace exit race Eric W. Biederman
2012-05-07 0:33 ` [PATCH 1/3] pidns: Use task_active_pid_ns in do_notify_parent Eric W. Biederman
2012-05-07 0:35 ` [PATCH 2/3] pidns: Guarantee that the pidns init will be the last pidns process reaped Eric W. Biederman
2012-05-08 22:50 ` Andrew Morton
2012-05-16 18:39 ` Oleg Nesterov
2012-05-16 19:34 ` Oleg Nesterov
2012-05-16 20:54 ` Eric W. Biederman
2012-05-17 17:00 ` Oleg Nesterov
2012-05-17 21:46 ` Eric W. Biederman [this message]
2012-05-18 12:39 ` Oleg Nesterov
2012-05-19 0:03 ` Eric W. Biederman
2012-05-21 12:44 ` Oleg Nesterov
2012-05-22 0:16 ` Eric W. Biederman
2012-05-22 0:20 ` [PATCH] pidns: Guarantee that the pidns init will be the last pidns process reaped. v2 Eric W. Biederman
2012-05-22 16:54 ` Oleg Nesterov
2012-05-22 19:23 ` Andrew Morton
2012-05-23 14:52 ` Oleg Nesterov
2012-05-25 15:15 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Oleg Nesterov
2012-05-25 15:59 ` [PATCH -mm 0/1] pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper Oleg Nesterov
2012-05-25 16:00 ` [PATCH -mm 1/1] " Oleg Nesterov
2012-05-25 21:43 ` Eric W. Biederman
2012-05-27 19:10 ` [PATCH v2 -mm 0/1] " Oleg Nesterov
2012-05-27 19:11 ` [PATCH v2 -mm 1/1] " Oleg Nesterov
2012-05-29 6:34 ` Eric W. Biederman
2012-05-25 21:25 ` [PATCH -mm] pidns-guarantee-that-the-pidns-init-will-be-the-last-pidns-process-r eaped-v2-fix-fix Eric W. Biederman
2012-05-27 18:41 ` [PATCH -mm v2] " Oleg Nesterov
2012-05-07 0:35 ` [PATCH 3/3] pidns: Make killed children autoreap Eric W. Biederman
2012-05-08 22:51 ` Andrew Morton
2012-04-30 13:57 ` [RFC PATCH] namespaces: fix leak on fork() failure Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87d3628oqa.fsf@xmission.com \
--to=ebiederm@xmission.com \
--cc=akpm@linux-foundation.org \
--cc=efault@gmx.de \
--cc=gorcunov@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=louis.rilling@kerlabs.com \
--cc=oleg@redhat.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.