public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing
@ 2008-08-24 15:49 Oleg Nesterov
  2008-08-26 21:25 ` Serge E. Hallyn
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Oleg Nesterov @ 2008-08-24 15:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Pavel Emelyanov, Robert Rex, Roland McGrath,
	Serge Hallyn, Sukadev Bhattiprolu, linux-kernel

zap_pid_ns_processes() sets pid_ns->child_reaper = NULL, this is wrong.

Yes, we have already killed all tasks in this namespace, and sys_wait4()
doesn't see any child. But this doesn't mean ->children list is empty,
we may have EXIT_DEAD tasks which are not visible to do_wait(). In that
case the subsequent forget_original_parent() will crash the kernel because
it will try to re-parent these tasks to the NULL reaper.

Even if there are no childs, it is not good that forget_original_parent()
uses reaper == NULL.

Change the code to set ->child_reaper = init_pid_ns.child_reaper instead.
We could use pid_ns->parent->child_reaper as well, I think this does not
really matter. These EXIT_DEAD tasks are not visible to the new ->parent
after re-parenting, they will silently do release_task() eventually.

Note that we must change ->child_reaper, otherwise forget_original_parent()
will use reaper == father, and in that case we will hit the (correct)
BUG_ON(!list_empty(&father->children)).

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 2.6.27-rc4/kernel/pid_namespace.c~1_ZAP_DONT_CLEAR_REAPER	2008-07-30 13:12:49.000000000 +0400
+++ 2.6.27-rc4/kernel/pid_namespace.c	2008-08-24 17:22:59.000000000 +0400
@@ -179,9 +179,12 @@ void zap_pid_ns_processes(struct pid_nam
 		rc = sys_wait4(-1, NULL, __WALL, NULL);
 	} while (rc != -ECHILD);
 
-
-	/* Child reaper for the pid namespace is going away */
-	pid_ns->child_reaper = NULL;
+	/*
+	 * We can not clear ->child_reaper or leave it alone.
+	 * There may by stealth EXIT_DEAD tasks on ->children,
+	 * forget_original_parent() must move them somewhere.
+	 */
+	pid_ns->child_reaper = init_pid_ns.child_reaper;
 	acct_exit_ns(pid_ns);
 	return;
 }


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing
  2008-08-24 15:49 [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing Oleg Nesterov
@ 2008-08-26 21:25 ` Serge E. Hallyn
  2008-08-27 16:36   ` Oleg Nesterov
  2008-08-27  1:43 ` sukadev
  2008-08-27 11:35 ` Pavel Emelyanov
  2 siblings, 1 reply; 5+ messages in thread
From: Serge E. Hallyn @ 2008-08-26 21:25 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Eric W. Biederman, Pavel Emelyanov, Robert Rex,
	Roland McGrath, Sukadev Bhattiprolu, linux-kernel

Quoting Oleg Nesterov (oleg@tv-sign.ru):
> zap_pid_ns_processes() sets pid_ns->child_reaper = NULL, this is wrong.
> 
> Yes, we have already killed all tasks in this namespace, and sys_wait4()
> doesn't see any child. But this doesn't mean ->children list is empty,
> we may have EXIT_DEAD tasks which are not visible to do_wait(). In that
> case the subsequent forget_original_parent() will crash the kernel because
> it will try to re-parent these tasks to the NULL reaper.
> 
> Even if there are no childs, it is not good that forget_original_parent()
> uses reaper == NULL.
> 
> Change the code to set ->child_reaper = init_pid_ns.child_reaper instead.
> We could use pid_ns->parent->child_reaper as well, I think this does not
> really matter. These EXIT_DEAD tasks are not visible to the new ->parent
> after re-parenting, they will silently do release_task() eventually.
> 
> Note that we must change ->child_reaper, otherwise forget_original_parent()
> will use reaper == father, and in that case we will hit the (correct)
> BUG_ON(!list_empty(&father->children)).
> 
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

Well while it looked correct to me all along, I couldn't get the
testcase to cause an oops.  But I see where do_exit() calls
zap_pid_ns_processes() (through group_exit()) before
forget_original_parent(), which can/should end up dereferencing
the null pid_ns, so clearly this is needed and correct.

Acked-by: Serge Hallyn <serue@us.ibm.com>

Thanks, Oleg.

-serge

> --- 2.6.27-rc4/kernel/pid_namespace.c~1_ZAP_DONT_CLEAR_REAPER	2008-07-30 13:12:49.000000000 +0400
> +++ 2.6.27-rc4/kernel/pid_namespace.c	2008-08-24 17:22:59.000000000 +0400
> @@ -179,9 +179,12 @@ void zap_pid_ns_processes(struct pid_nam
>  		rc = sys_wait4(-1, NULL, __WALL, NULL);
>  	} while (rc != -ECHILD);
> 
> -
> -	/* Child reaper for the pid namespace is going away */
> -	pid_ns->child_reaper = NULL;
> +	/*
> +	 * We can not clear ->child_reaper or leave it alone.
> +	 * There may by stealth EXIT_DEAD tasks on ->children,
> +	 * forget_original_parent() must move them somewhere.

Actually this comment is a little bit misleading - the null
deref will happen regardless of whether there are children,
right?

> +	 */
> +	pid_ns->child_reaper = init_pid_ns.child_reaper;
>  	acct_exit_ns(pid_ns);
>  	return;
>  }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing
  2008-08-24 15:49 [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing Oleg Nesterov
  2008-08-26 21:25 ` Serge E. Hallyn
@ 2008-08-27  1:43 ` sukadev
  2008-08-27 11:35 ` Pavel Emelyanov
  2 siblings, 0 replies; 5+ messages in thread
From: sukadev @ 2008-08-27  1:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Eric W. Biederman, Pavel Emelyanov, Robert Rex,
	Roland McGrath, Serge Hallyn, linux-kernel

Oleg Nesterov [oleg@tv-sign.ru] wrote:
| zap_pid_ns_processes() sets pid_ns->child_reaper = NULL, this is wrong.
| 
| Yes, we have already killed all tasks in this namespace, and sys_wait4()
| doesn't see any child. But this doesn't mean ->children list is empty,
| we may have EXIT_DEAD tasks which are not visible to do_wait(). In that
| case the subsequent forget_original_parent() will crash the kernel because
| it will try to re-parent these tasks to the NULL reaper.
| 
| Even if there are no childs, it is not good that forget_original_parent()
| uses reaper == NULL.
| 
| Change the code to set ->child_reaper = init_pid_ns.child_reaper instead.
| We could use pid_ns->parent->child_reaper as well, I think this does not
| really matter. These EXIT_DEAD tasks are not visible to the new ->parent
| after re-parenting, they will silently do release_task() eventually.
| 
| Note that we must change ->child_reaper, otherwise forget_original_parent()
| will use reaper == father, and in that case we will hit the (correct)
| BUG_ON(!list_empty(&father->children)).
| 
| Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>

| 
| --- 2.6.27-rc4/kernel/pid_namespace.c~1_ZAP_DONT_CLEAR_REAPER	2008-07-30 13:12:49.000000000 +0400
| +++ 2.6.27-rc4/kernel/pid_namespace.c	2008-08-24 17:22:59.000000000 +0400
| @@ -179,9 +179,12 @@ void zap_pid_ns_processes(struct pid_nam
|  		rc = sys_wait4(-1, NULL, __WALL, NULL);
|  	} while (rc != -ECHILD);
| 
| -
| -	/* Child reaper for the pid namespace is going away */
| -	pid_ns->child_reaper = NULL;
| +	/*
| +	 * We can not clear ->child_reaper or leave it alone.
| +	 * There may by stealth EXIT_DEAD tasks on ->children,
| +	 * forget_original_parent() must move them somewhere.
| +	 */
| +	pid_ns->child_reaper = init_pid_ns.child_reaper;
|  	acct_exit_ns(pid_ns);
|  	return;
|  }
| 
| --
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to majordomo@vger.kernel.org
| More majordomo info at  http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing
  2008-08-24 15:49 [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing Oleg Nesterov
  2008-08-26 21:25 ` Serge E. Hallyn
  2008-08-27  1:43 ` sukadev
@ 2008-08-27 11:35 ` Pavel Emelyanov
  2 siblings, 0 replies; 5+ messages in thread
From: Pavel Emelyanov @ 2008-08-27 11:35 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Eric W. Biederman, Robert Rex, Roland McGrath,
	Serge Hallyn, Sukadev Bhattiprolu, linux-kernel

> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

Acked-by: Pavel Emelyanov <xemul@openvz.org>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing
  2008-08-26 21:25 ` Serge E. Hallyn
@ 2008-08-27 16:36   ` Oleg Nesterov
  0 siblings, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2008-08-27 16:36 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Andrew Morton, Eric W. Biederman, Pavel Emelyanov, Robert Rex,
	Roland McGrath, Sukadev Bhattiprolu, linux-kernel

On 08/26, Serge E. Hallyn wrote:
>
> Quoting Oleg Nesterov (oleg@tv-sign.ru):
> > zap_pid_ns_processes() sets pid_ns->child_reaper = NULL, this is wrong.
> > ...
> > Even if there are no childs, it is not good that forget_original_parent()
> > uses reaper == NULL.
> > ...
> 
> Well while it looked correct to me all along, I couldn't get the
> testcase to cause an oops.

Perhaps you can start several instances or increase the delay, I
forgot that usleep() uses usecs, not msecs. In any case I agree,
the test-case is oversimplified, but since it worked for me I didn't
try to make it more "stubborn".

> Acked-by: Serge Hallyn <serue@us.ibm.com>

Thanks!

> > +	/*
> > +	 * We can not clear ->child_reaper or leave it alone.
> > +	 * There may by stealth EXIT_DEAD tasks on ->children,
> > +	 * forget_original_parent() must move them somewhere.
>
> Actually this comment is a little bit misleading - the null
> deref will happen regardless of whether there are children,
> right?

Yes, the comment could be better ;)

But no, we won't oops without children. Please note that
forget_original_parent() uses reaper inside the
list_for_each_entry_safe(father->children) loop. ->children is
empty, reaper is not used at all.

But please also note the "not good" part of the changelog above.


Thanks to all for review!

Oleg.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2008-08-27 16:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-24 15:49 [PATCH 1/4] pid_ns: zap_pid_ns_processes: fix the ->child_reaper changing Oleg Nesterov
2008-08-26 21:25 ` Serge E. Hallyn
2008-08-27 16:36   ` Oleg Nesterov
2008-08-27  1:43 ` sukadev
2008-08-27 11:35 ` Pavel Emelyanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox