public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Kirill Korotaev <dev@sw.ru>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: linux-kernel@vger.kernel.org, vserver@list.linux-vserver.org,
	Herbert Poetzl <herbert@13thfloor.at>,
	"Serge E. Hallyn" <serue@us.ibm.com>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Dave Hansen <haveblue@us.ibm.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Suleiman Souhlal <ssouhlal@FreeBSD.org>,
	Hubertus Franke <frankeh@watson.ibm.com>,
	Cedric Le Goater <clg@fr.ibm.com>,
	Kyle Moffett <mrmacman_g4@mac.com>, Greg <gkurz@fr.ibm.com>,
	Linus Torvalds <torvalds@osdl.org>, Andrew Morton <akpm@osdl.org>,
	Greg KH <greg@kroah.com>, Rik van Riel <riel@redhat.com>,
	Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>,
	Andrey Savochkin <saw@sawoct.com>,
	Kirill Korotaev <dev@openvz.org>, Andi Kleen <ak@suse.de>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Jeff Garzik <jgarzik@pobox.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	Jes Sorensen <jes@sgi.com>
Subject: Re: [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace
Date: Fri, 10 Feb 2006 23:30:59 +0300	[thread overview]
Message-ID: <43ECF803.8080404@sw.ru> (raw)
In-Reply-To: <m1bqxkmgcv.fsf_-_@ebiederm.dsl.xmission.com>

Eric,

All my commments are inline below.

> This patch modifies the fork/exit, signal handling, and pid and
> process group manipulating syscalls to support multiple process
> spaces, and implements the data for allow multiple instaces of the pid
> namespace. 

[ ... skipped .... ]

> +extern struct pspace init_pspace;
> +
> +#define INVALID_PID 0x7fffffff
<<<< what is it for?

> +
> +static inline int pspace_task_visible(struct pspace *pspace, struct task_struct *tsk)
> +{
> +	return (tsk->pspace == pspace) || 
> +		((tsk->pspace->child_reaper.pspace == pspace) &&
> +		 (tsk->pspace->child_reaper.task == tsk));
<<< the logic with child_reaper which can be somehow partly inside 
pspace... and this check is not that abvious.

Actually I can't say your patch is cleaner somehow.
It is very big and most of the changes are trivial, which creates an 
illusion that it is straightforward and clean.

[ ... skipped .... ]

> @@ -788,6 +801,16 @@ fastcall NORET_TYPE void do_exit(long co
>  		panic("Attempted to kill the idle task!");
>  	if (unlikely(is_init(tsk)))
>  		panic("Attempted to kill init!");
> +
> +	/*
> +	 * If we are the pspace leader it is nonsense for the pspace
> +	 * to continue so kill everyone else in the pspace.
> +	 */
> +	if (pspace_leader(tsk)) {
> +		tsk->pspace->flags = PSPACE_EXIT;
> +		kill_pspace_info(SIGKILL, (void *)1, tsk->pspace);
> +	}
> +
<<<<

1.
flags are neither atomic nor protected with any lock.
So if it will be used later for something else in future, there will be 
100% race. You also assigns them here, while everywhere in other places 
a bit is checked.

2. due to 1) you code is buggy. in this respect do_exit() is not 
serialized with copy_process().

3. due to the same 1) reason
 > +		kill_pspace_info(SIGKILL, (void *)1, tsk->pspace);
can miss a task being forked. Bang!!!

4.
So you are effectively inserting a code for cleaning up pspace here and 
the same is actually required for other subsystems like networking/IPC etc.
I think you suppose that other resources are freed when the last 
reference is dropped, but to tell the truth this is a way to deadlocks. 
Because refs are put in too many places, where you can't make a real 
cleanup due to locking etc. You can't for example call syncronize_net() 
from bh, which is required for network cleanup.

Just an another argument why containers are easier/better and why you 
will eventually end up with it.

>  	if (tsk->io_context)
>  		exit_io_context();
>  

[ ... skipped ... ]

> @@ -1147,11 +1150,55 @@ retry:
>  }
>  
>  /*
> + * kill_pspace_info() sends a signal to all processes in a process space.
> + * This is what kill(-1, sig) does.
> + */
> +
> +int __kill_pspace_info(int sig, struct siginfo *info, struct pspace *pspace)
> +{
> +	struct task_struct *p = NULL;
> +	int retval = 0, count = 0;
> +
> +	for_each_process(p) {
> +		int err;
> +		/* Skip the current pspace leader */
> +		if (current_pspace_leader(p))
> +			continue;
> +
> +		/* Skip the sender of the signal */
> +		if (p->signal == current->signal)
> +			continue;
> +
> +		/* Skip processes outside the target process space */
> +		if (!in_pspace(pspace, p))
> +			continue;
> +
> +		/* Finally it is a good process send the signal. */
> +		err = group_send_sig_info(sig, info, p);
> +		++count;
> +		if (err != -EPERM)
> +			retval = err;
<<<<
why EPERM is ok?
do you want to miss some tasks?
> +	}
> +	return count ? retval : -ESRCH;
> +}
> +

Thanks,
Kirill


  parent reply	other threads:[~2006-02-10 20:29 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-06 19:19 [RFC][PATCH 0/20] Multiple instances of the process id namespace Eric W. Biederman
2006-02-06 19:22 ` [RFC][PATCH 01/20] pid: Intoduce the concept of a wid (wait id) Eric W. Biederman
2006-02-06 19:27   ` [RFC][PATCH 02/20] pspace: The parent process id of pid 1 is always 0 Eric W. Biederman
2006-02-06 19:29     ` [RFC][PATCH 03/20] pid: Introduce a generic helper to test for init Eric W. Biederman
2006-02-06 19:34       ` [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace Eric W. Biederman
2006-02-06 19:36         ` [RFC][PATCH 05/20] sched: Fixup the scheduler syscalls to deal with pspaces Eric W. Biederman
2006-02-06 19:39           ` [RFC][PATCH 06/20] cad_pid: Fixup the cad_pid users to assume it is in the initial process id namespace Eric W. Biederman
2006-02-06 19:41             ` [RFC][PATCH 07/20] tty: Update the tty layer to work with pspaces Eric W. Biederman
2006-02-06 19:44               ` [RFC][PATCH 08/20] vt: Update the virtual console to handle pspaces Eric W. Biederman
2006-02-06 19:46                 ` [RFC][PATCH 09/20] ptrace: Update ptrace " Eric W. Biederman
2006-02-06 19:49                   ` [RFC][PATCH 10/20] capabilities: Update the capabilities code to " Eric W. Biederman
2006-02-06 19:51                     ` [RFC][PATCH 11/20] ioprio: Update ioprio " Eric W. Biederman
2006-02-06 19:54                       ` [RFC][PATCH 12/20] fcntl: Update fcntl to work with pspaces Eric W. Biederman
2006-02-06 19:57                         ` [RFC][PATCH 13/20] kthread: Update kthread " Eric W. Biederman
2006-02-06 19:59                           ` [RFC][PATCH 14/20] mm: Update vmscan " Eric W. Biederman
2006-02-06 20:00                             ` [RFC][PATCH 15/20] posix-timers: Update posix timers " Eric W. Biederman
2006-02-06 20:02                               ` [PATCH 16/20] nfs: Don't use pids to track the lockd server process Eric W. Biederman
2006-02-06 20:03                                 ` [RFC][PATCH 17/20] usb: Fixup usb so it works with pspaces Eric W. Biederman
2006-02-06 20:05                                   ` [RFC][PATCH 18/20] posix-mqueue: Make mqueues work with pspspaces Eric W. Biederman
2006-02-06 20:07                                     ` [RFC][PATCH 19/20] pspace: Upcate the pid_max sysctl to work in a per pspace fashion Eric W. Biederman
2006-02-06 20:12                                       ` [RFC][PATCH 20/20] proc: Update /proc to support multiple pid spaces Eric W. Biederman
2006-02-10 20:40                                         ` Kirill Korotaev
2006-02-11 10:10                                           ` Eric W. Biederman
2006-02-06 20:41                                   ` [RFC][PATCH 17/20] usb: Fixup usb so it works with pspaces Serge E. Hallyn
2006-02-06 20:53                                     ` Eric W. Biederman
2006-02-07 16:41         ` [RFC][PATCH 04/20] pspace: Allow multiple instaces of the process id namespace William Lee Irwin III
2006-02-07 17:05           ` Eric W. Biederman
2006-02-10 20:30         ` Kirill Korotaev [this message]
2006-02-11  9:42           ` Eric W. Biederman
2006-02-11 10:11           ` Eric W. Biederman
2006-02-11 10:43           ` Eric W. Biederman
2006-02-13  8:53             ` Kirill Korotaev
2006-02-13 16:40             ` Dave Hansen
2006-02-11 11:03           ` Eric W. Biederman
2006-02-13  9:02             ` Kirill Korotaev
2006-02-13 21:31               ` Serge E. Hallyn
2006-02-13 23:41               ` Eric W. Biederman
2006-02-11 11:57           ` Eric W. Biederman
2006-02-13  9:22             ` Kirill Korotaev
2006-02-13 17:02               ` Serge E. Hallyn
2006-02-14  5:59                 ` Eric W. Biederman
2006-02-14  5:54               ` Eric W. Biederman
2006-02-14 18:45                 ` Dave Hansen
2006-02-15 12:07                 ` Kirill Korotaev
2006-02-15 13:31                   ` Herbert Poetzl
2006-02-16 13:44                     ` Serge E. Hallyn
2006-02-20  9:27                       ` Kirill Korotaev
2006-02-20 17:04                         ` Herbert Poetzl
2006-02-21 16:29                           ` Kirill Korotaev
2006-02-21 23:23                             ` Herbert Poetzl
2006-02-20 11:29                     ` Kirill Korotaev
2006-02-20 12:34                       ` Herbert Poetzl
2006-02-20 14:11                         ` Kirill Korotaev
2006-02-20 15:08                           ` Herbert Poetzl
2006-03-01 22:06                             ` Cedric Le Goater
2006-02-15 18:40                   ` Eric W. Biederman
2006-02-06 19:56       ` [RFC][PATCH 03/20] pid: Introduce a generic helper to test for init Dave Hansen
2006-02-06 20:22         ` Eric W. Biederman
2006-02-07 15:08       ` Geert Uytterhoeven
2006-02-07 15:17         ` Eric W. Biederman
2006-02-06 19:54   ` [RFC][PATCH 01/20] pid: Intoduce the concept of a wid (wait id) Serge E. Hallyn
2006-02-06 20:23     ` Eric W. Biederman
2006-02-07 17:39   ` Jeff Dike
2006-02-07 18:32     ` Eric W. Biederman
2006-02-10 18:51   ` Kirill Korotaev
2006-02-11  9:25     ` Eric W. Biederman
2006-02-13  8:43       ` Kirill Korotaev
2006-02-14  0:04         ` Eric W. Biederman
2006-02-06 20:40 ` [RFC][PATCH 0/20] Multiple instances of the process id namespace Hubertus Franke
2006-02-06 20:51   ` Eric W. Biederman
2006-02-06 21:07     ` Hubertus Franke
2006-02-06 21:56       ` Eric W. Biederman
2006-02-07  0:48 ` Dave Hansen
2006-02-07  5:14   ` Eric W. Biederman
2006-02-07  9:33   ` Andi Kleen
2006-02-08  4:19     ` Randy.Dunlap
2006-02-08  4:28       ` Eric W. Biederman
2006-02-08  4:51         ` Randy.Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=43ECF803.8080404@sw.ru \
    --to=dev@sw.ru \
    --cc=ak@suse.de \
    --cc=akpm@osdl.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=arjan@infradead.org \
    --cc=benh@kernel.crashing.org \
    --cc=clg@fr.ibm.com \
    --cc=dev@openvz.org \
    --cc=ebiederm@xmission.com \
    --cc=frankeh@watson.ibm.com \
    --cc=gkurz@fr.ibm.com \
    --cc=greg@kroah.com \
    --cc=haveblue@us.ibm.com \
    --cc=herbert@13thfloor.at \
    --cc=jes@sgi.com \
    --cc=jgarzik@pobox.com \
    --cc=kuznet@ms2.inr.ac.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mrmacman_g4@mac.com \
    --cc=riel@redhat.com \
    --cc=saw@sawoct.com \
    --cc=serue@us.ibm.com \
    --cc=ssouhlal@FreeBSD.org \
    --cc=torvalds@osdl.org \
    --cc=trond.myklebust@fys.uio.no \
    --cc=vserver@list.linux-vserver.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox