Linux Container Development
 help / color / mirror / Atom feed
From: Oren Laadan <orenl@cs.columbia.edu>
To: Ingo Molnar <mingo@elte.hu>
Cc: containers@lists.osdl.org, Alexey Dobriyan <adobriyan@gmail.com>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	"Serge E. Hallyn" <serue@us.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Linux-Kernel <linux-kernel@vger.kernel.org>
Subject: Re: Creating tasks on restart: userspace vs kernel
Date: Tue, 14 Apr 2009 10:53:52 -0400	[thread overview]
Message-ID: <49E4A380.4070503@cs.columbia.edu> (raw)
In-Reply-To: <20090414095904.GD3558@elte.hu>



Ingo Molnar wrote:
> * Oren Laadan <orenl@cs.columbia.edu> wrote:
> 
>> <3> Clone with pid:
>>
>> To restart processes from userspace, there needs to be a way to 
>> request a specific pid--in the current pid_ns--for the child 
>> process (clearly, if it isn't in use).
>>
>> Why is it a disadvantage ?  to Linus, a syscall clone_with_pid() 
>> "sounds like a _wonderful_ attack vector against badly written 
>> user-land software...".  Actually, getting a specific pid is 
>> possible without this syscall.  But the point is that it's 
>> undesirable to have this functionality unrestricted.
> 
> The point is that there's a class of a difference between a racy and 
> unreliable method of 'create tens of thousands of tasks to steal the 
> right PID you are interested in' and a built-in syscall that gives 
> this within a couple of microseconds.
> 
> Most signal races are timing dependent so the ability to do it 
> really quickly makes or breaks the practicality of many classes of 
> exploits.

Exactly.

> 
>> So one option is to require root privileges. Another option is to 
>> restrict such action in pid_ns created by the same user. Even more 
>> so, restrict to only containers that are being restarted.
> 
> Requiring root privileges seems to remove much of the appeal of 
> allowing this to be a more generic sub-container creation thing. If 
> regular unprivileged apps cannot use this to save/restore their own 
> local task hierarchy, the whole thing becomes rather pointless, 
> right?

First, I suggest to distinguish between two cases: (1) c/r of a whole
container, and (2) c/r of a task subtree. (#2 is a nice byproduct of
this work, but with more limited scope/applicability).

#2 is easier: we don't use a new ipc_ns necessarily, so we don't need
to (and perhaps can't) restore old pids. So there is no question about
privileges. (This of course requires that the application be c/r-aware
or c/r-agnostic).

For #1, we need to create a new container to begin with. This already
requires CAP_SYS_ADMIN. Yes, for now we can use some setuid() to create
a new pid_ns and then do the restart.

We will eventually need CAP_SYS_ADMIN for other parts of the restart,
for instance to restore a listening socket on a privileged port, or to
restore tasks of multiple users, or to restore an open file accessible
by, say, root only (assume the original task opened the file and then
dropped its privileges).

So for c/r - eventually we'll need to trust something in the checkpoint
image, like you trust a kernel module. One way to do it is to have the
userland utility (particularly restart) setuid, and have it sign the
image during checkpoint and then verify the signature during restart.

Oren.

  reply	other threads:[~2009-04-14 14:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-14  3:43 Creating tasks on restart: userspace vs kernel Oren Laadan
2009-04-14  9:59 ` Ingo Molnar
2009-04-14 14:53   ` Oren Laadan [this message]
2009-04-14 16:16     ` Serge E. Hallyn
2009-04-14 16:36 ` Alexey Dobriyan
2009-04-14 16:46   ` Alexey Dobriyan
     [not found]   ` <20090414163633.GE27461-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-04-14 18:40     ` Oren Laadan
2009-04-14 19:59       ` Alexey Dobriyan
2009-04-14 20:10         ` Oren Laadan
2009-04-14 21:01           ` Alexey Dobriyan
2009-04-15 19:56       ` C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel) Alexey Dobriyan
2009-04-15 21:38         ` C/R without "leaks" Oren Laadan
2009-04-22  0:16           ` Nathan Lynch
2009-04-15 22:42         ` C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel) Greg Kurz
2009-04-16 16:12           ` Alexey Dobriyan
2009-04-16 18:10             ` C/R without "leaks" Chris Friesen
     [not found]               ` <49E774B1.5060505-ZIRUuHA3oDzQT0dZR+AlfA@public.gmane.org>
2009-04-16 18:39                 ` Oren Laadan
2009-04-17  9:15                   ` Greg Kurz
2009-04-17  9:48                     ` Oren Laadan
2009-04-17 12:25                       ` Greg Kurz
2009-04-17  8:46             ` C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel) Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49E4A380.4070503@cs.columbia.edu \
    --to=orenl@cs.columbia.edu \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.osdl.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=serue@us.ibm.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox