linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org,
	dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org,
	mingo-X9Un+BFzKDI@public.gmane.org,
	hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org,
	viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org
Subject: Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint	restart
Date: Tue, 21 Oct 2008 21:33:19 -0400	[thread overview]
Message-ID: <48FE82DF.6030005@cs.columbia.edu> (raw)
In-Reply-To: <20081021202410.GA10423-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>



Serge E. Hallyn wrote:
> Quoting Andrew Morton (akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org):
>> On Mon, 20 Oct 2008 01:40:30 -0400
>> Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> wrote:
>>>  asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
>>>  {
>>> -	pr_debug("sys_checkpoint not implemented yet\n");
>>> -	return -ENOSYS;
>>> +	struct cr_ctx *ctx;
>>> +	int ret;
>>> +
>>> +	/* no flags for now */
>>> +	if (flags)
>>> +		return -EINVAL;
>>> +
>>> +	ctx = cr_ctx_alloc(pid, fd, flags | CR_CTX_CKPT);
>>> +	if (IS_ERR(ctx))
>>> +		return PTR_ERR(ctx);
>>> +
>>> +	ret = do_checkpoint(ctx);
>>> +
>>> +	if (!ret)
>>> +		ret = ctx->crid;
>>> +
>>> +	cr_ctx_free(ctx);
>>> +	return ret;
>>>  }
>> Is it appropriate that this be an unprivileged operation?
> 
> Early versions checked capable(CAP_SYS_ADMIN), and we reasoned that we
> would later attempt to remove the need for privilege so that all users
> could safely use it.
> 
> Arnd Bergmann called us on that nonsense, pointing out that it'd make
> more sense to let unprivileged users use them now, so that we'll be
> more careful about the security as patches roll in.
> 
> So, Oren's patchset right now only checkpoints current, despite pid
> being part of the API.  So the task can access its own data.  When
> the patch supports checkpointing another task (which Oren says he's
> doing right now), then our intent is to check for ptrace access to
> the target task.  (Right, Oren?)

Correct. That's already in the additional patch in the git tree - first
I locate the task and if found, I check ptrace_may_access() (read mode).

see the top patch in:
http://git.ncl.cs.columbia.edu/?p=linux-cr-dev.git;a=shortlog;h=refs/heads/ckpt-v7

> 
>> What happens if I pass it a pid which isn't system-wide unique?
> 
> pid must be checked in the caller's pid namespace.  So if I've create a
> container which I want to checkpoint, pid 1 in that pidns will be, say,
> 3497 in my pid_ns, and so 3497 is the pid I must use.  If I try to pass
> 1, I'll try to checkpoint my own container.  And, if I'm not privileged
> and init is owned by root, the ptrace() check I mentioned above will
> return -EPERM.

Yup.

> 
>> What happens if I pass it a pid of a process which I don't own?  This
>> is super security-sensitive and we need to go over the permission
>> checking with a toothcomb.  It needs to be exhaustively described in
>> the changelog.  It might have security/selinux implications - I don't
>> know, I didn't look, but lights are flashing and bells are ringing over
>> here.

This should be covered by ptrace_may_access() test.

In the longer run, I suppose SElinux people would want a security hook
there to approve or disapprove the operation.

>>
>> What happens if I pass it a pid of a process which I _do_ own, but it
>> does not refer to a container's init process?
> 
> I would assume that do_checkpoint() would return -EINVAL, but it's a
> great question:  Oren, did you have another plan?

Since we intentional provide minimal functionality to keep the patchset
simple and allow easy review - we only checkpoint one task; it doesn't
really matter because we don't deal with the entire container.

With the ability to checkpoint multiple process we will have to ensure
that we checkpoint an entire container. I planned to return -EINVAL if
the target task isn't a container init(1). Another option, if people
prefer, is to use any task in a container to "represent" the entire
container.

> 
>> If `pid' must refer to a container's init process, isn't it always
>> equal to 1??
> 
> Not in the caller's pid_namespace.
> 
>>>  /**
>>>   * sys_restart - restart a container
>>>   * @crid: checkpoint image identifier
>>> @@ -36,6 +234,19 @@ asmlinkage long sys_checkpoint(pid_t pid, int fd, unsigned long flags)
>>>   */
>>>  asmlinkage long sys_restart(int crid, int fd, unsigned long flags)
>>>  {
>>> -	pr_debug("sys_restart not implemented yet\n");
>>> -	return -ENOSYS;
>>> +	struct cr_ctx *ctx;
>>> +	int ret;
>>> +
>>> +	/* no flags for now */
>>> +	if (flags)
>>> +		return -EINVAL;
>>> +
>>> +	ctx = cr_ctx_alloc(crid, fd, flags | CR_CTX_RSTR);
>>> +	if (IS_ERR(ctx))
>>> +		return PTR_ERR(ctx);
>>> +
>>> +	ret = do_restart(ctx);
>>> +
>>> +	cr_ctx_free(ctx);
>>> +	return ret;
>>>  }
>> Again, this is scary stuff.  We're allowing unprivileged userspace to
>> feed random numbers into kernel data structures.
> 
> Yes, all of the file opens and mmaps must not skip the usual security
> checks.  The task credentials are currently unsupported, meaning that
> euid, etc, come from the caller, not the checkpoint image.  When the

Actually, the fact that task credentials are not restored makes it
more secure, because the user can't do anything beyond her current
capabilities.

For the same reason, however, unless we agree on a secure way to
elevate credentials, there are various things that we cannot restore,
even though it may be something we would want to permit.

> restoration of credentials becomes supported, then definately the
> caller (of sys_restore())'s ability to setresuid/setresgid to those
> values must be checked.
> 
> So that's why we don't want CAP_SYS_ADMIN required up-front.  That way
> we will be forced to more carefully review each of those features.
> 
>> I'd like to see the security guys take a real close look at all of
>> this, and for them to do that effectively they should be provided with
>> a full description of the security design of this feature.
> 
> Right, some of the above should be spelled out somewhere.  Should it be
> in the patch description, in the Documentation/checkpoint.txt file,
> or someplace else?  Oren, do you want to filter the above information
> into the right place, or do you want me to do it and send you a patch?

I'll add something to the Documentation/checkpoint.txt.

Thanks,

Oren.

--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2008-10-22  1:33 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-20  5:40 [RFC v7][PATCH 0/9] Kernel based checkpoint/restart Oren Laadan
2008-10-20  5:40 ` [RFC v7][PATCH 1/9] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2008-10-20  5:40 ` [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart Oren Laadan
     [not found]   ` <1224481237-4892-3-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-21 19:41     ` Andrew Morton
     [not found]       ` <20081021124130.a002e838.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-10-21 20:24         ` Serge E. Hallyn
     [not found]           ` <20081021202410.GA10423-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-21 20:41             ` Andrew Morton
2008-10-22  1:33             ` Oren Laadan [this message]
2008-10-22  2:55               ` Daniel Jacobowitz
     [not found]                 ` <20081022025513.GA7504-FDxGpBj5bhMn2ysHARXsoQ@public.gmane.org>
2008-10-22  3:02                   ` Dave Hansen
2008-10-22 14:29                     ` Daniel Jacobowitz
     [not found]               ` <48FE82DF.6030005-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-22 15:28                 ` Serge E. Hallyn
     [not found]                   ` <20081022152804.GA23821-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-22 16:02                     ` Oren Laadan
2008-10-22 17:03                       ` Serge E. Hallyn
     [not found]                         ` <20081022170325.GA4908-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-22 18:32                           ` Oren Laadan
2008-10-27  8:27                       ` Peter Chubb
     [not found]                         ` <87tzayh27r.wl%peter-LkDQP0DxSMGxwJ88Py/mJxCuuivNXqWP@public.gmane.org>
2008-10-27 11:03                           ` Oren Laadan
     [not found]                             ` <49059FED.4030202-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-27 16:42                               ` Dave Hansen
2008-10-27 17:11                                 ` Oren Laadan
     [not found]                                   ` <4905F648.4030402-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-27 20:51                                     ` Matt Helsley
2008-10-27 21:20                                       ` Serge E. Hallyn
2008-10-27 21:51                                       ` Oren Laadan
     [not found]                                         ` <490637D8.4080404-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-27 22:09                                           ` Dave Hansen
2008-10-28 18:33                                             ` Serge E. Hallyn
2008-10-20  5:40 ` [RFC v7][PATCH 3/9] x86 support for checkpoint/restart Oren Laadan
2008-10-20  5:40 ` [RFC v7][PATCH 4/9] Dump memory address space Oren Laadan
2008-10-20  5:40 ` [RFC v7][PATCH 6/9] Checkpoint/restart: initial documentation Oren Laadan
     [not found]   ` <1224481237-4892-7-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-28 16:48     ` Michael Kerrisk
     [not found] ` <1224481237-4892-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-20  5:40   ` [RFC v7][PATCH 5/9] Restore memory address space Oren Laadan
2008-10-20  5:40   ` [RFC v7][PATCH 7/9] Infrastructure for shared objects Oren Laadan
2008-10-21 19:21   ` [RFC v7][PATCH 0/9] Kernel based checkpoint/restart Andrew Morton
     [not found]     ` <20081021122135.4bce362c.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2008-10-21 20:41       ` Dave Hansen
2008-10-22  9:20         ` Ingo Molnar
     [not found]           ` <20081022092024.GC12453-X9Un+BFzKDI@public.gmane.org>
2008-10-22 11:51             ` Daniel Lezcano
2008-10-22 11:55             ` Cedric Le Goater
2008-10-20  5:40 ` [RFC v7][PATCH 8/9] Dump open file descriptors Oren Laadan
2008-10-20  5:40 ` [RFC v7][PATCH 9/9] Restore open file descriprtors Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48FE82DF.6030005@cs.columbia.edu \
    --to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mingo-X9Un+BFzKDI@public.gmane.org \
    --cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=viro-RmSDqhL/yNMiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).