Linux Container Development
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
To: Sukadev Bhattiprolu
	<sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Nathan Lynch <nathanl-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org>
Subject: Re: [v12][PATCH 8/9] Define eclone() syscall
Date: Thu, 12 Nov 2009 19:12:48 -0600	[thread overview]
Message-ID: <20091113011248.GA7899@us.ibm.com> (raw)
In-Reply-To: <20091113004356.GA23615-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

Quoting Sukadev Bhattiprolu (sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
> (Trimmed Cc to Containers list).
> 
> Updated patch to ignore ->child_stack_size on architectures that don't
> need it.
> 
> ---
> >From e1e9b0b6eb511058961c1fb526f44b597790bfd7 Mon Sep 17 00:00:00 2001
> From: Sukadev Bhattiprolu <suka@suka.(none)>
> Date: Tue, 20 Oct 2009 22:04:57 -0700
> Subject: [v13][PATCH 8/9] Define eclone() syscall
> 
> Container restart requires that a task have the same pid it had when it was
> checkpointed. When containers are nested the tasks within the containers
> exist in multiple pid namespaces and hence have multiple pids to specify
> during restart.
> 
> eclone(), intended for use during restart, is the same as
> clone(), except that it takes a 'pids' paramter. This parameter lets
> caller choose specific pid numbers for the child process, in the
> process's active and ancestor pid namespaces. (Descendant pid namespaces
> in general don't matter since processes don't have pids in them anyway,
> but see comments in copy_target_pids() regarding CLONE_NEWPID).
> 
> eclone() also attempts to address a second limitation of the
> clone() system call. clone() is restricted to 32 clone flags and all but
> one of these are in use. If more new clone flags are needed, we will be
> forced to define a new variant of the clone() system call. To address
> this, eclone() allows at least 64 clone flags with some room
> for more if necessary.
> 
> To prevent unprivileged processes from misusing this interface,
> eclone() currently needs CAP_SYS_ADMIN, when the 'pids' parameter
> is non-NULL.
> 
> See Documentation/eclone in next patch for more details and an
> example of its usage.
> 
> NOTE:
> 	- System calls are restricted to 6 parameters and the number and sizes
> 	  of parameters needed for eclone() exceed 6 integers. The new
> 	  prototype works around this restriction while providing some
> 	  flexibility if eclone() needs to be further extended in the
> 	  future.
> TODO:
> 	- We should convert clone-flags to 64-bit value in all architectures.
> 	  Its probably best to do that as a separate patchset since clone_flags
> 	  touches several functions and that patchset seems independent of this
> 	  new system call.
> 
> Changelog[v13-rc1]:
> 	- [Nathan Lynch, Serge Hallyn] Rename ->child_stack_base to
> 	  ->child_stack and ensure ->child_stack_size is 0 on architectures
> 	  that don't need it (see comments in types.h for details).
> 
> Changelog[v12]:
> 	- [Serge Hallyn] Ignore ->child_stack_size if ->child_stack_base
> 	  is NULL.
> 	- [Oren Laadan, Serge Hallyn] Rename clone_with_pids() to eclone()
> Changelog[v11]:
> 	- [Dave Hansen] Move clone_args validation checks to arch-indpeendent
> 	  code.
> 	- [Oren Laadan] Make args_size a parameter to system call and remove
> 	  it from 'struct clone_args'
> 
> Changelog[v10]:
> 	- Rename clone3() to clone_with_pids()
> 	- [Linus Torvalds] Use PTREGSCALL() rather than the generic syscall
> 	  implementation
> 
> Changelog[v9]:
> 	- [Roland McGrath, H. Peter Anvin] To avoid confusion on 64-bit
> 	  architectures split the new clone-flags into 'low' and 'high'
> 	  words and pass in the 'lower' flags as the first argument.
> 	  This would maintain similarity of the clone3() with clone()/
> 	  clone2(). Also has the side-effect of the name matching the
> 	  number of parameters :-)
> 	- [Roland McGrath] Rename structure to 'clone_args' and add a
> 	  'child_stack_size' field
> 
> Changelog[v8]
> 	- [Oren Laadan] parent_tid and child_tid fields in 'struct clone_arg'
> 	  must be 64-bit.
> 	- clone2() is in use in IA64. Rename system call to clone3().
> 
> Changelog[v7]:
> 	- [Peter Zijlstra, Arnd Bergmann] Rename system call to clone2()
> 	  and group parameters into a new 'struct clone_struct' object.
> 
> Changelog[v6]:
> 	- (Nathan Lynch, Arnd Bergmann, H. Peter Anvin, Linus Torvalds)
> 	  Change 'pid_set.pids' to a 'pid_t pids[]' so size of 'struct pid_set'
> 	  is constant across architectures.
> 	- (Nathan Lynch) Change pid_set.num_pids to unsigned and remove
> 	  'unum_pids < 0' check.
> 
> Changelog[v4]:
> 	- (Oren Laadan) rename 'struct target_pid_set' to 'struct pid_set'
> 
> Changelog[v3]:
> 	- (Oren Laadan) Allow CLONE_NEWPID flag (by allocating an extra pid
> 	  in the target_pids[] list and setting it 0. See copy_target_pids()).
> 	- (Oren Laadan) Specified target pids should apply only to youngest
> 	  pid-namespaces (see copy_target_pids())
> 	- (Matt Helsley) Update patch description.
> 
> Changelog[v2]:
> 	- Remove unnecessary printk and add a note to callers of
> 	  copy_target_pids() to free target_pids.
> 	- (Serge Hallyn) Mention CAP_SYS_ADMIN restriction in patch description.
> 	- (Oren Laadan) Add checks for 'num_pids < 0' (return -EINVAL) and
> 	  'num_pids == 0' (fall back to normal clone()).
> 	- Move arch-independent code (sanity checks and copy-in of target-pids)
> 	  into kernel/fork.c and simplify sys_clone_with_pids()
> 
> Changelog[v1]:
> 	- Fixed some compile errors (had fixed these errors earlier in my
> 	  git tree but had not refreshed patches before emailing them)
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

  parent reply	other threads:[~2009-11-13  1:12 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20091111043440.GA9377@suka>
2009-11-11  4:42 ` [v12][PATCH 1/9] Factor out code to allocate pidmap page Sukadev Bhattiprolu
2009-11-11  4:43 ` [v12][PATCH 2/9] Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2009-11-11  4:43 ` [v12][PATCH 3/9] Define set_pidmap() function Sukadev Bhattiprolu
2009-11-11  4:43 ` [v12][PATCH 4/9] Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2009-11-11  4:44 ` [v12][PATCH 5/9] Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2009-11-11  4:44 ` Sukadev Bhattiprolu
2009-11-11  4:44 ` [v12][PATCH 6/9] Check invalid clone flags Sukadev Bhattiprolu
2009-11-11  4:44 ` [v12][PATCH 7/9] Define do_fork_with_pids() Sukadev Bhattiprolu
2009-11-11  4:45 ` [v12][PATCH 8/9] Define eclone() syscall Sukadev Bhattiprolu
2009-11-11  4:45 ` [v12][PATCH 9/9] Document " Sukadev Bhattiprolu
2009-11-11 22:38 ` [v12][PATCH 0/9] Implement " Sukadev Bhattiprolu
     [not found] ` <20091111044250.GA11393@suka>
2009-11-11 22:38   ` [v12][PATCH 1/9] Factor out code to allocate pidmap page Sukadev Bhattiprolu
     [not found] ` <20091111044313.GB11393@suka>
2009-11-11 22:39   ` [v12][PATCH 2/9] Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
     [not found] ` <20091111044329.GC11393@suka>
2009-11-11 22:39   ` [v12][PATCH 3/9] Define set_pidmap() function Sukadev Bhattiprolu
     [not found] ` <20091111044347.GD11393@suka>
2009-11-11 22:39   ` [v12][PATCH 4/9] Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
     [not found] ` <20091111044403.GE11393@suka>
2009-11-11 22:40   ` [v12][PATCH 5/9] Add target_pids parameter to copy_process() Sukadev Bhattiprolu
     [not found] ` <20091111044422.GF11393@suka>
2009-11-11 22:40   ` [v12][PATCH 6/9] Check invalid clone flags Sukadev Bhattiprolu
     [not found] ` <20091111044438.GG11393@suka>
2009-11-11 22:40   ` [v12][PATCH 7/9] Define do_fork_with_pids() Sukadev Bhattiprolu
     [not found] ` <20091111044509.GH11393@suka>
2009-11-11 22:40   ` [v12][PATCH 8/9] Define eclone() syscall Sukadev Bhattiprolu
     [not found]   ` <20091111224049.GI24988@suka>
2009-11-13  0:43     ` Sukadev Bhattiprolu
     [not found]       ` <20091113004356.GA23615-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-11-13  1:12         ` Serge E. Hallyn [this message]
2009-11-13 17:27         ` Serge E. Hallyn
     [not found] ` <20091111044527.GI11393@suka>
2009-11-11 22:41   ` [v12][PATCH 9/9] Document " Sukadev Bhattiprolu
2009-11-13  0:45     ` Sukadev Bhattiprolu
     [not found]       ` <20091113004531.GB23615-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-11-13  1:17         ` Serge E. Hallyn
2009-11-19 14:20 ` [v12][PATCH 0/9] Implement " Arnd Bergmann
     [not found]   ` <200911191520.46445.arnd-r2nGTMty4D4@public.gmane.org>
2009-11-19 23:56     ` Sukadev Bhattiprolu
     [not found]   ` <20091119235644.GA18720@us.ibm.com>
     [not found]     ` <20091119235644.GA18720-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-11-20  8:08       ` Arnd Bergmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091113011248.GA7899@us.ibm.com \
    --to=serue-r/jw6+rmf7hqt0dzr+alfa@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=nathanl-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org \
    --cc=sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox