public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oren Laadan <orenl@librato.com>
To: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org, arnd@arndb.de,
	Containers <containers@lists.linux-foundation.org>,
	Nathan Lynch <nathanl@austin.ibm.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	hpa@zytor.com, mingo@elte.hu, torvalds@linux-foundation.org,
	Alexey Dobriyan <adobriyan@gmail.com>,
	Pavel Emelyanov <xemul@openvz.org>
Subject: Re: [RFC][v7][PATCH 0/9] Implement clone2() system call
Date: Thu, 01 Oct 2009 11:19:27 -0400	[thread overview]
Message-ID: <4AC4C87F.3000702@librato.com> (raw)
In-Reply-To: <20091001023618.GA31344@us.ibm.com>



Sukadev Bhattiprolu wrote:
> Oren Laadan [orenl@librato.com] wrote:
> | 
> | 
> | Sukadev Bhattiprolu wrote:
> | > === NEW CLONE() SYSTEM CALL:
> | > 
> | > To support application checkpoint/restart, a task must have the same pid it
> | > had when it was checkpointed.  When containers are nested, the tasks within
> | > the containers exist in multiple pid namespaces and hence have multiple pids
> | > to specify during restart.
> | > 
> | > This patchset implements a new system call, clone2() that lets a process
> | > specify the pids of the child process.
> | > 
> | > Patches 1 through 6 are helper patches, needed for choosing a pid for the
> | > child process.
> | > 
> | > Patch 8 defines a prototype of the new system call. Patch 9 adds some
> | > documentation on the new system call, some/all of which will eventually
> | > go into a man page.
> | > 
> | 
> | [...]
> | 
> | > 
> | > Based on these requirements and constraints, we explored a couple of system
> | > call interfaces (in earlier versions of this patchset) and currently define
> | > the system call as:
> | > 
> | > 	struct clone_struct {
> | > 		u64 flags;
> | > 		u64 child_stack;
> | > 		u32 nr_pids;
> | > 		u32 parent_tid;
> | > 		u32 child_tid;
> | 
> | So @parent_tid and @child_tid are pointers to userspace memory and
> | require 'u64' (and it won't hurt to make @reserved1 a 'u64' as well).
> 
> Well, if we make parent_tid and child_tid u64, we could move reserved1
> after ->nr_pids and leave it as a 32-bit value.

Sure. In any case, won't hurt to leave large reserved space -
someone may be thankful for it in the future ;)

> 
> | 
> | > 		u32 reserved1;
> | > 		u64 reserved2;
> | > 	};
> | > 
> | 
> | Also, for forward/backward compatibility, explicitly state in the
> | documentation, and enforce in the kernel, that flags which are not
> | defined must not be set, and that reserved{1,2} must remain 0.
> 
> Agree with checking for reserved1 and reserved2.
> 
> We currently don't check for invalid clone_flags - we just ignore them.
> Adding checks like
> 
> 	if (fls(kcs.flags) > fls(CLONE_LAST_FLAG))
> 
> would assume we always use bits in order (while it seems to make sense, to
> use them in order, we don't seem to have done so in the past).
> 
> Alternatively we could define a CLONE_FLAG_MASK of valid flags and update
> the mask when each new clone flag is added. 
> 
> But do we really need to check for invalid flags ?

I'd go for a a mask.

The idea is that we want to educate userspace to _not_ use unused
flags now. For if userspace sets an unused flag now and we let it
be, the application will break when we give meaning to that flag.

Oren.


  reply	other threads:[~2009-10-01 15:19 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-24 16:55 [RFC][v7][PATCH 0/9] Implement clone2() system call Sukadev Bhattiprolu
2009-09-24 17:00 ` [RFC][v7][PATCH 1/9]: Factor out code to allocate pidmap page Sukadev Bhattiprolu
2009-09-24 17:00 ` [RFC][v7][PATCH 2/9]: Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2009-09-24 17:01 ` [RFC][v7][PATCH 3/9] Make pid_max a pid_ns property Sukadev Bhattiprolu
2009-09-24 17:45   ` Oren Laadan
2009-09-24 17:01 ` [RFC][v7][PATCH 4/9]: Add target_pid parameter to alloc_pidmap() Sukadev Bhattiprolu
2009-09-24 17:47   ` Oren Laadan
2009-09-24 17:01 ` [RFC][v7][PATCH 5/9]: Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2009-09-24 17:02 ` [RFC][v7][PATCH 6/9]: Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2009-09-24 17:02 ` [RFC][v7][PATCH 7/9]: Define do_fork_with_pids() Sukadev Bhattiprolu
2009-09-24 17:03 ` [RFC][v7][PATCH 8/9]: Define clone2() syscall Sukadev Bhattiprolu
2009-09-24 21:43   ` Arnd Bergmann
2009-09-25  8:23     ` Louis Rilling
2009-09-25 10:56       ` Louis Rilling
2009-09-29 18:05         ` Sukadev Bhattiprolu
2009-09-29 18:40           ` Roland McGrath
2009-09-29 18:44             ` H. Peter Anvin
2009-09-29 19:02               ` Arjan van de Ven
2009-09-29 19:10                 ` Linus Torvalds
2009-09-29 20:02                   ` H. Peter Anvin
2009-09-29 22:11                     ` Linus Torvalds
2009-09-29 22:19                       ` H. Peter Anvin
2009-09-30 16:15                         ` Arnd Bergmann
2009-09-30 16:27                           ` Linus Torvalds
2009-09-30 17:59                             ` Arnd Bergmann
2009-09-30 19:14                               ` Linus Torvalds
2009-09-30  6:48                       ` Roland McGrath
2009-09-29 20:00                 ` H. Peter Anvin
2009-09-29 21:58           ` Oren Laadan
2009-09-24 17:03 ` [RFC][v7][PATCH 9/9]: Document " Sukadev Bhattiprolu
2009-09-24 18:05   ` Randy Dunlap
2009-09-25  2:31   ` KOSAKI Motohiro
2009-09-24 17:44 ` [RFC][v7][PATCH 0/9] Implement clone2() system call Oren Laadan
2009-09-24 20:15   ` Sukadev Bhattiprolu
2009-09-24 22:06     ` Oren Laadan
2009-09-24 22:21       ` Arnd Bergmann
2009-09-24 23:19         ` Oren Laadan
2009-10-01  2:36   ` Sukadev Bhattiprolu
2009-10-01 15:19     ` Oren Laadan [this message]
2009-09-24 17:57 ` Alexey Dobriyan
2009-09-24 18:35   ` Serge E. Hallyn
2009-09-30  5:34     ` Alexey Dobriyan
2009-09-30 17:41       ` Oren Laadan
2009-10-02 20:27         ` Alexey Dobriyan
2009-10-02 21:06           ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AC4C87F.3000702@librato.com \
    --to=orenl@librato.com \
    --cc=adobriyan@gmail.com \
    --cc=arnd@arndb.de \
    --cc=containers@lists.linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=nathanl@austin.ibm.com \
    --cc=sukadev@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox