Linux Container Development
 help / color / mirror / Atom feed
From: Oren Laadan <orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
To: "Serge E. Hallyn" <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Sukadev Bhattiprolu
	<sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Alexey Dobriyan
	<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC][v4][PATCH 7/7]: Define clone_extended() syscall
Date: Thu, 06 Aug 2009 12:05:53 -0400	[thread overview]
Message-ID: <4A7AFF61.8050802@librato.com> (raw)
In-Reply-To: <20090806155520.GA904-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>



Serge E. Hallyn wrote:
> Quoting Oren Laadan (orenl-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org):
>>
>> Serge E. Hallyn wrote:
>>> Quoting Sukadev Bhattiprolu (sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org):
>>>> Subject: [RFC][v4][PATCH 7/7]: Define clone_extended() syscall
>>>>
>>>> Container restart requires that a task have the same pid it had when it was
>>>> checkpointed. When containers are nested the tasks within the containers
>>>> exist in multiple pid namespaces and hence have multiple pids to specify
>>>> during restart.
>>>>
>>>> This patch defines, a new system call, clone_extended() which is like clone(),
>>>> but takes a new 'pid_set' parameter.  This parameter lets caller choose
>>>> specific pid numbers for the child process, in the process's active and
>>>> ancestor pid namespaces. (Descendant pid namespaces in general don't matter
>>>> since processes don't have pids in them anyway, but see comments in
>>>> copy_target_pids() regarding CLONE_NEWPID).
>>>>
>>>> Unlike clone(), however, clone_extended() needs CAP_SYS_ADMIN, at least for
>>>> now, to prevent unprivileged processes from misusing this interface.
>>> It only needs that when specifying pids.
>>>
>>>> While the main motivation for this interface is the need to let a process
>>>> choose its 'pid numbers', the clone_extended() interface uses 64-bit clone
>>>> flags.  The 'higher' portion of the clone flags are unused and are only
>>>> included to preclude yet another version of clone when a new clone flag is
>>>> needed. 
>>>>
>>>> ===== Interface:
>>>>
>>>> Compared to clone(), clone_extended() needs to pass in three more pieces
>>>> of information:
>>>>
>>>> 	- additional 32-bit of clone_flags
>>>> 	- number of pids in the set
>>>> 	- user buffer containing the list of pids.
>>>>
>>>> But since clone() already takes 5 parameters and some (all ?) architectures
>>>> are restricted to 6 parameters to a system-call, additional data-structures
>>>> (and copy_from_user()) are needed.
>>>>
>>>> The proposed interface for clone_extended() is:
>>>>
>>>> 	struct clone_tid_info {
>>>> 		void *parent_tid; 	/* parent_tid_ptr parameter */
>>>> 		void *child_tid; 	/* child_tid_ptr parameter */
>>>> 	};
>>>>
>>>> 	struct pid_set {
>>>> 		int num_pids;
>>>> 		pid_t *pids;
>>>> 	};
>>>>
>>>> 	int clone_extended(int flags_low, int flags_high, void *child_stack,
>>>> 			void *unused, struct clone_tid_info *tid_ptrs,
>>>> 			struct pid_set *pid_setp);
>>> I was thinking additional flags would be passed in the (renamed)
>>> struct pid_set.
>> Yes.
>>
>> But maybe in (renamed) 'struct clone_info' instead of 'struct pid_set' ?
>>
>> I vaguely recall a strong preference to not require copy-from-user
>> during a fast-path clone, because it may hurt performance.
>>
>> *If* this is the case, then maybe place extra flags among the
>> "base" args, or at least a CLONE_EXTRA would indicate that more
>> arguments need to be pulled from user-space ?
> 
> Wouldn't passing NULL for struct clone_info suffice?

:o

Actually, I misread the original prototype, and I prefer Suka's
current suggestion.

Oren.

> 
>> Do you intend to get feedback from LKML too ?
>>
>> Oren.

  parent reply	other threads:[~2009-08-06 16:05 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-06  6:10 [RFC][v4][PATCH 0/7] clone_extended() syscall Sukadev Bhattiprolu
     [not found] ` <20090806061056.GA1044-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-06  6:22   ` [RFC][v4][PATCH 1/7] Factor out code to allocate pidmap page Sukadev Bhattiprolu
2009-08-06  6:23   ` [RFC][v4][PATCH 2/7]: Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2009-08-06  6:23   ` [RFC][v4][PATCH 3/7]: Add target_pid parameter to alloc_pidmap() Sukadev Bhattiprolu
2009-08-06  6:24   ` [RFC][v4][PATCH 4/7]: Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2009-08-06  6:24   ` [RFC][v4][PATCH 5/7]: Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2009-08-06  6:24   ` [RFC][v4][PATCH 6/7]: Define do_fork_with_pids() Sukadev Bhattiprolu
2009-08-06  6:25   ` [RFC][v4][PATCH 7/7]: Define clone_extended() syscall Sukadev Bhattiprolu
     [not found]     ` <20090806062505.GG5619-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-06 13:38       ` Serge E. Hallyn
     [not found]         ` <20090806133847.GA28392-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-06 15:37           ` Oren Laadan
     [not found]             ` <4A7AF8AD.4070805-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-08-06 15:55               ` Serge E. Hallyn
     [not found]                 ` <20090806155520.GA904-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-06 16:05                   ` Oren Laadan [this message]
     [not found]                     ` <4A7AFF61.8050802-RdfvBDnrOixBDgjK7y7TUQ@public.gmane.org>
2009-08-06 16:16                       ` Serge E. Hallyn
     [not found]                         ` <20090806161616.GA1472-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-06 18:23                           ` Sukadev Bhattiprolu
     [not found]                             ` <20090806182340.GA2579-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-06 18:35                               ` Serge E. Hallyn
2009-08-06 20:38       ` Matt Helsley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4A7AFF61.8050802@librato.com \
    --to=orenl-rdfvbdnroixbdgjk7y7tuq@public.gmane.org \
    --cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox