From: "Serge E. Hallyn" <serue@us.ibm.com>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Containers <containers@lists.linux-foundation.org>,
Oleg Nesterov <oleg@redhat.com>,
linux-kernel@vger.kernel.org, hpa@zytor.com, mingo@elte.hu,
Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>,
torvalds@linux-foundation.org,
Alexey Dobriyan <adobriyan@gmail.com>,
Pavel Emelyanov <xemul@openvz.org>
Subject: Re: [RFC][v4][PATCH 0/7] clone_with_pids() system call
Date: Fri, 21 Aug 2009 11:11:20 -0500 [thread overview]
Message-ID: <20090821161120.GA21094@us.ibm.com> (raw)
In-Reply-To: <20090813194616.GA10493@us.ibm.com>
Quoting Serge E. Hallyn (serue@us.ibm.com):
> Quoting Eric W. Biederman (ebiederm@xmission.com):
> > Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes:
> >
> > > Eric W. Biederman [ebiederm@xmission.com] wrote:
> > > | Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> writes:
> > > |
> > > | > === NEW CLONE() SYSTEM CALL:
> > > | >
> > > | > To support application checkpoint/restart, a task must have the same pid it
> > > | > had when it was checkpointed. When containers are nested, the tasks within
> > > | > the containers exist in multiple pid namespaces and hence have multiple pids
> > > | > to specify during restart.
> > > | >
> > > | > This patchset implements a new system call, clone_with_pids() that lets a
> > > | > process specify the pids of the child process.
> > > | >
> > > | > Patches 1 through 5 are helpers and we believe they are needed for application
> > > | > restart, regardless of the kernel implementation of application restart.
> > > |
> > > | I'm not very impressed.
> > > |
> > > | - static int alloc_pidmap(struct pid_namespace *pid_ns)
> > > | + static int alloc_pidmap(struct pid_namespace *pid_ns, int pid_max, int last_pid)
> > > |
> > > | Do that.
> > > |
> > > | That is pass in pid_max and last_pid, and you don't have to do weird
> > > | things in alloc_pidmap, and no set_pidmap is needed.
> > >
> > > But last_pid is from the pid_ns. Do you mean to have alloc_pidmap()
> > > take a pid_min and pid_max and when choosing a specific pid, have
> > > pid_min == pid_max == target_pid ?
> >
> > Yes. It already takes a pid_min and a pid_max from the environment.
> > I guess the pid_min is RESERVED_PIDS by default.
> >
> > > | No changes to copy_process are needed it already takes a struct pid
> > > | argument.
> > >
> > >
> > > I see your point about passing in both 'struct pid*' and target_pids[].
> > > But in the common case the struct pid passed into copy_process() is
> > > NULL - allocating pid in do_fork() would significantly alter the
> > > existing control flow - no ? alloc_pid() assumes any new pid namespace
> > > has been created - in copy_namespaces(). Moving the alloc_pid() to
> > > do_fork() would require parsing clone_flags in do_fork() and pulling
> > > pid namespace code out of copy_namespaces().
> >
> > Why change do_fork?
> >
> > > | I haven't been following closely what is gained by having a clone_with_pids
> > > | syscall?
> > >
> > > When restarting an application from a checkpoint, the application must get
> > > the same pid it had at the time of checkpoint. clone_with_pids() would be
> > > used during restart so the child can be created with a specific set of pids.
> >
> > That part I understand. What I don't understand is why have that one part be
> > special and have user space do the work?
>
> How would this be used then? Let's say I'm recreating a process tree
> with two nested pid namespaces. so just using clone(CLONE_NEWPID) we'd
> have P{500} creates P{1501,1} which creates P{1502,1,2} which creates
> P{1502,2,3} (1502 in top namespace, 2 in child ns, 3 in lowest pid ns).
> But now we want to create P{X, 27, 953} (i.e. X can be anything). How
> do we specify that for pidns 2 we want pid_min=pid_max=27, and for
> pidns 3 pid_min=pid_max=953?
Eric, if you have an idea for how to do this, please let me know,
and I'll set about trying a new patchset to do it. But as it stands
I don't see how to make your suggestion useful from userspace.
thanks,
-serge
next prev parent reply other threads:[~2009-08-21 16:11 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-07 6:11 [RFC][v4][PATCH 0/7] clone_with_pids() system call Sukadev Bhattiprolu
2009-08-07 6:12 ` [RFC][v4][PATCH 1/7]: Factor out code to allocate pidmap page Sukadev Bhattiprolu
2009-08-07 6:12 ` [RFC][v4][PATCH 2/7]: Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2009-08-07 6:13 ` [RFC][v4][PATCH 3/7]: Add target_pid parameter to alloc_pidmap() Sukadev Bhattiprolu
2009-08-07 6:13 ` [RFC][v4][PATCH 4/7]: Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2009-08-07 6:13 ` [RFC][v4][PATCH 5/7]: Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2009-08-07 6:14 ` [RFC][v4][PATCH 6/7]: Define do_fork_with_pids() Sukadev Bhattiprolu
2009-08-07 6:15 ` [RFC][v4][PATCH 7/7]: Define clone_with_pids syscall Sukadev Bhattiprolu
[not found] ` <20090807061517.GG20672-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-10 14:54 ` Pavel Machek
2009-08-10 14:54 ` Pavel Machek
[not found] ` <20090810145425.GA1378-+ZI9xUNit7I@public.gmane.org>
2009-08-10 15:07 ` Serge E. Hallyn
2009-08-10 22:26 ` Sukadev Bhattiprolu
2009-08-10 15:07 ` Serge E. Hallyn
2009-08-10 22:26 ` Sukadev Bhattiprolu
[not found] ` <20090807061103.GA19343-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-07 6:12 ` [RFC][v4][PATCH 1/7]: Factor out code to allocate pidmap page Sukadev Bhattiprolu
2009-08-07 6:12 ` [RFC][v4][PATCH 2/7]: Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2009-08-07 6:13 ` [RFC][v4][PATCH 3/7]: Add target_pid parameter to alloc_pidmap() Sukadev Bhattiprolu
2009-08-07 6:13 ` [RFC][v4][PATCH 4/7]: Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2009-08-07 6:13 ` [RFC][v4][PATCH 5/7]: Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2009-08-07 6:14 ` [RFC][v4][PATCH 6/7]: Define do_fork_with_pids() Sukadev Bhattiprolu
2009-08-07 6:15 ` [RFC][v4][PATCH 7/7]: Define clone_with_pids syscall Sukadev Bhattiprolu
2009-08-13 3:45 ` [RFC][v4][PATCH 0/7] clone_with_pids() system call Eric W. Biederman
2009-08-13 3:45 ` Eric W. Biederman
2009-08-13 8:00 ` Sukadev Bhattiprolu
[not found] ` <20090813080049.GA16639-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-13 9:05 ` Eric W. Biederman
2009-08-13 9:05 ` Eric W. Biederman
[not found] ` <m1vdks2iea.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-08-13 19:46 ` Serge E. Hallyn
2009-08-18 3:31 ` Sukadev Bhattiprolu
2009-08-13 19:46 ` Serge E. Hallyn
2009-08-21 16:11 ` Serge E. Hallyn [this message]
[not found] ` <20090813194616.GA10493-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-21 16:11 ` Serge E. Hallyn
2009-08-18 3:31 ` Sukadev Bhattiprolu
[not found] ` <m1vdks5qc8.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-08-13 8:00 ` Sukadev Bhattiprolu
2009-08-13 13:32 ` Serge E. Hallyn
2009-08-13 13:32 ` Serge E. Hallyn
-- strict thread matches above, loose matches on Subject: below --
2009-08-07 6:11 Sukadev Bhattiprolu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090821161120.GA21094@us.ibm.com \
--to=serue@us.ibm.com \
--cc=adobriyan@gmail.com \
--cc=containers@lists.linux-foundation.org \
--cc=ebiederm@xmission.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=sukadev@linux.vnet.ibm.com \
--cc=torvalds@linux-foundation.org \
--cc=xemul@openvz.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.