Linux Container Development
 help / color / mirror / Atom feed
From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: Albert Cahalan <acahalan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Randy Dunlap
	<randy.dunlap-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Serge Hallyn
	<serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>,
	Nathan Lynch <nathanl-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org>,
	"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>,
	Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
	Sukadev Bhattiprolu
	<sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	Roland McGrath <roland-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Subject: Re: [PATCH 11/11][v15]: Document sys_eclone
Date: Tue, 06 Jul 2010 11:12:10 -0400	[thread overview]
Message-ID: <4C3347CA.8060703@cs.columbia.edu> (raw)
In-Reply-To: <AANLkTinr_2u-_0S2UvMDc7hOE_JOVIOjGtVo9Tzuk21E-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>



Albert Cahalan wrote:
> On Mon, Jul 5, 2010 at 12:18 AM, Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org> wrote:
>> Matt Helsley wrote:
>>> On Sat, Jul 03, 2010 at 07:41:30PM -0400, Albert Cahalan wrote:
>>>> On Sat, Jul 3, 2010 at 4:32 PM, Sukadev Bhattiprolu
>>>> <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> 
>> It follows that trying to set pid's in pid-namespaces _below_ you
>> simply doesn't make sense (beyond the CLONE_NEWPID case).
> 
> I may have some wrong ideas about how process restart works,
> but I'd thought it would normally be done from above or from PID 1
> in the same pid namespace.
> 
>> Finally, there have been objections before to allow pid-selection
>> by non-privileged process.
> 
> Eh, I dearly hope that privileged processes are generally not
> even addressable (never mind creatable or accessable) from
> inside anything other than the top-level pid namespace.
> 
> Well, at least nothing should get more privilege than PID 1.
> This would include having UID values that PID 1 can switch
> to and having capability sets that PID 1 can switch to, and
> any other (SE Linux, AppArmor, etc.) things too.
> 
> Restarting a privileged process with a less privileged PID 1
> should result in privilege loss, and ought to require some sort of
> "--force" option to ensure the person accepts possible breakage.
> 
>>>>> +static int do_clone(int (*child_fn)(void *), void *child_arg,
>>>>> +               unsigned int flags_low, int nr_pids, pid_t *pids_list)
>>>> There needs to be a way to pass child_fn and child_arg
>>>> via the kernel. Besides being required for kernel-managed
>>>> stacks, it's normally a saner interface. Stack setup would
>>>> be much like the stack setup for signal handlers. Imagine
>>> I'm inclined to say this is a bad idea.
>>>
>>> I didn't think we had "kernel-managed stacks" in mainline. The most we
>>> have, to my knowledge, is the sigaltstack support and kernel threads.
>>>
>>> I don't see how being able to pass in child_fn and child_arg to the
>>> kernel improves the sanity of the interface. If anything it will make
>>> eclone even more exotic -- now at the end of the syscall we'll
>>> need to mess with the registers/stack of the child much like when we're
>>> invoking a signal handler. That just adds more arch-specific code than is
>>> necessary.
>>>
>>> Userspace wrappers are perfectly capable of invoking the child function
>>> and passing the arguments. Furthermore, passing those arguments requires
>>> expanding the argument structure or putting even greater pressure on
>>> registers (which, as you pointed out below, is an issue for vfork).
> 
> BSD's rfork_thread has, among other things, these two arguments:
> 
> int (*func)(void *arg)
> void *arg
> 
>>>> using this for a vfork-like interface that didn't have painful
>>>> interactions with the compiler.
>> Pardon my ignorance - what sort of painful interactions ?
> 
> The child returns from vfork, via the same return address that
> the parent will later use. (on the stack for many architectures)
> The child then calls a function which might not have the same
> stack layout as vfork, scrambling whatever may be on the stack
> that the parent will be using to return from vfork. The parent may
> then end up using a return address that has been corrupted.
> To make this work, gcc actually recognizes vfork and has
> special handling for it.

I assumed that this is taken care of by libc rather than the
compiler, like it is done for clone(2).

Oren.

  parent reply	other threads:[~2010-07-06 15:12 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-03 20:32 [PATCH 00/11][v15]: Implement eclone() system call Sukadev Bhattiprolu
     [not found] ` <1278189164-28408-1-git-send-email-sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-03 20:32   ` [PATCH 01/11][v15]: Factor out code to allocate pidmap page Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 02/11][v15]: Have alloc_pidmap() return actual error code Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 03/11][v15]: Define set_pidmap() function Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 04/11][v15]: Add target_pids parameter to alloc_pid() Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 05/11][v15]: Add target_pids parameter to copy_process() Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 06/11][v15]: Check invalid clone flags Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 07/11][v15]: Define do_fork_with_pids() Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 08/11][v15]: Implement sys_eclone for x86 (32,64) Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 09/11][v15]: Implement sys_eclone for s390 Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 10/11][v15]: Implement sys_eclone for powerpc Sukadev Bhattiprolu
2010-07-03 20:32   ` [PATCH 11/11][v15]: Document sys_eclone Sukadev Bhattiprolu
     [not found]     ` <1278189164-28408-12-git-send-email-sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2010-07-03 23:41       ` Albert Cahalan
     [not found]         ` <AANLkTinM1jqG-9Mgbzft8bALGri7ZpzU9ZcPbMTe4fvW-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-04 23:39           ` Matt Helsley
     [not found]             ` <20100704233951.GK3338-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-07-05  0:45               ` Albert Cahalan
     [not found]                 ` <AANLkTilTGvFTkjy8vi8N8msB7koEp0r7SnpPqJkVN4XA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-05 10:34                   ` Arnd Bergmann
     [not found]                     ` <201007051234.40943.arnd-r2nGTMty4D4@public.gmane.org>
2010-07-06 22:25                       ` Sukadev Bhattiprolu
     [not found]                         ` <20100706222554.GA7648-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-07-11  9:00                           ` Albert Cahalan
     [not found]                             ` <AANLkTilJMi8cXSlbG8towQFFAQpuuJjO9kDXWOfEu_EJ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-12 18:02                               ` Matt Helsley
2010-07-12 21:54                               ` Sukadev Bhattiprolu
     [not found]                                 ` <20100712215456.GA23721-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-07-13  6:48                                   ` Albert Cahalan
     [not found]                                     ` <AANLkTikXDsv9CoV_EU48bunLD1wCh3w7HVdpo84rJtJg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-20 22:13                                       ` Sukadev Bhattiprolu
     [not found]                                         ` <20100720221357.GA2440-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-07-21 19:51                                           ` Albert Cahalan
2010-07-05  4:10               ` H. Peter Anvin
     [not found]                 ` <4C315B42.1020201-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
2010-07-06  4:11                   ` Albert Cahalan
     [not found]                     ` <AANLkTimWBFbcRyf5tvA9Lork13gAJtCAdUg_ZS3PzbI0-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-06 15:14                       ` Oren Laadan
2010-07-05  4:18               ` Oren Laadan
     [not found]                 ` <4C315D2D.6000107-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-07-06  3:59                   ` Albert Cahalan
     [not found]                     ` <AANLkTinr_2u-_0S2UvMDc7hOE_JOVIOjGtVo9Tzuk21E-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-07-06 13:12                       ` Serge E. Hallyn
2010-07-06 15:12                       ` Oren Laadan [this message]
     [not found]                         ` <4C3347CA.8060703-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-07-06 22:23                           ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C3347CA.8060703@cs.columbia.edu \
    --to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
    --cc=acahalan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=arnd-r2nGTMty4D4@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
    --cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
    --cc=nathanl-V7BBcbaFuwjMbYB6QlFGEg@public.gmane.org \
    --cc=randy.dunlap-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=roland-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=serge.hallyn-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org \
    --cc=sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox