public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: "Jürg Billeter" <j@bitron.ch>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Filipe Brandenburger <filbranden@google.com>,
	David Wilcox <davidvsthegiant@gmail.com>,
	hansecke@gmail.com, linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH] prctl: add PR_[GS]ET_PDEATHSIG_PROC
Date: Tue, 03 Oct 2017 12:40:43 -0500	[thread overview]
Message-ID: <8760bwb04k.fsf@xmission.com> (raw)
In-Reply-To: <1507050019.19102.51.camel@bitron.ch> ("Jürg Billeter"'s message of "Tue, 03 Oct 2017 19:00:19 +0200")

Jürg Billeter <j@bitron.ch> writes:

> On Tue, 2017-10-03 at 09:46 -0500, Eric W. Biederman wrote:
>> There is a general need to find out about the death of other processes,
>> if you are not the parent of the process.   I would be inclined to call
>> it waitfd.  Something that you give a pid.  It performs a permission
>> check and the pid becomes readable when the process dies.  With poll
>> working on the fd, and the fd returning wstatus of the dead child.
>> 
>> Support SIGIO on the fd and you have a signal delivery mechanism,
>> if you want it.
>
> File descriptors for processes (waitfd/clonefd) are definitely
> interesting.  Especially if reaping the process (and reparenting its
> children) is delayed until the last process file descriptor is closed. 
> However, this would be a much larger addition and also less intuitive
> to use if all you want is killing the process tree.
>
>> For the kill all children when the parent dies the mechanism you are
>> proposing is escapable.  We already have an inescapable version of it
>> with init in a pid namespace.  We already have an escapable version of
>> it with orphaned process groups and SIGHUP.
>> 
>> So I would really appreciate a very clear use case for what we are
>> building here.  As it appears the killing of children can already be
>> done another way, and that the waiting for the parent can be done better
>> another way.
>
> My use case is to provide a way for a process to spawn a child and
> ensure that no descendants survive when that child dies.  Avoiding
> runaway processes is desirable in many situations.  My motivation is
> very lightweight (nested) sandboxing (every process is potentially
> sandboxed).
>
> I.e., pid namespaces would be a pretty good fit (assuming they are
> sufficiently lightweight) but CLONE_NEWPID requires CAP_SYS_ADMIN. 
> User namespaces can help here, but creating tons of user namespaces
> just for this doesn't sound sensible. MAX_PID_NS_LEVEL could be an
> issue as well at some point but 32 levels are likely fine in practice.
>
> For my particular scenario I may actually be able to create a single
> user namespace, run all processes with (namespaced) CAP_SYS_ADMIN and
> use CLONE_NEWPID for every process.  However, I would prefer not
> requiring CAP_SYS_ADMIN and a regular application that wants to avoid
> runaway processes for a spawned helper process cannot rely on
> CAP_SYS_ADMIN.
>
> My plan was to use PR_SET_PDEATHSIG_PROC with PR_NO_NEW_PRIVS and a
> suitable seccomp filter to prevent changes to pdeath_signal_proc.  For
> my SIGKILL use case it would be even better to simply require
> PR_NO_NEW_PRIVS and make pdeath_signal_proc sticky, avoiding the need
> for seccomp.  I wanted to keep the differences to the existing
> PR_SET_PDEATHSIG minimal but if we argue that the non-SIGKILL use case
> is better solved with waitfd (or maybe the process events connector),
> we could tailor the prctl for the SIGKILL use case (or support both via
> prctl arg3).
>
> I have another small patch locally that adds a prctl that restricts
> kill(2) to direct children of the current thread group for lightweight
> sandboxing.  That would also be redundant if it was possible to use
> CLONE_NEWPID for every process.

I believe the current default limits allow using CLONE_NEWPID for every
process.  The data structures seem light enough as well.

> What's actually the reason that CLONE_NEWPID requires CAP_SYS_ADMIN? 
> Does CLONE_NEWPID pose any risks that don't exist for
> CLONE_NEWUSER|CLONE_NEWPID?  Assuming we can't simply drop the
> CAP_SYS_ADMIN requirement, do you see a better solution for this use
> case?

CLONE_NEWPID without a permission check would allow runing a setuid root
application in a pid namespace.  Off the top of my head I can't think of
a really good exploit.  But when you mess up pid files, and hide
information from a privileged application I can completely imagine
forcing that application to misbehave in ways the attacker can control.
Leading to bad things.

Eric

  reply	other threads:[~2017-10-03 17:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-09  9:40 [PATCH] prctl: add PR_[GS]ET_PDEATHSIG_PROC Jürg Billeter
2017-09-12 17:05 ` Oleg Nesterov
2017-09-12 18:54   ` Jürg Billeter
2017-09-13 17:11     ` Oleg Nesterov
2017-09-13 17:26       ` Jürg Billeter
2017-09-13 17:48         ` Oleg Nesterov
2017-09-29 12:30 ` [RESEND PATCH] " Jürg Billeter
2017-10-02 23:20   ` Andrew Morton
2017-10-03  3:25     ` Eric W. Biederman
2017-10-03  6:45       ` Jürg Billeter
2017-10-03 14:46         ` Eric W. Biederman
2017-10-03 16:10           ` Linus Torvalds
2017-10-03 16:36             ` Eric W. Biederman
2017-10-03 17:02               ` Linus Torvalds
2017-10-03 19:30                 ` Eric W. Biederman
2017-10-03 20:02                   ` Linus Torvalds
2017-10-03 20:32                     ` Eric W. Biederman
2017-10-03 17:00           ` Jürg Billeter
2017-10-03 17:40             ` Eric W. Biederman [this message]
2017-10-03 17:47               ` Jürg Billeter
2017-10-03 19:05                 ` Eric W. Biederman
2017-10-05 16:27             ` Oleg Nesterov
2017-10-08 17:47               ` Jürg Billeter
2017-10-09 16:32                 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8760bwb04k.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=davidvsthegiant@gmail.com \
    --cc=filbranden@google.com \
    --cc=hansecke@gmail.com \
    --cc=j@bitron.ch \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox