Re: [RESEND PATCH] prctl: add PR_[GS]ET_PDEATHSIG_PROC

All of lore.kernel.org
 help / color / mirror / Atom feed

From: ebiederm@xmission.com (Eric W. Biederman)
To: "Jürg Billeter" <j@bitron.ch>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Oleg Nesterov <oleg@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	Filipe Brandenburger <filbranden@google.com>,
	David Wilcox <davidvsthegiant@gmail.com>,
	hansecke@gmail.com, linux-kernel@vger.kernel.org
Subject: Re: [RESEND PATCH] prctl: add PR_[GS]ET_PDEATHSIG_PROC
Date: Tue, 03 Oct 2017 12:40:43 -0500	[thread overview]
Message-ID: <8760bwb04k.fsf@xmission.com> (raw)
In-Reply-To: <1507050019.19102.51.camel@bitron.ch> ("Jürg Billeter"'s message of "Tue, 03 Oct 2017 19:00:19 +0200")

Jürg Billeter <j@bitron.ch> writes:

> On Tue, 2017-10-03 at 09:46 -0500, Eric W. Biederman wrote:
>> There is a general need to find out about the death of other processes,
>> if you are not the parent of the process.   I would be inclined to call
>> it waitfd.  Something that you give a pid.  It performs a permission
>> check and the pid becomes readable when the process dies.  With poll
>> working on the fd, and the fd returning wstatus of the dead child.
>> 
>> Support SIGIO on the fd and you have a signal delivery mechanism,
>> if you want it.
>
> File descriptors for processes (waitfd/clonefd) are definitely
> interesting.  Especially if reaping the process (and reparenting its
> children) is delayed until the last process file descriptor is closed. 
> However, this would be a much larger addition and also less intuitive
> to use if all you want is killing the process tree.
>
>> For the kill all children when the parent dies the mechanism you are
>> proposing is escapable.  We already have an inescapable version of it
>> with init in a pid namespace.  We already have an escapable version of
>> it with orphaned process groups and SIGHUP.
>> 
>> So I would really appreciate a very clear use case for what we are
>> building here.  As it appears the killing of children can already be
>> done another way, and that the waiting for the parent can be done better
>> another way.
>
> My use case is to provide a way for a process to spawn a child and
> ensure that no descendants survive when that child dies.  Avoiding
> runaway processes is desirable in many situations.  My motivation is
> very lightweight (nested) sandboxing (every process is potentially
> sandboxed).
>
> I.e., pid namespaces would be a pretty good fit (assuming they are
> sufficiently lightweight) but CLONE_NEWPID requires CAP_SYS_ADMIN. 
> User namespaces can help here, but creating tons of user namespaces
> just for this doesn't sound sensible. MAX_PID_NS_LEVEL could be an
> issue as well at some point but 32 levels are likely fine in practice.
>
> For my particular scenario I may actually be able to create a single
> user namespace, run all processes with (namespaced) CAP_SYS_ADMIN and
> use CLONE_NEWPID for every process.  However, I would prefer not
> requiring CAP_SYS_ADMIN and a regular application that wants to avoid
> runaway processes for a spawned helper process cannot rely on
> CAP_SYS_ADMIN.
>
> My plan was to use PR_SET_PDEATHSIG_PROC with PR_NO_NEW_PRIVS and a
> suitable seccomp filter to prevent changes to pdeath_signal_proc.  For
> my SIGKILL use case it would be even better to simply require
> PR_NO_NEW_PRIVS and make pdeath_signal_proc sticky, avoiding the need
> for seccomp.  I wanted to keep the differences to the existing
> PR_SET_PDEATHSIG minimal but if we argue that the non-SIGKILL use case
> is better solved with waitfd (or maybe the process events connector),
> we could tailor the prctl for the SIGKILL use case (or support both via
> prctl arg3).
>
> I have another small patch locally that adds a prctl that restricts
> kill(2) to direct children of the current thread group for lightweight
> sandboxing.  That would also be redundant if it was possible to use
> CLONE_NEWPID for every process.

I believe the current default limits allow using CLONE_NEWPID for every
process.  The data structures seem light enough as well.

> What's actually the reason that CLONE_NEWPID requires CAP_SYS_ADMIN? 
> Does CLONE_NEWPID pose any risks that don't exist for
> CLONE_NEWUSER|CLONE_NEWPID?  Assuming we can't simply drop the
> CAP_SYS_ADMIN requirement, do you see a better solution for this use
> case?

CLONE_NEWPID without a permission check would allow runing a setuid root
application in a pid namespace.  Off the top of my head I can't think of
a really good exploit.  But when you mess up pid files, and hide
information from a privileged application I can completely imagine
forcing that application to misbehave in ways the attacker can control.
Leading to bad things.

Eric

next prev parent reply	other threads:[~2017-10-03 17:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-09  9:40 [PATCH] prctl: add PR_[GS]ET_PDEATHSIG_PROC Jürg Billeter
2017-09-12 17:05 ` Oleg Nesterov
2017-09-12 18:54   ` Jürg Billeter
2017-09-13 17:11     ` Oleg Nesterov
2017-09-13 17:26       ` Jürg Billeter
2017-09-13 17:48         ` Oleg Nesterov
2017-09-29 12:30 ` [RESEND PATCH] " Jürg Billeter
2017-10-02 23:20   ` Andrew Morton
2017-10-03  3:25     ` Eric W. Biederman
2017-10-03  6:45       ` Jürg Billeter
2017-10-03 14:46         ` Eric W. Biederman
2017-10-03 16:10           ` Linus Torvalds
2017-10-03 16:36             ` Eric W. Biederman
2017-10-03 17:02               ` Linus Torvalds
2017-10-03 19:30                 ` Eric W. Biederman
2017-10-03 20:02                   ` Linus Torvalds
2017-10-03 20:32                     ` Eric W. Biederman
2017-10-03 17:00           ` Jürg Billeter
2017-10-03 17:40             ` Eric W. Biederman [this message]
2017-10-03 17:47               ` Jürg Billeter
2017-10-03 19:05                 ` Eric W. Biederman
2017-10-05 16:27             ` Oleg Nesterov
2017-10-08 17:47               ` Jürg Billeter
2017-10-09 16:32                 ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8760bwb04k.fsf@xmission.com \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=davidvsthegiant@gmail.com \
    --cc=filbranden@google.com \
    --cc=hansecke@gmail.com \
    --cc=j@bitron.ch \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtk.manpages@gmail.com \
    --cc=oleg@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.