All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
To: Roland McGrath <roland@redhat.com>
Cc: oleg@redhat.com, ebiederm@xmission.com, daniel@hozac.com,
	xemul@openvz.org, containers@lists.osdl.org,
	linux-kernel@vger.kernel.org, sukadev@us.ibm.com
Subject: Re: [RFC][PATCH 3/5] Determine if sender is from ancestor ns
Date: Mon, 8 Dec 2008 19:22:07 -0800	[thread overview]
Message-ID: <20081209032207.GA30016@us.ibm.com> (raw)
In-Reply-To: <20081204010636.A8465FC053@magilla.sf.frob.com>

Roland McGrath [roland@redhat.com] wrote:


Thanks for the review and your comments.
> 
> I don't quarrel with the intent, but I don't like this approach.
> I think we can do it more cleanly.  I don't have it all worked out,
> but a couple of thoughts come to mind.
> 
> Firstly, it's no problem to split SIGNAL_UNKILLABLE out into multiple
> flags.  It's quite noninvasive, only signal.c looks at those flags.  Then
> we can implement the signal magic cleanly keyed on a few separate flags,
> using different flag combinations for global init and container inits.

Ok. Sounds good. Let say for this discussion, container-inits set
SIGNAL_UNKILLABLE_CINIT, global inits continue to set SIGNAL_UNKILLABLE.

> 
> I don't mind a feature that lets an unprivileged container init magically
> ignore normal exit signals.  That just does something it could do itself
> with sigaction, except its /proc/pid/status "SigIgn:" line will lie.

Yes, but after Bastian mentioned it recently, I was wondering if we can
keep "SigIgn:" honest by tweaking procfs code - i.e have procfs check
the SIGNAL_UNKILLABLE flags and include SIG_DFL exit signals in the
SigIgn: line.

We could include blocked SIG_DFL signals in the SigIgn: line ? Hacky ?

> That's OK.  Key that on its own flag and set that in both inits.  Just
> check that flag in prepare_signal() and in get_signal_to_deliver().
> 
> It's a different story for the signals that cannot be caught, blocked, or
> ignored, SIGKILL and SIGSTOP.  At least when sent from outside the
> container, circumventing either of those would be a privilege escalation
> from what's always been understood by admins and so on.  

So if SIGKILL or SIGSTOP are sent from inside the container, we drop them
in prepare_signal() if SIGNAL_UNKILLABLE flags are set. If sent from outside
the container, we queue them and get_signal_to_deliver() delivers these
unless SIGNAL_UNKILLABLE is set (i.e the signals are delivered to cinits but
ignored for global init).

> 
> It's convenient for the implementation that we can treat these differently
> from other signals, precisely because they can never be caught, blocked, or
> ignored.  That is, when we decide in prepare_signal() to queue a SIGKILL or
> SIGSTOP, it can never turn out that we're later going to drop it because it
> went from caught or ignored or blocked to uncaught, unignored, and
> unblocked.  (It's only because of that possibility that there is any need
> to check for a suppressed exit signal in get_signal_to_deliver() rather
> than only in prepare_signal().)  That means that when the decision hinges
> on the namespace correlation of the sender and receiver, you can check that
> when it's handy (current vs p in prepare_signal) rather than trying to
> reconstruct the answer for a queued signal.

Yes. One thing I am not clear on yet is how we decide in prepare_signal()
if it is safe to check the sender's namespace (since caller of send_signal()
could be an interrupt handler or workqueue).

As I was discussing with Bastian Blank, SI_FROMUSER() is not sufficient
since SI_ASYNCIO appears as an user-space signal.

Now is SI_ASYNCIO really supposed to be a user signal ? Or like SI_MESGQ
and SI_POLL can it be a kernel signal ? If we fix SI_ASYNCIO, can we
safely use SI_FROMUSER() for this  or could there be other 'user' signals
from kernel ?

> 
> Finally, one more thought.  This may be moot for the problem at hand if you
> take the approach I just suggested, but probably should be fixed anyway.
> It seems to me that the si_pid and si_uid fields of siginfo_t ought to be
> translated to the namespaces of the receiver.  I think it makes most sense
> to do this on the front end, i.e. in the callers that fill in the siginfo_t
> in the first place (sys_kill et al, or maybe a few layers down?).
> Currently it's inconsistent, but mostly wrong.

Yes, that makes a lot of sense to push that initialization to the callers.
We have been going through and identifying the 'si_pid's that need to be fixed.
Nadia sent out one patch for ipc/mqueue.c

> do_notify_parent() and
> do_notify_parent_cldstop() use the receiver's namespace to compute si_pid,
> but the rest of the signal.c routines do not.  A free side effect of doing
> this is that si_pid for a sender whose PID is not visible to the receiver
> (i.e. outside its container) would be distinctively 0 or -1 or something.
> (-1 might be the best choice, since si_[pu]id=0 already arises now in case
> of signal queue exhaustion and the like.)  Hence, possibly one could simply
> use si_pid>0 as a "sent from inside the container" check on a queued siginfo_t.

Yes, its probably moot with the other approach, but the only drawback with
relying on the "->si_pid > 0" check in get_signal_to_deliver() was that
allocation of siginfo_t could have failed. In which case it would be
unclear whether signal was from parent or same namespace.

But if we fix si_pid problems and correctly pass in siginfo_t, can we
use "si_pid > 0" check in prepare_signal() to decide whether or not
to queue the signal ?

Sukadev

  reply	other threads:[~2008-12-09  3:22 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-26  3:42 [RFC][PATCH 0/5] Container init signal semantics Sukadev Bhattiprolu
2008-11-26  3:44 ` [RFC][PATCH 1/5] pid: Implement ns_of_pid Sukadev Bhattiprolu
2008-11-26  3:44   ` Sukadev Bhattiprolu
2008-11-27  1:19   ` Bastian Blank
2008-12-01 20:24     ` Sukadev Bhattiprolu
2008-12-02 11:58       ` Bastian Blank
2008-12-02 22:12         ` Sukadev Bhattiprolu
2008-12-03  0:34         ` Valdis.Kletnieks
2008-11-26  3:45 ` [RFC][PATCH 2/5] pid: Generalize task_active_pid_ns Sukadev Bhattiprolu
2008-11-26  3:45   ` Sukadev Bhattiprolu
2008-11-27  1:17   ` Bastian Blank
2008-11-27 21:19     ` Greg Kurz
2008-12-01 21:15       ` Sukadev Bhattiprolu
2008-12-02 11:57         ` Bastian Blank
2008-12-03  7:41           ` Sukadev Bhattiprolu
2008-12-03  7:41             ` Sukadev Bhattiprolu
2008-12-04 12:58             ` Bastian Blank
2008-11-27 13:09   ` Nadia Derbey
2008-12-01 20:38     ` Sukadev Bhattiprolu
2008-11-26  3:46 ` [RFC][PATCH 3/5] Determine if sender is from ancestor ns Sukadev Bhattiprolu
2008-11-26  3:46   ` Sukadev Bhattiprolu
     [not found]   ` <20081126034611.GC23238-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-11-27  1:01     ` Bastian Blank
2008-11-27  1:01       ` Bastian Blank
2008-12-01 20:15       ` Sukadev Bhattiprolu
2008-12-02 11:48         ` Bastian Blank
2008-12-02 19:59           ` Sukadev Bhattiprolu
2008-12-04 12:45             ` [RFC][PATCH 3/5] Determine if sender is from ancestor ns+ Bastian Blank
2008-12-04  1:06     ` [RFC][PATCH 3/5] Determine if sender is from ancestor ns Roland McGrath
2008-12-04  1:06       ` Roland McGrath
2008-12-09  3:22       ` Sukadev Bhattiprolu [this message]
2008-12-02  3:07   ` Roland McGrath
2008-11-26  3:46 ` [RFC][PATCH 4/5] Protect cinit from fatal signals Sukadev Bhattiprolu
2008-11-26  3:46   ` Sukadev Bhattiprolu
2008-11-27  1:07   ` Bastian Blank
2008-12-01 20:21     ` Sukadev Bhattiprolu
2008-12-02 12:06       ` Bastian Blank
2008-12-02 20:51         ` Sukadev Bhattiprolu
2008-12-04 12:52           ` Bastian Blank
2008-12-04 18:58             ` Sukadev Bhattiprolu
2008-11-26  3:46 ` [RFC][PATCH 5/5] Clear si_pid for signal from ancestor ns Sukadev Bhattiprolu
2008-11-26  3:46   ` Sukadev Bhattiprolu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081209032207.GA30016@us.ibm.com \
    --to=sukadev@linux.vnet.ibm.com \
    --cc=containers@lists.osdl.org \
    --cc=daniel@hozac.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=roland@redhat.com \
    --cc=sukadev@us.ibm.com \
    --cc=xemul@openvz.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.