public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>,
	roland@redhat.com, "Eric W. Biederman" <ebiederm@xmission.com>,
	daniel@hozac.com, Containers <containers@lists.osdl.org>,
	linux-kernel@vger.kernel.org
Subject: [PATCH 0/7][v8] Container-init signal semantics
Date: Wed, 18 Feb 2009 19:02:07 -0800	[thread overview]
Message-ID: <20090219030207.GA18783@us.ibm.com> (raw)


Patch 5/7 is new in this set and fixes a bug. Remaining patches are
just a forward-port from previous version and I believe they address
all comments I have received.

Oleg please sign-off/ack if you agree.

---

Container-init must behave like global-init to processes within the
container and hence it must be immune to unhandled fatal signals from
within the container (i.e SIG_DFL signals that terminate the process).

But the same container-init must behave like a normal process to 
processes in ancestor namespaces and so if it receives the same fatal
signal from a process in ancestor namespace, the signal must be
processed.

Implementing these semantics requires that send_signal() determine pid
namespace of the sender but since signals can originate from workqueues/
interrupt-handlers, determining pid namespace of sender may not always
be possible or safe.

This patchset implements the design/simplified semantics suggested by
Oleg Nesterov.  The simplified semantics for container-init are:

	- container-init must never be terminated by a signal from a
	  descendant process.

	- container-init must never be immune to SIGKILL from an ancestor
	  namespace (so a process in parent namespace must always be able
	  to terminate a descendant container).

	- container-init may be immune to unhandled fatal signals (like
	  SIGUSR1) even if they are from ancestor namespace. SIGKILL/SIGSTOP
	  are the only reliable signals to a container-init from ancestor
	  namespace.

Patches in this set:

	[PATCH 1/7] Remove 'handler' parameter to tracehook functions
	[PATCH 2/7] Protect init from unwanted signals more
	[PATCH 3/7] Add from_ancestor_ns parameter to send_signal()
	[PATCH 4/7] Protect cinit from unblocked SIG_DFL signals
	[PATCH 5/7] zap_pid_ns_process() should use force_sig()
	[PATCH 6/7] Protect cinit from blocked fatal signals
	[PATCH 7/7] SI_USER: Masquerade si_pid when crossing pid ns boundary

Changelog[v8]:

	- Bugfix (new patch, 5/7): Nested container-init not terminated when
	  parent container-init exits and calls zap_pid_ns_processes().
	- Dropped old patch 7/7 which showed SIG_DFL signals to init as
	  "ignored" in /proc (we were undecided on whether its good or bad).

Changelog[v7]:
	- siginfo_from_user() and siginfo_from_ancestor_ns() are fairly simple
	  and used only in send_signal(). Remove them and move the logic into
	  send_signal() (Patch 4/7)

	- Update /proc/pid/status to include SIG_DFL signals to init in the
	  "ignored" set (and remove the TODO in Patch 0/7) (Patch 7/7)

Changelog[v6]:

	- Patches 3,4: Have kill_pid_info_as_uid() pass in 'from_ancestor_ns'
	  parameter to __send_signal() and remove SI_ASYNCIO check in
	  siginfo_from_user().
	- Patches 4,6: Update changelog and simplify code

Changelog[v5]:
	- Patch 2/6: Remove SIG_IGN check in sig_task_ignored() and let
	  sig_handler_ignored() check SIG_IGN.
        - Patch 3/6. Put siginfo_from_ancestor_ns() back under CONFIG_PID_NS
	  and remove warning in rt_sigqueueinfo().
	- (Patch 5/6)Simplify check in get_signal_to_deliver()
	- (Patch 6/6)Simplify masquerading pid
	- LTP-20081219-intermediate showed no new errors on 2.6.28-rc5-mm2.

Changelog[v4]:
	- [Bugfix] Patch 3/7. Check ns == NULL in siginfo_from_ancestor_ns().
	  Although http://lkml.org/lkml/2008/12/16/502 makes it less likely
	  that ns == NULL, looks like an explicit check won't hurt ?
	- Remove SIGNAL_UNKILLABLE_FROM_NS flag and simplify logic as
	  suggested by Oleg Nesterov.
	- Dropped patch that set SIGNAL_UNKILLABLE_FROM_NS and set
	  SIGNAL_UNKILLABLE in patch 5/7 to be bisect-safe.
	- Add a warning in rt_sigqueueinfo() if SI_ASYNCIO is used
	  (patch 3/7)
	- Added two patches (6/7 and 7/7) to masquerade si_pid for
	  SI_USER and SI_TKILL


Changelog[v3]:
	Changes based on discussions of previous version:
		http://lkml.org/lkml/2008/11/25/458

	Major changes:

	- Define SIGNAL_UNKILLABLE_FROM_NS and use in container-inits to
	  skip fatal signals from same namespace but process SIGKILL/SIGSTOP
	  from ancestor namespace.
	- Use SI_FROMUSER() and si_code != SI_ASYNCIO to determine if
	  it is safe to dereference pid-namespace of caller. Highly
	  experimental :-)
	- Masquerading si_pid when crossing namespace boundary: relevant
	  patches merged in -mm and dropped from this set.

	Minor changes:

	- Remove 'handler' parameter to tracehook functions
	- Update sig_ignored() to drop SIG_DFL signals to global init early
	  (tried to address Roland's  and Oleg's comments)
	- Use 'same_ns' flag to drop SIGKILL/SIGSTOP to cinit from same
	  namespace


Limitations/side-effects of current design

	- Container-init is immune to suicide - kill(getpid(), SIGKILL) is
	  ignored. Use exit() :-)

             reply	other threads:[~2009-02-19  3:02 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-19  3:02 Sukadev Bhattiprolu [this message]
2009-02-19  3:05 ` [PATCH 1/7][v8] Remove 'handler' parameter to tracehook functions Sukadev Bhattiprolu
2009-02-19  3:05 ` [PATCH 2/7][v8] Protect init from unwanted signals more Sukadev Bhattiprolu
2009-02-19  3:06 ` [PATCH 3/7][v8] Add from_ancestor_ns parameter to send_signal() Sukadev Bhattiprolu
2009-02-19  3:06 ` [PATCH 4/7][v8] Protect cinit from unblocked SIG_DFL signals Sukadev Bhattiprolu
2009-02-19  3:07 ` [PATCH 5/7][v8] zap_pid_ns_process() should use force_sig() Sukadev Bhattiprolu
2009-02-19 18:59   ` Oleg Nesterov
2009-02-19 20:26     ` Sukadev Bhattiprolu
2009-02-19  3:07 ` [PATCH 6/7][v8] Protect cinit from blocked fatal signals Sukadev Bhattiprolu
2009-02-19  3:07 ` [PATCH 7/7][v8] SI_USER: Masquerade si_pid when crossing pid ns boundary Sukadev Bhattiprolu
2009-02-19 16:11   ` Eric W. Biederman
2009-02-19 18:51     ` Oleg Nesterov
2009-02-19 22:18       ` Eric W. Biederman
2009-02-19 22:31         ` Oleg Nesterov
2009-02-19 23:21           ` Eric W. Biederman
2009-02-19 23:51             ` Roland McGrath
2009-02-20  0:35               ` Eric W. Biederman
2009-02-20  1:06                 ` Roland McGrath
2009-02-20  2:12                   ` Eric W. Biederman
2009-02-20  3:10                     ` Roland McGrath
2009-02-20  4:05                       ` Eric W. Biederman
2009-02-20  0:28             ` Oleg Nesterov
2009-02-20  1:16               ` Eric W. Biederman
2009-02-19 14:59 ` [PATCH 0/7][v8] Container-init signal semantics Daniel Lezcano
2009-03-07 19:04   ` Sukadev Bhattiprolu
2009-03-07 19:43     ` Daniel Lezcano
2009-03-07 19:51       ` Greg Kurz
2009-03-07 19:59         ` Daniel Lezcano
2009-02-19 20:53 ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090219030207.GA18783@us.ibm.com \
    --to=sukadev@linux.vnet.ibm.com \
    --cc=akpm@osdl.org \
    --cc=containers@lists.osdl.org \
    --cc=daniel@hozac.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@tv-sign.ru \
    --cc=roland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox