All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
To: Andrew Morton <akpm@osdl.org>
Cc: Oleg Nesterov <oleg@tv-sign.ru>,
	roland@redhat.com, "Eric W. Biederman" <ebiederm@xmission.com>,
	daniel@hozac.com, Containers <containers@lists.osdl.org>,
	linux-kernel@vger.kernel.org
Subject: [PATCH 0/7][v8] Container-init signal semantics
Date: Wed, 18 Feb 2009 19:02:07 -0800	[thread overview]
Message-ID: <20090219030207.GA18783@us.ibm.com> (raw)


Patch 5/7 is new in this set and fixes a bug. Remaining patches are
just a forward-port from previous version and I believe they address
all comments I have received.

Oleg please sign-off/ack if you agree.

---

Container-init must behave like global-init to processes within the
container and hence it must be immune to unhandled fatal signals from
within the container (i.e SIG_DFL signals that terminate the process).

But the same container-init must behave like a normal process to 
processes in ancestor namespaces and so if it receives the same fatal
signal from a process in ancestor namespace, the signal must be
processed.

Implementing these semantics requires that send_signal() determine pid
namespace of the sender but since signals can originate from workqueues/
interrupt-handlers, determining pid namespace of sender may not always
be possible or safe.

This patchset implements the design/simplified semantics suggested by
Oleg Nesterov.  The simplified semantics for container-init are:

	- container-init must never be terminated by a signal from a
	  descendant process.

	- container-init must never be immune to SIGKILL from an ancestor
	  namespace (so a process in parent namespace must always be able
	  to terminate a descendant container).

	- container-init may be immune to unhandled fatal signals (like
	  SIGUSR1) even if they are from ancestor namespace. SIGKILL/SIGSTOP
	  are the only reliable signals to a container-init from ancestor
	  namespace.

Patches in this set:

	[PATCH 1/7] Remove 'handler' parameter to tracehook functions
	[PATCH 2/7] Protect init from unwanted signals more
	[PATCH 3/7] Add from_ancestor_ns parameter to send_signal()
	[PATCH 4/7] Protect cinit from unblocked SIG_DFL signals
	[PATCH 5/7] zap_pid_ns_process() should use force_sig()
	[PATCH 6/7] Protect cinit from blocked fatal signals
	[PATCH 7/7] SI_USER: Masquerade si_pid when crossing pid ns boundary

Changelog[v8]:

	- Bugfix (new patch, 5/7): Nested container-init not terminated when
	  parent container-init exits and calls zap_pid_ns_processes().
	- Dropped old patch 7/7 which showed SIG_DFL signals to init as
	  "ignored" in /proc (we were undecided on whether its good or bad).

Changelog[v7]:
	- siginfo_from_user() and siginfo_from_ancestor_ns() are fairly simple
	  and used only in send_signal(). Remove them and move the logic into
	  send_signal() (Patch 4/7)

	- Update /proc/pid/status to include SIG_DFL signals to init in the
	  "ignored" set (and remove the TODO in Patch 0/7) (Patch 7/7)

Changelog[v6]:

	- Patches 3,4: Have kill_pid_info_as_uid() pass in 'from_ancestor_ns'
	  parameter to __send_signal() and remove SI_ASYNCIO check in
	  siginfo_from_user().
	- Patches 4,6: Update changelog and simplify code

Changelog[v5]:
	- Patch 2/6: Remove SIG_IGN check in sig_task_ignored() and let
	  sig_handler_ignored() check SIG_IGN.
        - Patch 3/6. Put siginfo_from_ancestor_ns() back under CONFIG_PID_NS
	  and remove warning in rt_sigqueueinfo().
	- (Patch 5/6)Simplify check in get_signal_to_deliver()
	- (Patch 6/6)Simplify masquerading pid
	- LTP-20081219-intermediate showed no new errors on 2.6.28-rc5-mm2.

Changelog[v4]:
	- [Bugfix] Patch 3/7. Check ns == NULL in siginfo_from_ancestor_ns().
	  Although http://lkml.org/lkml/2008/12/16/502 makes it less likely
	  that ns == NULL, looks like an explicit check won't hurt ?
	- Remove SIGNAL_UNKILLABLE_FROM_NS flag and simplify logic as
	  suggested by Oleg Nesterov.
	- Dropped patch that set SIGNAL_UNKILLABLE_FROM_NS and set
	  SIGNAL_UNKILLABLE in patch 5/7 to be bisect-safe.
	- Add a warning in rt_sigqueueinfo() if SI_ASYNCIO is used
	  (patch 3/7)
	- Added two patches (6/7 and 7/7) to masquerade si_pid for
	  SI_USER and SI_TKILL


Changelog[v3]:
	Changes based on discussions of previous version:
		http://lkml.org/lkml/2008/11/25/458

	Major changes:

	- Define SIGNAL_UNKILLABLE_FROM_NS and use in container-inits to
	  skip fatal signals from same namespace but process SIGKILL/SIGSTOP
	  from ancestor namespace.
	- Use SI_FROMUSER() and si_code != SI_ASYNCIO to determine if
	  it is safe to dereference pid-namespace of caller. Highly
	  experimental :-)
	- Masquerading si_pid when crossing namespace boundary: relevant
	  patches merged in -mm and dropped from this set.

	Minor changes:

	- Remove 'handler' parameter to tracehook functions
	- Update sig_ignored() to drop SIG_DFL signals to global init early
	  (tried to address Roland's  and Oleg's comments)
	- Use 'same_ns' flag to drop SIGKILL/SIGSTOP to cinit from same
	  namespace


Limitations/side-effects of current design

	- Container-init is immune to suicide - kill(getpid(), SIGKILL) is
	  ignored. Use exit() :-)

             reply	other threads:[~2009-02-19  3:02 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-19  3:02 Sukadev Bhattiprolu [this message]
2009-02-19  3:05 ` [PATCH 1/7][v8] Remove 'handler' parameter to tracehook functions Sukadev Bhattiprolu
2009-02-19  3:05 ` [PATCH 2/7][v8] Protect init from unwanted signals more Sukadev Bhattiprolu
2009-02-19  3:06 ` [PATCH 3/7][v8] Add from_ancestor_ns parameter to send_signal() Sukadev Bhattiprolu
2009-02-19  3:06 ` [PATCH 4/7][v8] Protect cinit from unblocked SIG_DFL signals Sukadev Bhattiprolu
2009-02-19  3:07 ` [PATCH 7/7][v8] SI_USER: Masquerade si_pid when crossing pid ns boundary Sukadev Bhattiprolu
2009-02-19 16:11   ` Eric W. Biederman
     [not found]     ` <m1y6w21k6d.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-02-19 18:51       ` Oleg Nesterov
2009-02-19 18:51         ` Oleg Nesterov
2009-02-19 22:18         ` Eric W. Biederman
     [not found]           ` <m1fxiayss9.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-02-19 22:31             ` Oleg Nesterov
2009-02-19 22:31               ` Oleg Nesterov
2009-02-19 23:21               ` Eric W. Biederman
2009-02-19 23:51                 ` Roland McGrath
2009-02-19 23:51                   ` Roland McGrath
2009-02-20  0:35                   ` Eric W. Biederman
     [not found]                     ` <m1bpsyt05t.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-02-20  1:06                       ` Roland McGrath
2009-02-20  1:06                         ` Roland McGrath
2009-02-20  2:12                         ` Eric W. Biederman
2009-02-20  3:10                           ` Roland McGrath
2009-02-20  3:10                             ` Roland McGrath
2009-02-20  4:05                             ` Eric W. Biederman
     [not found]                 ` <m1fxiaxbb5.fsf-+imSwln9KH6u2/kzUuoCbdi2O/JbrIOy@public.gmane.org>
2009-02-20  0:28                   ` Oleg Nesterov
2009-02-20  0:28                     ` Oleg Nesterov
2009-02-20  1:16                     ` Eric W. Biederman
2009-02-19 14:59 ` [PATCH 0/7][v8] Container-init signal semantics Daniel Lezcano
2009-03-07 19:04   ` Sukadev Bhattiprolu
2009-03-07 19:43     ` Daniel Lezcano
2009-03-07 19:51       ` Greg Kurz
2009-03-07 19:59         ` Daniel Lezcano
     [not found] ` <20090219030207.GA18783-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-02-19  3:07   ` [PATCH 5/7][v8] zap_pid_ns_process() should use force_sig() Sukadev Bhattiprolu
2009-02-19  3:07     ` Sukadev Bhattiprolu
     [not found]     ` <20090219030704.GE18990-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-02-19 18:59       ` Oleg Nesterov
2009-02-19 18:59         ` Oleg Nesterov
2009-02-19 20:26         ` Sukadev Bhattiprolu
2009-02-19  3:07   ` [PATCH 6/7][v8] Protect cinit from blocked fatal signals Sukadev Bhattiprolu
2009-02-19  3:07     ` Sukadev Bhattiprolu
2009-02-19 20:53   ` [PATCH 0/7][v8] Container-init signal semantics Oleg Nesterov
2009-02-19 20:53     ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090219030207.GA18783@us.ibm.com \
    --to=sukadev@linux.vnet.ibm.com \
    --cc=akpm@osdl.org \
    --cc=containers@lists.osdl.org \
    --cc=daniel@hozac.com \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@tv-sign.ru \
    --cc=roland@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.