All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Matt Brown <matt@nmatt.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	jmorris@namei.org, gregkh@linuxfoundation.org, jslaby@suse.com,
	akpm@linux-foundation.org, jannh@google.com,
	keescook@chromium.org, kernel-hardening@lists.openwall.com,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [kernel-hardening] Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN
Date: Sat, 22 Apr 2017 14:50:34 -0500	[thread overview]
Message-ID: <20170422195034.GA17556@mail.hallyn.com> (raw)
In-Reply-To: <c1977923-f68d-6ea9-2fd1-d2939848eec6@nmatt.com>

Quoting Matt Brown (matt@nmatt.com):
> On 04/21/2017 01:24 AM, Serge E. Hallyn wrote:
> >On Fri, Apr 21, 2017 at 01:09:59AM -0400, Matt Brown wrote:
> >>On 04/20/2017 01:41 PM, Serge E. Hallyn wrote:
> >>>Quoting matt@nmatt.com (matt@nmatt.com):
> >>>>On 2017-04-20 11:19, Serge E. Hallyn wrote:
> >>>>>Quoting Matt Brown (matt@nmatt.com):
> >>>>>>On 04/19/2017 07:53 PM, Serge E. Hallyn wrote:
> >>>>>>>Quoting Matt Brown (matt@nmatt.com):
> >>>>>>>>On 04/19/2017 12:58 AM, Serge E. Hallyn wrote:
> >>>>>>>>>On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
> >>>>>>>>>>This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
> >>>>>>>>>>project in-kernel.
> >>>>>>>>>>
> >>>>>>>>>>This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
> >>>>>>>>>>sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
> >>>>>>>>>>ioctl calls from non CAP_SYS_ADMIN users.
> >>>>>>>>>>
> >>>>>>>>>>Possible effects on userland:
> >>>>>>>>>>
> >>>>>>>>>>There could be a few user programs that would be effected by this
> >>>>>>>>>>change.
> >>>>>>>>>>See: <https://codesearch.debian.net/search?q=ioctl%5C%28.*TIOCSTI>
> >>>>>>>>>>notable programs are: agetty, csh, xemacs and tcsh
> >>>>>>>>>>
> >>>>>>>>>>However, I still believe that this change is worth it given that the
> >>>>>>>>>>Kconfig defaults to n. This will be a feature that is turned on for the
> >>>>>>>>>
> >>>>>>>>>It's not worthless, but note that for instance before this was fixed
> >>>>>>>>>in lxc, this patch would not have helped with escapes from privileged
> >>>>>>>>>containers.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>I assume you are talking about this CVE:
> >>>>>>>>https://bugzilla.redhat.com/show_bug.cgi?id=1411256
> >>>>>>>>
> >>>>>>>>In retrospect, is there any way that an escape from a privileged
> >>>>>>>>container with the this bug could have been prevented?
> >>>>>>>
> >>>>>>>I don't know, that's what I was probing for.  Detecting that the pgrp
> >>>>>>>or session - heck, the pid namespace - has changed would seem like a
> >>>>>>>good indicator that it shouldn't be able to push.
> >>>>>>>
> >>>>>>
> >>>>>>pgrp and session won't do because in the case we are discussing
> >>>>>>current->signal->tty is the same as tty.
> >>>>>>
> >>>>>>This is the current check that is already in place:
> >>>>>>| if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
> >>>>>>| 	return -EPERM;
> >>>>>
> >>>>>Yeah...
> >>>>>
> >>>>>>The only thing I could find to detect the tty message coming from a
> >>>>>>container is as follows:
> >>>>>>| task_active_pid_ns(current)->level
> >>>>>>
> >>>>>>This will be zero when run on the host, but 1 when run inside a
> >>>>>>container. However this is very much a hack and could probably break
> >>>>>>some userland stuff where there are multiple levels of namespaces.
> >>>>>
> >>>>>Yes.  This is also however why I don't like the current patch, because
> >>>>>capable() will never be true in a container, so nested containers
> >>>>>break.
> >>>>>
> >>>>
> >>>>What do you mean by "capable() will never be true in a container"?
> >>>>My understanding
> >>>>is that if a container is given CAP_SYS_ADMIN then
> >>>>capable(CAP_SYS_ADMIN) will return
> >>>>true?
> >>>
> >>>No, capable(X) checks for X with respect to the initial user namespace.
> >>>So for root-owned containers it will be true, but containers running in
> >>>non-initial user namespaces cannot pass that check.
> >>>
> >>>To check for privilege with respect to another user namespace, you need
> >>>to use ns_capable.  But for that you need a user_ns to target.
> >>>
> >>
> >>How about: ns_capable(current_user_ns(),CAP_SYS_ADMIN) ?
> >>
> >>current_user_ns() was found in include/linux/cred.h
> >
> >Any user can create a new user namespace and pass the above check.  What we
> >want is to find the user namespace which opened the tty.
> >
> 
> I believe I have a working solution that I can show in the next version
> of the patch later today, but I just want to run the logic by you first.
> 
> I added: "struct user_namespace *owner_user_ns;" as a field in
> tty_struct (include/linux/tty.h) Note: I am totally open to suggestions
> for a better name.
> 
> Then I added "tty->owner_user_ns = current_user_ns();" to the
> alloc_tty_struct function. (drivers/tty/tty_io.c)

That's what I was hoping could work.  Then you can check ns_capable
with respect to that.

You'll want to grab a reference to the user_ns, and drop it on
final close, but otherwise this sounds good to me.  I don't really
know the tty layer well though so we'll need some sanity checking
from someone who does.

> When testing with a docker container, running in a different user
> namespace, I printed out current_user_ns()->level, which returned 1,
> and tty->owner_user_ns->level, which returned 0. This seems to prove
> that I am correctly storing the user namespace which opened the tty.
> 
> Please let me know if there are any edge cases that I am missing with
> this approach.

Thanks for posting this!  This seems like the best solution to me.

WARNING: multiple messages have this Message-ID (diff)
From: serge@hallyn.com (Serge E. Hallyn)
To: linux-security-module@vger.kernel.org
Subject: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN
Date: Sat, 22 Apr 2017 14:50:34 -0500	[thread overview]
Message-ID: <20170422195034.GA17556@mail.hallyn.com> (raw)
In-Reply-To: <c1977923-f68d-6ea9-2fd1-d2939848eec6@nmatt.com>

Quoting Matt Brown (matt at nmatt.com):
> On 04/21/2017 01:24 AM, Serge E. Hallyn wrote:
> >On Fri, Apr 21, 2017 at 01:09:59AM -0400, Matt Brown wrote:
> >>On 04/20/2017 01:41 PM, Serge E. Hallyn wrote:
> >>>Quoting matt at nmatt.com (matt at nmatt.com):
> >>>>On 2017-04-20 11:19, Serge E. Hallyn wrote:
> >>>>>Quoting Matt Brown (matt at nmatt.com):
> >>>>>>On 04/19/2017 07:53 PM, Serge E. Hallyn wrote:
> >>>>>>>Quoting Matt Brown (matt at nmatt.com):
> >>>>>>>>On 04/19/2017 12:58 AM, Serge E. Hallyn wrote:
> >>>>>>>>>On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
> >>>>>>>>>>This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
> >>>>>>>>>>project in-kernel.
> >>>>>>>>>>
> >>>>>>>>>>This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
> >>>>>>>>>>sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
> >>>>>>>>>>ioctl calls from non CAP_SYS_ADMIN users.
> >>>>>>>>>>
> >>>>>>>>>>Possible effects on userland:
> >>>>>>>>>>
> >>>>>>>>>>There could be a few user programs that would be effected by this
> >>>>>>>>>>change.
> >>>>>>>>>>See: <https://codesearch.debian.net/search?q=ioctl%5C%28.*TIOCSTI>
> >>>>>>>>>>notable programs are: agetty, csh, xemacs and tcsh
> >>>>>>>>>>
> >>>>>>>>>>However, I still believe that this change is worth it given that the
> >>>>>>>>>>Kconfig defaults to n. This will be a feature that is turned on for the
> >>>>>>>>>
> >>>>>>>>>It's not worthless, but note that for instance before this was fixed
> >>>>>>>>>in lxc, this patch would not have helped with escapes from privileged
> >>>>>>>>>containers.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>I assume you are talking about this CVE:
> >>>>>>>>https://bugzilla.redhat.com/show_bug.cgi?id=1411256
> >>>>>>>>
> >>>>>>>>In retrospect, is there any way that an escape from a privileged
> >>>>>>>>container with the this bug could have been prevented?
> >>>>>>>
> >>>>>>>I don't know, that's what I was probing for.  Detecting that the pgrp
> >>>>>>>or session - heck, the pid namespace - has changed would seem like a
> >>>>>>>good indicator that it shouldn't be able to push.
> >>>>>>>
> >>>>>>
> >>>>>>pgrp and session won't do because in the case we are discussing
> >>>>>>current->signal->tty is the same as tty.
> >>>>>>
> >>>>>>This is the current check that is already in place:
> >>>>>>| if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
> >>>>>>| 	return -EPERM;
> >>>>>
> >>>>>Yeah...
> >>>>>
> >>>>>>The only thing I could find to detect the tty message coming from a
> >>>>>>container is as follows:
> >>>>>>| task_active_pid_ns(current)->level
> >>>>>>
> >>>>>>This will be zero when run on the host, but 1 when run inside a
> >>>>>>container. However this is very much a hack and could probably break
> >>>>>>some userland stuff where there are multiple levels of namespaces.
> >>>>>
> >>>>>Yes.  This is also however why I don't like the current patch, because
> >>>>>capable() will never be true in a container, so nested containers
> >>>>>break.
> >>>>>
> >>>>
> >>>>What do you mean by "capable() will never be true in a container"?
> >>>>My understanding
> >>>>is that if a container is given CAP_SYS_ADMIN then
> >>>>capable(CAP_SYS_ADMIN) will return
> >>>>true?
> >>>
> >>>No, capable(X) checks for X with respect to the initial user namespace.
> >>>So for root-owned containers it will be true, but containers running in
> >>>non-initial user namespaces cannot pass that check.
> >>>
> >>>To check for privilege with respect to another user namespace, you need
> >>>to use ns_capable.  But for that you need a user_ns to target.
> >>>
> >>
> >>How about: ns_capable(current_user_ns(),CAP_SYS_ADMIN) ?
> >>
> >>current_user_ns() was found in include/linux/cred.h
> >
> >Any user can create a new user namespace and pass the above check.  What we
> >want is to find the user namespace which opened the tty.
> >
> 
> I believe I have a working solution that I can show in the next version
> of the patch later today, but I just want to run the logic by you first.
> 
> I added: "struct user_namespace *owner_user_ns;" as a field in
> tty_struct (include/linux/tty.h) Note: I am totally open to suggestions
> for a better name.
> 
> Then I added "tty->owner_user_ns = current_user_ns();" to the
> alloc_tty_struct function. (drivers/tty/tty_io.c)

That's what I was hoping could work.  Then you can check ns_capable
with respect to that.

You'll want to grab a reference to the user_ns, and drop it on
final close, but otherwise this sounds good to me.  I don't really
know the tty layer well though so we'll need some sanity checking
from someone who does.

> When testing with a docker container, running in a different user
> namespace, I printed out current_user_ns()->level, which returned 1,
> and tty->owner_user_ns->level, which returned 0. This seems to prove
> that I am correctly storing the user namespace which opened the tty.
> 
> Please let me know if there are any edge cases that I am missing with
> this approach.

Thanks for posting this!  This seems like the best solution to me.
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "Serge E. Hallyn" <serge@hallyn.com>
To: Matt Brown <matt@nmatt.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
	jmorris@namei.org, gregkh@linuxfoundation.org, jslaby@suse.com,
	akpm@linux-foundation.org, jannh@google.com,
	keescook@chromium.org, kernel-hardening@lists.openwall.com,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN
Date: Sat, 22 Apr 2017 14:50:34 -0500	[thread overview]
Message-ID: <20170422195034.GA17556@mail.hallyn.com> (raw)
In-Reply-To: <c1977923-f68d-6ea9-2fd1-d2939848eec6@nmatt.com>

Quoting Matt Brown (matt@nmatt.com):
> On 04/21/2017 01:24 AM, Serge E. Hallyn wrote:
> >On Fri, Apr 21, 2017 at 01:09:59AM -0400, Matt Brown wrote:
> >>On 04/20/2017 01:41 PM, Serge E. Hallyn wrote:
> >>>Quoting matt@nmatt.com (matt@nmatt.com):
> >>>>On 2017-04-20 11:19, Serge E. Hallyn wrote:
> >>>>>Quoting Matt Brown (matt@nmatt.com):
> >>>>>>On 04/19/2017 07:53 PM, Serge E. Hallyn wrote:
> >>>>>>>Quoting Matt Brown (matt@nmatt.com):
> >>>>>>>>On 04/19/2017 12:58 AM, Serge E. Hallyn wrote:
> >>>>>>>>>On Tue, Apr 18, 2017 at 11:45:26PM -0400, Matt Brown wrote:
> >>>>>>>>>>This patch reproduces GRKERNSEC_HARDEN_TTY functionality from the grsecurity
> >>>>>>>>>>project in-kernel.
> >>>>>>>>>>
> >>>>>>>>>>This will create the Kconfig SECURITY_TIOCSTI_RESTRICT and the corresponding
> >>>>>>>>>>sysctl kernel.tiocsti_restrict that, when activated, restrict all TIOCSTI
> >>>>>>>>>>ioctl calls from non CAP_SYS_ADMIN users.
> >>>>>>>>>>
> >>>>>>>>>>Possible effects on userland:
> >>>>>>>>>>
> >>>>>>>>>>There could be a few user programs that would be effected by this
> >>>>>>>>>>change.
> >>>>>>>>>>See: <https://codesearch.debian.net/search?q=ioctl%5C%28.*TIOCSTI>
> >>>>>>>>>>notable programs are: agetty, csh, xemacs and tcsh
> >>>>>>>>>>
> >>>>>>>>>>However, I still believe that this change is worth it given that the
> >>>>>>>>>>Kconfig defaults to n. This will be a feature that is turned on for the
> >>>>>>>>>
> >>>>>>>>>It's not worthless, but note that for instance before this was fixed
> >>>>>>>>>in lxc, this patch would not have helped with escapes from privileged
> >>>>>>>>>containers.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>I assume you are talking about this CVE:
> >>>>>>>>https://bugzilla.redhat.com/show_bug.cgi?id=1411256
> >>>>>>>>
> >>>>>>>>In retrospect, is there any way that an escape from a privileged
> >>>>>>>>container with the this bug could have been prevented?
> >>>>>>>
> >>>>>>>I don't know, that's what I was probing for.  Detecting that the pgrp
> >>>>>>>or session - heck, the pid namespace - has changed would seem like a
> >>>>>>>good indicator that it shouldn't be able to push.
> >>>>>>>
> >>>>>>
> >>>>>>pgrp and session won't do because in the case we are discussing
> >>>>>>current->signal->tty is the same as tty.
> >>>>>>
> >>>>>>This is the current check that is already in place:
> >>>>>>| if ((current->signal->tty != tty) && !capable(CAP_SYS_ADMIN))
> >>>>>>| 	return -EPERM;
> >>>>>
> >>>>>Yeah...
> >>>>>
> >>>>>>The only thing I could find to detect the tty message coming from a
> >>>>>>container is as follows:
> >>>>>>| task_active_pid_ns(current)->level
> >>>>>>
> >>>>>>This will be zero when run on the host, but 1 when run inside a
> >>>>>>container. However this is very much a hack and could probably break
> >>>>>>some userland stuff where there are multiple levels of namespaces.
> >>>>>
> >>>>>Yes.  This is also however why I don't like the current patch, because
> >>>>>capable() will never be true in a container, so nested containers
> >>>>>break.
> >>>>>
> >>>>
> >>>>What do you mean by "capable() will never be true in a container"?
> >>>>My understanding
> >>>>is that if a container is given CAP_SYS_ADMIN then
> >>>>capable(CAP_SYS_ADMIN) will return
> >>>>true?
> >>>
> >>>No, capable(X) checks for X with respect to the initial user namespace.
> >>>So for root-owned containers it will be true, but containers running in
> >>>non-initial user namespaces cannot pass that check.
> >>>
> >>>To check for privilege with respect to another user namespace, you need
> >>>to use ns_capable.  But for that you need a user_ns to target.
> >>>
> >>
> >>How about: ns_capable(current_user_ns(),CAP_SYS_ADMIN) ?
> >>
> >>current_user_ns() was found in include/linux/cred.h
> >
> >Any user can create a new user namespace and pass the above check.  What we
> >want is to find the user namespace which opened the tty.
> >
> 
> I believe I have a working solution that I can show in the next version
> of the patch later today, but I just want to run the logic by you first.
> 
> I added: "struct user_namespace *owner_user_ns;" as a field in
> tty_struct (include/linux/tty.h) Note: I am totally open to suggestions
> for a better name.
> 
> Then I added "tty->owner_user_ns = current_user_ns();" to the
> alloc_tty_struct function. (drivers/tty/tty_io.c)

That's what I was hoping could work.  Then you can check ns_capable
with respect to that.

You'll want to grab a reference to the user_ns, and drop it on
final close, but otherwise this sounds good to me.  I don't really
know the tty layer well though so we'll need some sanity checking
from someone who does.

> When testing with a docker container, running in a different user
> namespace, I printed out current_user_ns()->level, which returned 1,
> and tty->owner_user_ns->level, which returned 0. This seems to prove
> that I am correctly storing the user namespace which opened the tty.
> 
> Please let me know if there are any edge cases that I am missing with
> this approach.

Thanks for posting this!  This seems like the best solution to me.

  reply	other threads:[~2017-04-22 19:50 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-19  3:45 [kernel-hardening] [PATCH] make TIOCSTI ioctl require CAP_SYS_ADMIN Matt Brown
2017-04-19  3:45 ` Matt Brown
2017-04-19  3:45 ` Matt Brown
2017-04-19  4:58 ` [kernel-hardening] " Serge E. Hallyn
2017-04-19  4:58   ` Serge E. Hallyn
2017-04-19  4:58   ` Serge E. Hallyn
2017-04-19  5:20   ` [kernel-hardening] " Kees Cook
2017-04-19  5:20     ` Kees Cook
2017-04-19  5:20     ` Kees Cook
2017-04-19 23:43     ` [kernel-hardening] " Matt Brown
2017-04-19 23:43       ` Matt Brown
2017-04-19 23:43       ` Matt Brown
2017-04-19 23:21   ` [kernel-hardening] " Matt Brown
2017-04-19 23:21     ` Matt Brown
2017-04-19 23:21     ` Matt Brown
2017-04-19 23:53     ` [kernel-hardening] " Serge E. Hallyn
2017-04-19 23:53       ` Serge E. Hallyn
2017-04-19 23:53       ` Serge E. Hallyn
2017-04-20  4:44       ` [kernel-hardening] " Matt Brown
2017-04-20  4:44         ` Matt Brown
2017-04-20  4:44         ` Matt Brown
2017-04-20 15:19         ` [kernel-hardening] " Serge E. Hallyn
2017-04-20 15:19           ` Serge E. Hallyn
2017-04-20 15:19           ` Serge E. Hallyn
2017-04-20 15:24           ` [kernel-hardening] " Serge E. Hallyn
2017-04-20 15:24             ` Serge E. Hallyn
2017-04-20 17:15           ` matt
2017-04-20 17:15             ` matt
2017-04-20 17:15             ` matt at nmatt.com
2017-04-20 17:41             ` [kernel-hardening] " Serge E. Hallyn
2017-04-20 17:41               ` Serge E. Hallyn
2017-04-20 17:41               ` Serge E. Hallyn
2017-04-21  5:09               ` [kernel-hardening] " Matt Brown
2017-04-21  5:09                 ` Matt Brown
2017-04-21  5:09                 ` Matt Brown
2017-04-21  5:24                 ` [kernel-hardening] " Serge E. Hallyn
2017-04-21  5:24                   ` Serge E. Hallyn
2017-04-21  5:24                   ` Serge E. Hallyn
2017-04-21  6:01                   ` [kernel-hardening] " Kees Cook
2017-04-21  6:01                     ` Kees Cook
2017-04-21  6:01                     ` Kees Cook
2017-04-22 17:09                   ` [kernel-hardening] " Matt Brown
2017-04-22 17:09                     ` Matt Brown
2017-04-22 17:09                     ` Matt Brown
2017-04-22 19:50                     ` Serge E. Hallyn [this message]
2017-04-22 19:50                       ` Serge E. Hallyn
2017-04-22 19:50                       ` Serge E. Hallyn
2017-04-19 11:18 ` [kernel-hardening] " James Morris
2017-04-19 11:18   ` James Morris
2017-04-19 11:18   ` James Morris
2017-04-20  0:08   ` [kernel-hardening] " Matt Brown
2017-04-20  0:08     ` Matt Brown
2017-04-20  0:08     ` Matt Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170422195034.GA17556@mail.hallyn.com \
    --to=serge@hallyn.com \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=jmorris@namei.org \
    --cc=jslaby@suse.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=matt@nmatt.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.