[kernel-hardening] Re: [RFC v1 00/17] seccomp-object: From attack surface reduction to sandboxing

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Daniel Borkmann <daniel@iogearbox.net>
To: "Kees Cook" <keescook@chromium.org>, "Mickaël Salaün" <mic@digikod.net>
Cc: linux-security-module <linux-security-module@vger.kernel.org>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Andy Lutomirski <luto@amacapital.net>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Casey Schaufler <casey@schaufler-ca.com>,
	David Drysdale <drysdale@google.com>,
	Eric Paris <eparis@redhat.com>,
	James Morris <james.l.morris@oracle.com>,
	Jeff Dike <jdike@addtoit.com>, Julien Tinnes <jln@google.com>,
	Michael Kerrisk <mtk@man7.org>, Paul Moore <pmoore@redhat.com>,
	Richard Weinberger <richard@nod.at>,
	"Serge E . Hallyn" <serge@hallyn.com>,
	Stephen Smalley <sds@tycho.nsa.gov>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Will Drewry <wad@chromium.org>,
	Linux API <linux-api@vger.kernel.org>,
	"kernel-hardening@lists.openwall.com"
	<kernel-hardening@lists.openwall.com>,
	alexei.starovoitov@gmail.com
Subject: [kernel-hardening] Re: [RFC v1 00/17] seccomp-object: From attack surface reduction to sandboxing
Date: Sat, 21 May 2016 17:19:52 +0200	[thread overview]
Message-ID: <57407C98.3090508@iogearbox.net> (raw)
In-Reply-To: <CAGXu5jK1U12vMk11HD_x_gNz3Rk4ZgEfdThY7DHvm4e4sPRh4g@mail.gmail.com>

Hi Mickaël,

[ sorry for commenting so late ... ]

On 04/28/2016 04:36 AM, Kees Cook wrote:
> On Wed, Mar 23, 2016 at 6:46 PM, Mickaël Salaün <mic@digikod.net> wrote:
>> Hi,
>>
>> This series is a proof of concept (not ready for production) to extend seccomp
>> with the ability to check argument pointers of syscalls as kernel object (e.g.
>> file path). This add a needed feature to create a full sandbox managed by
>> userland like the Seatbelt/XNU Sandbox or the OpenBSD Pledge. It was initially
>> inspired from a partial seccomp-LSM prototype [1] but has evolved a lot since :)
>>
>> The audience for this RFC is limited to security-related actors to discuss
>> about this new feature before enlarging the scope to a wider audience. This
>> aims to focus on the security goal, usability and architecture before entering
>> into the gory details of each subsystem. I also wish to get constructive
>> criticisms about the userland API and intrusiveness of the code (and what could
>> be the other ways to do it better) before going further (and addressing the
>> TODO and FIXME in the code).
>>
>> The approach taken is to add the minimum amount of code while still allowing
>> the userland to create access rules via seccomp. The current limitation of
>> seccomp is to get raw syscall arguments value but there is no way to
>> dereference a pointer to check its content (e.g. the first argument of the open
>> syscall). This seccomp evolution brings a generic way to check against argument
>> pointer regardless from the syscall unlike current LSMs.
>
> Okay, I've read through this whole series now (sorry for the huge
> delay). I think that it is overly complex for what it results in
> providing. Here are some background thoughts I had:
>
> 1) People have asked for "dereferenced argument inspection" (I will
> call this DAI...), in that they would like to be able to process
> arguments like how BPF traditionally processes packets. This series
> doesn't provide that. Rather, it provides static checks against
> specific arguments types (currently just path checks).
>
> 2) When I dig into the requirements people have around DAI, it's
> mostly about checking path names. There is some interest in some of
> the network structures, but mostly it's path names. This series
> certainly underscores this since your first example is path names. :)

Out of curiosity, did you have a look whether adding some very basic
eBPF support for seccomp-BPF could also enable you for the option of
inspecting arguments eventually?

With basic, I mean adding new eBPF program type BPF_PROG_TYPE_SECCOMP
and the only things allowed would be to use a very limited set of
helpers. No maps, etc allowed for this type. If needed for extracting
args, you could extend struct seccomp_data for eBPF use, and add new
set of helper functions that would allow you to extract/walk arguments,
and, for example, pass the extracted buffer back to the eBPF prog for
further inspection.

Have a look at samples in [1,2], which are for tracing though, but possibly
it could be designed in a more or less similar way, where clang compiles
this policy down into eBPF bytecode. Did you have a look at this direction
or any thoughts on it?

   [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/tracex5_kern.c
   [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/tracex1_kern.c

> 3) Solving ToCToU should also solve performance problems. For example,
> this series, on a successful syscall, will look up a pathname twice
> (once in seccomp, then again in the syscall, and then compares the
> results in the LSM as a ToCToU back-stop). This seems like a waste of
> effort, since this reimplements the work the kernel is already doing
> to pass the resulting structure to the LSM hooks. As such, since this
> series is doing static checks and not allowing byte processing for
> DAI, I'm convinced that it should entirely happen in the LSM hooks.
>
> 4) Performing the checks in the LSM hooks carries a risk of exposing
> the syscall's argument processing code to an attacker, but I think
> that is okay since very similar code would already need to be written
> to do the same thing before entering the syscall. The only way out of
> this, I think, would be to standardize syscall argument processing.
>
> 5) If we can standardize syscall argument processing, we could also
> change when it happens, and retain the results for the syscall,
> allowing for full byte processing style of DAI. e.g. copy userspace to
> kernel space, do BPF on the argument, if okay, pass the kernel copy to
> the syscall where it continues the processing. If the kernel copy
> wasn't already created by seccomp, the syscall would just make that
> copy itself, etc.
>
> So, I see DAI as going one of two ways:
>
> a) rewrite all syscall entry to use a common cacheable argument parser
> and offering true BPF processing of the argument bytes.
>
> b) use the existing LSM hooks and define a policy language that can be
> loaded ahead of time.
>
> Doing "a" has many problems, I think. Not the least of which is that I
> can't imagine a way for such an architectural change to not have
> negative performance impacts for the regular case.
>
> Doing "b" means writing a policy engine. I would expect it to look a
> lot like either AppArmor or TOMOYO. TOMOYO has network structure
> processing, so probably it would look more like TOMOYO if you wanted
> more than just file paths. Maybe a seccomp LSM could share logic from
> one of the existing path-based LSMs.
>
> Another note I had for this series was that because the checker tries
> to keep a cached struct path, it allows unprivileged users to check
> for path names existing or not, regardless of the user's permissions.
> Instead, you have to check the path against the policy each time.
> AppArmor does this efficiently with a pre-built deterministic finite
> automatons (built from regular expressions), and TOMOYO just does
> string compares and limited glob parsing every time.
>
> So, I can't take this as-is, but I'll take the one fix near the start.
> :) I hope this isn't too discouraging, since I'd love to see this
> solved. Hopefully you can keep chipping away at it!
> Thanks!
>
> -Kees
>

WARNING: multiple messages have this Message-ID (diff)

From: Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
To: "Kees Cook" <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	"Mickaël Salaün" <mic-WFhQfpSGs3bR7s880joybQ@public.gmane.org>
Cc: linux-security-module
	<linux-security-module-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Andreas Gruenbacher
	<agruenba-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Andy Lutomirski <luto-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>,
	Casey Schaufler <casey-iSGtlc1asvQWG2LlvL+J4A@public.gmane.org>,
	David Drysdale <drysdale-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Eric Paris <eparis-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	James Morris
	<james.l.morris-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>,
	Jeff Dike <jdike-OPE4K8JWMJJBDgjK7y7TUQ@public.gmane.org>,
	Julien Tinnes <jln-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Michael Kerrisk <mtk-ASgREoAs3yw@public.gmane.org>,
	Paul Moore <pmoore-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Richard Weinberger <richard-/L3Ra7n9ekc@public.gmane.org>,
	"Serge E . Hallyn"
	<serge-A9i7LUbDfNHQT0dZR+AlfA@public.gmane.org>,
	Stephen Smalley <sds-+05T5uksL2qpZYMLLGbcSA@public.gmane.org>,
	Tetsuo Handa
	<penguin-kernel-1yMVhJb1mP/7nzcFbJAaVXf5DAMn2ifp@public.gmane.org>,
	Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"kernel-hardening-ZwoEplunGu1jrUoiu81ncdBPR1lH4CV8@public.gmane.org"
	<kernel-hardening-ZwoEplunGu1jrUoiu81ncdBPR1lH4CV8@public.gmane.org>,
	alexei.starovoitov-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Subject: Re: [RFC v1 00/17] seccomp-object: From attack surface reduction to sandboxing
Date: Sat, 21 May 2016 17:19:52 +0200	[thread overview]
Message-ID: <57407C98.3090508@iogearbox.net> (raw)
In-Reply-To: <CAGXu5jK1U12vMk11HD_x_gNz3Rk4ZgEfdThY7DHvm4e4sPRh4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Hi Mickaël,

[ sorry for commenting so late ... ]

On 04/28/2016 04:36 AM, Kees Cook wrote:
> On Wed, Mar 23, 2016 at 6:46 PM, Mickaël Salaün <mic@digikod.net> wrote:
>> Hi,
>>
>> This series is a proof of concept (not ready for production) to extend seccomp
>> with the ability to check argument pointers of syscalls as kernel object (e.g.
>> file path). This add a needed feature to create a full sandbox managed by
>> userland like the Seatbelt/XNU Sandbox or the OpenBSD Pledge. It was initially
>> inspired from a partial seccomp-LSM prototype [1] but has evolved a lot since :)
>>
>> The audience for this RFC is limited to security-related actors to discuss
>> about this new feature before enlarging the scope to a wider audience. This
>> aims to focus on the security goal, usability and architecture before entering
>> into the gory details of each subsystem. I also wish to get constructive
>> criticisms about the userland API and intrusiveness of the code (and what could
>> be the other ways to do it better) before going further (and addressing the
>> TODO and FIXME in the code).
>>
>> The approach taken is to add the minimum amount of code while still allowing
>> the userland to create access rules via seccomp. The current limitation of
>> seccomp is to get raw syscall arguments value but there is no way to
>> dereference a pointer to check its content (e.g. the first argument of the open
>> syscall). This seccomp evolution brings a generic way to check against argument
>> pointer regardless from the syscall unlike current LSMs.
>
> Okay, I've read through this whole series now (sorry for the huge
> delay). I think that it is overly complex for what it results in
> providing. Here are some background thoughts I had:
>
> 1) People have asked for "dereferenced argument inspection" (I will
> call this DAI...), in that they would like to be able to process
> arguments like how BPF traditionally processes packets. This series
> doesn't provide that. Rather, it provides static checks against
> specific arguments types (currently just path checks).
>
> 2) When I dig into the requirements people have around DAI, it's
> mostly about checking path names. There is some interest in some of
> the network structures, but mostly it's path names. This series
> certainly underscores this since your first example is path names. :)

Out of curiosity, did you have a look whether adding some very basic
eBPF support for seccomp-BPF could also enable you for the option of
inspecting arguments eventually?

With basic, I mean adding new eBPF program type BPF_PROG_TYPE_SECCOMP
and the only things allowed would be to use a very limited set of
helpers. No maps, etc allowed for this type. If needed for extracting
args, you could extend struct seccomp_data for eBPF use, and add new
set of helper functions that would allow you to extract/walk arguments,
and, for example, pass the extracted buffer back to the eBPF prog for
further inspection.

Have a look at samples in [1,2], which are for tracing though, but possibly
it could be designed in a more or less similar way, where clang compiles
this policy down into eBPF bytecode. Did you have a look at this direction
or any thoughts on it?

   [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/tracex5_kern.c
   [2] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/tracex1_kern.c

> 3) Solving ToCToU should also solve performance problems. For example,
> this series, on a successful syscall, will look up a pathname twice
> (once in seccomp, then again in the syscall, and then compares the
> results in the LSM as a ToCToU back-stop). This seems like a waste of
> effort, since this reimplements the work the kernel is already doing
> to pass the resulting structure to the LSM hooks. As such, since this
> series is doing static checks and not allowing byte processing for
> DAI, I'm convinced that it should entirely happen in the LSM hooks.
>
> 4) Performing the checks in the LSM hooks carries a risk of exposing
> the syscall's argument processing code to an attacker, but I think
> that is okay since very similar code would already need to be written
> to do the same thing before entering the syscall. The only way out of
> this, I think, would be to standardize syscall argument processing.
>
> 5) If we can standardize syscall argument processing, we could also
> change when it happens, and retain the results for the syscall,
> allowing for full byte processing style of DAI. e.g. copy userspace to
> kernel space, do BPF on the argument, if okay, pass the kernel copy to
> the syscall where it continues the processing. If the kernel copy
> wasn't already created by seccomp, the syscall would just make that
> copy itself, etc.
>
> So, I see DAI as going one of two ways:
>
> a) rewrite all syscall entry to use a common cacheable argument parser
> and offering true BPF processing of the argument bytes.
>
> b) use the existing LSM hooks and define a policy language that can be
> loaded ahead of time.
>
> Doing "a" has many problems, I think. Not the least of which is that I
> can't imagine a way for such an architectural change to not have
> negative performance impacts for the regular case.
>
> Doing "b" means writing a policy engine. I would expect it to look a
> lot like either AppArmor or TOMOYO. TOMOYO has network structure
> processing, so probably it would look more like TOMOYO if you wanted
> more than just file paths. Maybe a seccomp LSM could share logic from
> one of the existing path-based LSMs.
>
> Another note I had for this series was that because the checker tries
> to keep a cached struct path, it allows unprivileged users to check
> for path names existing or not, regardless of the user's permissions.
> Instead, you have to check the path against the policy each time.
> AppArmor does this efficiently with a pre-built deterministic finite
> automatons (built from regular expressions), and TOMOYO just does
> string compares and limited glob parsing every time.
>
> So, I can't take this as-is, but I'll take the one fix near the start.
> :) I hope this isn't too discouraging, since I'd love to see this
> solved. Hopefully you can keep chipping away at it!
> Thanks!
>
> -Kees
>

next prev parent reply	other threads:[~2016-05-21 15:19 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-24  1:46 [kernel-hardening] [RFC v1 00/17] seccomp-object: From attack surface reduction to sandboxing Mickaël Salaün
2016-03-24  1:46 ` Mickaël Salaün
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 01/17] um: Export the sys_call_table Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 02/17] seccomp: Fix typo Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 03/17] selftest/seccomp: Fix the flag name SECCOMP_FILTER_FLAG_TSYNC Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  4:35   ` [kernel-hardening] " Kees Cook
2016-03-24  4:35     ` Kees Cook
2016-03-29 15:35     ` [kernel-hardening] " Shuah Khan
2016-03-29 15:35       ` Shuah Khan
2016-03-29 18:46       ` [kernel-hardening] [PATCH 1/2] " Mickaël Salaün
2016-03-29 18:46         ` Mickaël Salaün
2016-03-29 19:06         ` [kernel-hardening] " Shuah Khan
2016-03-29 19:06           ` Shuah Khan
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 04/17] selftest/seccomp: Fix the seccomp(2) signature Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  4:36   ` [kernel-hardening] " Kees Cook
2016-03-24  4:36     ` Kees Cook
2016-03-29 15:38     ` [kernel-hardening] " Shuah Khan
2016-03-29 15:38       ` Shuah Khan
2016-03-29 18:51       ` [kernel-hardening] [PATCH 2/2] " Mickaël Salaün
2016-03-29 18:51         ` Mickaël Salaün
2016-03-29 19:07         ` [kernel-hardening] " Shuah Khan
2016-03-29 19:07           ` Shuah Khan
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 05/17] security/seccomp: Add LSM and create arrays of syscall metadata Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24 15:47   ` [kernel-hardening] " Casey Schaufler
2016-03-24 15:47     ` Casey Schaufler
2016-03-24 16:01   ` [kernel-hardening] " Casey Schaufler
2016-03-24 16:01     ` Casey Schaufler
2016-03-24 21:31     ` [kernel-hardening] " Mickaël Salaün
2016-03-24 21:31       ` Mickaël Salaün
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 06/17] seccomp: Add the SECCOMP_ADD_CHECKER_GROUP command Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 07/17] seccomp: Add seccomp object checker evaluation Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  1:46 ` [kernel-hardening] [RFC v1 08/17] selftest/seccomp: Remove unknown_ret_is_kill_above_allow test Mickaël Salaün
2016-03-24  1:46   ` Mickaël Salaün
2016-03-24  2:53 ` [kernel-hardening] [RFC v1 09/17] selftest/seccomp: Extend seccomp_data until matches[6] Mickaël Salaün
2016-03-24  2:53   ` Mickaël Salaün
2016-03-24  2:53   ` [kernel-hardening] [RFC v1 10/17] selftest/seccomp: Add field_is_valid_syscall test Mickaël Salaün
2016-03-24  2:53     ` Mickaël Salaün
2016-03-24  2:53   ` [kernel-hardening] [RFC v1 11/17] selftest/seccomp: Add argeval_open_whitelist test Mickaël Salaün
2016-03-24  2:53     ` Mickaël Salaün
2016-03-24  2:53   ` [kernel-hardening] [RFC v1 12/17] audit,seccomp: Extend audit with seccomp state Mickaël Salaün
2016-03-24  2:53     ` Mickaël Salaün
2016-03-24  2:53   ` [kernel-hardening] [RFC v1 13/17] selftest/seccomp: Rename TRACE_poke to TRACE_poke_sys_read Mickaël Salaün
2016-03-24  2:53     ` Mickaël Salaün
2016-03-24  2:53   ` [kernel-hardening] [RFC v1 14/17] selftest/seccomp: Make tracer_poke() more generic Mickaël Salaün
2016-03-24  2:53     ` Mickaël Salaün
2016-03-24  2:54   ` [kernel-hardening] [RFC v1 15/17] selftest/seccomp: Add argeval_toctou_argument test Mickaël Salaün
2016-03-24  2:54     ` Mickaël Salaün
2016-03-24  2:54   ` [kernel-hardening] [RFC v1 16/17] security/seccomp: Protect against filesystem TOCTOU Mickaël Salaün
2016-03-24  2:54     ` Mickaël Salaün
2016-03-24  2:54   ` [kernel-hardening] [RFC v1 17/17] selftest/seccomp: Add argeval_toctou_filesystem test Mickaël Salaün
2016-03-24  2:54     ` Mickaël Salaün
2016-03-24 16:24 ` [kernel-hardening] Re: [RFC v1 00/17] seccomp-object: From attack surface reduction to sandboxing Kees Cook
2016-03-24 16:24   ` Kees Cook
2016-03-27  5:03   ` [kernel-hardening] " Loganaden Velvindron
2016-03-27  5:03     ` Loganaden Velvindron
2016-04-20 18:21 ` Mickaël Salaün
2016-04-20 18:21   ` Mickaël Salaün
2016-04-26 22:46   ` [kernel-hardening] " Kees Cook
2016-04-26 22:46     ` Kees Cook
2016-04-28  2:36 ` [kernel-hardening] " Kees Cook
2016-04-28  2:36   ` Kees Cook
2016-04-28 23:45   ` [kernel-hardening] " Mickaël Salaün
2016-04-28 23:45     ` Mickaël Salaün
2016-05-21 12:58     ` [kernel-hardening] " Mickaël Salaün
2016-05-21 12:58       ` Mickaël Salaün
2016-05-02 22:19   ` [kernel-hardening] " James Morris
2016-05-02 22:19     ` James Morris
2016-05-21 15:19   ` Daniel Borkmann [this message]
2016-05-21 15:19     ` Daniel Borkmann
2016-05-22 21:30     ` [kernel-hardening] " Mickaël Salaün
2016-05-22 21:30       ` Mickaël Salaün

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57407C98.3090508@iogearbox.net \
    --to=daniel@iogearbox.net \
    --cc=agruenba@redhat.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=arnd@arndb.de \
    --cc=casey@schaufler-ca.com \
    --cc=drysdale@google.com \
    --cc=eparis@redhat.com \
    --cc=james.l.morris@oracle.com \
    --cc=jdike@addtoit.com \
    --cc=jln@google.com \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mic@digikod.net \
    --cc=mtk@man7.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=pmoore@redhat.com \
    --cc=richard@nod.at \
    --cc=sds@tycho.nsa.gov \
    --cc=serge@hallyn.com \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.