netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
To: "Michael Kerrisk (man-pages)"
	<mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Cc: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Linux API <linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Alexei Starovoitov <ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>,
	Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>,
	"Serge E. Hallyn"
	<serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>,
	Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>,
	LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Network Development
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
Subject: Re: [PATCH 5/6] seccomp: add a way to attach a filter via eBPF fd
Date: Tue, 8 Sep 2015 07:40:44 -0600	[thread overview]
Message-ID: <20150908134044.GV26679@smitten> (raw)
In-Reply-To: <55EA95FE.7000006-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On Sat, Sep 05, 2015 at 09:13:02AM +0200, Michael Kerrisk (man-pages) wrote:
> On 09/04/2015 10:41 PM, Kees Cook wrote:
> > On Fri, Sep 4, 2015 at 9:04 AM, Tycho Andersen
> > <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org> wrote:
> >> This is the final bit needed to support seccomp filters created via the bpf
> >> syscall.
> 
> Hmm. Thanks Kees, for CCinf linux-api@. That really should have been done at
> the outset.

Apologies, I'll cc the list on future versions.

> Tycho, where's the man-pages patch describing this new kernel-userspace
> API feature? :-)

Once we get the API finalized I'm happy to write it.

> >> One concern with this patch is exactly what the interface should look like
> >> for users, since seccomp()'s second argument is a pointer, we could ask
> >> people to pass a pointer to the fd, but implies we might write to it which
> >> seems impolite. Right now we cast the pointer (and force the user to cast
> >> it), which generates ugly warnings. I'm not sure what the right answer is
> >> here.
> >>
> >> Signed-off-by: Tycho Andersen <tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
> >> CC: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> >> CC: Will Drewry <wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> >> CC: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> >> CC: Andy Lutomirski <luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org>
> >> CC: Pavel Emelyanov <xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
> >> CC: Serge E. Hallyn <serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org>
> >> CC: Alexei Starovoitov <ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> >> CC: Daniel Borkmann <daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org>
> >> ---
> >>  include/linux/seccomp.h      |  3 +-
> >>  include/uapi/linux/seccomp.h |  1 +
> >>  kernel/seccomp.c             | 70 ++++++++++++++++++++++++++++++++++++--------
> >>  3 files changed, 61 insertions(+), 13 deletions(-)
> >>
> >> diff --git a/include/linux/seccomp.h b/include/linux/seccomp.h
> >> index d1a86ed..a725dd5 100644
> >> --- a/include/linux/seccomp.h
> >> +++ b/include/linux/seccomp.h
> >> @@ -3,7 +3,8 @@
> >>
> >>  #include <uapi/linux/seccomp.h>
> >>
> >> -#define SECCOMP_FILTER_FLAG_MASK       (SECCOMP_FILTER_FLAG_TSYNC)
> >> +#define SECCOMP_FILTER_FLAG_MASK       (\
> >> +       SECCOMP_FILTER_FLAG_TSYNC | SECCOMP_FILTER_FLAG_EBPF)
> >>
> >>  #ifdef CONFIG_SECCOMP
> >>
> >> diff --git a/include/uapi/linux/seccomp.h b/include/uapi/linux/seccomp.h
> >> index 0f238a4..c29a423 100644
> >> --- a/include/uapi/linux/seccomp.h
> >> +++ b/include/uapi/linux/seccomp.h
> >> @@ -16,6 +16,7 @@
> >>
> >>  /* Valid flags for SECCOMP_SET_MODE_FILTER */
> >>  #define SECCOMP_FILTER_FLAG_TSYNC      1
> >> +#define SECCOMP_FILTER_FLAG_EBPF       (1 << 1)
> >>
> >>  /*
> >>   * All BPF programs must return a 32-bit value.
> >> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> >> index a2c5b32..9c6bea6 100644
> >> --- a/kernel/seccomp.c
> >> +++ b/kernel/seccomp.c
> >> @@ -355,17 +355,6 @@ static struct seccomp_filter *seccomp_prepare_filter(struct sock_fprog *fprog)
> >>
> >>         BUG_ON(INT_MAX / fprog->len < sizeof(struct sock_filter));
> >>
> >> -       /*
> >> -        * Installing a seccomp filter requires that the task has
> >> -        * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
> >> -        * This avoids scenarios where unprivileged tasks can affect the
> >> -        * behavior of privileged children.
> >> -        */
> >> -       if (!task_no_new_privs(current) &&
> >> -           security_capable_noaudit(current_cred(), current_user_ns(),
> >> -                                    CAP_SYS_ADMIN) != 0)
> >> -               return ERR_PTR(-EACCES);
> >> -
> >>         /* Allocate a new seccomp_filter */
> >>         sfilter = kzalloc(sizeof(*sfilter), GFP_KERNEL | __GFP_NOWARN);
> >>         if (!sfilter)
> >> @@ -509,6 +498,48 @@ static void seccomp_send_sigsys(int syscall, int reason)
> >>         info.si_syscall = syscall;
> >>         force_sig_info(SIGSYS, &info, current);
> >>  }
> >> +
> >> +#ifdef CONFIG_BPF_SYSCALL
> >> +static struct seccomp_filter *seccomp_prepare_ebpf(const char __user *filter)
> >> +{
> >> +       /* XXX: this cast generates a warning. should we make people pass in
> >> +        * &fd, or is there some nicer way of doing this?
> >> +        */
> >> +       u32 fd = (u32) filter;
> > 
> > I think this is probably the right way to do it, modulo getting the
> > warning fixed. Let me invoke the great linux-api subscribers to get
> > some more opinions.
> 
> Sigh. It's sad, but the using a cast does seem the simplest option.
> But, how about another idea...
> 
> > tl;dr: adding SECCOMP_FILTER_FLAG_EBPF to the flags changes the
> > pointer argument into an fd argument. Is this sane, should it be a
> > pointer to an fd, or should it not be a flag at all, creating a new
> > seccomp command instead (SECCOMP_MODE_FILTER_EBPF)?
> 
> What about
> 
>     seccomp(SECCOMP_MODE_FILTER_EBPF, flags, structp)
> 
> Where structp is a pointer to something like
> 
> struct seccomp_ebpf {
>     int size;      /* Size of this whole struct */
>     int fd;
> }
> 
> 'size' allows for future expansion of the struct (in case we want to 
> expand it later), and placing 'fd' inside a struct avoids unpleasant
> implication that would be made by passing a pointer to an fd as the
> third argument.

I like this; although perhaps something like bpf() has, with the
unions inside the struct so that we don't have to declare another
struct if we add another seccomp command. Kees?

Tycho

> Cheers,
> 
> Michael
> 
> 
> > -Kees
> > 
> >> +       struct seccomp_filter *ret;
> >> +       struct bpf_prog *prog;
> >> +
> >> +       prog = bpf_prog_get(fd);
> >> +       if (IS_ERR(prog))
> >> +               return (struct seccomp_filter *) prog;
> >> +
> >> +       if (prog->type != BPF_PROG_TYPE_SECCOMP) {
> >> +               bpf_prog_put(prog);
> >> +               return ERR_PTR(-EINVAL);
> >> +       }
> >> +
> >> +       ret = kzalloc(sizeof(*ret), GFP_KERNEL | __GFP_NOWARN);
> >> +       if (!ret) {
> >> +               bpf_prog_put(prog);
> >> +               return ERR_PTR(-ENOMEM);
> >> +       }
> >> +
> >> +       ret->prog = prog;
> >> +       atomic_set(&ret->usage, 1);
> >> +
> >> +       /* Intentionally don't bpf_prog_put() here, because the underlying prog
> >> +        * is refcounted too and we're holding a reference from the struct
> >> +        * seccomp_filter object.
> >> +        */
> >> +
> >> +       return ret;
> >> +}
> >> +#else
> >> +static struct seccomp_filter *seccomp_prepare_ebpf(const char __user *filter)
> >> +{
> >> +       return ERR_PTR(-EINVAL);
> >> +}
> >> +#endif
> >>  #endif /* CONFIG_SECCOMP_FILTER */
> >>
> >>  /*
> >> @@ -775,8 +806,23 @@ static long seccomp_set_mode_filter(unsigned int flags,
> >>         if (flags & ~SECCOMP_FILTER_FLAG_MASK)
> >>                 return -EINVAL;
> >>
> >> +       /*
> >> +        * Installing a seccomp filter requires that the task has
> >> +        * CAP_SYS_ADMIN in its namespace or be running with no_new_privs.
> >> +        * This avoids scenarios where unprivileged tasks can affect the
> >> +        * behavior of privileged children.
> >> +        */
> >> +       if (!task_no_new_privs(current) &&
> >> +           security_capable_noaudit(current_cred(), current_user_ns(),
> >> +                                    CAP_SYS_ADMIN) != 0)
> >> +               return -EACCES;
> >> +
> >>         /* Prepare the new filter before holding any locks. */
> >> -       prepared = seccomp_prepare_user_filter(filter);
> >> +       if (flags & SECCOMP_FILTER_FLAG_EBPF)
> >> +               prepared = seccomp_prepare_ebpf(filter);
> >> +       else
> >> +               prepared = seccomp_prepare_user_filter(filter);
> >> +
> >>         if (IS_ERR(prepared))
> >>                 return PTR_ERR(prepared);
> >>
> >> --
> >> 2.1.4
> >>
> > 
> > 
> > 
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

  parent reply	other threads:[~2015-09-08 13:40 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 16:04 c/r of seccomp filters via underlying eBPF Tycho Andersen
2015-09-04 16:04 ` [PATCH 1/6] ebpf: add a seccomp program type Tycho Andersen
2015-09-04 20:17   ` Alexei Starovoitov
2015-09-04 21:09     ` Tycho Andersen
2015-09-04 20:34   ` Kees Cook
2015-09-04 21:06     ` Tycho Andersen
2015-09-04 21:08       ` Kees Cook
2015-09-09 15:50         ` Tycho Andersen
2015-09-09 16:07           ` Alexei Starovoitov
2015-09-09 16:09             ` Daniel Borkmann
2015-09-09 16:37               ` Kees Cook
2015-09-09 16:52                 ` Alexei Starovoitov
2015-09-09 17:27                   ` Kees Cook
2015-09-09 17:31                     ` Tycho Andersen
2015-09-09 16:07           ` Daniel Borkmann
2015-09-04 21:50   ` Andy Lutomirski
2015-09-09 16:13     ` Daniel Borkmann
2015-09-04 16:04 ` [PATCH 2/6] seccomp: make underlying bpf ref counted as well Tycho Andersen
2015-09-04 21:53   ` Andy Lutomirski
2015-09-04 16:04 ` [PATCH 3/6] ebpf: add a way to dump an eBPF program Tycho Andersen
2015-09-04 20:17   ` Kees Cook
2015-09-04 20:45     ` Tycho Andersen
2015-09-04 20:50       ` Kees Cook
2015-09-04 20:58         ` Alexei Starovoitov
2015-09-04 21:00           ` Tycho Andersen
2015-09-04 21:48       ` Andy Lutomirski
2015-09-04 22:28         ` Tycho Andersen
2015-09-04 23:08           ` Andy Lutomirski
2015-09-05  0:27             ` Tycho Andersen
2015-09-09 22:34               ` Tycho Andersen
2015-09-09 23:44                 ` Andy Lutomirski
2015-09-10  0:13                   ` Tycho Andersen
2015-09-10  0:44                     ` Andy Lutomirski
2015-09-10  0:58                       ` Tycho Andersen
2015-09-04 23:27           ` Kees Cook
2015-09-05  0:08             ` Andy Lutomirski
2015-09-04 20:27   ` Alexei Starovoitov
2015-09-04 20:42     ` Tycho Andersen
2015-09-04 16:04 ` [PATCH 4/6] seccomp: add a way to access filters via bpf fds Tycho Andersen
2015-09-04 20:26   ` Kees Cook
2015-09-04 20:29     ` Alexei Starovoitov
2015-09-04 20:58       ` Tycho Andersen
2015-09-04 16:04 ` [PATCH 5/6] seccomp: add a way to attach a filter via eBPF fd Tycho Andersen
2015-09-04 20:40   ` Alexei Starovoitov
     [not found]   ` <1441382664-17437-6-git-send-email-tycho.andersen-Z7WLFzj8eWMS+FvcfC7Uqw@public.gmane.org>
2015-09-04 20:41     ` Kees Cook
     [not found]       ` <CAGXu5jKke44txdYqEgPRrkn8SyWGjJuHxT2qMdq2ztp_16mQyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-05  7:13         ` Michael Kerrisk (man-pages)
     [not found]           ` <55EA95FE.7000006-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-09-08 13:40             ` Tycho Andersen [this message]
2015-09-09  0:07               ` Kees Cook
     [not found]                 ` <CAGXu5jKS0yX92XXhL6ZkqMrxkqFpPyyBd7wbsvEEx4rqZ0VG6g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-09-09 14:47                   ` Tycho Andersen
2015-09-09 15:14                     ` Alexei Starovoitov
     [not found]                       ` <20150909151402.GA3429-2RGepAHry04KGsCuBW9QBvb0xQGhdpdCAL8bYrjMMd8@public.gmane.org>
2015-09-09 15:55                         ` Tycho Andersen
2015-09-04 16:04 ` [PATCH 6/6] ebpf: allow BPF_REG_X in src_reg conditional jumps Tycho Andersen
2015-09-04 21:06   ` Alexei Starovoitov
2015-09-04 22:43     ` Tycho Andersen
2015-09-05  4:12       ` Alexei Starovoitov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150908134044.GV26679@smitten \
    --to=tycho.andersen-z7wlfzj8ewms+fvcfc7uqw@public.gmane.org \
    --cc=ast-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=daniel-FeC+5ew28dpmcu3hnIyYJQ@public.gmane.org \
    --cc=keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=luto-kltTT9wpgjJwATOyAt5JVQ@public.gmane.org \
    --cc=mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=serge.hallyn-GeWIH/nMZzLQT0dZR+AlfA@public.gmane.org \
    --cc=wad-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org \
    --cc=xemul-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).