All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Kees Cook <keescook@chromium.org>
Cc: Will Drewry <wad@chromium.org>,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-doc@vger.kernel.org,
	kernel-hardening@lists.openwall.com, netdev@vger.kernel.org,
	x86@kernel.org, arnd@arndb.de, davem@davemloft.net,
	hpa@zytor.com, mingo@redhat.com, oleg@redhat.com,
	peterz@infradead.org, rdunlap@xenotime.net,
	mcgrathr@chromium.org, tglx@linutronix.de, luto@mit.edu,
	eparis@redhat.com, serge.hallyn@canonical.com, djm@mindrot.org,
	scarybeasts@gmail.com, indan@nul.nu, pmoore@redhat.com,
	corbet@lwn.net, eric.dumazet@gmail.com, markus@chromium.org,
	coreyb@linux.vnet.ibm.com, jmorris@namei.org
Subject: [kernel-hardening] Re: [PATCH v17 08/15] seccomp: add system call filtering using BPF
Date: Fri, 6 Apr 2012 14:05:03 -0700	[thread overview]
Message-ID: <20120406140503.10b75c5b.akpm@linux-foundation.org> (raw)
In-Reply-To: <CAGXu5jJVDLzCOyCo-q6+pF-_3v4Ws8jBzYXwac419uv2QQS0mA@mail.gmail.com>

On Fri, 6 Apr 2012 13:44:43 -0700
Kees Cook <keescook@chromium.org> wrote:

> On Fri, Apr 6, 2012 at 1:23 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu, 29 Mar 2012 15:01:53 -0500
> > Will Drewry <wad@chromium.org> wrote:
> >
> >> [This patch depends on luto@mit.edu's no_new_privs patch:
> >>    https://lkml.org/lkml/2012/1/30/264
> >>  included in this series for ease of consumption.
> >> ]
> >>
> >> This patch adds support for seccomp mode 2.  Mode 2 introduces the
> >> ability for unprivileged processes to install system call filtering
> >> policy expressed in terms of a Berkeley Packet Filter (BPF) program.
> >> This program will be evaluated in the kernel for each system call
> >> the task makes and computes a result based on data in the format
> >> of struct seccomp_data.
> >> ...
> >> +static void seccomp_filter_log_failure(int syscall)
> >> +{
> >> +     int compat = 0;
> >> +#ifdef CONFIG_COMPAT
> >> +     compat = is_compat_task();
> >> +#endif
> >
> > hm, I'm surprised that we don't have a zero-returning implementation of
> > is_compat_task() when CONFIG_COMPAT=n.  Seems silly.  Blames Arnd.
> 
> There is

I can't find it.  The definition in include/linux/compat.h is inside
#ifdef CONFIG_COMPAT.

> >> +static long seccomp_attach_filter(struct sock_fprog *fprog)
> >> +{
> >> +     struct seccomp_filter *filter;
> >> +     unsigned long fp_size = fprog->len * sizeof(struct sock_filter);
> >> +     unsigned long total_insns = fprog->len;
> >> +     long ret;
> >> +
> >> +     if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
> >> +             return -EINVAL;
> >> +
> >> +     for (filter = current->seccomp.filter; filter; filter = filter->prev)
> >> +             total_insns += filter->len + 4;  /* include a 4 instr penalty */
> >
> > So tasks don't share filters?  We copy them by value at fork?  Do we do
> > this at vfork() too?
> 
> The filter chain is shared (and refcounted).

So what's the locking rule for accessing and modifying that
singly-linked list?

> ...
> >> +/* put_seccomp_filter - decrements the ref count of tsk->seccomp.filter */
> >> +void put_seccomp_filter(struct task_struct *tsk)
> >> +{
> >> +     struct seccomp_filter *orig = tsk->seccomp.filter;
> >> +     /* Clean up single-reference branches iteratively. */
> >> +     while (orig && atomic_dec_and_test(&orig->usage)) {
> >> +             struct seccomp_filter *freeme = orig;
> >> +             orig = orig->prev;
> >> +             kfree(freeme);
> >> +     }
> >> +}
> >
> > So if one of the filters in the list has an elevated refcount, we bail
> > out on the remainder of the list.  Seems odd.
> 
> This so that every filter in the list doesn't need to have their
> refcount raised. As long as the counting up matching the counting
> down, it's fine. This allows for process trees branching the filter
> list at different times still being safe. IIUC, this code was based on
> how namespace refcounting is handled. I spent some time proving to
> myself that it was correctly refcounted a while back. More eyes is
> better, of course. :)

Please ensure that future readers of this code have a description of
how it is supposed to work.

WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Kees Cook <keescook@chromium.org>
Cc: Will Drewry <wad@chromium.org>,
	linux-kernel@vger.kernel.org,
	linux-security-module@vger.kernel.org,
	linux-arch@vger.kernel.org, linux-doc@vger.kernel.org,
	kernel-hardening@lists.openwall.com, netdev@vger.kernel.org,
	x86@kernel.org, arnd@arndb.de, davem@davemloft.net,
	hpa@zytor.com, mingo@redhat.com, oleg@redhat.com,
	peterz@infradead.org, rdunlap@xenotime.net,
	mcgrathr@chromium.org, tglx@linutronix.de, luto@mit.edu,
	eparis@redhat.com, serge.hallyn@canonical.com, djm@mindrot.org,
	scarybeasts@gmail.com, indan@nul.nu, pmoore@redhat.com,
	corbet@lwn.net, eric.dumazet@gmail.com, markus@chromium.org,
	coreyb@linux.vnet.ibm.com, jmorris@namei.org
Subject: Re: [PATCH v17 08/15] seccomp: add system call filtering using BPF
Date: Fri, 6 Apr 2012 14:05:03 -0700	[thread overview]
Message-ID: <20120406140503.10b75c5b.akpm@linux-foundation.org> (raw)
In-Reply-To: <CAGXu5jJVDLzCOyCo-q6+pF-_3v4Ws8jBzYXwac419uv2QQS0mA@mail.gmail.com>

On Fri, 6 Apr 2012 13:44:43 -0700
Kees Cook <keescook@chromium.org> wrote:

> On Fri, Apr 6, 2012 at 1:23 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> > On Thu, 29 Mar 2012 15:01:53 -0500
> > Will Drewry <wad@chromium.org> wrote:
> >
> >> [This patch depends on luto@mit.edu's no_new_privs patch:
> >>    https://lkml.org/lkml/2012/1/30/264
> >>  included in this series for ease of consumption.
> >> ]
> >>
> >> This patch adds support for seccomp mode 2.  Mode 2 introduces the
> >> ability for unprivileged processes to install system call filtering
> >> policy expressed in terms of a Berkeley Packet Filter (BPF) program.
> >> This program will be evaluated in the kernel for each system call
> >> the task makes and computes a result based on data in the format
> >> of struct seccomp_data.
> >> ...
> >> +static void seccomp_filter_log_failure(int syscall)
> >> +{
> >> +     int compat = 0;
> >> +#ifdef CONFIG_COMPAT
> >> +     compat = is_compat_task();
> >> +#endif
> >
> > hm, I'm surprised that we don't have a zero-returning implementation of
> > is_compat_task() when CONFIG_COMPAT=n.  Seems silly.  Blames Arnd.
> 
> There is

I can't find it.  The definition in include/linux/compat.h is inside
#ifdef CONFIG_COMPAT.

> >> +static long seccomp_attach_filter(struct sock_fprog *fprog)
> >> +{
> >> +     struct seccomp_filter *filter;
> >> +     unsigned long fp_size = fprog->len * sizeof(struct sock_filter);
> >> +     unsigned long total_insns = fprog->len;
> >> +     long ret;
> >> +
> >> +     if (fprog->len == 0 || fprog->len > BPF_MAXINSNS)
> >> +             return -EINVAL;
> >> +
> >> +     for (filter = current->seccomp.filter; filter; filter = filter->prev)
> >> +             total_insns += filter->len + 4;  /* include a 4 instr penalty */
> >
> > So tasks don't share filters?  We copy them by value at fork?  Do we do
> > this at vfork() too?
> 
> The filter chain is shared (and refcounted).

So what's the locking rule for accessing and modifying that
singly-linked list?

> ...
> >> +/* put_seccomp_filter - decrements the ref count of tsk->seccomp.filter */
> >> +void put_seccomp_filter(struct task_struct *tsk)
> >> +{
> >> +     struct seccomp_filter *orig = tsk->seccomp.filter;
> >> +     /* Clean up single-reference branches iteratively. */
> >> +     while (orig && atomic_dec_and_test(&orig->usage)) {
> >> +             struct seccomp_filter *freeme = orig;
> >> +             orig = orig->prev;
> >> +             kfree(freeme);
> >> +     }
> >> +}
> >
> > So if one of the filters in the list has an elevated refcount, we bail
> > out on the remainder of the list.  Seems odd.
> 
> This so that every filter in the list doesn't need to have their
> refcount raised. As long as the counting up matching the counting
> down, it's fine. This allows for process trees branching the filter
> list at different times still being safe. IIUC, this code was based on
> how namespace refcounting is handled. I spent some time proving to
> myself that it was correctly refcounted a while back. More eyes is
> better, of course. :)

Please ensure that future readers of this code have a description of
how it is supposed to work.

  reply	other threads:[~2012-04-06 21:05 UTC|newest]

Thread overview: 146+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-29 20:01 [kernel-hardening] [PATCH v17 00/15] seccomp_filter: BPF-based syscall filtering Will Drewry
2012-03-29 20:01 ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 01/15] Add PR_{GET,SET}_NO_NEW_PRIVS to prevent execve from granting privs Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-04-06 19:49   ` [kernel-hardening] " Andrew Morton
2012-04-06 19:49     ` Andrew Morton
2012-04-06 19:55     ` [kernel-hardening] " Andy Lutomirski
2012-04-06 19:55       ` Andy Lutomirski
2012-04-06 19:55       ` Andy Lutomirski
2012-04-06 20:47     ` [kernel-hardening] " Markus Gutschke
2012-04-06 20:47       ` Markus Gutschke
2012-04-06 20:54       ` [kernel-hardening] " Andrew Lutomirski
2012-04-06 20:54         ` Andrew Lutomirski
2012-04-06 21:04         ` [kernel-hardening] " Markus Gutschke
2012-04-06 21:04           ` Markus Gutschke
2012-04-06 21:15           ` [kernel-hardening] " Andrew Lutomirski
2012-04-06 21:15             ` Andrew Lutomirski
2012-04-06 21:32             ` [kernel-hardening] " Markus Gutschke
2012-04-06 21:32               ` Markus Gutschke
2012-04-10 19:12     ` [kernel-hardening] " Will Drewry
2012-04-10 19:12       ` Will Drewry
2012-04-06 19:55   ` [kernel-hardening] " Andrew Morton
2012-04-06 19:55     ` Andrew Morton
2012-04-06 19:55     ` Andrew Morton
2012-04-06 20:01     ` [kernel-hardening] " Andrew Lutomirski
2012-04-06 20:01       ` Andrew Lutomirski
2012-04-06 20:28       ` [kernel-hardening] " Jonathan Corbet
2012-04-06 20:28         ` Jonathan Corbet
2012-04-06 20:37         ` [kernel-hardening] " Andrew Lutomirski
2012-04-06 20:37           ` Andrew Lutomirski
2012-04-11 19:31         ` [kernel-hardening] " Michael Kerrisk (man-pages)
2012-04-11 19:31           ` Michael Kerrisk (man-pages)
2012-04-12  0:15           ` [kernel-hardening] " Michael Kerrisk (man-pages)
2012-04-12  0:15             ` Michael Kerrisk (man-pages)
2012-04-12  0:50           ` [kernel-hardening] " Andrew Lutomirski
2012-04-12  0:50             ` Andrew Lutomirski
2012-04-16 19:11           ` [kernel-hardening] " Rob Landley
2012-04-16 19:11             ` Rob Landley
2012-04-10 20:37       ` [kernel-hardening] " Rob Landley
2012-04-10 20:37         ` Rob Landley
2012-04-10 19:03     ` [kernel-hardening] " Will Drewry
2012-04-10 19:03       ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 02/15] Fix apparmor for PR_{GET,SET}_NO_NEW_PRIVS Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 03/15] sk_run_filter: add BPF_S_ANC_SECCOMP_LD_W Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 04/15] net/compat.c,linux/filter.h: share compat_sock_fprog Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 05/15] seccomp: kill the seccomp_t typedef Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 06/15] arch/x86: add syscall_get_arch to syscall.h Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 07/15] asm/syscall.h: add syscall_get_arch Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-04-06 20:05   ` [kernel-hardening] " Andrew Morton
2012-04-06 20:05     ` Andrew Morton
2012-04-09 19:24     ` [kernel-hardening] " Will Drewry
2012-04-09 19:24       ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 08/15] seccomp: add system call filtering using BPF Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-31  4:40   ` [kernel-hardening] " Vladimir Murzin
2012-03-31  4:40     ` Vladimir Murzin
2012-03-31 18:14     ` [kernel-hardening] " Will Drewry
2012-03-31 18:14       ` Will Drewry
2012-04-06 20:23   ` [kernel-hardening] " Andrew Morton
2012-04-06 20:23     ` Andrew Morton
2012-04-06 20:44     ` [kernel-hardening] " Kees Cook
2012-04-06 20:44       ` Kees Cook
2012-04-06 21:05       ` Andrew Morton [this message]
2012-04-06 21:05         ` Andrew Morton
2012-04-06 21:06         ` [kernel-hardening] " H. Peter Anvin
2012-04-06 21:06           ` H. Peter Anvin
2012-04-06 21:09           ` [kernel-hardening] " Andrew Morton
2012-04-06 21:09             ` Andrew Morton
2012-04-08 18:22     ` [kernel-hardening] " Indan Zupancic
2012-04-08 18:22       ` Indan Zupancic
2012-04-09 19:59       ` [kernel-hardening] " Will Drewry
2012-04-09 19:59         ` Will Drewry
2012-04-10  9:48         ` [kernel-hardening] " James Morris
2012-04-10  9:48           ` James Morris
2012-04-10 20:00         ` [kernel-hardening] " Andrew Morton
2012-04-10 20:00           ` Andrew Morton
2012-04-10 20:16           ` [kernel-hardening] " Will Drewry
2012-04-10 20:16             ` Will Drewry
2012-04-10 20:16             ` Will Drewry
2012-04-10 10:34       ` [kernel-hardening] " Eric Dumazet
2012-04-10 10:34         ` Eric Dumazet
2012-04-10 19:54       ` [kernel-hardening] " Andrew Morton
2012-04-10 19:54         ` Andrew Morton
2012-04-10 20:15         ` [kernel-hardening] " Will Drewry
2012-04-10 20:15           ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 09/15] seccomp: remove duplicated failure logging Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-04-06 21:14   ` [kernel-hardening] " Andrew Morton
2012-04-06 21:14     ` Andrew Morton
2012-04-09 19:26     ` [kernel-hardening] " Will Drewry
2012-04-09 19:26       ` Will Drewry
2012-04-09 19:26       ` Will Drewry
2012-04-09 19:32       ` [kernel-hardening] " Kees Cook
2012-04-09 19:32         ` Kees Cook
2012-04-09 19:33       ` [kernel-hardening] " Eric Paris
2012-04-09 19:33         ` Eric Paris
2012-04-09 19:39         ` [kernel-hardening] " Kees Cook
2012-04-09 19:39           ` Kees Cook
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 10/15] seccomp: add SECCOMP_RET_ERRNO Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-04-06 21:19   ` [kernel-hardening] " Andrew Morton
2012-04-06 21:19     ` Andrew Morton
2012-04-09 19:19     ` [kernel-hardening] " Will Drewry
2012-04-09 19:19       ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 11/15] signal, x86: add SIGSYS info and make it synchronous Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 12/15] seccomp: Add SECCOMP_RET_TRAP Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 13/15] ptrace,seccomp: Add PTRACE_SECCOMP support Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-04-06 21:24   ` [kernel-hardening] " Andrew Morton
2012-04-06 21:24     ` Andrew Morton
2012-04-09 19:38     ` [kernel-hardening] " Will Drewry
2012-04-09 19:38       ` Will Drewry
2012-03-29 20:01 ` [kernel-hardening] [PATCH v17 14/15] x86: Enable HAVE_ARCH_SECCOMP_FILTER Will Drewry
2012-03-29 20:01   ` Will Drewry
2012-03-29 20:02 ` [kernel-hardening] [PATCH v17 15/15] Documentation: prctl/seccomp_filter Will Drewry
2012-03-29 20:02   ` Will Drewry
2012-04-06 21:26   ` [kernel-hardening] " Andrew Morton
2012-04-06 21:26     ` Andrew Morton
2012-04-09 19:46     ` [kernel-hardening] " Will Drewry
2012-04-09 19:46       ` Will Drewry
2012-04-09 20:47       ` [kernel-hardening] " Markus Gutschke
2012-04-09 20:47         ` Markus Gutschke
2012-04-09 20:58         ` [kernel-hardening] " Ryan Ware
2012-04-09 20:58           ` Ryan Ware
2012-04-09 20:58           ` Ryan Ware
2012-04-09 22:47           ` [kernel-hardening] " Will Drewry
2012-04-09 22:47             ` Will Drewry
2012-04-10 17:49             ` [kernel-hardening] " Ryan Ware
2012-04-10 17:49               ` Ryan Ware
2012-04-10 17:49               ` Ryan Ware
2012-03-29 23:11 ` [kernel-hardening] Re: [PATCH v17 00/15] seccomp_filter: BPF-based syscall filtering James Morris
2012-03-29 23:11   ` James Morris
2012-04-06 21:28   ` [kernel-hardening] " Andrew Morton
2012-04-06 21:28     ` Andrew Morton
2012-04-06 21:28     ` Andrew Morton
2012-04-09  3:48     ` [kernel-hardening] " James Morris
2012-04-09  3:48       ` James Morris
2012-04-09  3:48       ` James Morris

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120406140503.10b75c5b.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=coreyb@linux.vnet.ibm.com \
    --cc=davem@davemloft.net \
    --cc=djm@mindrot.org \
    --cc=eparis@redhat.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hpa@zytor.com \
    --cc=indan@nul.nu \
    --cc=jmorris@namei.org \
    --cc=keescook@chromium.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=luto@mit.edu \
    --cc=markus@chromium.org \
    --cc=mcgrathr@chromium.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pmoore@redhat.com \
    --cc=rdunlap@xenotime.net \
    --cc=scarybeasts@gmail.com \
    --cc=serge.hallyn@canonical.com \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.