From: Jann Horn <jannh@google.com>
To: Kees Cook <keescook@chromium.org>
Cc: YiFei Zhu <yifeifz2@illinois.edu>,
Christian Brauner <christian.brauner@ubuntu.com>,
Tycho Andersen <tycho@tycho.pizza>,
Andy Lutomirski <luto@amacapital.net>,
Will Drewry <wad@chromium.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Giuseppe Scrivano <gscrivan@redhat.com>,
Tobin Feldman-Fitzthum <tobin@ibm.com>,
Dimitrios Skarlatos <dskarlat@cs.cmu.edu>,
Valentin Rothberg <vrothber@redhat.com>,
Hubertus Franke <frankeh@us.ibm.com>,
Jack Chen <jianyan2@illinois.edu>,
Josep Torrellas <torrella@illinois.edu>,
Tianyin Xu <tyxu@illinois.edu>, bpf <bpf@vger.kernel.org>,
Linux Containers <containers@lists.linux-foundation.org>,
Linux API <linux-api@vger.kernel.org>,
kernel list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 3/6] seccomp: Implement constant action bitmaps
Date: Thu, 24 Sep 2020 14:28:32 +0200 [thread overview]
Message-ID: <CAG48ez1MWhrtkbWTNpc1v-WqWYiLM_JrCKvuE6DdH6vBY3MJzQ@mail.gmail.com> (raw)
In-Reply-To: <202009240018.A4D8274F@keescook>
On Thu, Sep 24, 2020 at 9:37 AM Kees Cook <keescook@chromium.org> wrote:
> On Thu, Sep 24, 2020 at 02:25:03AM +0200, Jann Horn wrote:
> > On Thu, Sep 24, 2020 at 1:29 AM Kees Cook <keescook@chromium.org> wrote:
[...]
> > (However, a "which syscalls have a fixed result" bitmap might make
> > sense if we want to export the list of permitted syscalls as a text
> > file in procfs, as I mentioned over at
> > <https://lore.kernel.org/lkml/CAG48ez3Ofqp4crXGksLmZY6=fGrF_tWyUCg7PBkAetvbbOPeOA@mail.gmail.com/>.)
>
> I haven't found a data structure I'm happy with for this. It seemed like
> NR_syscalls * sizeof(u32) was rather a lot (i.e. to store the BPF_RET
> value). However, let me discuss that more in the "why in in thread?"
> below...
[...]
> > > +#endif
> > > +};
> > > +
> > > struct seccomp_filter;
> > > /**
> > > * struct seccomp - the state of a seccomp'ed process
> > > @@ -45,6 +56,13 @@ struct seccomp {
> > > #endif
> > > atomic_t filter_count;
> > > struct seccomp_filter *filter;
> > > + struct seccomp_bitmaps native;
> > > +#ifdef CONFIG_COMPAT
> > > + struct seccomp_bitmaps compat;
> > > +#endif
> > > +#ifdef SECCOMP_MULTIPLEXED_SYSCALL_TABLE_ARCH
> > > + struct seccomp_bitmaps multiplex;
> > > +#endif
> >
> > Why do we have one bitmap per thread (in struct seccomp) instead of
> > putting the bitmap for a given filter and all its ancestors into the
> > seccomp_filter?
>
> I explicitly didn't want to add code that was run per-filter; I wanted
> O(1), not O(n) even if the n work was a small constant. There is
> obviously a memory/perf tradeoff here. I wonder if the middle ground
> would be to put a bitmap and "constant action" results in the filter....
> oh duh. The "top" filter is already going to be composed with its
> ancestors. That's all that needs to be checked.
Yeah - when adding a new filter, you can evaluate each syscall for the
newly added filter. For both the "accept" bitmap and the "constant
action" bitmap, you can AND the bitmap of the existing filter into the
new filter's bitmap.
Although actually, I think my "constant action" bitmap proposal was a
stupid idea... when someone asks for an analysis of the filter via
procfs (which shouldn't be a common action, so speed doesn't really
matter there), we can just dynamically evaluate the entire filter tree
using our filter-evaluation helper. Let's drop the "constant action"
bitmap idea.
> Then the tri-state can be:
>
> bitmap accept[NR_syscalls]: accept or check "known" bitmap
> bitmap filter[NR_syscalls]: run filter or return known action
> u32 known_action[NR_syscalls];
Actually, maybe we should just have an "accept" list, nothing else, to
keep it straightforward and with minimal memory usage...
> (times syscall numbering "architecture" counts)
>
> Though perhaps it would be just as fast as:
>
> bitmap run_filter[NR_syscalls]: run filter or return known_action
> u32 known_action[NR_syscalls];
>
> where accept isn't treated special...
Using a bitset for accepted syscalls instead of a big array would
probably have far less cache impact on the syscall entry path. If we
just have an "accept" bitmask, we can store information about 512
syscalls per cache line - that's almost the entire syscall table. In
contrast, a known_action list can only store information about 16
syscalls in a cache line, and we'd additionally still have to query
the "filter" bitmap.
I think our goal here should be that if a syscall is always allowed,
seccomp should execute the smallest amount of instructions we can get
away with, and touch the smallest amount of memory possible (and
preferably that memory should be shared between threads). The bitmap
fastpath should probably also avoid populate_seccomp_data().
next prev parent reply other threads:[~2020-09-24 12:29 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-23 23:29 [PATCH v1 0/6] seccomp: Implement constant action bitmaps Kees Cook
2020-09-23 23:29 ` [PATCH 1/6] seccomp: Introduce SECCOMP_PIN_ARCHITECTURE Kees Cook
2020-09-24 0:41 ` Jann Horn
2020-09-24 7:11 ` Kees Cook
2020-09-23 23:29 ` [PATCH 2/6] x86: Enable seccomp architecture tracking Kees Cook
2020-09-24 0:45 ` Jann Horn
2020-09-24 7:12 ` Kees Cook
2020-09-23 23:29 ` [PATCH 3/6] seccomp: Implement constant action bitmaps Kees Cook
2020-09-24 0:25 ` Jann Horn
2020-09-24 7:36 ` Kees Cook
2020-09-24 8:07 ` YiFei Zhu
2020-09-24 8:15 ` Kees Cook
2020-09-24 8:22 ` YiFei Zhu
2020-09-24 12:28 ` Jann Horn [this message]
2020-09-24 12:37 ` David Laight
2020-09-24 12:56 ` Jann Horn
[not found] ` <DM6PR11MB271492D0565E91475D949F5DEF390@DM6PR11MB2714.namprd11.prod.outlook.com>
2020-09-24 0:36 ` YiFei Zhu
2020-09-24 7:38 ` Kees Cook
2020-09-24 7:51 ` YiFei Zhu
2020-09-23 23:29 ` [PATCH 4/6] seccomp: Emulate basic filters for constant action results Kees Cook
2020-09-23 23:47 ` Jann Horn
2020-09-24 7:46 ` Kees Cook
2020-09-24 15:28 ` Paul Moore
2020-09-24 19:52 ` Kees Cook
2020-09-24 20:46 ` Paul Moore
2020-09-24 21:35 ` Kees Cook
2020-09-23 23:29 ` [PATCH 5/6] selftests/seccomp: Compare bitmap vs filter overhead Kees Cook
2020-09-23 23:29 ` [PATCH 6/6] [DEBUG] seccomp: Report bitmap coverage ranges Kees Cook
2020-09-24 13:40 ` [PATCH v1 0/6] seccomp: Implement constant action bitmaps Rasmus Villemoes
2020-09-24 13:58 ` YiFei Zhu
2020-09-25 5:56 ` Rasmus Villemoes
2020-09-25 7:07 ` YiFei Zhu
2020-09-26 18:11 ` YiFei Zhu
2020-09-28 20:04 ` Kees Cook
2020-09-28 20:16 ` YiFei Zhu
2020-09-24 14:05 ` Jann Horn
2020-09-24 18:57 ` Andrea Arcangeli
2020-09-24 19:18 ` Jann Horn
[not found] ` <9dbe8e3bbdad43a1872202ff38c34ca2@DM5PR11MB1692.namprd11.prod.outlook.com>
2020-09-24 19:48 ` Tianyin Xu
2020-09-24 20:00 ` Kees Cook
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAG48ez1MWhrtkbWTNpc1v-WqWYiLM_JrCKvuE6DdH6vBY3MJzQ@mail.gmail.com \
--to=jannh@google.com \
--cc=aarcange@redhat.com \
--cc=bpf@vger.kernel.org \
--cc=christian.brauner@ubuntu.com \
--cc=containers@lists.linux-foundation.org \
--cc=dskarlat@cs.cmu.edu \
--cc=frankeh@us.ibm.com \
--cc=gscrivan@redhat.com \
--cc=jianyan2@illinois.edu \
--cc=keescook@chromium.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=tobin@ibm.com \
--cc=torrella@illinois.edu \
--cc=tycho@tycho.pizza \
--cc=tyxu@illinois.edu \
--cc=vrothber@redhat.com \
--cc=wad@chromium.org \
--cc=yifeifz2@illinois.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).