[PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: wad@chromium.org (Will Drewry)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering
Date: Wed, 18 May 2011 21:07:15 -0700	[thread overview]
Message-ID: <BANLkTikBK3-KZ10eErQ6Eex_L6Qe2aZang@mail.gmail.com> (raw)
In-Reply-To: <20110517131902.GF21441@elte.hu>

On Tue, May 17, 2011 at 6:19 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> On Tue, 2011-05-17 at 14:42 +0200, Ingo Molnar wrote:
>> > * Steven Rostedt <rostedt@goodmis.org> wrote:
>> >
>> > > On Mon, 2011-05-16 at 18:52 +0200, Ingo Molnar wrote:
>> > > > * Steven Rostedt <rostedt@goodmis.org> wrote:
>> > > >
>> > > > > I'm a bit nervous about the 'active' role of (trace_)events, because of the
>> > > > > way multiple callbacks can be registered. How would:
>> > > > >
>> > > > > ? ? ? err = event_x();
>> > > > > ? ? ? if (err == -EACCESS) {
>> > > > >
>> > > > > be handled? [...]
>> > > >
>> > > > The default behavior would be something obvious: to trigger all callbacks and
>> > > > use the first non-zero return value.
>> > >
>> > > But how do we know which callback that was from? There's no ordering of what
>> > > callbacks are called first.
>> >
>> > We do not have to know that - nor do the calling sites care in general. Do you
>> > have some specific usecase in mind where the identity of the callback that
>> > generates a match matters?
>>
>> Maybe I'm confused. I was thinking that these event_*() are what we
>> currently call trace_*(), but the event_*(), I assume, can return a
>> value if a call back returns one.
>
> Yeah - and the call site can treat it as:
>
> ?- Ugh, if i get an error i need to abort whatever i was about to do
>
> or (more advanced future use):
>
> ?- If i get a positive value i need to re-evaluate the parameters that were
> ? passed in, they were changed

Do event_* that return non-void exist in the tree at all now?  I've
looked at the various tracepoint macros as well as some of the other
handlers (trace_function, perf_tp_event, etc) and I'm not seeing any
places where a return value is honored nor could be.  At best, the
perf_tp_event can be short-circuited it in the hlist_for_each, but
it'd still need a way to bubble up a failure and result in not calling
the trace/event that the hook precedes.

Am I missing something really obvious?  I don't feel I've gotten a
good handle on exactly how all the tracing code gets triggered, so
perhaps I'm still a level (or three) too shallow. (I can see the asm
hooks for trace functions and I can see where that translates to
registered calls - like trace_function - but I don't see how the
hooked calls can be trivially aborted).

As is, I'm not sure how the perf and ftrace infrastructure could be
reused cleanly without a fair number of hacks to the interface and a
good bit of reworking.  I can already see a number of challenges
around reusing the sys_perf_event_open interface and the fact that
reimplementing something even as simple as seccomp mode=1 seems to
require a fair amount of tweaking to avoid from being leaky.  (E.g.,
enabling all TRACE_EVENT()s for syscalls will miss unhooked syscalls
so either acceptance matching needs to be propagated up the stack
along with some seccomp-like task modality or seccomp-on-perf would
have to depend on sys_enter events with syscall number predicate
matching and fail when a filter discard applies to all active events.)

At present, I'm leaning back towards the v2 series (plus the requested
minor changes) for the benefit of code clarity and its fail-secure
behavior.  Even just considering the reduced case of seccomp mode 1
being implemented on the shared infrastructure, I feel like I missing
something that makes it viable.  Any clues?

If not, I don't think a seccomp mode 2 interface via prctl would be
intractable if the long term movement is to a ftrace/perf backend - it
just means that the in-kernel code would change to wrap whatever the
final design ended up being.

Thanks and sorry if I'm being dense!

>> Thus, we now have the ability to dynamically attach function calls to
>> arbitrary points in the kernel that can have an affect on the code that
>> called it. Right now, we only have the ability to attach function calls to
>> these locations that have passive affects (tracing/profiling).
>
> Well, they can only have the effect that the calling site accepts and handles.
> So the 'effect' is not arbitrary and not defined by the callbacks, it is
> controlled and handled by the calling code.
>
> We do not want invisible side-effects, opaque hooks, etc.
>
> Instead of that we want (this is the getname() example i cited in the thread)
> explicit effects, like:
>
> ?if (event_vfs_getname(result))
> ? ? ? ?return ERR_PTR(-EPERM);
>
>> But you say, "nor do the calling sites care in general". Then what do
>> these calling sites do with the return code? Are we limiting these
>> actions to security only? Or can we have some other feature. [...]
>
> Yeah, not just security. One other example that came up recently is whether to
> panic the box on certain (bad) events such as NMI errors. This too could be
> made flexible via the event filter code: we already capture many events, so
> places that might conceivably do some policy could do so based on a filter
> condition.

This sounds great - I just wish I could figure out how it'd work :)

>> [...] I can envision that we can make the Linux kernel quite dynamic here
>> with "self modifying code". That is, anywhere we have "hooks", perhaps we
>> could replace them with dynamic switches (jump labels). Maybe events would
>> not be the best use, but they could be a generic one.
>
> events and explicit function calls and explicit side-effects are pretty much
> the only thing that are acceptable. We do not want opaque hooks and arbitrary
> side-effects.
>
>> Knowing what callback returned the result would be beneficial. Right now, you
>> are saying if the call back return anything, just abort the call, not knowing
>> what callback was called.
>
> Yeah, and that's a feature: that way a number of conditions can be attached.
> Multiple security frameworks may have effect on a task or multiple tools might
> set policy action on a given event.
>
> Thanks,
>
> ? ? ? ?Ingo
>

next prev parent reply	other threads:[~2011-05-19  4:07 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1304017638.18763.205.camel@gandalf.stny.rr.com>
2011-05-12  3:02 ` [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering Will Drewry
2011-05-12  7:48   ` Ingo Molnar
2011-05-12  9:24     ` Kees Cook
2011-05-12 10:49       ` Ingo Molnar
2011-05-12 11:44     ` James Morris
2011-05-12 13:01       ` Ingo Molnar
2011-05-12 16:26         ` Will Drewry
2011-05-16 12:55           ` Ingo Molnar
2011-05-16 14:42             ` Will Drewry
2011-05-13  0:18         ` James Morris
2011-05-13 12:10           ` Ingo Molnar
2011-05-13 12:19             ` Peter Zijlstra
2011-05-13 12:26               ` Ingo Molnar
2011-05-13 12:39                 ` Peter Zijlstra
2011-05-13 12:43                   ` Peter Zijlstra
2011-05-13 12:54                     ` Ingo Molnar
2011-05-13 13:08                       ` Peter Zijlstra
2011-05-13 13:18                         ` Ingo Molnar
2011-05-13 13:55                           ` Peter Zijlstra
2011-05-13 14:57                             ` Ingo Molnar
2011-05-13 15:27                               ` Peter Zijlstra
2011-05-14  7:05                                 ` Ingo Molnar
2011-05-16 16:23                               ` Steven Rostedt
2011-05-16 16:52                                 ` Ingo Molnar
2011-05-16 17:03                                   ` Steven Rostedt
2011-05-17 12:42                                     ` Ingo Molnar
2011-05-17 13:05                                       ` Steven Rostedt
2011-05-17 13:19                                         ` Ingo Molnar
2011-05-19  4:07                                           ` Will Drewry [this message]
2011-05-19 12:22                                             ` Steven Rostedt
2011-05-19 21:05                                               ` Will Drewry
2011-05-24 15:59                                                 ` Will Drewry
2011-05-24 16:20                                                   ` Peter Zijlstra
2011-05-24 16:25                                                     ` Thomas Gleixner
2011-05-24 19:00                                                       ` Will Drewry
2011-05-24 19:54                                                     ` Ingo Molnar
2011-05-24 20:10                                                       ` Ingo Molnar
2011-05-25 10:35                                                       ` Thomas Gleixner
2011-05-25 15:01                                                         ` Ingo Molnar
2011-05-25 17:43                                                           ` Peter Zijlstra
2011-05-29 20:17                                                             ` Ingo Molnar
2011-05-25 17:48                                                           ` Thomas Gleixner
2011-05-26  8:43                                                             ` Ingo Molnar
2011-05-26  9:15                                                             ` Ingo Molnar
2011-05-24 20:08                                                   ` Ingo Molnar
2011-05-24 20:14                                                     ` Steven Rostedt
2011-05-13 15:17                           ` Eric Paris
2011-05-13 15:29                             ` [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system callfiltering David Laight
2011-05-16 12:03                               ` Ingo Molnar
2011-05-13 12:49                   ` [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering Ingo Molnar
2011-05-13 13:55                     ` Peter Zijlstra
2011-05-13 15:02                       ` Ingo Molnar
2011-05-13 15:10             ` Eric Paris
2011-05-13 15:23               ` Peter Zijlstra
2011-05-13 15:55                 ` Eric Paris
2011-05-13 16:29                   ` Will Drewry
2011-05-14  7:30               ` Ingo Molnar
2011-05-14 20:57                 ` Will Drewry
2011-05-16 12:43                   ` Ingo Molnar
2011-05-16 15:29                     ` Will Drewry
2011-05-17 12:57                       ` Ingo Molnar
2011-05-16  0:36             ` James Morris
2011-05-16 15:08               ` Ingo Molnar
2011-05-17  2:24                 ` James Morris
2011-05-17 13:10                   ` Ingo Molnar
2011-05-17 13:29                     ` James Morris
2011-05-17 18:34                       ` Ingo Molnar
2011-05-26  6:27               ` Pavel Machek
2011-05-26  8:35                 ` Ingo Molnar
2011-05-12 12:15     ` Frederic Weisbecker
2011-05-12 11:33   ` James Morris
2011-05-13 19:35   ` Arnd Bergmann
2011-05-14 20:58     ` Will Drewry
2011-05-15  6:42       ` Arnd Bergmann
2011-05-16 12:00         ` Ingo Molnar
2011-05-16 15:26   ` Steven Rostedt
2011-05-16 15:28     ` Will Drewry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTikBK3-KZ10eErQ6Eex_L6Qe2aZang@mail.gmail.com \
    --to=wad@chromium.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).