* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-02 20:51 ` Kees Cook
@ 2016-08-02 21:06 ` Jeffrey Vander Stoep
2016-08-03 8:28 ` Ingo Molnar
` (2 subsequent siblings)
3 siblings, 0 replies; 51+ messages in thread
From: Jeffrey Vander Stoep @ 2016-08-02 21:06 UTC (permalink / raw)
To: Kees Cook, Peter Zijlstra
Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet, Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 4896 bytes --]
Far from trying to kill perf, we want (and require) perf to be available to
developers on Android. All that this patch enables us to do is gate it
behind developer settings - just like we do with other developer targeted
features.
On Tue, Aug 2, 2016 at 1:51 PM Kees Cook <keescook@chromium.org> wrote:
> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org>
> wrote:
> > On Tue, Aug 02, 2016 at 12:04:34PM -0700, Kees Cook wrote:
> >
> >> Now, obviously, these API have huge value, otherwise they wouldn't
> >> exist in the first place, and they wouldn't be built into end-user
> >> kernels if they were universally undesirable. But that's not the
> >> situation: the APIs are needed, but they lack the appropriate knobs to
> >> control their availability.
> >
> > So far so good, but I take exception with the suggestion that the
> > proposed knob is appropriate.
> >
> >> And this isn't just about Android: regular
> >> distro kernels (like Debian, who also uses this patch) tend to build
> >> in everything so people can use whatever they want. But for admins
> >> that want to reduce their systems' attack surface, there needs to be
> >> ways to disable things like this.
> >
> > And here I think you're overestimating the knowledge of most admins.
>
> Well, I suspect that's why both Android and Debian default this to off
> right now. :) But, regardless, I think it's weird to block admins who
> DO understand about attack surface from being able to limit it.
>
> >> > So the problem I have with this is that it will completely inhibit
> >> > development of things like JITs that self-profile to re-compile
> >> > frequently used code.
> >>
> >> This is a good example of a use-case where this knob would be turned
> >> down. But for many many other use-cases, when presented with a
> >> pre-built kernel, there isn't a way to remove the attack surface.
> >
> > No, quite the opposite. Having this knob will completely inhibit
> > development of such applications. Worse it will probably render perf
> > dead for quite a large body of developers.
>
> I hear what you're saying, but I think there's a few problems: the
> proposed self-profiling JIT doesn't exist (so it's pointless to
> discuss), and the number of developers that don't have access to their
> own system is impossibly tiny when compared to the hundreds of
> millions of end-users for whom perf is not needed.
>
> > The moment you frame it like: perf or sekjurity, and even default to
> > no-perf-because-sekjurity, a whole bunch of corporate IT departments
> > will not enable this, even for their developers.
>
> It's not "vs security", it's a risk analysis of attack surface. The
> vast majority of people running Linux do not use perf (right now).
> I've never suggested it be default disabled: I'm wanting to upstream
> the sysctl setting that is already in use on distros where the distro
> kernel teams have deemed this is needed knob for their end-users. All
> of the objections you're talking about assume that the knob doesn't
> exist, but it does already. It's just not in upstream. :)
>
> > Have you never had to 'root' your work machine to get work done? I have.
> > Luckily this was pre-secure-boot times so it was trivial since I had
> > physical access to the machine. But it still sucked I had to fight IT
> > over mostly 'trivial' crap.
>
> Yeah, I've had to do similar. Frankly most people use VMs or non-corp
> hardware for doing development work, so I don't think this is a useful
> comparison.
>
> >
> >> > I would much rather have an LSM hook where the security stuff can do
> >> > more fine grained control of things. Allowing some apps perf usage
> while
> >> > denying others.
> >>
> >> I'm not against an LSM, but I think it's needless complexity when
> >> there is already a knob for this but it just doesn't go "high" enough.
> >> :)
> >
> > So what will you to the moment the Google Dalvik guys come to you and
> > say: "Hey, we want to do active profiling to do better on-line code
> > generation?".
>
> That hasn't happened yet. When it does, we can revisit this. But right
> now, today, there is a need for this knob.
>
> > I see 0 up-sides of this approach and, as per the above, a whole bunch
> > of very serious downsides.
> >
> > A global (esp. default inhibited) knob is too coarse and limiting.
>
> I haven't suggested it be default inhibit in the upstream Kconfig. And
> having this knob already with the 0, 1, and 2 settings seems
> incomplete to me without this highest level of restriction that 3
> would provide. That seems rather arbitrary to me. :)
>
> Let me take this another way instead. What would be a better way to
> provide a mechanism for system owners to disable perf without an LSM?
> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
> wide a coverage as possible.)
>
> -Kees
>
> --
> Kees Cook
> Chrome OS & Brillo Security
>
[-- Attachment #2: Type: text/html, Size: 5937 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-02 20:51 ` Kees Cook
2016-08-02 21:06 ` Jeffrey Vander Stoep
@ 2016-08-03 8:28 ` Ingo Molnar
2016-08-03 12:28 ` Daniel Micay
2016-08-03 14:41 ` Peter Zijlstra
2016-08-03 17:25 ` Eric W. Biederman
3 siblings, 1 reply; 51+ messages in thread
From: Ingo Molnar @ 2016-08-03 8:28 UTC (permalink / raw)
To: Kees Cook
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet, Eric W. Biederman
* Kees Cook <keescook@chromium.org> wrote:
> > I see 0 up-sides of this approach and, as per the above, a whole bunch of very
> > serious downsides.
> >
> > A global (esp. default inhibited) knob is too coarse and limiting.
>
> I haven't suggested it be default inhibit in the upstream Kconfig. And
> having this knob already with the 0, 1, and 2 settings seems
> incomplete to me without this highest level of restriction that 3
> would provide. That seems rather arbitrary to me. :)
The default has no impact on the "it's too coarse and limiting" negative property
of this patch, which is the show-stopper aspect. Please fix that aspect instead of
trying to argue around it.
This isn't some narrow debugging mechanism we can turn on/off globally and forget
about, this is a wide scope performance measurement and event logging
infrastructure that is being utilized not just by developers but by apps and
runtimes as well.
> Let me take this another way instead. What would be a better way to provide a
> mechanism for system owners to disable perf without an LSM? (Since far fewer
> folks run with an enforcing "big" LSM: I'm seeking as wide a coverage as
> possible.)
Because in practice what will happen is that if the only option is to do something
drastic for sekjurity, IT departments will do it - while if there's a more
flexible mechanism that does not throw out the baby with the bath water that is
going to be used.
This is as if 20 years ago you had submitted a patch to the early Linux TCP/IP
networking code to be on/off via a global sysctl switch and told people that
"in developer mode you can have networking, talk to your admin".
We'd have told you: "this switch is too coarse and limiting, please implement
something better, like a list of routes which defines which IP ranges are
accessible, and a privileged range of listen sockets ports and some flexible
kernel side filtering mechanism to inhibit outgoing/incoming connections".
Global sysctls are way too coarse.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 8:28 ` Ingo Molnar
@ 2016-08-03 12:28 ` Daniel Micay
2016-08-03 12:53 ` Daniel Micay
2016-08-03 13:36 ` Peter Zijlstra
0 siblings, 2 replies; 51+ messages in thread
From: Daniel Micay @ 2016-08-03 12:28 UTC (permalink / raw)
To: kernel-hardening, Kees Cook
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet,
Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 3822 bytes --]
> The default has no impact on the "it's too coarse and limiting"
> negative property
> of this patch, which is the show-stopper aspect. Please fix that
> aspect instead of
> trying to argue around it.
Disabling perf events in the kernel configuration is even more limiting,
and is currently the alternative to doing it this way. It makes sense
for debugging features to be disabled in production releases of products
unless there's a way to control them where risk isn't increased. Having
a way to toggle it dynamically will allow it to be remain available.
> This isn't some narrow debugging mechanism we can turn on/off globally
> and forget
> about, this is a wide scope performance measurement and event logging
> infrastructure that is being utilized not just by developers but by
> apps and
> runtimes as well.
The incredibly wide scope is why it's such a big security problem. If it
wasn't such a frequent source of vulnerabilities, it wouldn't have been
disabled for unprivileged users in grsecurity, Debian and then Android.
It's extended by lots of vendor code to specific to platforms too, so it
isn't just some core kernel code that's properly reviewed.
I don't think there are runtimes using this for JIT tracing. Perhaps it
doesn't actually suit their needs. It's a theoretical use case.
> > Let me take this another way instead. What would be a better way to
> > provide a
> > mechanism for system owners to disable perf without an LSM? (Since
> > far fewer
> > folks run with an enforcing "big" LSM: I'm seeking as wide a
> > coverage as
> > possible.)
>
> Because in practice what will happen is that if the only option is to
> do something
> drastic for sekjurity, IT departments will do it - while if there's a
> more
> flexible mechanism that does not throw out the baby with the bath
> water that is
> going to be used.
It's already done: Android and Debian disable it for unprivileged users
by default. It's already done for most of desktop and mobile Linux and
perhaps even most servers. Not providing functionality desired by
downstream doesn't mean it won't be provided there.
They'll keep doing it whether or not this lands. If it doesn't land, it
will only mean that mainline kernels aren't usable for making Android
devices. They need to pass the compatibility test suite, which means
having this. The mechanism could change but I don't see why the actual
requirement would.
> This is as if 20 years ago you had submitted a patch to the early
> Linux TCP/IP
> networking code to be on/off via a global sysctl switch and told
> people that
> "in developer mode you can have networking, talk to your admin".
>
> We'd have told you: "this switch is too coarse and limiting, please
> implement
> something better, like a list of routes which defines which IP ranges
> are
> accessible, and a privileged range of listen sockets ports and some
> flexible
> kernel side filtering mechanism to inhibit outgoing/incoming
> connections".
>
> Global sysctls are way too coarse.
The end result of submitting an LSM hook instead of using this would
probably come down to having a global sysctl toggle in Yama. There would
be two sysctl controls for perf restrictions rather than one which is
needless complexity for the interface.
Android and Debian would be using a fine-grained perf LSM to accomplish
the same thing: globally disabling it for unprivileged users when not
doing development. The nice thing about fine-grained control would be
implementing a *more* restrictive policy for the case when it's toggled
on. It wouldn't necessarily have to be globally enabled, only enabled
for the relevant user or even a specific process, but that would be
complicated to implement.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 12:28 ` Daniel Micay
@ 2016-08-03 12:53 ` Daniel Micay
2016-08-03 13:36 ` Peter Zijlstra
1 sibling, 0 replies; 51+ messages in thread
From: Daniel Micay @ 2016-08-03 12:53 UTC (permalink / raw)
To: kernel-hardening, Kees Cook
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet,
Eric W. Biederman
[-- Attachment #1: Type: text/plain, Size: 422 bytes --]
Having this in Yama would also make it probable that there would be a
security-centric default. It would end up wiping out unprivileged perf
events access on distributions using Yama for ptrace_scope unless they
make the explicit decision to disable it. Having the perf subsystem
extend the existing perf_event_paranoid sysctl leaves the control over
the upstream default in the hands of the perf subsystem, not LSMs.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 12:28 ` Daniel Micay
2016-08-03 12:53 ` Daniel Micay
@ 2016-08-03 13:36 ` Peter Zijlstra
1 sibling, 0 replies; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-03 13:36 UTC (permalink / raw)
To: Daniel Micay
Cc: kernel-hardening, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet,
Eric W. Biederman
On Wed, Aug 03, 2016 at 08:28:10AM -0400, Daniel Micay wrote:
> I don't think there are runtimes using this for JIT tracing. Perhaps it
> doesn't actually suit their needs. It's a theoretical use case.
I know there are compiler teams using perf for FDO, see for example:
https://gcc.gnu.org/wiki/AutoFDO/Tutorial
LLVM also has AutoFDO support AFAIU.
There is no reason JITs could not also do this, and IIRC there's JITs
build on top of LLVM, so it shouldn't be too hard to imagine an AutoFDO
enabled JIT.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-02 20:51 ` Kees Cook
2016-08-02 21:06 ` Jeffrey Vander Stoep
2016-08-03 8:28 ` Ingo Molnar
@ 2016-08-03 14:41 ` Peter Zijlstra
2016-08-03 15:42 ` Schaufler, Casey
2016-08-03 17:25 ` Eric W. Biederman
3 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-03 14:41 UTC (permalink / raw)
To: Kees Cook
Cc: Jeff Vander Stoep, Ingo Molnar, Arnaldo Carvalho de Melo,
Alexander Shishkin, linux-doc@vger.kernel.org,
kernel-hardening@lists.openwall.com, LKML, Jonathan Corbet,
Eric W. Biederman
On Tue, Aug 02, 2016 at 01:51:47PM -0700, Kees Cook wrote:
> Let me take this another way instead. What would be a better way to
> provide a mechanism for system owners to disable perf without an LSM?
> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
> wide a coverage as possible.)
Could something like a new capability bit work?
I'm thinking that applications that have network connections already
drop all possible capabilities (I know, unlikely to be true, but should
be true for most stuff I hope). This would disable perf for remote code
execution exploits, including web-browsers and the lot.
It would keep perf working for local stuff by default, although
obviously with pam_cap you can limit this when and where needed.
For Android this could mean the JVM explicitly dropping the cap for its
'children' while retaining the use itself. And this would also keep perf
working on the ADB shell stuff.
And, I think this would allow a JIT executable to gain the cap using
file caps, even when the user using it doesn't have it, which would keep
things usable even in restricted environments.
Or am I misunderstanding capabilities -- which is entirely possible.
^ permalink raw reply [flat|nested] 51+ messages in thread
* RE: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 14:41 ` Peter Zijlstra
@ 2016-08-03 15:42 ` Schaufler, Casey
0 siblings, 0 replies; 51+ messages in thread
From: Schaufler, Casey @ 2016-08-03 15:42 UTC (permalink / raw)
To: kernel-hardening@lists.openwall.com, Kees Cook
Cc: Jeff Vander Stoep, Ingo Molnar, Arnaldo Carvalho de Melo,
Alexander Shishkin, linux-doc@vger.kernel.org, LKML,
Jonathan Corbet, Eric W. Biederman
> -----Original Message-----
> From: Peter Zijlstra [mailto:peterz@infradead.org]
> Sent: Wednesday, August 03, 2016 7:41 AM
> To: Kees Cook <keescook@chromium.org>
> Cc: Jeff Vander Stoep <jeffv@google.com>; Ingo Molnar <mingo@redhat.com>;
> Arnaldo Carvalho de Melo <acme@kernel.org>; Alexander Shishkin
> <alexander.shishkin@linux.intel.com>; linux-doc@vger.kernel.org; kernel-
> hardening@lists.openwall.com; LKML <linux-kernel@vger.kernel.org>;
> Jonathan Corbet <corbet@lwn.net>; Eric W. Biederman
> <ebiederm@xmission.com>
> Subject: Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further
> restriction of perf_event_open
>
> On Tue, Aug 02, 2016 at 01:51:47PM -0700, Kees Cook wrote:
> > Let me take this another way instead. What would be a better way to
> > provide a mechanism for system owners to disable perf without an LSM?
> > (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
> > wide a coverage as possible.)
>
> Could something like a new capability bit work?
Yes, it could, but we don't have a real good experience
with the advanced use of capabilities. The big distros and
the "OS"s have yet to adopt them seriously. It's the same
sort of issue, really. The granularity gets to you.
> I'm thinking that applications that have network connections already
> drop all possible capabilities (I know, unlikely to be true, but should
> be true for most stuff I hope). This would disable perf for remote code
> execution exploits, including web-browsers and the lot.
>
> It would keep perf working for local stuff by default, although
> obviously with pam_cap you can limit this when and where needed.
>
> For Android this could mean the JVM explicitly dropping the cap for its
> 'children' while retaining the use itself. And this would also keep perf
> working on the ADB shell stuff.
>
>
> And, I think this would allow a JIT executable to gain the cap using
> file caps, even when the user using it doesn't have it, which would keep
> things usable even in restricted environments.
>
>
> Or am I misunderstanding capabilities -- which is entirely possible.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-02 20:51 ` Kees Cook
@ 2016-08-03 17:25 ` Eric W. Biederman
2016-08-03 8:28 ` Ingo Molnar
` (2 subsequent siblings)
3 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2016-08-03 17:25 UTC (permalink / raw)
To: Kees Cook
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
Sigh.
Kees we have already had this conversation about user namespaces and
apparently you missed the point.
As I have said before the problem with a system wide off switch is what
happens when you have a single application that needs to use the
feature. Without care your system wide protection disappears.
That is very brittle design.
What I see as much more palatable is a design that allows for
features to be turned off in sandboxes.
So please if you are going to worry about disabling large swaths of
the kernel to reduce the attack surface please come up with designs
that are not brittle in allowing users to use a feature nor are they
brittle in keeping the feature disabled where you want it disabled.
One of the strengths of linux is applications of features the authors of
the software had not imagined. Your proposals seem to be trying to put
the world a tiny little box where if someone had not imagined and
preapproved a use of a feature it should not happen. Let's please
avoid implementing totalitarianism to avoid malicious code exploiting
bugs in the kernel. I am not interested in that future.
Especially when dealing with disabling code to reduce attack surface,
when then are no known attacks what we are actually dealing with
is a small percentage probability reduction that a malicious attacker
will be able to exploit the attack.
Remember security is as much about availability as it is about
integrity. You keep imagining features that are great big denial of
service attacks on legitimate users.
Kees Cook <keescook@chromium.org> writes:
> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> Let me take this another way instead. What would be a better way to
> provide a mechanism for system owners to disable perf without an LSM?
> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
> wide a coverage as possible.)
I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
Perhaps something else.
Eric
^ permalink raw reply [flat|nested] 51+ messages in thread* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
@ 2016-08-03 17:25 ` Eric W. Biederman
0 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2016-08-03 17:25 UTC (permalink / raw)
To: Kees Cook
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
Sigh.
Kees we have already had this conversation about user namespaces and
apparently you missed the point.
As I have said before the problem with a system wide off switch is what
happens when you have a single application that needs to use the
feature. Without care your system wide protection disappears.
That is very brittle design.
What I see as much more palatable is a design that allows for
features to be turned off in sandboxes.
So please if you are going to worry about disabling large swaths of
the kernel to reduce the attack surface please come up with designs
that are not brittle in allowing users to use a feature nor are they
brittle in keeping the feature disabled where you want it disabled.
One of the strengths of linux is applications of features the authors of
the software had not imagined. Your proposals seem to be trying to put
the world a tiny little box where if someone had not imagined and
preapproved a use of a feature it should not happen. Let's please
avoid implementing totalitarianism to avoid malicious code exploiting
bugs in the kernel. I am not interested in that future.
Especially when dealing with disabling code to reduce attack surface,
when then are no known attacks what we are actually dealing with
is a small percentage probability reduction that a malicious attacker
will be able to exploit the attack.
Remember security is as much about availability as it is about
integrity. You keep imagining features that are great big denial of
service attacks on legitimate users.
Kees Cook <keescook@chromium.org> writes:
> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> Let me take this another way instead. What would be a better way to
> provide a mechanism for system owners to disable perf without an LSM?
> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
> wide a coverage as possible.)
I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
Perhaps something else.
Eric
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 17:25 ` Eric W. Biederman
(?)
@ 2016-08-03 18:53 ` Kees Cook
2016-08-03 21:44 ` Peter Zijlstra
-1 siblings, 1 reply; 51+ messages in thread
From: Kees Cook @ 2016-08-03 18:53 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
On Wed, Aug 3, 2016 at 10:25 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> Sigh.
>
> Kees we have already had this conversation about user namespaces and
> apparently you missed the point.
Well, I didn't miss the point: that's why I CCed you. :) This is
nearly the same discussion (though in this case there is already a
sysctl, hence this thread).
> As I have said before the problem with a system wide off switch is what
> happens when you have a single application that needs to use the
> feature. Without care your system wide protection disappears.
> That is very brittle design.
Yeah, the use-cases tend to be "never", "always", "this process", and
"this user". The way Android would like to be handling the permission,
though, gets back to another thing we talked about briefly when
discussing userns: revocation. Android wants to turn the entire
permission on and off across the entire system, so things like
capabilities or privileged fds or something don't quite fit that
either.
> What I see as much more palatable is a design that allows for
> features to be turned off in sandboxes.
>
> So please if you are going to worry about disabling large swaths of
> the kernel to reduce the attack surface please come up with designs
> that are not brittle in allowing users to use a feature nor are they
> brittle in keeping the feature disabled where you want it disabled.
Yup, Linus asked for this too. I haven't come up with anything
particularly great yet. :P
> One of the strengths of linux is applications of features the authors of
> the software had not imagined. Your proposals seem to be trying to put
> the world a tiny little box where if someone had not imagined and
> preapproved a use of a feature it should not happen. Let's please
> avoid implementing totalitarianism to avoid malicious code exploiting
> bugs in the kernel. I am not interested in that future.
To be clear: I'm interested in giving system owners greater control
over what's exposed. That's not about limiting access everywhere. And
I'm interested in making sure that the upstream kernel actually
provides what end-users want. In the most extreme version of this is
when distros carry kernel patches to get it done (this was true with
userns and is true again here with perf). This IS a desired features,
and it exists in the world. I want to avoid the confusion that arises
from people running patched kernels: upstream developers don't realize
what state their features are in when they reach end users,
documentation doesn't match, etc etc.
However, I do accept that the existing mechanisms that have served
Linux over the years (sysctls, capabilities, etc) are woefully
insufficient in many of these cases.
> Especially when dealing with disabling code to reduce attack surface,
> when then are no known attacks what we are actually dealing with
> is a small percentage probability reduction that a malicious attacker
> will be able to exploit the attack.
But there are bugs. We just haven't found them. Based on the past
research by Jon Corbet (and again recently by me) we're sitting on at
least 1 new critical bug that we won't find for the next 5 years (and
we've got at least 1 more each that are on average 1, 2, 3, and 4
years old that we haven't found yet either).
> Remember security is as much about availability as it is about
> integrity. You keep imagining features that are great big denial of
> service attacks on legitimate users.
That's not my goal: legitimate users should have access. That's up to
system owners. But I'd like to provide ways for system owners to keep
illegitimate users from having access. :)
> Kees Cook <keescook@chromium.org> writes:
>
>> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> Let me take this another way instead. What would be a better way to
>> provide a mechanism for system owners to disable perf without an LSM?
>> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
>> wide a coverage as possible.)
>
> I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
> Perhaps something else.
Peter, did you happen to see Eric's solution to this problem for
namespaces? Basically, a per-userns sysctl instead of a global sysctl.
Is that something that would be acceptable here?
-Kees
--
Kees Cook
Brillo & Chrome OS Security
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 18:53 ` Kees Cook
@ 2016-08-03 21:44 ` Peter Zijlstra
2016-08-04 2:50 ` Eric W. Biederman
0 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-03 21:44 UTC (permalink / raw)
To: Kees Cook
Cc: Eric W. Biederman, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
On Wed, Aug 03, 2016 at 11:53:41AM -0700, Kees Cook wrote:
> > Kees Cook <keescook@chromium.org> writes:
> >
> >> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> >> Let me take this another way instead. What would be a better way to
> >> provide a mechanism for system owners to disable perf without an LSM?
> >> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
> >> wide a coverage as possible.)
> >
> > I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
> > Perhaps something else.
>
> Peter, did you happen to see Eric's solution to this problem for
> namespaces? Basically, a per-userns sysctl instead of a global sysctl.
> Is that something that would be acceptable here?
Someone would have to educate me on what a userns is and how that would
help here.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 21:44 ` Peter Zijlstra
@ 2016-08-04 2:50 ` Eric W. Biederman
0 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2016-08-04 2:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Aug 03, 2016 at 11:53:41AM -0700, Kees Cook wrote:
>> > Kees Cook <keescook@chromium.org> writes:
>> >
>> >> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> >> Let me take this another way instead. What would be a better way to
>> >> provide a mechanism for system owners to disable perf without an LSM?
>> >> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
>> >> wide a coverage as possible.)
>> >
>> > I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
>> > Perhaps something else.
>>
>> Peter, did you happen to see Eric's solution to this problem for
>> namespaces? Basically, a per-userns sysctl instead of a global sysctl.
>> Is that something that would be acceptable here?
>
> Someone would have to educate me on what a userns is and how that would
> help here.
userns is an abbreviation for user namespace.
How it might help is that it is an easy unescapable context for
processes. Essentialy the idea is to limit the scope of the sysctl to a
container.
User namespaces run into flack because while tremendously simple in
themselves the code takes advantage of the fact that suid root
executables in a user namespace do not have privileges on anything
outside of the user namespace. Which means that it is semantically safe
to allow operations like creating mount namespaces, mount filesystems,
creating network namespaces, manipulating the network stack etc. All of
which allows unprivileged users (that can create network namespaces) to
exercise more kernel code and exercise those bugs.
Fundamentally user namespaces as objects you can create need limits on
the maximum number of user namespaces you can create to cawtch run away
processes. Set the limit you can create to 0 and you get what Kees
wants.
In my pending patches that were not quite ready for the merge window, I
added a sysctl that described the maximum number of user namepaces that
could be created (default value threads-max), and implemented the sysctl
in a per user way. Such that counts and limits were kept for every user
namespace. In a nested user namespace (which are all of them except for
the initial user namspace) the count and limit would be checked in the
current user namepsace, then the count would be incremented in the parent
and verified the count was below the limit in the parent user namespace.
What this means in practice is user namespaces can be enabled by default
on a system, and yet you can easily disable them in a sandbox that was
built with a user namespace.
I named the new sysctls in my patch:
/proc/sys/userns/max_user_namespaces
/proc/sys/userns/max_pid_namespaces
/proc/sys/userns/max_net_namespaces
/proc/sys/userns/max_uts_namespaces
/proc/sys/userns/max_ipc_namespaces
/proc/sys/userns/max_cgroup_namespaces
/proc/sys/userns/max_mnt_namespaces
What Kees was suggesting was to add a similar sysctl say:
/proc/sys/userns/perf_event_enabled
And have the ability to disable perf events in each user namespaces.
While still being able to leave usage perf events enabled by default.
I don't know if any of that is a good fit for perf events.
For purposes of this discussion I assume we are limiting ourselves to
discussing userspace tracing, which semantically is 100% fine for
access by userspace.
Eric
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
@ 2016-08-04 2:50 ` Eric W. Biederman
0 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2016-08-04 2:50 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Aug 03, 2016 at 11:53:41AM -0700, Kees Cook wrote:
>> > Kees Cook <keescook@chromium.org> writes:
>> >
>> >> On Tue, Aug 2, 2016 at 1:30 PM, Peter Zijlstra <peterz@infradead.org> wrote:
>> >> Let me take this another way instead. What would be a better way to
>> >> provide a mechanism for system owners to disable perf without an LSM?
>> >> (Since far fewer folks run with an enforcing "big" LSM: I'm seeking as
>> >> wide a coverage as possible.)
>> >
>> > I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
>> > Perhaps something else.
>>
>> Peter, did you happen to see Eric's solution to this problem for
>> namespaces? Basically, a per-userns sysctl instead of a global sysctl.
>> Is that something that would be acceptable here?
>
> Someone would have to educate me on what a userns is and how that would
> help here.
userns is an abbreviation for user namespace.
How it might help is that it is an easy unescapable context for
processes. Essentialy the idea is to limit the scope of the sysctl to a
container.
User namespaces run into flack because while tremendously simple in
themselves the code takes advantage of the fact that suid root
executables in a user namespace do not have privileges on anything
outside of the user namespace. Which means that it is semantically safe
to allow operations like creating mount namespaces, mount filesystems,
creating network namespaces, manipulating the network stack etc. All of
which allows unprivileged users (that can create network namespaces) to
exercise more kernel code and exercise those bugs.
Fundamentally user namespaces as objects you can create need limits on
the maximum number of user namespaces you can create to cawtch run away
processes. Set the limit you can create to 0 and you get what Kees
wants.
In my pending patches that were not quite ready for the merge window, I
added a sysctl that described the maximum number of user namepaces that
could be created (default value threads-max), and implemented the sysctl
in a per user way. Such that counts and limits were kept for every user
namespace. In a nested user namespace (which are all of them except for
the initial user namspace) the count and limit would be checked in the
current user namepsace, then the count would be incremented in the parent
and verified the count was below the limit in the parent user namespace.
What this means in practice is user namespaces can be enabled by default
on a system, and yet you can easily disable them in a sandbox that was
built with a user namespace.
I named the new sysctls in my patch:
/proc/sys/userns/max_user_namespaces
/proc/sys/userns/max_pid_namespaces
/proc/sys/userns/max_net_namespaces
/proc/sys/userns/max_uts_namespaces
/proc/sys/userns/max_ipc_namespaces
/proc/sys/userns/max_cgroup_namespaces
/proc/sys/userns/max_mnt_namespaces
What Kees was suggesting was to add a similar sysctl say:
/proc/sys/userns/perf_event_enabled
And have the ability to disable perf events in each user namespaces.
While still being able to leave usage perf events enabled by default.
I don't know if any of that is a good fit for perf events.
For purposes of this discussion I assume we are limiting ourselves to
discussing userspace tracing, which semantically is 100% fine for
access by userspace.
Eric
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 2:50 ` Eric W. Biederman
(?)
@ 2016-08-04 9:11 ` Peter Zijlstra
2016-08-04 15:13 ` Eric W. Biederman
-1 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-04 9:11 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
On Wed, Aug 03, 2016 at 09:50:37PM -0500, Eric W. Biederman wrote:
> What this means in practice is user namespaces can be enabled by default
> on a system, and yet you can easily disable them in a sandbox that was
> built with a user namespace.
>
> I named the new sysctls in my patch:
> /proc/sys/userns/max_user_namespaces
> /proc/sys/userns/max_pid_namespaces
> /proc/sys/userns/max_net_namespaces
> /proc/sys/userns/max_uts_namespaces
> /proc/sys/userns/max_ipc_namespaces
> /proc/sys/userns/max_cgroup_namespaces
> /proc/sys/userns/max_mnt_namespaces
>
> What Kees was suggesting was to add a similar sysctl say:
> /proc/sys/userns/perf_event_enabled
>
> And have the ability to disable perf events in each user namespaces.
> While still being able to leave usage perf events enabled by default.
>
> I don't know if any of that is a good fit for perf events.
>
> For purposes of this discussion I assume we are limiting ourselves to
> discussing userspace tracing, which semantically is 100% fine for
> access by userspace.
Right, so its basically a 'root' namespace. Not sure how this would
help, or cover the use-cases with perf through.
Do they really only care about the sandbox? I can imagine this being
sufficient for Android as that could do these userns thingies for each
app or whatnot. But does this cover the case Debian disabled perf for?
I'm not sure I've ever seen it described _why_ they did it.
So far I'm still liking the new capability bit better, assuming I
understood those right.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 9:11 ` Peter Zijlstra
@ 2016-08-04 15:13 ` Eric W. Biederman
0 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2016-08-04 15:13 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Aug 03, 2016 at 09:50:37PM -0500, Eric W. Biederman wrote:
>
>> What this means in practice is user namespaces can be enabled by default
>> on a system, and yet you can easily disable them in a sandbox that was
>> built with a user namespace.
>>
>> I named the new sysctls in my patch:
>> /proc/sys/userns/max_user_namespaces
>> /proc/sys/userns/max_pid_namespaces
>> /proc/sys/userns/max_net_namespaces
>> /proc/sys/userns/max_uts_namespaces
>> /proc/sys/userns/max_ipc_namespaces
>> /proc/sys/userns/max_cgroup_namespaces
>> /proc/sys/userns/max_mnt_namespaces
>>
>> What Kees was suggesting was to add a similar sysctl say:
>> /proc/sys/userns/perf_event_enabled
>>
>> And have the ability to disable perf events in each user namespaces.
>> While still being able to leave usage perf events enabled by default.
>>
>> I don't know if any of that is a good fit for perf events.
>>
>> For purposes of this discussion I assume we are limiting ourselves to
>> discussing userspace tracing, which semantically is 100% fine for
>> access by userspace.
>
> Right, so its basically a 'root' namespace. Not sure how this would
> help, or cover the use-cases with perf through.
The bits useful to the perf situation are:
- user namespaces nest.
- anyone can create a user namespace.
- a sysctl can be bound to the userns that takes local privilege to
change so you can't override it arbitrarily.
Which is a long way of saying a user namespace is one way of marking
processes that may or may not use perf.
It was given in this case as an example of something that has been
looked at that appears to solve peoples concerns.
Another way to achieve a similar effect is to build something like an
rlimit.
What is attractive to me semantically about something like this is
applications that have perf_event disabled can still be traced with perf.
> Do they really only care about the sandbox? I can imagine this being
> sufficient for Android as that could do these userns thingies for each
> app or whatnot.
So the question is how do we want to apply policy in this case.
If the only concern is that there might be some bug somewhere
in the code that is undiscovered and people who don't use a feature
don't want to have to worry about it, disabling things at the
application level makes sense.
In my mind a sandbox is policy like this that I apply to my application,
in contrast a sandbox approach with a global disable or some other
specific poicy that the system administrator applies.
The really important property that I think needs to exist is a less than
system granularity. So a solution that doesn't disable it for everyone
and doesn't disable something by default can be deployed while still
allowing the feature to be disabled where people don't want to take
the chance (such as in network facing daemons like apache).
> But does this cover the case Debian disabled perf for?
> I'm not sure I've ever seen it described _why_ they did it.
Good question. I suspect someone should ask. Especially since debian
defaults to 3. perf event is disabled for everyone.
> So far I'm still liking the new capability bit better, assuming I
> understood those right.
Your subsystem your call. I have never had much luck with capability
bits. They are not particularly flexible, and are hard to get rid of
permanently any suid root app gains them all.
But it isn't a particularly easy problem and I don't think we have any
solutions that have lasted the test of time for this kind of thing other
than seccomp.
Eric
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
@ 2016-08-04 15:13 ` Eric W. Biederman
0 siblings, 0 replies; 51+ messages in thread
From: Eric W. Biederman @ 2016-08-04 15:13 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Aug 03, 2016 at 09:50:37PM -0500, Eric W. Biederman wrote:
>
>> What this means in practice is user namespaces can be enabled by default
>> on a system, and yet you can easily disable them in a sandbox that was
>> built with a user namespace.
>>
>> I named the new sysctls in my patch:
>> /proc/sys/userns/max_user_namespaces
>> /proc/sys/userns/max_pid_namespaces
>> /proc/sys/userns/max_net_namespaces
>> /proc/sys/userns/max_uts_namespaces
>> /proc/sys/userns/max_ipc_namespaces
>> /proc/sys/userns/max_cgroup_namespaces
>> /proc/sys/userns/max_mnt_namespaces
>>
>> What Kees was suggesting was to add a similar sysctl say:
>> /proc/sys/userns/perf_event_enabled
>>
>> And have the ability to disable perf events in each user namespaces.
>> While still being able to leave usage perf events enabled by default.
>>
>> I don't know if any of that is a good fit for perf events.
>>
>> For purposes of this discussion I assume we are limiting ourselves to
>> discussing userspace tracing, which semantically is 100% fine for
>> access by userspace.
>
> Right, so its basically a 'root' namespace. Not sure how this would
> help, or cover the use-cases with perf through.
The bits useful to the perf situation are:
- user namespaces nest.
- anyone can create a user namespace.
- a sysctl can be bound to the userns that takes local privilege to
change so you can't override it arbitrarily.
Which is a long way of saying a user namespace is one way of marking
processes that may or may not use perf.
It was given in this case as an example of something that has been
looked at that appears to solve peoples concerns.
Another way to achieve a similar effect is to build something like an
rlimit.
What is attractive to me semantically about something like this is
applications that have perf_event disabled can still be traced with perf.
> Do they really only care about the sandbox? I can imagine this being
> sufficient for Android as that could do these userns thingies for each
> app or whatnot.
So the question is how do we want to apply policy in this case.
If the only concern is that there might be some bug somewhere
in the code that is undiscovered and people who don't use a feature
don't want to have to worry about it, disabling things at the
application level makes sense.
In my mind a sandbox is policy like this that I apply to my application,
in contrast a sandbox approach with a global disable or some other
specific poicy that the system administrator applies.
The really important property that I think needs to exist is a less than
system granularity. So a solution that doesn't disable it for everyone
and doesn't disable something by default can be deployed while still
allowing the feature to be disabled where people don't want to take
the chance (such as in network facing daemons like apache).
> But does this cover the case Debian disabled perf for?
> I'm not sure I've ever seen it described _why_ they did it.
Good question. I suspect someone should ask. Especially since debian
defaults to 3. perf event is disabled for everyone.
> So far I'm still liking the new capability bit better, assuming I
> understood those right.
Your subsystem your call. I have never had much luck with capability
bits. They are not particularly flexible, and are hard to get rid of
permanently any suid root app gains them all.
But it isn't a particularly easy problem and I don't think we have any
solutions that have lasted the test of time for this kind of thing other
than seccomp.
Eric
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 15:13 ` Eric W. Biederman
(?)
@ 2016-08-04 15:37 ` Peter Zijlstra
-1 siblings, 0 replies; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-04 15:37 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, kernel-hardening@lists.openwall.com,
LKML, Jonathan Corbet
On Thu, Aug 04, 2016 at 10:13:29AM -0500, Eric W. Biederman wrote:
> The bits useful to the perf situation are:
> - user namespaces nest.
> - anyone can create a user namespace.
> - a sysctl can be bound to the userns that takes local privilege to
> change so you can't override it arbitrarily.
>
> Which is a long way of saying a user namespace is one way of marking
> processes that may or may not use perf.
>
> It was given in this case as an example of something that has been
> looked at that appears to solve peoples concerns.
> What is attractive to me semantically about something like this is
> applications that have perf_event disabled can still be traced with perf.
> > So far I'm still liking the new capability bit better, assuming I
> > understood those right.
>
> Your subsystem your call. I have never had much luck with capability
> bits. They are not particularly flexible, and are hard to get rid of
> permanently any suid root app gains them all.
Right, so I've no experience with any of this.
But from what I understood amluto recently made capabilities much more
useful with: 58319057b784 ("capabilities: ambient capabilities").
And the thing I like is having file capabilities, so even though the
user cannot in general create perf events, we could mark the
/usr/bin/perf executable as having CAP_PERF and allow it to create them,
because its a 'trusted' executable.
Can something like that be done with userns? Afaiu once you create a
userns with perf disabled, even a nested one cannot re-enable it,
otherwise you cannot create sandboxes.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 17:25 ` Eric W. Biederman
(?)
(?)
@ 2016-08-03 19:36 ` Daniel Micay
2016-08-04 10:28 ` Mark Rutland
-1 siblings, 1 reply; 51+ messages in thread
From: Daniel Micay @ 2016-08-03 19:36 UTC (permalink / raw)
To: kernel-hardening, Kees Cook
Cc: Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
[-- Attachment #1: Type: text/plain, Size: 2606 bytes --]
> One of the strengths of linux is applications of features the authors
> of
> the software had not imagined. Your proposals seem to be trying to
> put
> the world a tiny little box where if someone had not imagined and
> preapproved a use of a feature it should not happen. Let's please
> avoid implementing totalitarianism to avoid malicious code exploiting
> bugs in the kernel. I am not interested in that future.
You're describing operating systems like Android, ChromeOS and iOS.
That future is already here and the Linux kernel is the major weak point
in the attempts to build those systems based on Linux. Even for the very
restricted Chrome sandbox, it's the easiest way out.
Android similarly allows near zero access to /sys for apps and little
access to /proc beyond the /proc/PID directories belonging to an app.
> Especially when dealing with disabling code to reduce attack surface,
> when then are no known attacks what we are actually dealing with
> is a small percentage probability reduction that a malicious attacker
> will be able to exploit the attack.
There are perf events vulnerabilities being exploited in the wild to
gain root on Android. It's not a theoretical attack vector. They're used
in both malware and rooting tools. Local privilege escalation bugs in
the kernel are common so there are a lot of alternatives but it's one of
the major sources for vulnerabilities. There's a lot of architecture and
vendor specific perf events code and lots of bleeding edge features. On
Android, a lot of the perf events vulnerabilities have been specific to
the Qualcomm SoC platform. Other platforms are likely just receiving a
lot less attention.
> Remember security is as much about availability as it is about
> integrity. You keep imagining features that are great big denial of
> service attacks on legitimate users.
Only developers care about perf events and they still have access to it.
JIT compilers have other ways to do tracing and even if they consider
this to be the ideal API, it's not particularly important if they have
to settle for something else. In reality, it's a small compromise.
> I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl.
> Perhaps something else.
It's not possible to use the current incarnation of seccomp for this
since it can't be dynamically granted/revoked. Perhaps it would be
possible to support adding/removing or at least toggling seccomp filters
for groups of processes. That would be good enough to take care of user
ns, ptrace, perf events, etc.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-03 19:36 ` Daniel Micay
@ 2016-08-04 10:28 ` Mark Rutland
2016-08-04 13:45 ` Daniel Micay
0 siblings, 1 reply; 51+ messages in thread
From: Mark Rutland @ 2016-08-04 10:28 UTC (permalink / raw)
To: kernel-hardening
Cc: Kees Cook, Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
On Wed, Aug 03, 2016 at 03:36:16PM -0400, Daniel Micay wrote:
> There's a lot of architecture and vendor specific perf events code and
> lots of bleeding edge features. On Android, a lot of the perf events
> vulnerabilities have been specific to the Qualcomm SoC platform. Other
> platforms are likely just receiving a lot less attention.
Are the relevant perf drivers for those platforms upstream? I've seen no
patches addressing security issues in the ARMv7 krait+Scorpion PMU
driver since it was added, and there's no ARMv8 QCOM PMU driver.
If there are outstanding issues, please report them upstream.
FWIW, I've used Vince Weaver's perf fuzzer to test the ARM PMU code
(both the framework and drivers), so other platforms are seeing some
attention. That said, I haven't done that recently.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 10:28 ` Mark Rutland
@ 2016-08-04 13:45 ` Daniel Micay
2016-08-04 14:11 ` Peter Zijlstra
0 siblings, 1 reply; 51+ messages in thread
From: Daniel Micay @ 2016-08-04 13:45 UTC (permalink / raw)
To: kernel-hardening
Cc: Kees Cook, Peter Zijlstra, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]
On Thu, 2016-08-04 at 11:28 +0100, Mark Rutland wrote:
> On Wed, Aug 03, 2016 at 03:36:16PM -0400, Daniel Micay wrote:
> >
> > There's a lot of architecture and vendor specific perf events code
> > and
> > lots of bleeding edge features. On Android, a lot of the perf events
> > vulnerabilities have been specific to the Qualcomm SoC platform.
> > Other
> > platforms are likely just receiving a lot less attention.
>
> Are the relevant perf drivers for those platforms upstream? I've seen
> no
> patches addressing security issues in the ARMv7 krait+Scorpion PMU
> driver since it was added, and there's no ARMv8 QCOM PMU driver.
>
> If there are outstanding issues, please report them upstream.
>
> FWIW, I've used Vince Weaver's perf fuzzer to test the ARM PMU code
> (both the framework and drivers), so other platforms are seeing some
> attention. That said, I haven't done that recently.
Qualcomm's perf driver is out-of-tree along with most of their other
drivers. Their drivers add up to a LOT of code shared across over a
billion mobile devices, leading to the focus on them. It also helps that
there are bounties for Nexus devices, so there are multi thousand dollar
rewards for bugs in the Qualcomm drivers compared to nothing for other
platforms / drivers. Now that perf is only available via ADB debugging,
further perf bugs no longer technically qualify for their bounties (but
they might still pay, I don't know).
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 13:45 ` Daniel Micay
@ 2016-08-04 14:11 ` Peter Zijlstra
2016-08-04 15:44 ` Daniel Micay
0 siblings, 1 reply; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-04 14:11 UTC (permalink / raw)
To: Daniel Micay
Cc: kernel-hardening, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
On Thu, Aug 04, 2016 at 09:45:23AM -0400, Daniel Micay wrote:
> Qualcomm's perf driver is out-of-tree along with most of their other
> drivers.
So you're asking us to maim upstream perf for some out of tree junk?
Srously? *plonk*
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 14:11 ` Peter Zijlstra
@ 2016-08-04 15:44 ` Daniel Micay
2016-08-04 15:55 ` Peter Zijlstra
2016-08-04 16:10 ` Mark Rutland
0 siblings, 2 replies; 51+ messages in thread
From: Daniel Micay @ 2016-08-04 15:44 UTC (permalink / raw)
To: Peter Zijlstra
Cc: kernel-hardening, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
[-- Attachment #1: Type: text/plain, Size: 900 bytes --]
On Thu, 2016-08-04 at 16:11 +0200, Peter Zijlstra wrote:
> On Thu, Aug 04, 2016 at 09:45:23AM -0400, Daniel Micay wrote:
> >
> > Qualcomm's perf driver is out-of-tree along with most of their other
> > drivers.
>
>
> So you're asking us to maim upstream perf for some out of tree junk?
> Srously? *plonk*
This feature doesn't come from Android. The perf events subsystem in the
mainline kernel is packed full of vulnerabilities too. The problem is so
bad that pointing one of the public fuzzers at it for a short period of
time is all that's required to start finding them.
Qualcomm's drivers might be lower quality than core kernel code, but
they're way above the baseline set by mainline kernel drivers...
Shining the same light on mainline drivers wouldn't be pretty. The work
going into hardening the Qualcomm drivers isn't happening upstream to
any comparable extent.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 15:44 ` Daniel Micay
@ 2016-08-04 15:55 ` Peter Zijlstra
2016-08-04 16:10 ` Mark Rutland
1 sibling, 0 replies; 51+ messages in thread
From: Peter Zijlstra @ 2016-08-04 15:55 UTC (permalink / raw)
To: Daniel Micay
Cc: kernel-hardening, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
On Thu, Aug 04, 2016 at 11:44:28AM -0400, Daniel Micay wrote:
> This feature doesn't come from Android. The perf events subsystem in the
> mainline kernel is packed full of vulnerabilities too.
Uhh, not so much. I spend a _lot_ of time a while back to get the core
and x86 solid. I could run the fuzzers for hours on end at some point.
> The problem is so bad that pointing one of the public fuzzers at it
> for a short period of time is all that's required to start finding
> them.
If you know of any that reproduce on x86 I'll go fix. For anything else
you need to complain elsewhere as I don't have hardware nor bandwidth.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 15:44 ` Daniel Micay
2016-08-04 15:55 ` Peter Zijlstra
@ 2016-08-04 16:10 ` Mark Rutland
2016-08-04 16:32 ` Daniel Micay
1 sibling, 1 reply; 51+ messages in thread
From: Mark Rutland @ 2016-08-04 16:10 UTC (permalink / raw)
To: kernel-hardening
Cc: Peter Zijlstra, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
On Thu, Aug 04, 2016 at 11:44:28AM -0400, Daniel Micay wrote:
> Qualcomm's drivers might be lower quality than core kernel code, but
> they're way above the baseline set by mainline kernel drivers...
I don't think that's true for the arm/arm64 perf code.
I think we've done a reasonable job of testing and fixing those, along
with core infrastructure issues. The perf fuzzer runs for a very long
time on a mainline kernel without issues, while on my Nexus 5x I get a
hard lockup after ~85 seconds (and prior to the last android update the
lockup was instantaneous).
> Shining the same light on mainline drivers wouldn't be pretty. The work
> going into hardening the Qualcomm drivers isn't happening upstream to
> any comparable extent.
>From my personal experience (and as above), and talking specifically
about PMU drivers, I think that the opposite is true. This is not to say
there aren't issues; I would not be surprised if there are. But it's
disingenuous to say that mainline code is worse than that which exists
in a vendor kernel when the latter is demonstrably much easier to break
than the former.
If there are issues you are aware of, please report them. If those
issues only exist in non-upstream code, then the applicable concerns are
somewhat different (though certainly still exist).
But please, let's frame the argument to match reality.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 16:10 ` Mark Rutland
@ 2016-08-04 16:32 ` Daniel Micay
2016-08-04 17:09 ` Mark Rutland
0 siblings, 1 reply; 51+ messages in thread
From: Daniel Micay @ 2016-08-04 16:32 UTC (permalink / raw)
To: kernel-hardening
Cc: Peter Zijlstra, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
[-- Attachment #1: Type: text/plain, Size: 2015 bytes --]
On Thu, 2016-08-04 at 17:10 +0100, Mark Rutland wrote:
> On Thu, Aug 04, 2016 at 11:44:28AM -0400, Daniel Micay wrote:
> >
> > Qualcomm's drivers might be lower quality than core kernel code, but
> > they're way above the baseline set by mainline kernel drivers...
>
> I don't think that's true for the arm/arm64 perf code.
The baseline architecture support is essentially core kernel code. I
agree it's much better than the SoC vendor code. You're spending a lot
of time auditing, fuzzing and improving the code in general, which is
not true for most drivers. They don't get that attention.
> I think we've done a reasonable job of testing and fixing those, along
> with core infrastructure issues. The perf fuzzer runs for a very long
> time on a mainline kernel without issues, while on my Nexus 5x I get a
> hard lockup after ~85 seconds (and prior to the last android update
> the
> lockup was instantaneous).
>
> From my personal experience (and as above), and talking specifically
> about PMU drivers, I think that the opposite is true. This is not to
> say
> there aren't issues; I would not be surprised if there are. But it's
> disingenuous to say that mainline code is worse than that which exists
> in a vendor kernel when the latter is demonstrably much easier to
> break
> than the former.
I wasn't talking specifically about perf.
> If there are issues you are aware of, please report them. If those
> issues only exist in non-upstream code, then the applicable concerns
> are
> somewhat different (though certainly still exist).
I'm not going to do volunteer work for a corporation. I've learned that
lesson after spending years doing it.
> But please, let's frame the argument to match reality.
The argument is framed in reality. Stating that it now often takes a few
hours to find a vulnerability with the unaltered, widely known public
perf fuzzer is not impressive. It's really an argument for claiming that
it's a significant security issue.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 16:32 ` Daniel Micay
@ 2016-08-04 17:09 ` Mark Rutland
2016-08-04 17:36 ` Daniel Micay
0 siblings, 1 reply; 51+ messages in thread
From: Mark Rutland @ 2016-08-04 17:09 UTC (permalink / raw)
To: kernel-hardening
Cc: Peter Zijlstra, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
On Thu, Aug 04, 2016 at 12:32:32PM -0400, Daniel Micay wrote:
> On Thu, 2016-08-04 at 17:10 +0100, Mark Rutland wrote:
> I wasn't talking specifically about perf.
Then this is irrelevant to a discussion about limiting access to the
perf interface.
Hardening drivers in general is a very interesting topic, but it is a
different topic.
> > But please, let's frame the argument to match reality.
>
> The argument is framed in reality. Stating that it now often takes a
> few hours to find a vulnerability with the unaltered, widely known
> public perf fuzzer is not impressive. It's really an argument for
> claiming that it's a significant security issue.
My claim was not that the mainline code was impressively perfect, but
rather that the vendor code was worse, countering a prior claim
otherwise. Hence, reality.
There is cetainly much that can be done to improve things, if we discuss
that which is actually applicable.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 51+ messages in thread
* Re: [kernel-hardening] Re: [PATCH 1/2] security, perf: allow further restriction of perf_event_open
2016-08-04 17:09 ` Mark Rutland
@ 2016-08-04 17:36 ` Daniel Micay
0 siblings, 0 replies; 51+ messages in thread
From: Daniel Micay @ 2016-08-04 17:36 UTC (permalink / raw)
To: kernel-hardening
Cc: Peter Zijlstra, Kees Cook, Jeff Vander Stoep, Ingo Molnar,
Arnaldo Carvalho de Melo, Alexander Shishkin,
linux-doc@vger.kernel.org, LKML, Jonathan Corbet
[-- Attachment #1: Type: text/plain, Size: 548 bytes --]
> My claim was not that the mainline code was impressively perfect, but
> rather that the vendor code was worse, countering a prior claim
> otherwise. Hence, reality.
You're arguing with a straw man.
I was responding to a comment about out-of-tree code, not generic
architecture perf drivers vs. alternative versions by SoC vendors.
Qualcomm and other vendors landing their drivers in mainline would be
nice, but it wouldn't make it inherently higher quality. I don't really
see what it has to do with this, which I why I responded...
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 851 bytes --]
^ permalink raw reply [flat|nested] 51+ messages in thread