From: Tejun Heo <tj@kernel.org>
To: Michal Hocko <mhocko@kernel.org>
Cc: "Peter Zijlstra" <peterz@infradead.org>,
"Andy Lutomirski" <luto@amacapital.net>,
"David Ahern" <dsahern@gmail.com>,
"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"Andy Lutomirski" <luto@kernel.org>,
"Daniel Mack" <daniel@zonque.org>,
"Mickaël Salaün" <mic@digikod.net>,
"Kees Cook" <keescook@chromium.org>, "Jann Horn" <jann@thejh.net>,
"David S. Miller" <davem@davemloft.net>,
"Thomas Graf" <tgraf@suug.ch>,
"Michael Kerrisk" <mtk.manpages@gmail.com>,
"Linux API" <linux-api@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Network Development" <netdev@vger.kernel.org>
Subject: Re: Potential issues (security and otherwise) with the current cgroup-bpf API
Date: Sun, 15 Jan 2017 20:19:01 -0500 [thread overview]
Message-ID: <20170116011901.GH14446@mtj.duckdns.org> (raw)
In-Reply-To: <20170103102559.GA30129@dhcp22.suse.cz>
Hello,
Sorry about the delay. Some fire fighthing followed the holidays.
On Tue, Jan 03, 2017 at 11:25:59AM +0100, Michal Hocko wrote:
> > So from what I understand the proposed cgroup is not in fact
> > hierarchical at all.
> >
> > @TJ, I thought you were enforcing all new cgroups to be properly
> > hierarchical, that would very much include this one.
>
> I would be interested in that as well. We have made that mistake in
> memcg v1 where hierarchy could be disabled for performance reasons and
> that turned out to be major PITA in the end. Why do we want to repeat
> the same mistake here?
Across the different threads on this subject, there have been multiple
explanations but I'll try to sum it up more clearly.
The big issue here is whether this is a cgroup thing or a bpf thing.
I don't think there's anything inherently wrong with one approach or
the other. Forget about the proposed cgroup bpf extentions but thinkg
about how iptables does cgroups. Whether it's the netcls/netprio in
v1 or direct membership matching in v2, it is the network side testing
for cgroup membership one way or the other. The only part where
cgroup is involved in is answering that test.
This also holds true for the perf controller. While it is implemented
as a controller, it isn't visible to cgroup users in any way and the
only function it serves is providing the membership test to perf
subsystem. perf is the one which decides whether and how it is to be
used. cgroup providing membership test to other subsystems is
completely acceptable and established.
Now coming back to bpf, the current implementation is just that.
Sure, cgroup hosts the rules in its data structures but that isn't
something conceptually relevant. We might as well implement it as a
prefixed hash table from bpf side. Having pointers in struct cgroup
is just a more efficient and easier way of achieving the same result.
In fact, IIUC, this whole thing was born out of discussions around
implementing scalable cgroup membership matching from bpf programs.
So, what's proposed is a proper part of bpf. In terms of
implementation, cgroup helps by hosting the pointers but that doesn't
necessarily affect the conceptual structure of it. Given that, I
don't think it'd be a good idea to add anything to cgroup interface
for this feature. Introspection is great to have but this should be
introspectable together with other bpf programs using the same
mechanism. That's where it belongs.
None of the issues that people have been raising here is actually an
issue if one thinks of it as a part of bpf. Its security model is
exactly the same as any other bpf programs. Recursive behavior is
exactly the same as how other external cgroup descendant membership
testing work. There is no issue here whatsoever.
Now, I'm not claiming that a bpf mechanism which is a proper part of
cgrou isn't attractive. It is, especially with delegation; however,
that is also where we don't quite know how to proceed. This doesn't
have much to do with cgroup. If something is delegatable to non-priv
users and scoped, cgroup's fine with it and if that's not possible it
simply isn't something which is delegatable and putting it on cgroup
doesn't change that.
I'm far from being a bpf expert, so I could be wrong here, but I don't
think there's anything fundamental which prevents bpf from being
delegatable but at the same time bpf is something which is extremely
flexible and nobody really thought about or worked that much on
delegating bpf. If there's enough need for it, I'm sure we'll
eventually get there but from what I hear it isn't something we can
pull off in a restricted timeframe.
There's nothing which makes the currently implemented mechanism
exclusive with a cgroup controller based one. The hooks are the
expensive part but can be shared, the rest is just about which
programs to execute in what order and how they should be chained.
There are a lot of immediate use cases which can benefit from the
proposed cgroup bpf mechanism and they're all fine with it being a
part of bpf and behaving like any other network mechanism behaves in
terms of configuration and delegation. I don't see a reason why we
would hold back on merging this. All the raised issues are coming
from confusing this as a part of cgroup. It isn't. It is a part of
bpf. If we want a bpf cgroup controller, great, but that is a
separate thing.
Thanks.
--
tejun
next prev parent reply other threads:[~2017-01-16 1:19 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-17 18:18 Potential issues (security and otherwise) with the current cgroup-bpf API Andy Lutomirski
2016-12-17 19:26 ` Mickaël Salaün
2016-12-17 20:02 ` Andy Lutomirski
2016-12-19 20:56 ` Alexei Starovoitov
2016-12-19 21:23 ` Andy Lutomirski
2016-12-20 0:02 ` Alexei Starovoitov
2016-12-20 0:25 ` Andy Lutomirski
2016-12-20 1:43 ` Andy Lutomirski
2016-12-20 1:44 ` David Ahern
2016-12-20 1:56 ` Andy Lutomirski
2016-12-20 2:52 ` David Ahern
2016-12-20 3:12 ` Andy Lutomirski
2016-12-20 4:44 ` Alexei Starovoitov
2016-12-20 5:27 ` Andy Lutomirski
2016-12-20 5:32 ` Alexei Starovoitov
2016-12-20 9:11 ` Peter Zijlstra
2017-01-03 10:25 ` Michal Hocko
2017-01-16 1:19 ` Tejun Heo [this message]
2017-01-17 13:03 ` Michal Hocko
2017-01-17 13:32 ` Peter Zijlstra
2017-01-17 13:58 ` Michal Hocko
2017-01-17 20:23 ` Andy Lutomirski
2017-01-18 22:18 ` Tejun Heo
2017-01-19 9:00 ` Michal Hocko
2016-12-20 3:18 ` Alexei Starovoitov
2016-12-20 3:50 ` Andy Lutomirski
2016-12-20 4:41 ` Alexei Starovoitov
2016-12-20 10:21 ` Daniel Mack
2016-12-20 17:23 ` Andy Lutomirski
2016-12-20 18:36 ` Daniel Mack
2016-12-20 18:49 ` Andy Lutomirski
2016-12-21 4:01 ` Alexei Starovoitov
2016-12-20 1:34 ` David Miller
2016-12-20 1:40 ` Andy Lutomirski
2016-12-20 4:51 ` Alexei Starovoitov
2016-12-20 5:26 ` Andy Lutomirski
-- strict thread matches above, loose matches on Subject: below --
2017-01-17 5:18 Andy Lutomirski
2017-01-18 22:41 ` Potential issues (security and otherwise) with the current cgroup-bpf API Tejun Heo
2017-01-19 0:18 ` Andy Lutomirski
2017-01-19 0:59 ` Tejun Heo
2017-01-19 2:29 ` Andy Lutomirski
2017-01-20 2:39 ` Alexei Starovoitov
2017-01-20 4:04 ` Andy Lutomirski
2017-01-23 4:31 ` Alexei Starovoitov
2017-01-23 20:20 ` Andy Lutomirski
2017-02-03 21:07 ` Andy Lutomirski
2017-02-03 23:21 ` Alexei Starovoitov
2017-02-04 17:10 ` Andy Lutomirski
2017-01-19 1:01 ` Mickaël Salaün
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170116011901.GH14446@mtj.duckdns.org \
--to=tj@kernel.org \
--cc=alexei.starovoitov@gmail.com \
--cc=daniel@zonque.org \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=jann@thejh.net \
--cc=keescook@chromium.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=mhocko@kernel.org \
--cc=mic@digikod.net \
--cc=mtk.manpages@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox