From: Tejun Heo <tj@kernel.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Michal Hocko" <mhocko@kernel.org>,
"Peter Zijlstra" <peterz@infradead.org>,
"David Ahern" <dsahern@gmail.com>,
"Alexei Starovoitov" <alexei.starovoitov@gmail.com>,
"Andy Lutomirski" <luto@kernel.org>,
"Daniel Mack" <daniel@zonque.org>,
"Mickaël Salaün" <mic@digikod.net>,
"Kees Cook" <keescook@chromium.org>, "Jann Horn" <jann@thejh.net>,
"David S. Miller" <davem@davemloft.net>,
"Thomas Graf" <tgraf@suug.ch>,
"Michael Kerrisk" <mtk.manpages@gmail.com>,
"Linux API" <linux-api@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Network Development" <netdev@vger.kernel.org>
Subject: Re: Potential issues (security and otherwise) with the current cgroup-bpf API
Date: Wed, 18 Jan 2017 14:41:20 -0800 [thread overview]
Message-ID: <20170118224120.GG9171@mtj.duckdns.org> (raw)
In-Reply-To: <CALCETrW+ZQ1xMEfdEOzx6RK9ZoCvQiqQSnOyhDJ=-FZMwUbucg@mail.gmail.com>
Helloo, Andy.
On Mon, Jan 16, 2017 at 09:18:36PM -0800, Andy Lutomirski wrote:
> Perhaps this is a net feature, though, not a cgroup feature. This
> would IMO make a certain amount of sense. Iptables cgroup matches,
> for example, logically are an iptables (i.e., net) feature. The
Yeap.
> problem here is that network configuration (and this explicitly
> includes iptables) is done on a per-netns level, whereas these new
> hooks entirely ignore network namespaces. I've pointed out that
> trying to enable them in a namespaced world is problematic (e.g.
> switching to ns_capable() will cause serious problems), and Alexei
> seems to think that this will never happen. So I don't think we can
> really call this a net feature that works the way that other net
> features work.
>
> (Suppose that this actually tried to be netns-enabled. Then you'd
> have to map from (netns, cgroup) -> hook, and this would probably be
> slow and messy.)
I'm not sure how important netns support for these bpf programs
actually is but I have no objection to making it behave in an
equivalent manner to iptables where appropriate. This is kinda
trivial to do too. For now, we can simply not run the programs if the
associated socket belongs to non-root netns. Later, if necessary, we
can add per-ns bpf programs. And I can't think of a reason why
(netns, cgroup) -> function mapping would be slow and messy. Why
would that be?
> So I think that leaves it being a cgroup feature. And it definitely
> does *not* play by the rules of cgroups right now.
Because this in no way is a cgroup feature. As you wrote above, it is
something similar to iptables with lacking netns considerations.
Let's address that and make it a proper network thing.
> > I'm sure we'll
> > eventually get there but from what I hear it isn't something we can
> > pull off in a restricted timeframe.
>
> To me, this sounds like "the API isn't that great, we know how to do
> better, but we really really want this feature ASAP so we want to ship
> it with a sub-par API." I think that's a bad idea.
I see your point but this isn't something which is black and white.
There are arguments for both directions. I'm leaning the other
direction because
* My impression with bpf is that while delegation is something
theoretically possible it is not something which is gonna take place
in any immediate time frame. If I'm wrong on this, please feel free
to correct me.
* There are a lot of use cases which can immediately take advantage of
cgroup-aware bpf programs operating as proposed. The model is
semantically equivalent to iptables (let's address the netns part if
that's an issue) which net people are familiar with.
* It isn't exclusive with adding cgroup bpf controller down the road
if necessary and practical. Sure, this isn't the perfect situation
but it seems like an acceptable trade-off to me. What ever is
perfect?
> > This also holds true for the perf controller. While it is implemented
> > as a controller, it isn't visible to cgroup users in any way and the
> > only function it serves is providing the membership test to perf
> > subsystem. perf is the one which decides whether and how it is to be
> > used. cgroup providing membership test to other subsystems is
> > completely acceptable and established.
>
> Unless I'm mistaken, "perf_event" is an actual cgroup controller, and
Yeah, it is implemented as a controller but in essence what it does is
just tagging the hierarchy to tell perf to use that hierarchy for
membership test purposes.
> it explicitly respects the cgroup hierarchy. It shows up in
> /proc/cgroups, and I had no problem mounting a cgroupfs instance with
> perf_event enabled. So I'm not sure what you mean.
That all it's doing is providing membership information.
Thanks.
--
tejun
next prev parent reply other threads:[~2017-01-18 22:41 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-17 5:18 Andy Lutomirski
2017-01-18 22:41 ` Tejun Heo [this message]
2017-01-19 0:18 ` Potential issues (security and otherwise) with the current cgroup-bpf API Andy Lutomirski
2017-01-19 0:59 ` Tejun Heo
2017-01-19 2:29 ` Andy Lutomirski
2017-01-20 2:39 ` Alexei Starovoitov
2017-01-20 4:04 ` Andy Lutomirski
2017-01-23 4:31 ` Alexei Starovoitov
2017-01-23 20:20 ` Andy Lutomirski
2017-02-03 21:07 ` Andy Lutomirski
2017-02-03 23:21 ` Alexei Starovoitov
2017-02-04 17:10 ` Andy Lutomirski
2017-01-19 1:01 ` Mickaël Salaün
-- strict thread matches above, loose matches on Subject: below --
2016-12-17 18:18 Andy Lutomirski
2016-12-17 19:26 ` Mickaël Salaün
2016-12-17 20:02 ` Andy Lutomirski
2016-12-19 20:56 ` Alexei Starovoitov
2016-12-19 21:23 ` Andy Lutomirski
2016-12-20 0:02 ` Alexei Starovoitov
2016-12-20 0:25 ` Andy Lutomirski
2016-12-20 1:43 ` Andy Lutomirski
2016-12-20 1:44 ` David Ahern
2016-12-20 1:56 ` Andy Lutomirski
2016-12-20 2:52 ` David Ahern
2016-12-20 3:12 ` Andy Lutomirski
2016-12-20 4:44 ` Alexei Starovoitov
2016-12-20 5:27 ` Andy Lutomirski
2016-12-20 5:32 ` Alexei Starovoitov
2016-12-20 9:11 ` Peter Zijlstra
2017-01-03 10:25 ` Michal Hocko
2017-01-16 1:19 ` Tejun Heo
2017-01-17 13:03 ` Michal Hocko
2017-01-17 13:32 ` Peter Zijlstra
2017-01-17 13:58 ` Michal Hocko
2017-01-17 20:23 ` Andy Lutomirski
2017-01-18 22:18 ` Tejun Heo
2017-01-19 9:00 ` Michal Hocko
2016-12-20 3:18 ` Alexei Starovoitov
2016-12-20 3:50 ` Andy Lutomirski
2016-12-20 4:41 ` Alexei Starovoitov
2016-12-20 10:21 ` Daniel Mack
2016-12-20 17:23 ` Andy Lutomirski
2016-12-20 18:36 ` Daniel Mack
2016-12-20 18:49 ` Andy Lutomirski
2016-12-21 4:01 ` Alexei Starovoitov
2016-12-20 1:34 ` David Miller
2016-12-20 1:40 ` Andy Lutomirski
2016-12-20 4:51 ` Alexei Starovoitov
2016-12-20 5:26 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170118224120.GG9171@mtj.duckdns.org \
--to=tj@kernel.org \
--cc=alexei.starovoitov@gmail.com \
--cc=daniel@zonque.org \
--cc=davem@davemloft.net \
--cc=dsahern@gmail.com \
--cc=jann@thejh.net \
--cc=keescook@chromium.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=luto@kernel.org \
--cc=mhocko@kernel.org \
--cc=mic@digikod.net \
--cc=mtk.manpages@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tgraf@suug.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox