From: Alexei Starovoitov <ast@fb.com>
To: Sargun Dhillon <sargun@sargun.me>, Daniel Mack <daniel@zonque.org>
Cc: <htejun@fb.com>, <daniel@iogearbox.net>, <davem@davemloft.net>,
<kafai@fb.com>, <fw@strlen.de>, <pablo@netfilter.org>,
<harald@redhat.com>, <netdev@vger.kernel.org>
Subject: Re: [PATCH v3 2/6] cgroup: add support for eBPF programs
Date: Mon, 5 Sep 2016 15:39:03 -0700 [thread overview]
Message-ID: <57CDF407.8020706@fb.com> (raw)
In-Reply-To: <20160905214001.GA30050@ircssh.c.rugged-nimbus-611.internal>
On 9/5/16 2:40 PM, Sargun Dhillon wrote:
> On Mon, Sep 05, 2016 at 04:49:26PM +0200, Daniel Mack wrote:
>> Hi,
>>
>> On 08/30/2016 01:04 AM, Sargun Dhillon wrote:
>>> On Fri, Aug 26, 2016 at 09:58:48PM +0200, Daniel Mack wrote:
>>>> This patch adds two sets of eBPF program pointers to struct cgroup.
>>>> One for such that are directly pinned to a cgroup, and one for such
>>>> that are effective for it.
>>>>
>>>> To illustrate the logic behind that, assume the following example
>>>> cgroup hierarchy.
>>>>
>>>> A - B - C
>>>> \ D - E
>>>>
>>>> If only B has a program attached, it will be effective for B, C, D
>>>> and E. If D then attaches a program itself, that will be effective for
>>>> both D and E, and the program in B will only affect B and C. Only one
>>>> program of a given type is effective for a cgroup.
>>>>
>>> How does this work when running and orchestrator within an orchestrator? The
>>> Docker in Docker / Mesos in Mesos use case, where the top level orchestrator is
>>> observing the traffic, and there is an orchestrator within that also need to run
>>> it.
>>>
>>> In this case, I'd like to run E's filter, then if it returns 0, D's, and B's,
>>> and so on.
>>
>> Running multiple programs was an idea I had in one of my earlier drafts,
>> but after some discussion, I refrained from it again because potentially
>> walking the cgroup hierarchy on every packet is just too expensive.
>>
> I think you're correct here. Maybe this is something I do with the LSM-attached
> filters, and not for skb filters. Do you think there might be a way to opt-in to
> this option?
>
>>> Is it possible to allow this, either by flattening out the
>>> datastructure (copy a ref to the bpf programs to C and E) or
>>> something similar?
>>
>> That would mean we carry a list of eBPF program pointers of dynamic
>> size. IOW, the deeper inside the cgroup hierarchy, the bigger the list,
>> so it can store a reference to all programs of all of its ancestor.
>>
>> While I think that would be possible, even at some later point, I'd
>> really like to avoid it for the sake of simplicity.
>>
>> Is there any reason why this can't be done in userspace? Compile a
>> program X for A, and overload it with Y, with Y doing the same than X
>> but add some extra checks? Note that all users of the bpf(2) syscall API
>> will need CAP_NET_ADMIN anyway, so there is no delegation to
>> unprivileged sub-orchestators or anything alike really.
>
> One of the use-cases that's becoming more and more common are
> containers-in-containers. In this, you have a privileged container that's
> running something like build orchestration, and you want to do macro-isolation
> (say limit access to only that tennant's infrastructure). Then, when the build
> orchestrator runs a build, it may want to monitor, and further isolate the tasks
> that run in the build job. This is a side-effect of composing different
> container technologies. Typically you use one system for images, then another
> for orchestration, and the actual program running inside of it can also leverage
> containerization.
>
> Example:
> K8s->Docker->Jenkins Agent->Jenkins Build Job
frankly I don't buy this argument, since above
and other 'examples' of container-in-container look
fake to me. There is a ton work to be done for such
scheme to be even remotely feasible. The cgroup+bpf
stuff would be the last on my list to 'fix' for such
deployments. I don't think we should worry about it
at present.
next prev parent reply other threads:[~2016-09-05 22:39 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-26 19:58 [PATCH v3 0/6] Add eBPF hooks for cgroups Daniel Mack
2016-08-26 19:58 ` [PATCH v3 1/6] bpf: add new prog type for cgroup socket filtering Daniel Mack
2016-08-29 22:14 ` Daniel Borkmann
2016-09-05 12:48 ` Daniel Mack
2016-08-26 19:58 ` [PATCH v3 2/6] cgroup: add support for eBPF programs Daniel Mack
2016-08-27 0:03 ` Alexei Starovoitov
2016-09-05 12:47 ` Daniel Mack
2016-08-29 22:42 ` Daniel Borkmann
2016-09-05 12:50 ` Daniel Mack
2016-08-29 23:04 ` Sargun Dhillon
2016-09-05 14:49 ` Daniel Mack
2016-09-05 21:40 ` Sargun Dhillon
2016-09-05 22:39 ` Alexei Starovoitov [this message]
2016-08-26 19:58 ` [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands Daniel Mack
2016-08-27 0:08 ` Alexei Starovoitov
2016-09-05 12:56 ` Daniel Mack
2016-09-05 15:30 ` David Laight
2016-09-05 15:40 ` Daniel Mack
2016-09-05 17:29 ` Joe Perches
2016-08-29 23:00 ` Daniel Borkmann
2016-09-05 12:54 ` Daniel Mack
2016-09-05 13:56 ` Daniel Borkmann
2016-09-05 14:09 ` Daniel Mack
2016-09-05 17:09 ` Daniel Borkmann
2016-09-05 18:32 ` Alexei Starovoitov
2016-09-05 18:43 ` Daniel Mack
2016-08-26 19:58 ` [PATCH v3 4/6] net: filter: run cgroup eBPF ingress programs Daniel Mack
2016-08-29 23:15 ` Daniel Borkmann
2016-08-26 19:58 ` [PATCH v3 5/6] net: core: run cgroup eBPF egress programs Daniel Mack
2016-08-29 22:03 ` Daniel Borkmann
2016-08-29 22:23 ` Sargun Dhillon
2016-09-05 14:22 ` Daniel Mack
2016-09-06 17:14 ` Daniel Borkmann
2016-08-26 19:58 ` [PATCH v3 6/6] samples: bpf: add userspace example for attaching eBPF programs to cgroups Daniel Mack
2016-08-27 13:00 ` [PATCH v3 0/6] Add eBPF hooks for cgroups Rami Rosen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57CDF407.8020706@fb.com \
--to=ast@fb.com \
--cc=daniel@iogearbox.net \
--cc=daniel@zonque.org \
--cc=davem@davemloft.net \
--cc=fw@strlen.de \
--cc=harald@redhat.com \
--cc=htejun@fb.com \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=sargun@sargun.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).