Re: [PATCH v3 2/6] cgroup: add support for eBPF programs

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Alexei Starovoitov <ast@fb.com>
To: Sargun Dhillon <sargun@sargun.me>, Daniel Mack <daniel@zonque.org>
Cc: <htejun@fb.com>, <daniel@iogearbox.net>, <davem@davemloft.net>,
	<kafai@fb.com>, <fw@strlen.de>, <pablo@netfilter.org>,
	<harald@redhat.com>, <netdev@vger.kernel.org>
Subject: Re: [PATCH v3 2/6] cgroup: add support for eBPF programs
Date: Mon, 5 Sep 2016 15:39:03 -0700	[thread overview]
Message-ID: <57CDF407.8020706@fb.com> (raw)
In-Reply-To: <20160905214001.GA30050@ircssh.c.rugged-nimbus-611.internal>

On 9/5/16 2:40 PM, Sargun Dhillon wrote:
> On Mon, Sep 05, 2016 at 04:49:26PM +0200, Daniel Mack wrote:
>> Hi,
>>
>> On 08/30/2016 01:04 AM, Sargun Dhillon wrote:
>>> On Fri, Aug 26, 2016 at 09:58:48PM +0200, Daniel Mack wrote:
>>>> This patch adds two sets of eBPF program pointers to struct cgroup.
>>>> One for such that are directly pinned to a cgroup, and one for such
>>>> that are effective for it.
>>>>
>>>> To illustrate the logic behind that, assume the following example
>>>> cgroup hierarchy.
>>>>
>>>>    A - B - C
>>>>          \ D - E
>>>>
>>>> If only B has a program attached, it will be effective for B, C, D
>>>> and E. If D then attaches a program itself, that will be effective for
>>>> both D and E, and the program in B will only affect B and C. Only one
>>>> program of a given type is effective for a cgroup.
>>>>
>>> How does this work when running and orchestrator within an orchestrator? The
>>> Docker in Docker / Mesos in Mesos use case, where the top level orchestrator is
>>> observing the traffic, and there is an orchestrator within that also need to run
>>> it.
>>>
>>> In this case, I'd like to run E's filter, then if it returns 0, D's, and B's,
>>> and so on.
>>
>> Running multiple programs was an idea I had in one of my earlier drafts,
>> but after some discussion, I refrained from it again because potentially
>> walking the cgroup hierarchy on every packet is just too expensive.
>>
> I think you're correct here. Maybe this is something I do with the LSM-attached
> filters, and not for skb filters. Do you think there might be a way to opt-in to
> this option?
>
>>> Is it possible to allow this, either by flattening out the
>>> datastructure (copy a ref to the bpf programs to C and E) or
>>> something similar?
>>
>> That would mean we carry a list of eBPF program pointers of dynamic
>> size. IOW, the deeper inside the cgroup hierarchy, the bigger the list,
>> so it can store a reference to all programs of all of its ancestor.
>>
>> While I think that would be possible, even at some later point, I'd
>> really like to avoid it for the sake of simplicity.
>>
>> Is there any reason why this can't be done in userspace? Compile a
>> program X for A, and overload it with Y, with Y doing the same than X
>> but add some extra checks? Note that all users of the bpf(2) syscall API
>> will need CAP_NET_ADMIN anyway, so there is no delegation to
>> unprivileged sub-orchestators or anything alike really.
>
> One of the use-cases that's becoming more and more common are
> containers-in-containers. In this, you have a privileged container that's
> running something like build orchestration, and you want to do macro-isolation
> (say limit access to only that tennant's infrastructure). Then, when the build
> orchestrator runs a build, it may want to monitor, and further isolate the tasks
> that run in the build job. This is a side-effect of composing different
> container technologies. Typically you use one system for images, then another
> for orchestration, and the actual program running inside of it can also leverage
> containerization.
>
> Example:
> K8s->Docker->Jenkins Agent->Jenkins Build Job

frankly I don't buy this argument, since above
and other 'examples' of container-in-container look
fake to me. There is a ton work to be done for such
scheme to be even remotely feasible. The cgroup+bpf
stuff would be the last on my list to 'fix' for such
deployments. I don't think we should worry about it
at present.

next prev parent reply	other threads:[~2016-09-05 22:39 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-26 19:58 [PATCH v3 0/6] Add eBPF hooks for cgroups Daniel Mack
2016-08-26 19:58 ` [PATCH v3 1/6] bpf: add new prog type for cgroup socket filtering Daniel Mack
2016-08-29 22:14   ` Daniel Borkmann
2016-09-05 12:48     ` Daniel Mack
2016-08-26 19:58 ` [PATCH v3 2/6] cgroup: add support for eBPF programs Daniel Mack
2016-08-27  0:03   ` Alexei Starovoitov
2016-09-05 12:47     ` Daniel Mack
2016-08-29 22:42   ` Daniel Borkmann
2016-09-05 12:50     ` Daniel Mack
2016-08-29 23:04   ` Sargun Dhillon
2016-09-05 14:49     ` Daniel Mack
2016-09-05 21:40       ` Sargun Dhillon
2016-09-05 22:39         ` Alexei Starovoitov [this message]
2016-08-26 19:58 ` [PATCH v3 3/6] bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands Daniel Mack
2016-08-27  0:08   ` Alexei Starovoitov
2016-09-05 12:56     ` Daniel Mack
2016-09-05 15:30       ` David Laight
2016-09-05 15:40         ` Daniel Mack
2016-09-05 17:29       ` Joe Perches
2016-08-29 23:00   ` Daniel Borkmann
2016-09-05 12:54     ` Daniel Mack
2016-09-05 13:56       ` Daniel Borkmann
2016-09-05 14:09         ` Daniel Mack
2016-09-05 17:09           ` Daniel Borkmann
2016-09-05 18:32             ` Alexei Starovoitov
2016-09-05 18:43               ` Daniel Mack
2016-08-26 19:58 ` [PATCH v3 4/6] net: filter: run cgroup eBPF ingress programs Daniel Mack
2016-08-29 23:15   ` Daniel Borkmann
2016-08-26 19:58 ` [PATCH v3 5/6] net: core: run cgroup eBPF egress programs Daniel Mack
2016-08-29 22:03   ` Daniel Borkmann
2016-08-29 22:23     ` Sargun Dhillon
2016-09-05 14:22     ` Daniel Mack
2016-09-06 17:14       ` Daniel Borkmann
2016-08-26 19:58 ` [PATCH v3 6/6] samples: bpf: add userspace example for attaching eBPF programs to cgroups Daniel Mack
2016-08-27 13:00 ` [PATCH v3 0/6] Add eBPF hooks for cgroups Rami Rosen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57CDF407.8020706@fb.com \
    --to=ast@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=daniel@zonque.org \
    --cc=davem@davemloft.net \
    --cc=fw@strlen.de \
    --cc=harald@redhat.com \
    --cc=htejun@fb.com \
    --cc=kafai@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=pablo@netfilter.org \
    --cc=sargun@sargun.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).