From: Stephen Hemminger <stephen@networkplumber.org>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: alexei.starovoitov@gmail.com, netdev@vger.kernel.org
Subject: Re: [PATCH iproute2 -next] {f,m}_bpf: allow for sharing maps
Date: Mon, 23 Nov 2015 16:11:48 -0800 [thread overview]
Message-ID: <20151123161148.7553cecb@xeon-e3> (raw)
In-Reply-To: <fb36dba8b867c3fce59560e2a8b295c0ae13a24c.1447323729.git.daniel@iogearbox.net>
On Fri, 13 Nov 2015 00:39:29 +0100
Daniel Borkmann <daniel@iogearbox.net> wrote:
> This larger work addresses one of the bigger remaining issues on
> tc's eBPF frontend, that is, to allow for persistent file descriptors.
> Whenever tc parses the ELF object, extracts and loads maps into the
> kernel, these file descriptors will be out of reach after the tc
> instance exits.
>
> Meaning, for simple (unnested) programs which contain one or
> multiple maps, the kernel holds a reference, and they will live
> on inside the kernel until the program holding them is unloaded,
> but they will be out of reach for user space, even worse with
> (also multiple nested) tail calls.
>
> For this issue, we introduced the concept of an agent that can
> receive the set of file descriptors from the tc instance creating
> them, in order to be able to further inspect/update map data for
> a specific use case. However, while that is more tied towards
> specific applications, it still doesn't easily allow for sharing
> maps accross multiple tc instances and would require a daemon to
> be running in the background. F.e. when a map should be shared by
> two eBPF programs, one attached to ingress, one to egress, this
> currently doesn't work with the tc frontend.
>
> This work solves exactly that, i.e. if requested, maps can now be
> _arbitrarily_ shared between object files (PIN_GLOBAL_NS) or within
> a single object (but various program sections, PIN_OBJECT_NS) without
> "loosing" the file descriptor set. To make that happen, we use eBPF
> object pinning introduced in kernel commit b2197755b263 ("bpf: add
> support for persistent maps/progs") for exactly this purpose.
>
> The shipped examples/bpf/bpf_shared.c code from this patch can be
> easily applied, for instance, as:
>
> - classifier-classifier shared:
>
> tc filter add dev foo parent 1: bpf obj shared.o sec egress
> tc filter add dev foo parent ffff: bpf obj shared.o sec ingress
>
> - classifier-action shared (here: late binding to a dummy classifier):
>
> tc actions add action bpf obj shared.o sec egress pass index 42
> tc filter add dev foo parent ffff: bpf obj shared.o sec ingress
> tc filter add dev foo parent 1: bpf bytecode '1,6 0 0 4294967295,' \
> action bpf index 42
>
> The toy example increments a shared counter on egress and dumps its
> value on ingress (if no sharing (PIN_NONE) would have been chosen,
> map value is 0, of course, due to the two map instances being created):
>
> [...]
> <idle>-0 [002] ..s. 38264.788234: : map val: 4
> <idle>-0 [002] ..s. 38264.788919: : map val: 4
> <idle>-0 [002] ..s. 38264.789599: : map val: 5
> [...]
>
> ... thus if both sections reference the pinned map(s) in question,
> tc will take care of fetching the appropriate file descriptor.
>
> The patch has been tested extensively on both, classifier and
> action sides.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Applied to net-next branch
prev parent reply other threads:[~2015-11-24 0:11 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-12 23:39 [PATCH iproute2 -next] {f,m}_bpf: allow for sharing maps Daniel Borkmann
2015-11-24 0:11 ` Stephen Hemminger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151123161148.7553cecb@xeon-e3 \
--to=stephen@networkplumber.org \
--cc=alexei.starovoitov@gmail.com \
--cc=daniel@iogearbox.net \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).