From: Josh Triplett <josh@joshtriplett.org>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
Eric Dumazet <edumazet@google.com>,
Daniel Borkmann <dborkman@redhat.com>,
Network Development <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net] bpf: split eBPF out of NET
Date: Fri, 24 Oct 2014 01:11:39 -0700 [thread overview]
Message-ID: <20141024081139.GA8861@thin> (raw)
In-Reply-To: <CAMEtUuxsk=iLDsD4XXZ8EcurFXgFxD-9iePv=NbBZn+b3YOXJA@mail.gmail.com>
On Thu, Oct 23, 2014 at 10:32:50PM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 23, 2014 at 8:23 PM, Josh Triplett <josh@joshtriplett.org> wrote:
> > On Thu, Oct 23, 2014 at 06:41:08PM -0700, Alexei Starovoitov wrote:
> >> introduce two configs:
> >> - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
> >> depend on
> >> - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use
> >>
> >> that solves several problems:
> >> - tracing and others that wish to use eBPF don't need to depend on NET.
> >> They can use BPF_SYSCALL to allow loading from userspace or select BPF
> >> to use it directly from kernel in NET-less configs.
> >> - in 3.18 programs cannot be attached to events yet, so don't force it on
> >> - when the rest of eBPF infra is there in 3.19+, it's still useful to
> >> switch it off to minimize kernel size
> >>
> >> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
> >
> > Thanks for working on this! A few nits below, but otherwise this looks
> > good to me. Once this gets appropriate reviews from net and bpf folks,
> > please let me know if you want this to go through the net tree, the tiny
> > tree, or some other tree.
>
> Thanks :)
> I've sent it to Dave and marked it as 'net', so it's for
> his net tree. I don't mind if he decides to steer it into net-next
> when it opens, since changing Kconfig is always tricky.
> I just felt that this patch deserves to be in 'net' and in 3.18-rc
Ah, nice; yes, getting it into 3.18-rc would be excellent if possible.
> >> bloat-o-meter on x64 shows:
> >> add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601)
> >
> > Very nice! Please do include the bloat-o-meter stats in the commit
> > message.
>
> I don't think that's necessary. eBPF is in early stages of adoption.
> More things to come, so bloat-o-meter stats will be obsolete
> very quickly.
I don't mean the full list of symbols, just the summary saying this
saves 15k.
> >> +# interpreter that classic socket filters depend on
> >> +config BPF
> >> + boolean
> >
> > s/boolean/bool/
>
> Is there a difference? I thought it's an alias.
It's an alias, but almost everything uses "bool":
~/src/linux$ git grep -w bool -- '*Kconfig*' | wc -l
7064
~/src/linux$ git grep -w boolean -- '*Kconfig*' | wc -l
94
> >> +config BPF_SYSCALL
> >> + bool "Enable bpf() system call" if EXPERT
> >> + select ANON_INODES
> >> + select BPF
> >> + default n
> >> + help
> >> + Enable the bpf() system call that allows to manipulate eBPF
> >> + programs and maps via file descriptors.
> >
> > Not sure this one goes under EXPERT, especially since it currently has
> > "default n".
>
> I followed the same style as EPOLL, EVENTFD and others
> in the same category.
I was thinking of CROSS_MEMORY_ATTACH and FHANDLE in the same file.
> >> +/* To execute LD_ABS/LD_IND instructions __bpf_prog_run() may call
> >> + * skb_copy_bits(), so provide a weak definition of it for NET-less config.
> >> + */
> >> +int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to,
> >> + int len)
> >> +{
> >> + return -EFAULT;
> >> +}
> >
> > Please discuss this in the commit message. What are the implications of
> > ending up with this implementation that always returns -EFAULT?
>
> because that's what real skb_copy_bits() would return.
> In this case it's actually irrelevant, since non-socket programs
> are not allowed to have LD_ABS/LD_IND instructions and
> I'm only resolving linker error here.
> But returning negative error helps prevent bugs in cases
> where verifier or some in-kernel generated program uses
> LD_ABS by mistake.
Makes sense.
> I don't think these type of explanations are necessary in
> commit logs.
>
> >> @@ -6,7 +6,7 @@ menuconfig NET
> >> bool "Networking support"
> >> select NLATTR
> >> select GENERIC_NET_UTILS
> >> - select ANON_INODES
> >> + select BPF
> >
> > Why does this not need to select ANON_INODES anymore? Did *only* BPF
> > use that, so it only needs to occur via BPF_SYSCALL? If so, can you
> > document that in the commit message?
>
> I hope that folks who were following this work on netdev
> remember commit 38b3629adb8c04 that added it.
> So here I'm actually removing this ANON_INODES dependency
> from NET and moving it into BPF_SYSCALL where it belongs.
Thanks for the clarification.
> btw, the goal of this patch is not tinification, but rather being
> good citizen and not forcing new syscall on everyone.
A critical part of the tinification effort is not having the kernel get
gratuitously bigger in other areas while we're trying to shrink it. So,
I really appreciate your work. :)
- Josh Triplett
next prev parent reply other threads:[~2014-10-24 8:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-24 1:41 [PATCH net] bpf: split eBPF out of NET Alexei Starovoitov
2014-10-24 3:23 ` Josh Triplett
2014-10-24 5:32 ` Alexei Starovoitov
2014-10-24 8:11 ` Josh Triplett [this message]
2014-10-24 8:19 ` Geert Uytterhoeven
2014-10-24 8:37 ` Daniel Borkmann
2014-10-27 23:10 ` David Miller
2014-10-28 0:18 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141024081139.GA8861@thin \
--to=josh@joshtriplett.org \
--cc=ast@plumgrid.com \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=edumazet@google.com \
--cc=geert@linux-m68k.org \
--cc=hannes@stressinduktion.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).