From: Josh Triplett <josh@joshtriplett.org>
To: Alexei Starovoitov <ast@plumgrid.com>
Cc: "David S. Miller" <davem@davemloft.net>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Ingo Molnar <mingo@kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Hannes Frederic Sowa <hannes@stressinduktion.org>,
Eric Dumazet <edumazet@google.com>,
Daniel Borkmann <dborkman@redhat.com>,
Network Development <netdev@vger.kernel.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net] bpf: split eBPF out of NET
Date: Fri, 24 Oct 2014 01:11:39 -0700 [thread overview]
Message-ID: <20141024081139.GA8861@thin> (raw)
In-Reply-To: <CAMEtUuxsk=iLDsD4XXZ8EcurFXgFxD-9iePv=NbBZn+b3YOXJA@mail.gmail.com>
On Thu, Oct 23, 2014 at 10:32:50PM -0700, Alexei Starovoitov wrote:
> On Thu, Oct 23, 2014 at 8:23 PM, Josh Triplett <josh@joshtriplett.org> wrote:
> > On Thu, Oct 23, 2014 at 06:41:08PM -0700, Alexei Starovoitov wrote:
> >> introduce two configs:
> >> - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters
> >> depend on
> >> - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use
> >>
> >> that solves several problems:
> >> - tracing and others that wish to use eBPF don't need to depend on NET.
> >> They can use BPF_SYSCALL to allow loading from userspace or select BPF
> >> to use it directly from kernel in NET-less configs.
> >> - in 3.18 programs cannot be attached to events yet, so don't force it on
> >> - when the rest of eBPF infra is there in 3.19+, it's still useful to
> >> switch it off to minimize kernel size
> >>
> >> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
> >
> > Thanks for working on this! A few nits below, but otherwise this looks
> > good to me. Once this gets appropriate reviews from net and bpf folks,
> > please let me know if you want this to go through the net tree, the tiny
> > tree, or some other tree.
>
> Thanks :)
> I've sent it to Dave and marked it as 'net', so it's for
> his net tree. I don't mind if he decides to steer it into net-next
> when it opens, since changing Kconfig is always tricky.
> I just felt that this patch deserves to be in 'net' and in 3.18-rc
Ah, nice; yes, getting it into 3.18-rc would be excellent if possible.
> >> bloat-o-meter on x64 shows:
> >> add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601)
> >
> > Very nice! Please do include the bloat-o-meter stats in the commit
> > message.
>
> I don't think that's necessary. eBPF is in early stages of adoption.
> More things to come, so bloat-o-meter stats will be obsolete
> very quickly.
I don't mean the full list of symbols, just the summary saying this
saves 15k.
> >> +# interpreter that classic socket filters depend on
> >> +config BPF
> >> + boolean
> >
> > s/boolean/bool/
>
> Is there a difference? I thought it's an alias.
It's an alias, but almost everything uses "bool":
~/src/linux$ git grep -w bool -- '*Kconfig*' | wc -l
7064
~/src/linux$ git grep -w boolean -- '*Kconfig*' | wc -l
94
> >> +config BPF_SYSCALL
> >> + bool "Enable bpf() system call" if EXPERT
> >> + select ANON_INODES
> >> + select BPF
> >> + default n
> >> + help
> >> + Enable the bpf() system call that allows to manipulate eBPF
> >> + programs and maps via file descriptors.
> >
> > Not sure this one goes under EXPERT, especially since it currently has
> > "default n".
>
> I followed the same style as EPOLL, EVENTFD and others
> in the same category.
I was thinking of CROSS_MEMORY_ATTACH and FHANDLE in the same file.
> >> +/* To execute LD_ABS/LD_IND instructions __bpf_prog_run() may call
> >> + * skb_copy_bits(), so provide a weak definition of it for NET-less config.
> >> + */
> >> +int __weak skb_copy_bits(const struct sk_buff *skb, int offset, void *to,
> >> + int len)
> >> +{
> >> + return -EFAULT;
> >> +}
> >
> > Please discuss this in the commit message. What are the implications of
> > ending up with this implementation that always returns -EFAULT?
>
> because that's what real skb_copy_bits() would return.
> In this case it's actually irrelevant, since non-socket programs
> are not allowed to have LD_ABS/LD_IND instructions and
> I'm only resolving linker error here.
> But returning negative error helps prevent bugs in cases
> where verifier or some in-kernel generated program uses
> LD_ABS by mistake.
Makes sense.
> I don't think these type of explanations are necessary in
> commit logs.
>
> >> @@ -6,7 +6,7 @@ menuconfig NET
> >> bool "Networking support"
> >> select NLATTR
> >> select GENERIC_NET_UTILS
> >> - select ANON_INODES
> >> + select BPF
> >
> > Why does this not need to select ANON_INODES anymore? Did *only* BPF
> > use that, so it only needs to occur via BPF_SYSCALL? If so, can you
> > document that in the commit message?
>
> I hope that folks who were following this work on netdev
> remember commit 38b3629adb8c04 that added it.
> So here I'm actually removing this ANON_INODES dependency
> from NET and moving it into BPF_SYSCALL where it belongs.
Thanks for the clarification.
> btw, the goal of this patch is not tinification, but rather being
> good citizen and not forcing new syscall on everyone.
A critical part of the tinification effort is not having the kernel get
gratuitously bigger in other areas while we're trying to shrink it. So,
I really appreciate your work. :)
- Josh Triplett
next prev parent reply other threads:[~2014-10-24 8:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-24 1:41 [PATCH net] bpf: split eBPF out of NET Alexei Starovoitov
2014-10-24 3:23 ` Josh Triplett
2014-10-24 5:32 ` Alexei Starovoitov
2014-10-24 8:11 ` Josh Triplett [this message]
2014-10-24 8:19 ` Geert Uytterhoeven
2014-10-24 8:37 ` Daniel Borkmann
2014-10-27 23:10 ` David Miller
2014-10-28 0:18 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141024081139.GA8861@thin \
--to=josh@joshtriplett.org \
--cc=ast@plumgrid.com \
--cc=davem@davemloft.net \
--cc=dborkman@redhat.com \
--cc=edumazet@google.com \
--cc=geert@linux-m68k.org \
--cc=hannes@stressinduktion.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=rostedt@goodmis.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.