From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexei Starovoitov Subject: Re: [PATCH v4 net-next 1/3] Extended BPF interpreter and converter Date: Tue, 4 Mar 2014 09:53:01 -0800 Message-ID: References: <1393910304-4004-1-git-send-email-ast@plumgrid.com> <1393910304-4004-2-git-send-email-ast@plumgrid.com> <20140304142824.GA1083@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: Daniel Borkmann , "David S. Miller" , Ingo Molnar , Will Drewry , Steven Rostedt , Peter Zijlstra , "H. Peter Anvin" , Jesse Gross , Thomas Gleixner , Masami Hiramatsu , Tom Zanussi , Jovi Zhangwei , Eric Dumazet , Linus Torvalds , Andrew Morton , Frederic Weisbecker , Arnaldo Carvalho de Melo , Pekka Enberg , Arjan van de Ven , Christoph Hellwig , linux-kernel@vger.kernel.org, netdev@vger.kernel.org To: Hagen Paul Pfeifer Return-path: In-Reply-To: <20140304142824.GA1083@localhost.localdomain> Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On Tue, Mar 4, 2014 at 6:28 AM, Hagen Paul Pfeifer wrote: > If all issues raised by Daniel are addresed: > > Acked-by: Hagen Paul Pfeifer Thanks! > But ... > >>Future work: >> >>0. seccomp >> >>1. add extended BPF JIT for x86_64 >> >>2. add inband old/new demux and extended BPF verifier, so that new programs >> can be loaded through old sk_attach_filter() and sk_unattached_filter_create() >> interfaces >> >>3. tracing filters systemtap-like with extended BPF >> >>4. OVS with extended BPF >> >>5. nftables with extended BPF > > ... this is shit (not your fault). (Jitted) BPF envolved into a direction > which is just not the right way to do it. You try to fix things, bypass > architectural shortcomings of BPF, perf issues because and so on. > > The right direction is to write a new general purpose in-kernel interpreter > from scratch. Capability layers should provide an compatible API for BPF and > seccomp. You have the knowledge to do exactly this, you nearly already did > this - you should start this undertake! this insn set evolved over few years. Initially we had nft-like high level state machine, but it wasn't fast, then kprobe-like pure x86_64 which was fast, but very hard to analyze from safety point of view. Then reduced x86-64 insn set and finally ebpf. I think any brand new instruction set will have steep learning curve, just because it's all new. ebpf tries to reuse as much as possible. opcode encoding is the same, instruction size is fixed at 8 bytes and so on. Yeah, these restrictions make few things not 100% optimal, but imo common look and feel is more important. What ebpf has already should be enough to do all of the above 'future work'. Built-in JIT-ability of ebpf is the key to performance. Ability to call some kernel functions from ebpf make it ultimately extensible. socket filters and seccomp don't use this feature yet, but tracing filters will. Regards, Alexei > > -- > Hagen Paul Pfeifer