From mboxrd@z Thu Jan 1 00:00:00 1970 From: Pablo Neira Ayuso Subject: Re: [PATCH next] iptables: add xt_bpf match Date: Tue, 8 Jan 2013 04:21:23 +0100 Message-ID: <20130108032123.GA16502@1984> References: <20121208033111.GB28114@1984> <1355089978-24463-1-git-send-email-willemb@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: netfilter-devel@vger.kernel.org To: Willem de Bruijn Return-path: Received: from mail.us.es ([193.147.175.20]:57549 "EHLO mail.us.es" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750934Ab3AHDVd (ORCPT ); Mon, 7 Jan 2013 22:21:33 -0500 Content-Disposition: inline In-Reply-To: <1355089978-24463-1-git-send-email-willemb@google.com> Sender: netfilter-devel-owner@vger.kernel.org List-ID: Hi Willem, On Sun, Dec 09, 2012 at 04:52:58PM -0500, Willem de Bruijn wrote: > Support arbitrary linux socket filter (BPF) programs as iptables > match rules. This allows for very expressive filters, and on > platforms with BPF JIT appears competitive with traditional hardcoded > iptables rules. > > At least, on an x86_64 that achieves 40K netperf TCP_STREAM without > any iptables rules (40 GBps), > > inserting 100x this bpf rule gives 28K > > ./iptables -A OUTPUT -m bpf --bytecode '6,40 0 0 14, 21 0 3 2048,48 0 0 25,21 0 1 20,6 0 0 96,6 0 0 0,' -j > > (as generated by tcpdump -i any -ddd ip proto 20 | tr '\n' ',') > > inserting 100x this u32 rule gives 21K > > ./iptables -A OUTPUT -m u32 --u32 '6&0xFF=0x20' -j DROP > > The two are logically equivalent, as far as I can tell. Let me know > if my test methodology is flawed in some way. Even in cases where > slower, the filter adds functionality currently lacking in iptables, > such as access to sk_buff fields like rxhash and queue_mapping. > > Signed-off-by: Willem de Bruijn > --- > include/linux/netfilter/xt_bpf.h | 17 +++++++ > net/netfilter/Kconfig | 9 ++++ > net/netfilter/Makefile | 1 + > net/netfilter/x_tables.c | 5 +- > net/netfilter/xt_bpf.c | 86 ++++++++++++++++++++++++++++++++++++++ > 5 files changed, 116 insertions(+), 2 deletions(-) > create mode 100644 include/linux/netfilter/xt_bpf.h > create mode 100644 net/netfilter/xt_bpf.c > > diff --git a/include/linux/netfilter/xt_bpf.h b/include/linux/netfilter/xt_bpf.h > new file mode 100644 > index 0000000..23502c0 > --- /dev/null > +++ b/include/linux/netfilter/xt_bpf.h > @@ -0,0 +1,17 @@ > +#ifndef _XT_BPF_H > +#define _XT_BPF_H > + > +#include > +#include > + > +struct xt_bpf_info { > + __u16 bpf_program_num_elem; > + > + /* only used in kernel */ > + struct sk_filter *filter __attribute__((aligned(8))); I see. You set match->userspacesize to zero in libxt_bpf to skip the comparison of that internal struct sk_filter *filter. > + > + /* variable size, based on program_num_elem */ > + struct sock_filter bpf_program[0]; While testing this I noticed: iptables -I OUTPUT -m bpf --bytecode \ '6,40 0 0 14, 21 0 3 2048,48 0 0 25,21 0 1 20,6 0 0 96,6 0 0 0' -j ACCEPT Note that this works but it should not. iptables -D OUTPUT -m bpf --bytecode \ '6,40 0 0 14, 21 0 3 2048,48 0 0 25,21 0 1 20,6 0 0 96,1 0 0 0' -j ACCEPT ^ Mind that 1, it's a different filter, but it deletes the previous filter without problems here. A quick look at make_delete_mask() in iptables tells me that the changes you made to userspace to allow variable size matches are not enough to generate a sane mask (which is fundamental while looking for a matching rule during the deletion).