Re: [PATCH next v3] iptables: add xt_bpf match

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Willem de Bruijn <willemb@google.com>
To: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: netfilter-devel <netfilter-devel@vger.kernel.org>
Subject: Re: [PATCH next v3] iptables: add xt_bpf match
Date: Fri, 18 Jan 2013 11:48:34 -0500	[thread overview]
Message-ID: <CA+FuTSeU8m8Hect+2cT6QOmfNH3oJ8mASOat-zn3-3WwJxJNZg@mail.gmail.com> (raw)
In-Reply-To: <20130117235328.GA16224@1984>

On Thu, Jan 17, 2013 at 6:53 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Hi Willem,
>
> On Wed, Jan 09, 2013 at 07:15:44PM -0500, Willem de Bruijn wrote:
>> Changes:
>> - v3: reverted no longer needed changes to x_tables.c
>> - v2: use a fixed size match structure to communicate between
>>       kernel and userspace.
>>
>> Support arbitrary linux socket filter (BPF) programs as iptables
>> match rules. This allows for very expressive filters, and on
>> platforms with BPF JIT appears competitive with traditional hardcoded
>> iptables rules.
>>
>> At least, on an x86_64 that achieves 40K netperf TCP_STREAM without
>> any iptables rules (40 GBps),
>>
>> inserting 100x this bpf rule gives 28K
>>
>>     ./iptables -A OUTPUT -m bpf --bytecode '6,40 0 0 14, 21 0 3 2048,48 0 0 25,21 0 1 20,6 0 0 96,6 0 0 0,' -j
>>
>>     (as generated by tcpdump -i any -ddd ip proto 20 | tr '\n' ',')
>
> That code generated by tcpdump will not work.
>
> tcpdump generates BPF code assuming that offset 0 is the link layer
> header, while iptables considers that offset 0 is the network layer.

Ah, yes, of course. We discussed that earlier. I removed the statement
from the commit message. Such hints belong in the libxt_bpf man page,
if anywhere.

To compile code right now, the little bpf compiler that I emailed
before can be downloaded from
http://code.google.com/p/kernel/downloads/detail?name=bpf2decimal.c

I don't think that a compiler has to be shipped with iptables itself,
let alone make iptables link against libraries. That said,  it is not
impossible to detect pcap.h in configure.ac and optionally enable a
"-m bpf --string" mode that calls pcap_compile_nopcap from within
libxt_bpf, so let me know if you would like me to code that up. I can
also try to send a patch to tcpdump that extends compilation (`-ddd -y
<type>`) to arbitrary link layer types.

> More comments below:
>
>> inserting 100x this u32 rule gives 21K
>>
>>     ./iptables -A OUTPUT -m u32 --u32 '6&0xFF=0x20' -j DROP
>>
>> The two are logically equivalent, as far as I can tell. Let me know
>> if my test methodology is flawed in some way. Even in cases where
>> slower, the filter adds functionality currently lacking in iptables,
>> such as access to sk_buff fields like rxhash and queue_mapping.
>> ---
>>  include/uapi/linux/netfilter/xt_bpf.h |   17 ++++++++
>>  net/netfilter/Kconfig                 |    9 ++++
>>  net/netfilter/Makefile                |    1 +
>>  net/netfilter/xt_bpf.c                |   73 +++++++++++++++++++++++++++++++++
>>  4 files changed, 100 insertions(+), 0 deletions(-)
>>  create mode 100644 include/uapi/linux/netfilter/xt_bpf.h
>>  create mode 100644 net/netfilter/xt_bpf.c
>>
>> diff --git a/include/uapi/linux/netfilter/xt_bpf.h b/include/uapi/linux/netfilter/xt_bpf.h
>> new file mode 100644
>> index 0000000..5dda450
>> --- /dev/null
>> +++ b/include/uapi/linux/netfilter/xt_bpf.h
>> @@ -0,0 +1,17 @@
>> +#ifndef _XT_BPF_H
>> +#define _XT_BPF_H
>> +
>> +#include <linux/filter.h>
>> +#include <linux/types.h>
>> +
>> +#define XT_BPF_MAX_NUM_INSTR 64
>> +
>> +struct xt_bpf_info {
>> +     __u16 bpf_program_num_elem;
>> +     struct sock_filter bpf_program[XT_BPF_MAX_NUM_INSTR];
>> +
>> +     /* only used in the kernel */
>> +     struct sk_filter *filter __attribute__((aligned(8)));
>> +};
>> +
>> +#endif /*_XT_BPF_H */
>> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
>> index fefa514..d45720f 100644
>> --- a/net/netfilter/Kconfig
>> +++ b/net/netfilter/Kconfig
>> @@ -798,6 +798,15 @@ config NETFILTER_XT_MATCH_ADDRTYPE
>>         If you want to compile it as a module, say M here and read
>>         <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
>>
>> +config NETFILTER_XT_MATCH_BPF
>> +     tristate '"bpf" match support'
>> +     depends on NETFILTER_ADVANCED
>> +     help
>> +       BPF matching applies a linux socket filter to each packet and
>> +          accepts those for which the filter returns non-zero.
>> +
>> +       To compile it as a module, choose M here.  If unsure, say N.
>> +
>>  config NETFILTER_XT_MATCH_CLUSTER
>>       tristate '"cluster" match support'
>>       depends on NF_CONNTRACK
>> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
>> index 3259697..6d6194525 100644
>> --- a/net/netfilter/Makefile
>> +++ b/net/netfilter/Makefile
>> @@ -98,6 +98,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
>>
>>  # matches
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_ADDRTYPE) += xt_addrtype.o
>> +obj-$(CONFIG_NETFILTER_XT_MATCH_BPF) += xt_bpf.o
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_COMMENT) += xt_comment.o
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o
>> diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
>> new file mode 100644
>> index 0000000..1bdfab8
>> --- /dev/null
>> +++ b/net/netfilter/xt_bpf.c
>> @@ -0,0 +1,73 @@
>> +/* Xtables module to match packets using a BPF filter.
>> + * Copyright 2013 Google Inc.
>> + * Written by Willem de Bruijn <willemb@google.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/ipv6.h>
>> +#include <linux/filter.h>
>> +#include <net/ip.h>
>> +
>> +#include <linux/netfilter/xt_bpf.h>
>> +#include <linux/netfilter/x_tables.h>
>> +
>> +MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
>> +MODULE_DESCRIPTION("Xtables: BPF filter match");
>> +MODULE_LICENSE("GPL");
>
> Please, add
>
> MODULE_ALIAS("ipt_bpf");
> MODULE_ALIAS("ip6t_bpf");

Done.

> otherwise module auto-loading will not work.
>
>> +
>> +static int bpf_mt_check(const struct xt_mtchk_param *par)
>> +{
>> +     struct xt_bpf_info *info = par->matchinfo;
>> +     struct sock_fprog program;
>> +
>> +     program.len = info->bpf_program_num_elem;
>> +     program.filter = info->bpf_program;
>
> sparse reports a warning here above. I've been trying to find a quick
> solution, please, have a look at it.

Thanks for catching this. Apparently, program.filter is annotated as
__user. A cast made the warning disappear, and is safe as far as I can
see, since it only forces the kernel to be more conservative in
accessing the memory.

>> +     if (sk_unattached_filter_create(&info->filter, &program)) {
>> +             pr_info("bpf: check failed: parse error\n");
>> +             return -EINVAL;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static bool bpf_mt(const struct sk_buff *skb, struct xt_action_param *par)
>> +{
>> +     const struct xt_bpf_info *info = par->matchinfo;
>> +
>> +     return SK_RUN_FILTER(info->filter, skb);
>> +}
>> +
>> +static void bpf_mt_destroy(const struct xt_mtdtor_param *par)
>> +{
>> +     const struct xt_bpf_info *info = par->matchinfo;
>> +     sk_unattached_filter_destroy(info->filter);
>> +}
>> +
>> +static struct xt_match bpf_mt_reg __read_mostly = {
>> +     .name           = "bpf",
>> +     .revision       = 0,
>> +     .family         = NFPROTO_UNSPEC,
>> +     .checkentry     = bpf_mt_check,
>> +     .match          = bpf_mt,
>> +     .destroy        = bpf_mt_destroy,
>> +     .matchsize      = sizeof(struct xt_bpf_info),
>> +     .me             = THIS_MODULE,
>> +};
>> +
>> +static int __init bpf_mt_init(void)
>> +{
>> +     return xt_register_match(&bpf_mt_reg);
>> +}
>> +
>> +static void __exit bpf_mt_exit(void)
>> +{
>> +     xt_unregister_match(&bpf_mt_reg);
>> +}
>> +
>> +module_init(bpf_mt_init);
>> +module_exit(bpf_mt_exit);
>> --
>> 1.7.7.3
>>

next prev parent reply	other threads:[~2013-01-18 16:49 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-05 19:22 [PATCH rfc] netfilter: two xtables matches Willem de Bruijn
2012-12-05 19:22 ` [PATCH 1/2] netfilter: add xt_priority xtables match Willem de Bruijn
2012-12-08  0:04   ` [PATCH] [RFC] netfilter: add xt_skbuff " Willem de Bruijn
2012-12-08  3:23     ` Pablo Neira Ayuso
2012-12-09 20:24       ` Willem de Bruijn
2012-12-09 20:28         ` [PATCH] " Willem de Bruijn
2012-12-05 19:22 ` [PATCH 2/2] netfilter: add xt_bpf " Willem de Bruijn
2012-12-05 19:48   ` Pablo Neira Ayuso
2012-12-05 20:10     ` Willem de Bruijn
2012-12-07 13:16       ` Pablo Neira Ayuso
2012-12-07 16:56         ` Willem de Bruijn
2012-12-08  3:31           ` Pablo Neira Ayuso
2012-12-08 16:02             ` Daniel Borkmann
2012-12-09 21:52             ` [PATCH next] iptables: add xt_bpf match Willem de Bruijn
2013-01-08  3:21               ` Pablo Neira Ayuso
2013-01-09  1:58                 ` Willem de Bruijn
2013-01-09  9:52                   ` Pablo Neira Ayuso
2013-01-10  0:08                     ` Willem de Bruijn
2013-01-10  0:08                       ` [PATCH next v2] " Willem de Bruijn
2013-01-10  0:15                         ` [PATCH next v3] " Willem de Bruijn
2013-01-17 23:53                           ` Pablo Neira Ayuso
2013-01-18 16:48                             ` Willem de Bruijn [this message]
2013-01-18 17:17                               ` [PATCH next] " Willem de Bruijn
2013-01-21 11:28                                 ` Pablo Neira Ayuso
2013-01-21 11:33                                   ` Pablo Neira Ayuso
2013-01-21 11:42                                     ` Florian Westphal
2013-01-21 12:03                                       ` Pablo Neira Ayuso
2013-01-21 16:02                                   ` Willem de Bruijn
2013-01-21 13:44                               ` [PATCH next v3] " Pablo Neira Ayuso
2013-01-22  8:46                                 ` Florian Westphal
2013-01-22  9:46                                   ` Jozsef Kadlecsik
2013-01-22 10:03                                     ` Maciej Żenczykowski
2013-01-22 11:11                                     ` Pablo Neira Ayuso
2013-01-23 15:59                                   ` Willem de Bruijn
2013-01-23 16:21                                     ` Pablo Neira Ayuso
2013-01-23 16:38                                       ` Willem de Bruijn
2013-01-23 18:56                                         ` Pablo Neira Ayuso
2013-02-18  3:44                                           ` [PATCH] utils: bpf_compile Willem de Bruijn
2013-02-20 10:38                                             ` Daniel Borkmann
2013-02-21  4:35                                               ` Willem de Bruijn
2013-02-21 13:43                                                 ` Daniel Borkmann
2013-03-12 15:44                                                   ` [PATCH next] " Willem de Bruijn
2013-04-01 22:20                                                     ` Pablo Neira Ayuso
2013-04-03 15:32                                                       ` Willem de Bruijn
2013-04-04  9:34                                                         ` Pablo Neira Ayuso
2013-02-18  3:52                                           ` [PATCH next v3] iptables: add xt_bpf match Willem de Bruijn
2013-02-24  2:15                                             ` Maciej Żenczykowski
2013-02-27 20:39                                               ` Willem de Bruijn
2012-12-05 19:28 ` [PATCH rfc] netfilter: two xtables matches Willem de Bruijn
2012-12-05 20:00   ` Jan Engelhardt
2012-12-05 21:45     ` Willem de Bruijn
2012-12-05 21:50       ` Willem de Bruijn
2012-12-05 22:35       ` Jan Engelhardt
2012-12-06  5:22     ` Pablo Neira Ayuso
2012-12-06 21:12       ` Willem de Bruijn
2012-12-07  7:22         ` Pablo Neira Ayuso
2012-12-07 13:20         ` Pablo Neira Ayuso
2012-12-07 17:26           ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+FuTSeU8m8Hect+2cT6QOmfNH3oJ8mASOat-zn3-3WwJxJNZg@mail.gmail.com \
    --to=willemb@google.com \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).