netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Willem de Bruijn <willemb@google.com>
Cc: netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	edumazet@google.com, davem@davemloft.net, kaber@trash.net
Subject: Re: [PATCH 2/2] netfilter: add xt_bpf xtables match
Date: Wed, 5 Dec 2012 20:48:54 +0100	[thread overview]
Message-ID: <20121205194854.GB28730@1984> (raw)
In-Reply-To: <1354735339-13402-3-git-send-email-willemb@google.com>

Hi Willem,

On Wed, Dec 05, 2012 at 02:22:19PM -0500, Willem de Bruijn wrote:
> A new match that executes sk_run_filter on every packet. BPF filters
> can access skbuff fields that are out of scope for existing iptables
> rules, allow more expressive logic, and on platforms with JIT support
> can even be faster.
> 
> I have a corresponding iptables patch that takes `tcpdump -ddd`
> output, as used in the examples below. The two parts communicate
> using a variable length structure. This is similar to ebt_among,
> but new for iptables.
> 
> Verified functionality by inserting an ip source filter on chain
> INPUT and an ip dest filter on chain OUTPUT and noting that ping
> failed while a rule was active:
> 
> iptables -v -A INPUT -m bpf --bytecode '4,32 0 0 12,21 0 1 $SADDR,6 0 0 96,6 0 0 0,' -j DROP
> iptables -v -A OUTPUT -m bpf --bytecode '4,32 0 0 16,21 0 1 $DADDR,6 0 0 96,6 0 0 0,' -j DROP

I like this BPF idea for iptables.

I made a similar extension time ago, but it was taking a file as
parameter. That file contained in BPF code. I made a simple bison
parser that takes BPF code and put it into the bpf array of
instructions. It would be a bit more intuitive to define a filter and
we can distribute it with iptables.

Let me check on my internal trees, I can put that user-space code
somewhere in case you're interested.

> Evaluated throughput by running netperf TCP_STREAM over loopback on
> x86_64. I expected the BPF filter to outperform hardcoded iptables
> filters when replacing multiple matches with a single bpf match, but
> even a single comparison to u32 appears to do better. Relative to the
> benchmark with no filter applied, rate with 100 BPF filters dropped
> to 81%. With 100 U32 filters it dropped to 55%. The difference sounds
> excessive to me, but was consistent on my hardware. Commands used:
> 
> for i in `seq 100`; do iptables -A OUTPUT -m bpf --bytecode '4,48 0 0 9,21 0 1 20,6 0 0 96,6 0 0 0,' -j DROP; done
> for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done
> 
> iptables -F OUTPUT
> 
> for i in `seq 100`; do iptables -A OUTPUT -m u32 --u32 '6&0xFF=0x20' -j DROP; done
> for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done
> 
> FYI: perf top
> 
> [bpf]
>     33.94%  [kernel]          [k] copy_user_generic_string
>      8.92%  [kernel]          [k] sk_run_filter
>      7.77%  [ip_tables]       [k] ipt_do_table
> 
> [u32]
>     22.63%  [kernel]          [k] copy_user_generic_string
>     14.46%  [kernel]          [k] memcpy
>      9.19%  [ip_tables]       [k] ipt_do_table
>      8.47%  [xt_u32]          [k] u32_mt
>      5.32%  [kernel]          [k] skb_copy_bits
> 
> The big difference appears to be in memory copying. I have not
> looked into u32, so cannot explain this right now. More interestingly,
> at higher rate, sk_run_filter appears to use as many cycles as u32_mt
> (both traces have roughly the same number of events).
> 
> One caveat: to work independent of device link layer, the filter
> expects DLT_RAW style BPF programs, i.e., those that expect the
> packet to start at the IP layer.
> ---
>  include/linux/netfilter/xt_bpf.h |   17 +++++++
>  net/netfilter/Kconfig            |    9 ++++
>  net/netfilter/Makefile           |    1 +
>  net/netfilter/x_tables.c         |    5 +-
>  net/netfilter/xt_bpf.c           |   88 ++++++++++++++++++++++++++++++++++++++
>  5 files changed, 118 insertions(+), 2 deletions(-)
>  create mode 100644 include/linux/netfilter/xt_bpf.h
>  create mode 100644 net/netfilter/xt_bpf.c
> 
> diff --git a/include/linux/netfilter/xt_bpf.h b/include/linux/netfilter/xt_bpf.h
> new file mode 100644
> index 0000000..23502c0
> --- /dev/null
> +++ b/include/linux/netfilter/xt_bpf.h
> @@ -0,0 +1,17 @@
> +#ifndef _XT_BPF_H
> +#define _XT_BPF_H
> +
> +#include <linux/filter.h>
> +#include <linux/types.h>
> +
> +struct xt_bpf_info {
> +	__u16 bpf_program_num_elem;
> +
> +	/* only used in kernel */
> +	struct sk_filter *filter __attribute__((aligned(8)));
> +
> +	/* variable size, based on program_num_elem */
> +	struct sock_filter bpf_program[0];
> +};
> +
> +#endif /*_XT_BPF_H */
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index c9739c6..c7cc0b8 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -798,6 +798,15 @@ config NETFILTER_XT_MATCH_ADDRTYPE
>  	  If you want to compile it as a module, say M here and read
>  	  <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
>  
> +config NETFILTER_XT_MATCH_BPF
> +	tristate '"bpf" match support'
> +	depends on NETFILTER_ADVANCED
> +	help
> +	  BPF matching applies a linux socket filter to each packet and
> +          accepts those for which the filter returns non-zero.
> +
> +	  To compile it as a module, choose M here.  If unsure, say N.
> +
>  config NETFILTER_XT_MATCH_CLUSTER
>  	tristate '"cluster" match support'
>  	depends on NF_CONNTRACK
> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
> index 8e5602f..9f12eeb 100644
> --- a/net/netfilter/Makefile
> +++ b/net/netfilter/Makefile
> @@ -98,6 +98,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
>  
>  # matches
>  obj-$(CONFIG_NETFILTER_XT_MATCH_ADDRTYPE) += xt_addrtype.o
> +obj-$(CONFIG_NETFILTER_XT_MATCH_BPF) += xt_bpf.o
>  obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
>  obj-$(CONFIG_NETFILTER_XT_MATCH_COMMENT) += xt_comment.o
>  obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index 8d987c3..26306be 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -379,8 +379,9 @@ int xt_check_match(struct xt_mtchk_param *par,
>  	if (XT_ALIGN(par->match->matchsize) != size &&
>  	    par->match->matchsize != -1) {
>  		/*
> -		 * ebt_among is exempt from centralized matchsize checking
> -		 * because it uses a dynamic-size data set.
> +		 * matches of variable size length, such as ebt_among,
> +		 * are exempt from centralized matchsize checking. They
> +		 * skip the test by setting xt_match.matchsize to -1.
>  		 */
>  		pr_err("%s_tables: %s.%u match: invalid size "
>  		       "%u (kernel) != (user) %u\n",
> diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
> new file mode 100644
> index 0000000..07077c5
> --- /dev/null
> +++ b/net/netfilter/xt_bpf.c
> @@ -0,0 +1,88 @@
> +/* Xtables module to match packets using a BPF filter.
> + * Copyright 2012 Google Inc.
> + * Written by Willem de Bruijn <willemb@google.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/ipv6.h>
> +#include <linux/filter.h>
> +#include <net/ip.h>
> +
> +#include <linux/netfilter/xt_bpf.h>
> +#include <linux/netfilter/x_tables.h>
> +
> +MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
> +MODULE_DESCRIPTION("Xtables: BPF filter match");
> +MODULE_LICENSE("GPL");
> +MODULE_ALIAS("ipt_bpf");
> +MODULE_ALIAS("ip6t_bpf");
> +
> +static int bpf_mt_check(const struct xt_mtchk_param *par)
> +{
> +	struct xt_bpf_info *info = par->matchinfo;
> +	const struct xt_entry_match *match;
> +	struct sock_fprog program;
> +	int expected_len;
> +
> +	match = container_of(par->matchinfo, const struct xt_entry_match, data);
> +	expected_len = sizeof(struct xt_entry_match) +
> +		       sizeof(struct xt_bpf_info) +
> +		       (sizeof(struct sock_filter) *
> +			info->bpf_program_num_elem);
> +
> +	if (match->u.match_size != expected_len) {
> +		pr_info("bpf: check failed: incorrect length\n");
> +		return -EINVAL;
> +	}
> +
> +	program.len = info->bpf_program_num_elem;
> +	program.filter = info->bpf_program;
> +	if (sk_unattached_filter_create(&info->filter, &program)) {
> +		pr_info("bpf: check failed: parse error\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool bpf_mt(const struct sk_buff *skb, struct xt_action_param *par)
> +{
> +	const struct xt_bpf_info *info = par->matchinfo;
> +
> +	return SK_RUN_FILTER(info->filter, skb);
> +}
> +
> +static void bpf_mt_destroy(const struct xt_mtdtor_param *par)
> +{
> +	const struct xt_bpf_info *info = par->matchinfo;
> +	sk_unattached_filter_destroy(info->filter);
> +}
> +
> +static struct xt_match bpf_mt_reg __read_mostly = {
> +	.name		= "bpf",
> +	.revision	= 0,
> +	.family		= NFPROTO_UNSPEC,
> +	.checkentry	= bpf_mt_check,
> +	.match		= bpf_mt,
> +	.destroy	= bpf_mt_destroy,
> +	.matchsize	= -1, /* skip xt_check_match because of dynamic len */
> +	.me		= THIS_MODULE,
> +};
> +
> +static int __init bpf_mt_init(void)
> +{
> +	return xt_register_match(&bpf_mt_reg);
> +}
> +
> +static void __exit bpf_mt_exit(void)
> +{
> +	xt_unregister_match(&bpf_mt_reg);
> +}
> +
> +module_init(bpf_mt_init);
> +module_exit(bpf_mt_exit);
> -- 
> 1.7.7.3
> 

  reply	other threads:[~2012-12-05 19:48 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-05 19:22 [PATCH rfc] netfilter: two xtables matches Willem de Bruijn
2012-12-05 19:22 ` [PATCH 1/2] netfilter: add xt_priority xtables match Willem de Bruijn
2012-12-08  0:04   ` [PATCH] [RFC] netfilter: add xt_skbuff " Willem de Bruijn
2012-12-08  3:23     ` Pablo Neira Ayuso
2012-12-09 20:24       ` Willem de Bruijn
2012-12-09 20:28         ` [PATCH] " Willem de Bruijn
2012-12-05 19:22 ` [PATCH 2/2] netfilter: add xt_bpf " Willem de Bruijn
2012-12-05 19:48   ` Pablo Neira Ayuso [this message]
2012-12-05 20:10     ` Willem de Bruijn
2012-12-07 13:16       ` Pablo Neira Ayuso
2012-12-07 16:56         ` Willem de Bruijn
2012-12-08  3:31           ` Pablo Neira Ayuso
2012-12-08 16:02             ` Daniel Borkmann
2012-12-09 21:52             ` [PATCH next] iptables: add xt_bpf match Willem de Bruijn
2013-01-08  3:21               ` Pablo Neira Ayuso
2013-01-09  1:58                 ` Willem de Bruijn
2013-01-09  9:52                   ` Pablo Neira Ayuso
2013-01-10  0:08                     ` Willem de Bruijn
2013-01-10  0:08                       ` [PATCH next v2] " Willem de Bruijn
2013-01-10  0:15                         ` [PATCH next v3] " Willem de Bruijn
2013-01-17 23:53                           ` Pablo Neira Ayuso
2013-01-18 16:48                             ` Willem de Bruijn
2013-01-18 17:17                               ` [PATCH next] " Willem de Bruijn
2013-01-21 11:28                                 ` Pablo Neira Ayuso
2013-01-21 11:33                                   ` Pablo Neira Ayuso
2013-01-21 11:42                                     ` Florian Westphal
2013-01-21 12:03                                       ` Pablo Neira Ayuso
2013-01-21 16:02                                   ` Willem de Bruijn
2013-01-21 13:44                               ` [PATCH next v3] " Pablo Neira Ayuso
2013-01-22  8:46                                 ` Florian Westphal
2013-01-22  9:46                                   ` Jozsef Kadlecsik
2013-01-22 10:03                                     ` Maciej Żenczykowski
2013-01-22 11:11                                     ` Pablo Neira Ayuso
2013-01-23 15:59                                   ` Willem de Bruijn
2013-01-23 16:21                                     ` Pablo Neira Ayuso
2013-01-23 16:38                                       ` Willem de Bruijn
2013-01-23 18:56                                         ` Pablo Neira Ayuso
2013-02-18  3:44                                           ` [PATCH] utils: bpf_compile Willem de Bruijn
2013-02-20 10:38                                             ` Daniel Borkmann
2013-02-21  4:35                                               ` Willem de Bruijn
2013-02-21 13:43                                                 ` Daniel Borkmann
2013-03-12 15:44                                                   ` [PATCH next] " Willem de Bruijn
2013-04-01 22:20                                                     ` Pablo Neira Ayuso
2013-04-03 15:32                                                       ` Willem de Bruijn
2013-04-04  9:34                                                         ` Pablo Neira Ayuso
2013-02-18  3:52                                           ` [PATCH next v3] iptables: add xt_bpf match Willem de Bruijn
2013-02-24  2:15                                             ` Maciej Żenczykowski
2013-02-27 20:39                                               ` Willem de Bruijn
2012-12-05 19:28 ` [PATCH rfc] netfilter: two xtables matches Willem de Bruijn
2012-12-05 20:00   ` Jan Engelhardt
2012-12-05 21:45     ` Willem de Bruijn
2012-12-05 21:50       ` Willem de Bruijn
2012-12-05 22:35       ` Jan Engelhardt
2012-12-06  5:22     ` Pablo Neira Ayuso
2012-12-06 21:12       ` Willem de Bruijn
2012-12-07  7:22         ` Pablo Neira Ayuso
2012-12-07 13:20         ` Pablo Neira Ayuso
2012-12-07 17:26           ` Willem de Bruijn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121205194854.GB28730@1984 \
    --to=pablo@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kaber@trash.net \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).