Netdev List
 help / color / mirror / Atom feed
* Re: [RFC PATCH 2/2] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-05 16:00 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S. Tsirkin, netdev, linux-security-module, selinux
In-Reply-To: <2433879.zRVUYBGg1f@jason-thinkpad-t430s>

On Wednesday, December 05, 2012 10:01:31 PM Jason Wang wrote:
> On Wednesday, December 05, 2012 01:44:55 PM Michael S. Tsirkin wrote:
> > On Wed, Dec 05, 2012 at 02:19:22PM +0800, Jason Wang wrote:
> > > On 12/05/2012 02:17 AM, Paul Moore wrote:
> > > > On Tuesday, December 04, 2012 07:36:26 PM Michael S. Tsirkin wrote:
> > > >> On Tue, Dec 04, 2012 at 11:18:57AM -0500, Paul Moore wrote:
> > > >>> Okay, based on your explanation of TUNSETQUEUE, the steps below are
> > > >>> what I
> > > >>> believe we need to do ... if you disagree speak up quickly please.
> > > >>> 
> > > >>> A. TUNSETIFF (new, non-persistent device)
> > > >>> 
> > > >>> [Allocate and initialize the tun_struct LSM state based on the
> > > >>> calling
> > > >>> process, use this state to label the TUN socket.]
> > > >>> 
> > > >>> 1. Call security_tun_dev_create() which authorizes the action.
> > > >>> 2. Call security_tun_dev_alloc_security() which allocates the
> > > >>> tun_struct
> > > >>> LSM blob and SELinux sets some internal blob state to record the
> > > >>> label
> > > >>> of
> > > >>> the calling process.
> > > >>> 3. Call security_tun_dev_attach() which sets the label of the TUN
> > > >>> socket
> > > >>> to match the label stored in the tun_struct LSM blob during A2.  No
> > > >>> authorization is done at this point since the socket is
> > > >>> new/unlabeled.
> > > >>> 
> > > >>> B. TUNSETIFF (existing, persistent device)
> > > >>> 
> > > >>> [Relabel the existing tun_struct LSM state based on the calling
> > > >>> process,
> > > >>> use this state to label the TUN socket.]
> > > >>> 
> > > >>> 1. Attempt to relabel/reset the tun_struct LSM blob from the
> > > >>> currently
> > > >>> stored value, set during A2, to the label of the current calling
> > > >>> process.
> > > >>> *** THIS IS NOT CURRENTLY DONE IN THE RFC PATCH ***
> > > >>> 2. Call security_tun_dev_attach() which sets the label of the TUN
> > > >>> socket
> > > >>> to match the label stored in the tun_struct LSM blob during B1. No
> > > >>> authorization is done at this point since the socket is
> > > >>> new/unlabeled.
> > > >>> 
> > > >>> C. TUNSETQUEUE
> > > >>> 
> > > >>> [Use the existing tun_struct LSM state to label the new TUN socket.]
> > > >>> 
> > > >>> 1. Call security_tun_dev_attach() which sets the label of the TUN
> > > >>> socket
> > > >>> to match the label stored in the tun_struct LSM blob set during
> > > >>> either
> > > >>> A2
> > > >>> or B1. No authorization is done at this point since the socket is
> > > >>> new/unlabeled.
> > > >> 
> > > >> Here's what bothers me. libvirt currently opens tun and passes
> > > >> fd to qemu. What would prevent qemu from attaching fd using
> > > >> TUNSETQUEUE
> > > >> to another device it does not own?
> > > > 
> > > > True, assuming all the above is correct and that I'm understanding it
> > > > correctly (Jason?), we should probably add a new SELinux access
> > > > control
> > > > for
> > > > TUNSETQUEUE.
> > > 
> > > Yes, we need make sure qemu can call TUNSETQUEUE for the device it does
> > > not own.
> > 
> > Meaning can *not* call?
> 
> Sorry for not being clear, I mean qemu can call TUNSETQUEUE for the device
> it owns and for the device it does not own, it can't call.

Okay, let me add a access control for TUNSETQUEUE and I'll post an updated 
patchset later today.

-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* Re: [RFC PATCH 1/2] tun: correctly report an error in tun_flow_init()
From: Paul Moore @ 2012-12-05 16:02 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <20121129220629.30020.99947.stgit@sifl>

On Thursday, November 29, 2012 05:06:29 PM Paul Moore wrote:
> On error, the error code from tun_flow_init() is lost inside
> tun_set_iff(), this patch fixes this by assigning the tun_flow_init()
> error code to the "err" variable which is returned by
> the tun_flow_init() function on error.
> 
> Signed-off-by: Paul Moore <pmoore@redhat.com>

Jason, we've had some good discussion around patch 2/2 but nothing on this 
fix; can I assume you are okay with this patch?  If so I think we should go 
ahead and apply this ...

> ---
>  drivers/net/tun.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 607a3a5..877ffe2 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1605,7 +1605,8 @@ static int tun_set_iff(struct net *net, struct file
> *file, struct ifreq *ifr)
> 
>  		tun_net_init(dev);
> 
> -		if (tun_flow_init(tun))
> +		err = tun_flow_init(tun);
> +		if (err < 0)
>  			goto err_free_dev;
> 
>  		dev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST |
-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* RE: kernel BUG at /build/buildd/linux-2.6.32/mm/mempolicy.c
From: Devendra C @ 2012-12-05 16:46 UTC (permalink / raw)
  To: Bokhan Artem, linux-kernel@vger.kernel.org, Borislav Petkov (
  Cc: netdev@vger.kernel.org
In-Reply-To: <50BF66A0.1050503@eml.ru>

> Date: Wed, 5 Dec 2012 22:22:08 +0700
> From: art@eml.ru
> To: linux-kernel@vger.kernel.org
> Subject: kernel BUG at /build/buildd/linux-2.6.32/mm/mempolicy.c
> 
> Hello.
> 
> We have several servers with mongodb running. Each server has several mongodb 
> instances. Mongodb dataset is larger then availiable memory (mongodb uses 
> memory-mapped files for all disk I/O).
> 2.6.32 and 2.6.38 kernels periodically crash and crash happens only with mongodb 
> servers.
> 
> 2.6.38's trace is in attachment.
> For 2.6.32 I only have "kernel BUG at 
> /build/buildd/linux-2.6.32/mm/mempolicy.c:1489!"
> 
> Heed help! :)

Adding netdev to cc,

seems like the networking bug?

if so please attach some tcpdump traces, lspci -vv -n, 

please forgive me if its not related to networking. :(

thanks, 		 	   		  

^ permalink raw reply

* Re: WARNING: drivers/net/ethernet/dlink/sundance.o(.text+0x2e87): Section mismatch in reference from the function sundance_probe1() to the variable .devinit.rodata:sundance_pci_tbl
From: David Miller @ 2012-12-05 17:50 UTC (permalink / raw)
  To: kda; +Cc: fengguang.wu, wfp5p, netdev, gregkh
In-Reply-To: <CAOJe8K3xBAuhVND-9dJgSxShhpHQ3zko=bE2CNrAKVa-gvDouA@mail.gmail.com>

From: Denis Kirjanov <kda@linux-powerpc.org>
Date: Wed, 5 Dec 2012 11:12:32 +0300

> I"ll fix it.

You can't.

It's only going to get fixed by a change in Greg KH's device tree
which updates the PCI table macros to not use __dev* section tags.

Fengguant, _PLEASE_, as Greg requested, stop reporting these section
mismatch errors, they aren't helpful and are wasting valuable
developer time.

^ permalink raw reply

* Re: [PATCH net-next 0/7] Allow to monitor multicast cache event via rtnetlink
From: David Miller @ 2012-12-05 17:53 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev
In-Reply-To: <50BF29DA.7020903@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Wed, 05 Dec 2012 12:02:50 +0100

> Le 04/12/2012 21:02, Nicolas Dichtel a écrit :
>> I can have a try on a tile platform. I don't have access to sparc or
>> mips.
> Hmm, I've read arm instead of mips! So I've tried on mips. Data are
> aligned on 32-bit, like for all netlink messages. nla_put_u64() will
> do the same, as it calls nla_put().
> 
> And the kernel will only use memcpy() to treat this attribute. Reader
> will be in userland.

Then userland will trap if the 64-bit values are only 32-bit aligned.

That's the problem I'm talking about.

I don't want to export any more unaligned 64-bit values in netlink
messages, it's a complete mess.

^ permalink raw reply

* Re: [PATCH net-next 0/7] Allow to monitor multicast cache event via rtnetlink
From: David Miller @ 2012-12-05 17:54 UTC (permalink / raw)
  To: David.Laight; +Cc: nicolas.dichtel, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70DA@saturn3.aculab.com>

From: "David Laight" <David.Laight@ACULAB.COM>
Date: Wed, 5 Dec 2012 11:41:33 -0000

> Probably worth commenting that the 64bit items might only be 32bit aligned.
> Just to stop anyone trying to read/write them with pointer casts.

Rather, let's not create this situation at all.

It's totally inappropriate to have special code to handle every single
time we want to put 64-bit values into netlink messages.

We need a real solution to this issue.

^ permalink raw reply

* Re: [PATCH] net: ICMPv6 packets transmitted on wrong interface if nfmark is mangled
From: David Miller @ 2012-12-05 17:57 UTC (permalink / raw)
  To: dries.dewinter; +Cc: pablo, kaber, netdev, netfilter-devel
In-Reply-To: <CA+e04fgiMDoaCx0fpdK5F6YfJ_CSOZ13q1fn9xs7De4JO02c6g@mail.gmail.com>

From: Dries De Winter <dries.dewinter@gmail.com>
Date: Wed, 5 Dec 2012 14:41:59 +0100

> My "noreroute" patch will not fix this. Therefore it's indeed maybe
> better to add a simple check to ip6_route_me_harder(): not a check for
> ICMPv6, but a check for (ipv6_addr_type(&iph->daddr) &
> IPV6_ADDR_LINKLOCAL) instead. What do you think?

What if a packet is rewritten from a non-link-local destination address
into a link-local one?  Or vice versa?

Your test will fail in those cases.

^ permalink raw reply

* Re: [PATCH] 3com: make 3c59x depend on HAS_IOPORT
From: David Miller @ 2012-12-05 17:58 UTC (permalink / raw)
  To: jang; +Cc: netdev
In-Reply-To: <1354716280.29038.2.camel@hal>

From: Jan Glauber <jang@linux.vnet.ibm.com>
Date: Wed, 05 Dec 2012 15:04:40 +0100

> From: Jan Glauber <jang@linux.vnet.ibm.com>
> 
> The 3com driver for 3c59x requires ioport_map. Since not all
> architectures support IO port mapping make 3c59x dependent on HAS_IOPORT.
> 
> Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>

Which platforms support PCI or EISA yet do not set HAS_IOPORT?

^ permalink raw reply

* Re: [PATCH] net/macb: increase RX buffer size for GEM
From: David Miller @ 2012-12-05 17:58 UTC (permalink / raw)
  To: nicolas.ferre; +Cc: netdev, linux-arm-kernel, linux-kernel, manabian, plagnioj
In-Reply-To: <50BF6366.8080600@atmel.com>

From: Nicolas Ferre <nicolas.ferre@atmel.com>
Date: Wed, 5 Dec 2012 16:08:22 +0100

> On 12/04/2012 07:22 PM, David Miller :
>> From: Nicolas Ferre <nicolas.ferre@atmel.com>
>> Date: Mon, 3 Dec 2012 13:15:43 +0100
>> 
>>> Macb Ethernet controller requires a RX buffer of 128 bytes. It is
>>> highly sub-optimal for Gigabit-capable GEM that is able to use
>>> a bigger DMA buffer. Change this constant and associated macros
>>> with data stored in the private structure.
>>> I also kept the result of buffers per page calculation to lower the
>>> impact of this move to a variable rx buffer size on rx hot path.
>>> RX DMA buffer size has to be multiple of 64 bytes as indicated in
>>> DMA Configuration Register specification.
>>>
>>> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
>> 
>> This looks like it will waste a couple hundred bytes for 1500 MTU
>> frames, am I right?
> 
> Yep! But buffers get recycled, and with the current memory management by
> pages, it seems that I have to rework some part of it to optimize this
> memory usage (8KB memory blocks split into 5 buffers each as David said...).
> 
> Do you think it is worth digging this way or may I rework the rx buffer
> management in case of the GEM interface. If I implement a different path
> for GEM interface, I will have the possibility to tailor rx DMA buffers
> from 1500 Bytes up to 10KB jumbo frames...

I almost think you have to.

^ permalink raw reply

* [PATCH net-next] ipv6: avoid taking locks at socket dismantle
From: Eric Dumazet @ 2012-12-05 19:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

ipv6_sock_mc_close() is called for ipv6 sockets at close time, and most
of them don't use multicast.

Add a test to avoid contention on a shared spinlock.

Same heuristic applies for ipv6_sock_ac_close(), to avoid contention
on a shared rwlock.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/anycast.c |    3 +++
 net/ipv6/mcast.c   |    3 +++
 2 files changed, 6 insertions(+)

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 2f4f584..757a810 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -189,6 +189,9 @@ void ipv6_sock_ac_close(struct sock *sk)
 	struct net *net = sock_net(sk);
 	int	prev_index;
 
+	if (!np->ipv6_ac_list)
+		return;
+
 	write_lock_bh(&ipv6_sk_ac_lock);
 	pac = np->ipv6_ac_list;
 	np->ipv6_ac_list = NULL;
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index b19ed51..28dfa5f 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -284,6 +284,9 @@ void ipv6_sock_mc_close(struct sock *sk)
 	struct ipv6_mc_socklist *mc_lst;
 	struct net *net = sock_net(sk);
 
+	if (!rcu_access_pointer(np->ipv6_mc_list))
+		return;
+
 	spin_lock(&ipv6_sk_mc_lock);
 	while ((mc_lst = rcu_dereference_protected(np->ipv6_mc_list,
 				lockdep_is_held(&ipv6_sk_mc_lock))) != NULL) {

^ permalink raw reply related

* [PATCH rfc] netfilter: two xtables matches
From: Willem de Bruijn @ 2012-12-05 19:22 UTC (permalink / raw)
  To: netfilter-devel, netdev, edumazet, davem, kaber, pablo

The second patch is more speculative and aims to be a more general
workaround, as well as a performance optimization: support
(preferably JIT compiled) BPF programs as iptables match rules.

Potentially, the skb->priority match can be implemented by applying
only the second patch and adding a new BPF_S_ANC ancillary field to
Linux Socket Filters.

I also wrote corresponding userspace patches to iptables. The process
for submitting both kernel and user patches is not 100% clear to me.
Sending the kernel bits to both netdev and netfilter-devel for
initial feedback. Please correct me if you want it another way.

The patches apply to net-next.


^ permalink raw reply

* [PATCH 1/2] netfilter: add xt_priority xtables match
From: Willem de Bruijn @ 2012-12-05 19:22 UTC (permalink / raw)
  To: netfilter-devel, netdev, edumazet, davem, kaber, pablo; +Cc: Willem de Bruijn
In-Reply-To: <1354735339-13402-1-git-send-email-willemb@google.com>

Add an iptables match based on the skb->priority field. This field
can be set by socket option SO_PRIORITY, among others.

The match supports range based matching on packet priority, with
optional inversion. Before matching, a mask can be applied to the
priority field to handle the case where different regions of the
bitfield are reserved for unrelated uses.
---
 include/linux/netfilter/xt_priority.h |   13 ++++++++
 net/netfilter/Kconfig                 |    9 ++++++
 net/netfilter/Makefile                |    1 +
 net/netfilter/xt_priority.c           |   51 +++++++++++++++++++++++++++++++++
 4 files changed, 74 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/netfilter/xt_priority.h
 create mode 100644 net/netfilter/xt_priority.c

diff --git a/include/linux/netfilter/xt_priority.h b/include/linux/netfilter/xt_priority.h
new file mode 100644
index 0000000..da9a288
--- /dev/null
+++ b/include/linux/netfilter/xt_priority.h
@@ -0,0 +1,13 @@
+#ifndef _XT_PRIORITY_H
+#define _XT_PRIORITY_H
+
+#include <linux/types.h>
+
+struct xt_priority_info {
+	__u32 min;
+	__u32 max;
+	__u32 mask;
+	__u8  invert;
+};
+
+#endif /*_XT_PRIORITY_H */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index fefa514..c9739c6 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -1093,6 +1093,15 @@ config NETFILTER_XT_MATCH_PKTTYPE
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
+config NETFILTER_XT_MATCH_PRIORITY
+	tristate '"priority" match support'
+	depends on NETFILTER_ADVANCED
+	help
+	  This option adds a match based on the value of the sk_buff
+	  priority field.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_QUOTA
 	tristate '"quota" match support'
 	depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 3259697..8e5602f 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -124,6 +124,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_OWNER) += xt_owner.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_PHYSDEV) += xt_physdev.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_PKTTYPE) += xt_pkttype.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_POLICY) += xt_policy.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_PRIORITY) += xt_priority.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_RATEEST) += xt_rateest.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o
diff --git a/net/netfilter/xt_priority.c b/net/netfilter/xt_priority.c
new file mode 100644
index 0000000..4982eee
--- /dev/null
+++ b/net/netfilter/xt_priority.c
@@ -0,0 +1,51 @@
+/* Xtables module to match packets based on their sk_buff priority field.
+ * Copyright 2012 Google Inc.
+ * Written by Willem de Bruijn <willemb@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+
+#include <linux/netfilter/xt_priority.h>
+#include <linux/netfilter/x_tables.h>
+
+MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
+MODULE_DESCRIPTION("Xtables: priority filter match");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("ipt_priority");
+MODULE_ALIAS("ip6t_priority");
+
+static bool priority_mt(const struct sk_buff *skb,
+			struct xt_action_param *par)
+{
+	const struct xt_priority_info *info = par->matchinfo;
+
+	__u32 priority = skb->priority & info->mask;
+	return (priority >= info->min && priority <= info->max) ^ info->invert;
+}
+
+static struct xt_match priority_mt_reg __read_mostly = {
+	.name		= "priority",
+	.revision	= 0,
+	.family		= NFPROTO_UNSPEC,
+	.match		= priority_mt,
+	.matchsize	= sizeof(struct xt_priority_info),
+	.me		= THIS_MODULE,
+};
+
+static int __init priority_mt_init(void)
+{
+	return xt_register_match(&priority_mt_reg);
+}
+
+static void __exit priority_mt_exit(void)
+{
+	xt_unregister_match(&priority_mt_reg);
+}
+
+module_init(priority_mt_init);
+module_exit(priority_mt_exit);
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH 2/2] netfilter: add xt_bpf xtables match
From: Willem de Bruijn @ 2012-12-05 19:22 UTC (permalink / raw)
  To: netfilter-devel, netdev, edumazet, davem, kaber, pablo; +Cc: Willem de Bruijn
In-Reply-To: <1354735339-13402-1-git-send-email-willemb@google.com>

A new match that executes sk_run_filter on every packet. BPF filters
can access skbuff fields that are out of scope for existing iptables
rules, allow more expressive logic, and on platforms with JIT support
can even be faster.

I have a corresponding iptables patch that takes `tcpdump -ddd`
output, as used in the examples below. The two parts communicate
using a variable length structure. This is similar to ebt_among,
but new for iptables.

Verified functionality by inserting an ip source filter on chain
INPUT and an ip dest filter on chain OUTPUT and noting that ping
failed while a rule was active:

iptables -v -A INPUT -m bpf --bytecode '4,32 0 0 12,21 0 1 $SADDR,6 0 0 96,6 0 0 0,' -j DROP
iptables -v -A OUTPUT -m bpf --bytecode '4,32 0 0 16,21 0 1 $DADDR,6 0 0 96,6 0 0 0,' -j DROP

Evaluated throughput by running netperf TCP_STREAM over loopback on
x86_64. I expected the BPF filter to outperform hardcoded iptables
filters when replacing multiple matches with a single bpf match, but
even a single comparison to u32 appears to do better. Relative to the
benchmark with no filter applied, rate with 100 BPF filters dropped
to 81%. With 100 U32 filters it dropped to 55%. The difference sounds
excessive to me, but was consistent on my hardware. Commands used:

for i in `seq 100`; do iptables -A OUTPUT -m bpf --bytecode '4,48 0 0 9,21 0 1 20,6 0 0 96,6 0 0 0,' -j DROP; done
for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done

iptables -F OUTPUT

for i in `seq 100`; do iptables -A OUTPUT -m u32 --u32 '6&0xFF=0x20' -j DROP; done
for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done

FYI: perf top

[bpf]
    33.94%  [kernel]          [k] copy_user_generic_string
     8.92%  [kernel]          [k] sk_run_filter
     7.77%  [ip_tables]       [k] ipt_do_table

[u32]
    22.63%  [kernel]          [k] copy_user_generic_string
    14.46%  [kernel]          [k] memcpy
     9.19%  [ip_tables]       [k] ipt_do_table
     8.47%  [xt_u32]          [k] u32_mt
     5.32%  [kernel]          [k] skb_copy_bits

The big difference appears to be in memory copying. I have not
looked into u32, so cannot explain this right now. More interestingly,
at higher rate, sk_run_filter appears to use as many cycles as u32_mt
(both traces have roughly the same number of events).

One caveat: to work independent of device link layer, the filter
expects DLT_RAW style BPF programs, i.e., those that expect the
packet to start at the IP layer.
---
 include/linux/netfilter/xt_bpf.h |   17 +++++++
 net/netfilter/Kconfig            |    9 ++++
 net/netfilter/Makefile           |    1 +
 net/netfilter/x_tables.c         |    5 +-
 net/netfilter/xt_bpf.c           |   88 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 118 insertions(+), 2 deletions(-)
 create mode 100644 include/linux/netfilter/xt_bpf.h
 create mode 100644 net/netfilter/xt_bpf.c

diff --git a/include/linux/netfilter/xt_bpf.h b/include/linux/netfilter/xt_bpf.h
new file mode 100644
index 0000000..23502c0
--- /dev/null
+++ b/include/linux/netfilter/xt_bpf.h
@@ -0,0 +1,17 @@
+#ifndef _XT_BPF_H
+#define _XT_BPF_H
+
+#include <linux/filter.h>
+#include <linux/types.h>
+
+struct xt_bpf_info {
+	__u16 bpf_program_num_elem;
+
+	/* only used in kernel */
+	struct sk_filter *filter __attribute__((aligned(8)));
+
+	/* variable size, based on program_num_elem */
+	struct sock_filter bpf_program[0];
+};
+
+#endif /*_XT_BPF_H */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index c9739c6..c7cc0b8 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -798,6 +798,15 @@ config NETFILTER_XT_MATCH_ADDRTYPE
 	  If you want to compile it as a module, say M here and read
 	  <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
 
+config NETFILTER_XT_MATCH_BPF
+	tristate '"bpf" match support'
+	depends on NETFILTER_ADVANCED
+	help
+	  BPF matching applies a linux socket filter to each packet and
+          accepts those for which the filter returns non-zero.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_CLUSTER
 	tristate '"cluster" match support'
 	depends on NF_CONNTRACK
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 8e5602f..9f12eeb 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -98,6 +98,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
 
 # matches
 obj-$(CONFIG_NETFILTER_XT_MATCH_ADDRTYPE) += xt_addrtype.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_BPF) += xt_bpf.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_COMMENT) += xt_comment.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index 8d987c3..26306be 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -379,8 +379,9 @@ int xt_check_match(struct xt_mtchk_param *par,
 	if (XT_ALIGN(par->match->matchsize) != size &&
 	    par->match->matchsize != -1) {
 		/*
-		 * ebt_among is exempt from centralized matchsize checking
-		 * because it uses a dynamic-size data set.
+		 * matches of variable size length, such as ebt_among,
+		 * are exempt from centralized matchsize checking. They
+		 * skip the test by setting xt_match.matchsize to -1.
 		 */
 		pr_err("%s_tables: %s.%u match: invalid size "
 		       "%u (kernel) != (user) %u\n",
diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
new file mode 100644
index 0000000..07077c5
--- /dev/null
+++ b/net/netfilter/xt_bpf.c
@@ -0,0 +1,88 @@
+/* Xtables module to match packets using a BPF filter.
+ * Copyright 2012 Google Inc.
+ * Written by Willem de Bruijn <willemb@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/ipv6.h>
+#include <linux/filter.h>
+#include <net/ip.h>
+
+#include <linux/netfilter/xt_bpf.h>
+#include <linux/netfilter/x_tables.h>
+
+MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
+MODULE_DESCRIPTION("Xtables: BPF filter match");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("ipt_bpf");
+MODULE_ALIAS("ip6t_bpf");
+
+static int bpf_mt_check(const struct xt_mtchk_param *par)
+{
+	struct xt_bpf_info *info = par->matchinfo;
+	const struct xt_entry_match *match;
+	struct sock_fprog program;
+	int expected_len;
+
+	match = container_of(par->matchinfo, const struct xt_entry_match, data);
+	expected_len = sizeof(struct xt_entry_match) +
+		       sizeof(struct xt_bpf_info) +
+		       (sizeof(struct sock_filter) *
+			info->bpf_program_num_elem);
+
+	if (match->u.match_size != expected_len) {
+		pr_info("bpf: check failed: incorrect length\n");
+		return -EINVAL;
+	}
+
+	program.len = info->bpf_program_num_elem;
+	program.filter = info->bpf_program;
+	if (sk_unattached_filter_create(&info->filter, &program)) {
+		pr_info("bpf: check failed: parse error\n");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static bool bpf_mt(const struct sk_buff *skb, struct xt_action_param *par)
+{
+	const struct xt_bpf_info *info = par->matchinfo;
+
+	return SK_RUN_FILTER(info->filter, skb);
+}
+
+static void bpf_mt_destroy(const struct xt_mtdtor_param *par)
+{
+	const struct xt_bpf_info *info = par->matchinfo;
+	sk_unattached_filter_destroy(info->filter);
+}
+
+static struct xt_match bpf_mt_reg __read_mostly = {
+	.name		= "bpf",
+	.revision	= 0,
+	.family		= NFPROTO_UNSPEC,
+	.checkentry	= bpf_mt_check,
+	.match		= bpf_mt,
+	.destroy	= bpf_mt_destroy,
+	.matchsize	= -1, /* skip xt_check_match because of dynamic len */
+	.me		= THIS_MODULE,
+};
+
+static int __init bpf_mt_init(void)
+{
+	return xt_register_match(&bpf_mt_reg);
+}
+
+static void __exit bpf_mt_exit(void)
+{
+	xt_unregister_match(&bpf_mt_reg);
+}
+
+module_init(bpf_mt_init);
+module_exit(bpf_mt_exit);
-- 
1.7.7.3

^ permalink raw reply related

* Re: [PATCH rfc] netfilter: two xtables matches
From: Willem de Bruijn @ 2012-12-05 19:28 UTC (permalink / raw)
  To: netfilter-devel, netdev, Eric Dumazet, David Miller, kaber, pablo
In-Reply-To: <1354735339-13402-1-git-send-email-willemb@google.com>

Somehow, the first part of this email went missing. Not critical,
but for completeness:

These two patches each add an xtables match.

The xt_priority match is a straighforward addition in the style of
xt_mark, adding the option to filter on one more sk_buff field. I
have an immediate application for this. The amount of code (in
kernel + userspace) to add a single check proved quite large.

On Wed, Dec 5, 2012 at 2:22 PM, Willem de Bruijn <willemb@google.com> wrote:
> The second patch is more speculative and aims to be a more general
> workaround, as well as a performance optimization: support
> (preferably JIT compiled) BPF programs as iptables match rules.
>
> Potentially, the skb->priority match can be implemented by applying
> only the second patch and adding a new BPF_S_ANC ancillary field to
> Linux Socket Filters.
>
> I also wrote corresponding userspace patches to iptables. The process
> for submitting both kernel and user patches is not 100% clear to me.
> Sending the kernel bits to both netdev and netfilter-devel for
> initial feedback. Please correct me if you want it another way.
>
> The patches apply to net-next.
>

^ permalink raw reply

* Re: [RFT PATCH] 8139cp: properly support change of MTU values [v2]
From: John Greene @ 2012-12-05 19:41 UTC (permalink / raw)
  To: Francois Romieu; +Cc: David Miller, netdev, dwmw2
In-Reply-To: <20121203204608.GA9815@electric-eye.fr.zoreil.com>

On 12/03/2012 03:46 PM, Francois Romieu wrote:
> David Miller <davem@davemloft.net> :
> [...]
>> I've applied this to net-next, if it triggers any problems we have
>> some time to work it out before 3.8 is released.
>
> I have bounced the messages to David Woodhouse since he authored the
> last 8139cp changes in net-next and owns the hardware to notice
> regressions.
>
> My message of two days ago was wrong : it is not possible for the irq
> handler to process a Tx event after the rings have been freed. Things
> still look racy wrt netpoll though.
>
> Any objection against the patch below ?
>
> (I did not gotoize the dev == NULL test: it is really unlikely and
> should go away).
>
> diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
> index 0da3f5e..57cd542 100644
> --- a/drivers/net/ethernet/realtek/8139cp.c
> +++ b/drivers/net/ethernet/realtek/8139cp.c
> @@ -577,28 +577,30 @@ static irqreturn_t cp_interrupt (int irq, void *dev_instance)
>   {
>   	struct net_device *dev = dev_instance;
>   	struct cp_private *cp;
> +	int handled = 0;
>   	u16 status;
>
>   	if (unlikely(dev == NULL))
>   		return IRQ_NONE;
>   	cp = netdev_priv(dev);
>
> +	spin_lock(&cp->lock);
> +
>   	status = cpr16(IntrStatus);
>   	if (!status || (status == 0xFFFF))
> -		return IRQ_NONE;
> +		goto out_unlock;
> +
> +	handled = 1;
>
>   	netif_dbg(cp, intr, dev, "intr, status %04x cmd %02x cpcmd %04x\n",
>   		  status, cpr8(Cmd), cpr16(CpCmd));
>
>   	cpw16(IntrStatus, status & ~cp_rx_intr_mask);
>
> -	spin_lock(&cp->lock);
> -
>   	/* close possible race's with dev_close */
>   	if (unlikely(!netif_running(dev))) {
>   		cpw16(IntrMask, 0);
> -		spin_unlock(&cp->lock);
> -		return IRQ_HANDLED;
> +		goto out_unlock;
>   	}
>
>   	if (status & (RxOK | RxErr | RxEmpty | RxFIFOOvr))
> @@ -612,8 +614,6 @@ static irqreturn_t cp_interrupt (int irq, void *dev_instance)
>   	if (status & LinkChg)
>   		mii_check_media(&cp->mii_if, netif_msg_link(cp), false);
>
> -	spin_unlock(&cp->lock);
> -
>   	if (status & PciErr) {
>   		u16 pci_status;
>
> @@ -625,7 +625,10 @@ static irqreturn_t cp_interrupt (int irq, void *dev_instance)
>   		/* TODO: reset hardware */
>   	}
>
> -	return IRQ_HANDLED;
> +out_unlock:
> +	spin_unlock(&cp->lock);
> +
> +	return IRQ_RETVAL(handled);
>   }
>
>   #ifdef CONFIG_NET_POLL_CONTROLLER
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
I think this is a good change, interesting it isn't in already or 
causing more issues on multi-processor boxes already. (perhaps it is?).

So do you think these patches need to go together? I could make a case 
either way.

Is this upstream yet?

-- 
John Greene
jogreene@redhat.com

^ permalink raw reply

* Re: [PATCH] 3com: make 3c59x depend on HAS_IOPORT
From: Jan Glauber @ 2012-12-05 19:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20121205.125804.979819753893520607.davem@davemloft.net>

On Wed, 2012-12-05 at 12:58 -0500, David Miller wrote:
> From: Jan Glauber <jang@linux.vnet.ibm.com>
> Date: Wed, 05 Dec 2012 15:04:40 +0100
> 
> > From: Jan Glauber <jang@linux.vnet.ibm.com>
> > 
> > The 3com driver for 3c59x requires ioport_map. Since not all
> > architectures support IO port mapping make 3c59x dependent on HAS_IOPORT.
> > 
> > Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
> 
> Which platforms support PCI or EISA yet do not set HAS_IOPORT?
> 

s390. We wont get support for port I/O in the PCI hardware/firmware
layer. (The patches for PCI on s390 are currently in linux-next).

--Jan

^ permalink raw reply

* Re: [PATCH 2/2] netfilter: add xt_bpf xtables match
From: Pablo Neira Ayuso @ 2012-12-05 19:48 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netfilter-devel, netdev, edumazet, davem, kaber
In-Reply-To: <1354735339-13402-3-git-send-email-willemb@google.com>

Hi Willem,

On Wed, Dec 05, 2012 at 02:22:19PM -0500, Willem de Bruijn wrote:
> A new match that executes sk_run_filter on every packet. BPF filters
> can access skbuff fields that are out of scope for existing iptables
> rules, allow more expressive logic, and on platforms with JIT support
> can even be faster.
> 
> I have a corresponding iptables patch that takes `tcpdump -ddd`
> output, as used in the examples below. The two parts communicate
> using a variable length structure. This is similar to ebt_among,
> but new for iptables.
> 
> Verified functionality by inserting an ip source filter on chain
> INPUT and an ip dest filter on chain OUTPUT and noting that ping
> failed while a rule was active:
> 
> iptables -v -A INPUT -m bpf --bytecode '4,32 0 0 12,21 0 1 $SADDR,6 0 0 96,6 0 0 0,' -j DROP
> iptables -v -A OUTPUT -m bpf --bytecode '4,32 0 0 16,21 0 1 $DADDR,6 0 0 96,6 0 0 0,' -j DROP

I like this BPF idea for iptables.

I made a similar extension time ago, but it was taking a file as
parameter. That file contained in BPF code. I made a simple bison
parser that takes BPF code and put it into the bpf array of
instructions. It would be a bit more intuitive to define a filter and
we can distribute it with iptables.

Let me check on my internal trees, I can put that user-space code
somewhere in case you're interested.

> Evaluated throughput by running netperf TCP_STREAM over loopback on
> x86_64. I expected the BPF filter to outperform hardcoded iptables
> filters when replacing multiple matches with a single bpf match, but
> even a single comparison to u32 appears to do better. Relative to the
> benchmark with no filter applied, rate with 100 BPF filters dropped
> to 81%. With 100 U32 filters it dropped to 55%. The difference sounds
> excessive to me, but was consistent on my hardware. Commands used:
> 
> for i in `seq 100`; do iptables -A OUTPUT -m bpf --bytecode '4,48 0 0 9,21 0 1 20,6 0 0 96,6 0 0 0,' -j DROP; done
> for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done
> 
> iptables -F OUTPUT
> 
> for i in `seq 100`; do iptables -A OUTPUT -m u32 --u32 '6&0xFF=0x20' -j DROP; done
> for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done
> 
> FYI: perf top
> 
> [bpf]
>     33.94%  [kernel]          [k] copy_user_generic_string
>      8.92%  [kernel]          [k] sk_run_filter
>      7.77%  [ip_tables]       [k] ipt_do_table
> 
> [u32]
>     22.63%  [kernel]          [k] copy_user_generic_string
>     14.46%  [kernel]          [k] memcpy
>      9.19%  [ip_tables]       [k] ipt_do_table
>      8.47%  [xt_u32]          [k] u32_mt
>      5.32%  [kernel]          [k] skb_copy_bits
> 
> The big difference appears to be in memory copying. I have not
> looked into u32, so cannot explain this right now. More interestingly,
> at higher rate, sk_run_filter appears to use as many cycles as u32_mt
> (both traces have roughly the same number of events).
> 
> One caveat: to work independent of device link layer, the filter
> expects DLT_RAW style BPF programs, i.e., those that expect the
> packet to start at the IP layer.
> ---
>  include/linux/netfilter/xt_bpf.h |   17 +++++++
>  net/netfilter/Kconfig            |    9 ++++
>  net/netfilter/Makefile           |    1 +
>  net/netfilter/x_tables.c         |    5 +-
>  net/netfilter/xt_bpf.c           |   88 ++++++++++++++++++++++++++++++++++++++
>  5 files changed, 118 insertions(+), 2 deletions(-)
>  create mode 100644 include/linux/netfilter/xt_bpf.h
>  create mode 100644 net/netfilter/xt_bpf.c
> 
> diff --git a/include/linux/netfilter/xt_bpf.h b/include/linux/netfilter/xt_bpf.h
> new file mode 100644
> index 0000000..23502c0
> --- /dev/null
> +++ b/include/linux/netfilter/xt_bpf.h
> @@ -0,0 +1,17 @@
> +#ifndef _XT_BPF_H
> +#define _XT_BPF_H
> +
> +#include <linux/filter.h>
> +#include <linux/types.h>
> +
> +struct xt_bpf_info {
> +	__u16 bpf_program_num_elem;
> +
> +	/* only used in kernel */
> +	struct sk_filter *filter __attribute__((aligned(8)));
> +
> +	/* variable size, based on program_num_elem */
> +	struct sock_filter bpf_program[0];
> +};
> +
> +#endif /*_XT_BPF_H */
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index c9739c6..c7cc0b8 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -798,6 +798,15 @@ config NETFILTER_XT_MATCH_ADDRTYPE
>  	  If you want to compile it as a module, say M here and read
>  	  <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
>  
> +config NETFILTER_XT_MATCH_BPF
> +	tristate '"bpf" match support'
> +	depends on NETFILTER_ADVANCED
> +	help
> +	  BPF matching applies a linux socket filter to each packet and
> +          accepts those for which the filter returns non-zero.
> +
> +	  To compile it as a module, choose M here.  If unsure, say N.
> +
>  config NETFILTER_XT_MATCH_CLUSTER
>  	tristate '"cluster" match support'
>  	depends on NF_CONNTRACK
> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
> index 8e5602f..9f12eeb 100644
> --- a/net/netfilter/Makefile
> +++ b/net/netfilter/Makefile
> @@ -98,6 +98,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
>  
>  # matches
>  obj-$(CONFIG_NETFILTER_XT_MATCH_ADDRTYPE) += xt_addrtype.o
> +obj-$(CONFIG_NETFILTER_XT_MATCH_BPF) += xt_bpf.o
>  obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
>  obj-$(CONFIG_NETFILTER_XT_MATCH_COMMENT) += xt_comment.o
>  obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index 8d987c3..26306be 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -379,8 +379,9 @@ int xt_check_match(struct xt_mtchk_param *par,
>  	if (XT_ALIGN(par->match->matchsize) != size &&
>  	    par->match->matchsize != -1) {
>  		/*
> -		 * ebt_among is exempt from centralized matchsize checking
> -		 * because it uses a dynamic-size data set.
> +		 * matches of variable size length, such as ebt_among,
> +		 * are exempt from centralized matchsize checking. They
> +		 * skip the test by setting xt_match.matchsize to -1.
>  		 */
>  		pr_err("%s_tables: %s.%u match: invalid size "
>  		       "%u (kernel) != (user) %u\n",
> diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
> new file mode 100644
> index 0000000..07077c5
> --- /dev/null
> +++ b/net/netfilter/xt_bpf.c
> @@ -0,0 +1,88 @@
> +/* Xtables module to match packets using a BPF filter.
> + * Copyright 2012 Google Inc.
> + * Written by Willem de Bruijn <willemb@google.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/ipv6.h>
> +#include <linux/filter.h>
> +#include <net/ip.h>
> +
> +#include <linux/netfilter/xt_bpf.h>
> +#include <linux/netfilter/x_tables.h>
> +
> +MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
> +MODULE_DESCRIPTION("Xtables: BPF filter match");
> +MODULE_LICENSE("GPL");
> +MODULE_ALIAS("ipt_bpf");
> +MODULE_ALIAS("ip6t_bpf");
> +
> +static int bpf_mt_check(const struct xt_mtchk_param *par)
> +{
> +	struct xt_bpf_info *info = par->matchinfo;
> +	const struct xt_entry_match *match;
> +	struct sock_fprog program;
> +	int expected_len;
> +
> +	match = container_of(par->matchinfo, const struct xt_entry_match, data);
> +	expected_len = sizeof(struct xt_entry_match) +
> +		       sizeof(struct xt_bpf_info) +
> +		       (sizeof(struct sock_filter) *
> +			info->bpf_program_num_elem);
> +
> +	if (match->u.match_size != expected_len) {
> +		pr_info("bpf: check failed: incorrect length\n");
> +		return -EINVAL;
> +	}
> +
> +	program.len = info->bpf_program_num_elem;
> +	program.filter = info->bpf_program;
> +	if (sk_unattached_filter_create(&info->filter, &program)) {
> +		pr_info("bpf: check failed: parse error\n");
> +		return -EINVAL;
> +	}
> +
> +	return 0;
> +}
> +
> +static bool bpf_mt(const struct sk_buff *skb, struct xt_action_param *par)
> +{
> +	const struct xt_bpf_info *info = par->matchinfo;
> +
> +	return SK_RUN_FILTER(info->filter, skb);
> +}
> +
> +static void bpf_mt_destroy(const struct xt_mtdtor_param *par)
> +{
> +	const struct xt_bpf_info *info = par->matchinfo;
> +	sk_unattached_filter_destroy(info->filter);
> +}
> +
> +static struct xt_match bpf_mt_reg __read_mostly = {
> +	.name		= "bpf",
> +	.revision	= 0,
> +	.family		= NFPROTO_UNSPEC,
> +	.checkentry	= bpf_mt_check,
> +	.match		= bpf_mt,
> +	.destroy	= bpf_mt_destroy,
> +	.matchsize	= -1, /* skip xt_check_match because of dynamic len */
> +	.me		= THIS_MODULE,
> +};
> +
> +static int __init bpf_mt_init(void)
> +{
> +	return xt_register_match(&bpf_mt_reg);
> +}
> +
> +static void __exit bpf_mt_exit(void)
> +{
> +	xt_unregister_match(&bpf_mt_reg);
> +}
> +
> +module_init(bpf_mt_init);
> +module_exit(bpf_mt_exit);
> -- 
> 1.7.7.3
> 

^ permalink raw reply

* Re: [Patch 1/1] net/phy: Add interrupt support for dp83640 phy.
From: Stephan Gatzka @ 2012-12-05 19:52 UTC (permalink / raw)
  To: Richard Cochran; +Cc: netdev, davem
In-Reply-To: <20121205100544.GA2293@netboy.at.omicron.at>


> The patch looks okay to me, but I worry that this might fail on boards
> which have not connected the phyer's PWERDOWN/INTN pin to anything.
> Such designs really need the PHY_POLL working.
> Taking a brief glance at the drivers for two such boards I know of
> (m5234bcc and an IXP), it looks like their MAC drivers set mii_bus irq
> to PHY_POLL, so it might work fine, but this patch still makes me
> nervous that some other board might break.
>
> Maybe this should be a kconfig option?

I don't think so.

Systems using device tree just don't specify the interrupt tag in the 
mdio section.

I have to admit that I don't know how how systems without employing 
device tree get the phy interrupt configured, maybe someone can explain 
that shortly?

Nevertheless, other drivers for very common phys like the lxt971 also 
just set the function pointers to config_intr and ack_interrupt and also 
set the flag PHY_HAS_INTERRUPT. So I don't think that the my patch 
breaks something.

Regards,

Stephan

^ permalink raw reply

* [PATCH 1/2 net-next] cnic: Reset iSCSI EQ during shutdown.
From: Michael Chan @ 2012-12-05 20:10 UTC (permalink / raw)
  To: davem; +Cc: netdev

Without the reset, reloading the cnic driver can cause the iSCSI
Event Queue to be out of sync with the driver and cause intermittent
crash.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Acked-by: Ariel Elior <ariele@broadcom.com>
---
 .../net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h    |    5 +++++
 drivers/net/ethernet/broadcom/cnic.c               |   19 +++++++++++++++++++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h
index 620fe93..60a83ad 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_fw_defs.h
@@ -23,6 +23,11 @@
 	(IRO[159].base + ((funcId) * IRO[159].m1))
 #define CSTORM_FUNC_EN_OFFSET(funcId) \
 	(IRO[149].base + ((funcId) * IRO[149].m1))
+#define CSTORM_HC_SYNC_LINE_INDEX_E1X_OFFSET(hcIndex, sbId) \
+	(IRO[139].base + ((hcIndex) * IRO[139].m1) + ((sbId) * IRO[139].m2))
+#define CSTORM_HC_SYNC_LINE_INDEX_E2_OFFSET(hcIndex, sbId) \
+	(IRO[138].base + (((hcIndex)>>2) * IRO[138].m1) + (((hcIndex)&3) \
+	* IRO[138].m2) + ((sbId) * IRO[138].m3))
 #define CSTORM_IGU_MODE_OFFSET (IRO[157].base)
 #define CSTORM_ISCSI_CQ_SIZE_OFFSET(pfId) \
 	(IRO[316].base + ((pfId) * IRO[316].m1))
diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index cc8434f..2c1f66d 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -5344,8 +5344,27 @@ static void cnic_stop_bnx2_hw(struct cnic_dev *dev)
 static void cnic_stop_bnx2x_hw(struct cnic_dev *dev)
 {
 	struct cnic_local *cp = dev->cnic_priv;
+	u32 hc_index = HC_INDEX_ISCSI_EQ_CONS;
+	u32 sb_id = cp->status_blk_num;
+	u32 idx_off, syn_off;
 
 	cnic_free_irq(dev);
+
+	if (BNX2X_CHIP_IS_E2_PLUS(cp->chip_id)) {
+		idx_off = offsetof(struct hc_status_block_e2, index_values) +
+			  (hc_index * sizeof(u16));
+
+		syn_off = CSTORM_HC_SYNC_LINE_INDEX_E2_OFFSET(hc_index, sb_id);
+	} else {
+		idx_off = offsetof(struct hc_status_block_e1x, index_values) +
+			  (hc_index * sizeof(u16));
+
+		syn_off = CSTORM_HC_SYNC_LINE_INDEX_E1X_OFFSET(hc_index, sb_id);
+	}
+	CNIC_WR16(dev, BAR_CSTRORM_INTMEM + syn_off, 0);
+	CNIC_WR16(dev, BAR_CSTRORM_INTMEM + CSTORM_STATUS_BLOCK_OFFSET(sb_id) +
+		  idx_off, 0);
+
 	*cp->kcq1.hw_prod_idx_ptr = 0;
 	CNIC_WR(dev, BAR_CSTRORM_INTMEM +
 		CSTORM_ISCSI_EQ_CONS_OFFSET(cp->pfid, 0), 0);
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/2 net-next] cnic: Fix rare race condition during iSCSI disconnect.
From: Michael Chan @ 2012-12-05 20:10 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1354738215-6644-1-git-send-email-mchan@broadcom.com>

From: Eddie Wai <eddie.wai@broadcom.com>

If the initiator and target try to close the connection at about the same
time, there is a race condition in the termination sequence for bnx2x.
Fix the problem by waiting for the remote termination to complete before
deleting the Connection ID.  This will prevent the firmware assert.

Update version to 2.5.15.

Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/cnic.c    |   13 +++++++++++--
 drivers/net/ethernet/broadcom/cnic_if.h |    4 ++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/cnic.c b/drivers/net/ethernet/broadcom/cnic.c
index 2c1f66d..1c2a851 100644
--- a/drivers/net/ethernet/broadcom/cnic.c
+++ b/drivers/net/ethernet/broadcom/cnic.c
@@ -3853,12 +3853,17 @@ static int cnic_cm_abort(struct cnic_sock *csk)
 		return cnic_cm_abort_req(csk);
 
 	/* Getting here means that we haven't started connect, or
-	 * connect was not successful.
+	 * connect was not successful, or it has been reset by the target.
 	 */
 
 	cp->close_conn(csk, opcode);
-	if (csk->state != opcode)
+	if (csk->state != opcode) {
+		/* Wait for remote reset sequence to complete */
+		while (test_bit(SK_F_PG_OFFLD_COMPLETE, &csk->flags))
+			msleep(1);
+
 		return -EALREADY;
+	}
 
 	return 0;
 }
@@ -3872,6 +3877,10 @@ static int cnic_cm_close(struct cnic_sock *csk)
 		csk->state = L4_KCQE_OPCODE_VALUE_CLOSE_COMP;
 		return cnic_cm_close_req(csk);
 	} else {
+		/* Wait for remote reset sequence to complete */
+		while (test_bit(SK_F_PG_OFFLD_COMPLETE, &csk->flags))
+			msleep(1);
+
 		return -EALREADY;
 	}
 	return 0;
diff --git a/drivers/net/ethernet/broadcom/cnic_if.h b/drivers/net/ethernet/broadcom/cnic_if.h
index 865095a..502e11e 100644
--- a/drivers/net/ethernet/broadcom/cnic_if.h
+++ b/drivers/net/ethernet/broadcom/cnic_if.h
@@ -14,8 +14,8 @@
 
 #include "bnx2x/bnx2x_mfw_req.h"
 
-#define CNIC_MODULE_VERSION	"2.5.14"
-#define CNIC_MODULE_RELDATE	"Sep 30, 2012"
+#define CNIC_MODULE_VERSION	"2.5.15"
+#define CNIC_MODULE_RELDATE	"Dec 04, 2012"
 
 #define CNIC_ULP_RDMA		0
 #define CNIC_ULP_ISCSI		1
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH rfc] netfilter: two xtables matches
From: Jan Engelhardt @ 2012-12-05 20:00 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: netfilter-devel, netdev, Eric Dumazet, David Miller, kaber, pablo
In-Reply-To: <CA+FuTSeYvcdtNf7N=POD+RzsE01+WVsub6F_nqbn06RZyoWx8w@mail.gmail.com>

On Wednesday 2012-12-05 20:28, Willem de Bruijn wrote:

>Somehow, the first part of this email went missing. Not critical,
>but for completeness:
>
>These two patches each add an xtables match.
>
>The xt_priority match is a straighforward addition in the style of
>xt_mark, adding the option to filter on one more sk_buff field. I
>have an immediate application for this. The amount of code (in
>kernel + userspace) to add a single check proved quite large.

Hm so yeah, can't we just place this in xt_mark.c?

^ permalink raw reply

* Re: [PATCH 2/2] netfilter: add xt_bpf xtables match
From: Willem de Bruijn @ 2012-12-05 20:10 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, netdev, Eric Dumazet, David Miller, kaber
In-Reply-To: <20121205194854.GB28730@1984>

On Wed, Dec 5, 2012 at 2:48 PM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> Hi Willem,
>
> On Wed, Dec 05, 2012 at 02:22:19PM -0500, Willem de Bruijn wrote:
>> A new match that executes sk_run_filter on every packet. BPF filters
>> can access skbuff fields that are out of scope for existing iptables
>> rules, allow more expressive logic, and on platforms with JIT support
>> can even be faster.
>>
>> I have a corresponding iptables patch that takes `tcpdump -ddd`
>> output, as used in the examples below. The two parts communicate
>> using a variable length structure. This is similar to ebt_among,
>> but new for iptables.
>>
>> Verified functionality by inserting an ip source filter on chain
>> INPUT and an ip dest filter on chain OUTPUT and noting that ping
>> failed while a rule was active:
>>
>> iptables -v -A INPUT -m bpf --bytecode '4,32 0 0 12,21 0 1 $SADDR,6 0 0 96,6 0 0 0,' -j DROP
>> iptables -v -A OUTPUT -m bpf --bytecode '4,32 0 0 16,21 0 1 $DADDR,6 0 0 96,6 0 0 0,' -j DROP
>
> I like this BPF idea for iptables.
>
> I made a similar extension time ago, but it was taking a file as
> parameter. That file contained in BPF code. I made a simple bison
> parser that takes BPF code and put it into the bpf array of
> instructions. It would be a bit more intuitive to define a filter and
> we can distribute it with iptables.

That's cleaner, indeed. I actually like how tcpdump operates as a
code generator if you pass -ddd. Unfortunately, it generates code only
for link layer types of its supported devices, such as DLT_EN10MB and
DLT_LINUX_SLL. The network layer interface of basic iptables
(forgetting device dependent mechanisms as used in xt_mac) is DLT_RAW,
but that is rarely supported.

> Let me check on my internal trees, I can put that user-space code
> somewhere in case you're interested.

Absolutely. I'll be happy to revise to get it in. I'm also considering
sending a patch to tcpdump to make it generate code independent of the
installed hardware when specifying -y.

>> Evaluated throughput by running netperf TCP_STREAM over loopback on
>> x86_64. I expected the BPF filter to outperform hardcoded iptables
>> filters when replacing multiple matches with a single bpf match, but
>> even a single comparison to u32 appears to do better. Relative to the
>> benchmark with no filter applied, rate with 100 BPF filters dropped
>> to 81%. With 100 U32 filters it dropped to 55%. The difference sounds
>> excessive to me, but was consistent on my hardware. Commands used:
>>
>> for i in `seq 100`; do iptables -A OUTPUT -m bpf --bytecode '4,48 0 0 9,21 0 1 20,6 0 0 96,6 0 0 0,' -j DROP; done
>> for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done
>>
>> iptables -F OUTPUT
>>
>> for i in `seq 100`; do iptables -A OUTPUT -m u32 --u32 '6&0xFF=0x20' -j DROP; done
>> for i in `seq 3`; do netperf -t TCP_STREAM -I 99 -H localhost; done
>>
>> FYI: perf top
>>
>> [bpf]
>>     33.94%  [kernel]          [k] copy_user_generic_string
>>      8.92%  [kernel]          [k] sk_run_filter
>>      7.77%  [ip_tables]       [k] ipt_do_table
>>
>> [u32]
>>     22.63%  [kernel]          [k] copy_user_generic_string
>>     14.46%  [kernel]          [k] memcpy
>>      9.19%  [ip_tables]       [k] ipt_do_table
>>      8.47%  [xt_u32]          [k] u32_mt
>>      5.32%  [kernel]          [k] skb_copy_bits
>>
>> The big difference appears to be in memory copying. I have not
>> looked into u32, so cannot explain this right now. More interestingly,
>> at higher rate, sk_run_filter appears to use as many cycles as u32_mt
>> (both traces have roughly the same number of events).
>>
>> One caveat: to work independent of device link layer, the filter
>> expects DLT_RAW style BPF programs, i.e., those that expect the
>> packet to start at the IP layer.
>> ---
>>  include/linux/netfilter/xt_bpf.h |   17 +++++++
>>  net/netfilter/Kconfig            |    9 ++++
>>  net/netfilter/Makefile           |    1 +
>>  net/netfilter/x_tables.c         |    5 +-
>>  net/netfilter/xt_bpf.c           |   88 ++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 118 insertions(+), 2 deletions(-)
>>  create mode 100644 include/linux/netfilter/xt_bpf.h
>>  create mode 100644 net/netfilter/xt_bpf.c
>>
>> diff --git a/include/linux/netfilter/xt_bpf.h b/include/linux/netfilter/xt_bpf.h
>> new file mode 100644
>> index 0000000..23502c0
>> --- /dev/null
>> +++ b/include/linux/netfilter/xt_bpf.h
>> @@ -0,0 +1,17 @@
>> +#ifndef _XT_BPF_H
>> +#define _XT_BPF_H
>> +
>> +#include <linux/filter.h>
>> +#include <linux/types.h>
>> +
>> +struct xt_bpf_info {
>> +     __u16 bpf_program_num_elem;
>> +
>> +     /* only used in kernel */
>> +     struct sk_filter *filter __attribute__((aligned(8)));
>> +
>> +     /* variable size, based on program_num_elem */
>> +     struct sock_filter bpf_program[0];
>> +};
>> +
>> +#endif /*_XT_BPF_H */
>> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
>> index c9739c6..c7cc0b8 100644
>> --- a/net/netfilter/Kconfig
>> +++ b/net/netfilter/Kconfig
>> @@ -798,6 +798,15 @@ config NETFILTER_XT_MATCH_ADDRTYPE
>>         If you want to compile it as a module, say M here and read
>>         <file:Documentation/kbuild/modules.txt>.  If unsure, say `N'.
>>
>> +config NETFILTER_XT_MATCH_BPF
>> +     tristate '"bpf" match support'
>> +     depends on NETFILTER_ADVANCED
>> +     help
>> +       BPF matching applies a linux socket filter to each packet and
>> +          accepts those for which the filter returns non-zero.
>> +
>> +       To compile it as a module, choose M here.  If unsure, say N.
>> +
>>  config NETFILTER_XT_MATCH_CLUSTER
>>       tristate '"cluster" match support'
>>       depends on NF_CONNTRACK
>> diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
>> index 8e5602f..9f12eeb 100644
>> --- a/net/netfilter/Makefile
>> +++ b/net/netfilter/Makefile
>> @@ -98,6 +98,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_IDLETIMER) += xt_IDLETIMER.o
>>
>>  # matches
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_ADDRTYPE) += xt_addrtype.o
>> +obj-$(CONFIG_NETFILTER_XT_MATCH_BPF) += xt_bpf.o
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_CLUSTER) += xt_cluster.o
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_COMMENT) += xt_comment.o
>>  obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o
>> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
>> index 8d987c3..26306be 100644
>> --- a/net/netfilter/x_tables.c
>> +++ b/net/netfilter/x_tables.c
>> @@ -379,8 +379,9 @@ int xt_check_match(struct xt_mtchk_param *par,
>>       if (XT_ALIGN(par->match->matchsize) != size &&
>>           par->match->matchsize != -1) {
>>               /*
>> -              * ebt_among is exempt from centralized matchsize checking
>> -              * because it uses a dynamic-size data set.
>> +              * matches of variable size length, such as ebt_among,
>> +              * are exempt from centralized matchsize checking. They
>> +              * skip the test by setting xt_match.matchsize to -1.
>>                */
>>               pr_err("%s_tables: %s.%u match: invalid size "
>>                      "%u (kernel) != (user) %u\n",
>> diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
>> new file mode 100644
>> index 0000000..07077c5
>> --- /dev/null
>> +++ b/net/netfilter/xt_bpf.c
>> @@ -0,0 +1,88 @@
>> +/* Xtables module to match packets using a BPF filter.
>> + * Copyright 2012 Google Inc.
>> + * Written by Willem de Bruijn <willemb@google.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + */
>> +
>> +#include <linux/module.h>
>> +#include <linux/skbuff.h>
>> +#include <linux/ipv6.h>
>> +#include <linux/filter.h>
>> +#include <net/ip.h>
>> +
>> +#include <linux/netfilter/xt_bpf.h>
>> +#include <linux/netfilter/x_tables.h>
>> +
>> +MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
>> +MODULE_DESCRIPTION("Xtables: BPF filter match");
>> +MODULE_LICENSE("GPL");
>> +MODULE_ALIAS("ipt_bpf");
>> +MODULE_ALIAS("ip6t_bpf");
>> +
>> +static int bpf_mt_check(const struct xt_mtchk_param *par)
>> +{
>> +     struct xt_bpf_info *info = par->matchinfo;
>> +     const struct xt_entry_match *match;
>> +     struct sock_fprog program;
>> +     int expected_len;
>> +
>> +     match = container_of(par->matchinfo, const struct xt_entry_match, data);
>> +     expected_len = sizeof(struct xt_entry_match) +
>> +                    sizeof(struct xt_bpf_info) +
>> +                    (sizeof(struct sock_filter) *
>> +                     info->bpf_program_num_elem);
>> +
>> +     if (match->u.match_size != expected_len) {
>> +             pr_info("bpf: check failed: incorrect length\n");
>> +             return -EINVAL;
>> +     }
>> +
>> +     program.len = info->bpf_program_num_elem;
>> +     program.filter = info->bpf_program;
>> +     if (sk_unattached_filter_create(&info->filter, &program)) {
>> +             pr_info("bpf: check failed: parse error\n");
>> +             return -EINVAL;
>> +     }
>> +
>> +     return 0;
>> +}
>> +
>> +static bool bpf_mt(const struct sk_buff *skb, struct xt_action_param *par)
>> +{
>> +     const struct xt_bpf_info *info = par->matchinfo;
>> +
>> +     return SK_RUN_FILTER(info->filter, skb);
>> +}
>> +
>> +static void bpf_mt_destroy(const struct xt_mtdtor_param *par)
>> +{
>> +     const struct xt_bpf_info *info = par->matchinfo;
>> +     sk_unattached_filter_destroy(info->filter);
>> +}
>> +
>> +static struct xt_match bpf_mt_reg __read_mostly = {
>> +     .name           = "bpf",
>> +     .revision       = 0,
>> +     .family         = NFPROTO_UNSPEC,
>> +     .checkentry     = bpf_mt_check,
>> +     .match          = bpf_mt,
>> +     .destroy        = bpf_mt_destroy,
>> +     .matchsize      = -1, /* skip xt_check_match because of dynamic len */
>> +     .me             = THIS_MODULE,
>> +};
>> +
>> +static int __init bpf_mt_init(void)
>> +{
>> +     return xt_register_match(&bpf_mt_reg);
>> +}
>> +
>> +static void __exit bpf_mt_exit(void)
>> +{
>> +     xt_unregister_match(&bpf_mt_reg);
>> +}
>> +
>> +module_init(bpf_mt_init);
>> +module_exit(bpf_mt_exit);
>> --
>> 1.7.7.3
>>

^ permalink raw reply

* Re: [PATCH] 3com: make 3c59x depend on HAS_IOPORT
From: David Miller @ 2012-12-05 20:22 UTC (permalink / raw)
  To: jang; +Cc: netdev
In-Reply-To: <1354736666.14042.7.camel@localhost.localdomain>

From: Jan Glauber <jang@linux.vnet.ibm.com>
Date: Wed, 05 Dec 2012 20:44:26 +0100

> On Wed, 2012-12-05 at 12:58 -0500, David Miller wrote:
>> From: Jan Glauber <jang@linux.vnet.ibm.com>
>> Date: Wed, 05 Dec 2012 15:04:40 +0100
>> 
>> > From: Jan Glauber <jang@linux.vnet.ibm.com>
>> > 
>> > The 3com driver for 3c59x requires ioport_map. Since not all
>> > architectures support IO port mapping make 3c59x dependent on HAS_IOPORT.
>> > 
>> > Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
>> 
>> Which platforms support PCI or EISA yet do not set HAS_IOPORT?
>> 
> 
> s390. We wont get support for port I/O in the PCI hardware/firmware
> layer. (The patches for PCI on s390 are currently in linux-next).

Ok, I'll apply this to net-next, thanks.

^ permalink raw reply

* [RFC PATCH v2 0/3] Fix some multiqueue TUN problems
From: Paul Moore @ 2012-12-05 20:25 UTC (permalink / raw)
  To: netdev, linux-security-module, selinux; +Cc: jasowang, mst

Second draft of the LSM/SELinux fixes to the upcoming multiqueue TUN
functionality.  This draft incorporates all the comments/decisions
from the first draft, notably the new LSM and SELinux hook for the
TUNSETQUEUE operation.  Other LSMs do not provide TUN controls so they
are not affected.

Once we decide this is the right approach I'll push the associated
SELinux policy FLASK definitions upstream; for those who are interested
the SELinux policy diff in included in the description of patch 1/2.

I don't expect this to be the final patch, just a starting point for
further discussion so I didn't really do any testing, simply making
sure that it compiled cleanly.

---

Paul Moore (3):
      tun: correctly report an error in tun_flow_init()
      selinux: add the "create_queue" permission to the "tun_socket" class
      tun: fix LSM/SELinux labeling of tun/tap devices


 drivers/net/tun.c                   |   29 +++++++++++++----
 include/linux/security.h            |   59 +++++++++++++++++++++++++++--------
 security/capability.c               |   24 ++++++++++++--
 security/security.c                 |   28 ++++++++++++++---
 security/selinux/hooks.c            |   50 +++++++++++++++++++++++-------
 security/selinux/include/classmap.h |    2 +
 security/selinux/include/objsec.h   |    4 ++
 7 files changed, 156 insertions(+), 40 deletions(-)

^ permalink raw reply

* [RFC PATCH v2 1/3] tun: correctly report an error in tun_flow_init()
From: Paul Moore @ 2012-12-05 20:26 UTC (permalink / raw)
  To: netdev, linux-security-module, selinux; +Cc: jasowang, mst
In-Reply-To: <20121205202144.18626.61966.stgit@localhost>

On error, the error code from tun_flow_init() is lost inside
tun_set_iff(), this patch fixes this by assigning the tun_flow_init()
error code to the "err" variable which is returned by
the tun_flow_init() function on error.

Signed-off-by: Paul Moore <pmoore@redhat.com>
---
 drivers/net/tun.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index a1b2389..14a0454 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1591,7 +1591,8 @@ static int tun_set_iff(struct net *net, struct file *file, struct ifreq *ifr)
 
 		tun_net_init(dev);
 
-		if (tun_flow_init(tun))
+		err = tun_flow_init(tun);
+		if (err < 0)
 			goto err_free_dev;
 
 		dev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST |

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox