Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH] cxgb3: Set vlan_feature on net_device
From: brenohl @ 2012-07-18 19:29 UTC (permalink / raw)
  To: divy; +Cc: netdev, Breno Leitao

cxgb3 interface has a bad performance when VLAN is set. On my current
setup, a PowerLinux 7R2, I am able to get around 7 Gbps on a TCP_STREAM
(8 instances, 4k message).
With this patch, I am able to reach 9.5 Gbps.

Signed-off-by: Breno Leitao <brenohl@br.ibm.com>

diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
index abb6ce7..fcf4b31 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -3173,6 +3173,9 @@ static void __devinit cxgb3_init_iscsi_mac(struct net_device *dev)
 	pi->iscsic.mac_addr[3] |= 0x80;
 }
 
+#define TSO_FLAGS (NETIF_F_TSO | NETIF_F_TSO6 | NETIF_F_TSO_ECN)
+#define VLAN_FEAT (NETIF_F_SG | NETIF_F_IP_CSUM | TSO_FLAGS | \
+			NETIF_F_IPV6_CSUM | NETIF_F_HIGHDMA)
 static int __devinit init_one(struct pci_dev *pdev,
 			      const struct pci_device_id *ent)
 {
@@ -3293,6 +3296,7 @@ static int __devinit init_one(struct pci_dev *pdev,
 		netdev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM |
 			NETIF_F_TSO | NETIF_F_RXCSUM | NETIF_F_HW_VLAN_RX;
 		netdev->features |= netdev->hw_features | NETIF_F_HW_VLAN_TX;
+		netdev->vlan_features |= netdev->features & VLAN_FEAT;
 		if (pci_using_dac)
 			netdev->features |= NETIF_F_HIGHDMA;
 
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH 10/15] ipv4: Cache input routes in fib_info nexthops.
From: Joe Perches @ 2012-07-18 19:27 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120718.112413.1969496621247659288.davem@davemloft.net>

On Wed, 2012-07-18 at 11:24 -0700, David Miller wrote:
> Caching input routes is slightly simpler than output routes, since we
> don't need to be concerned with nexthop exceptions.  (locally
> destined, and routed packets, never trigger PMTU events or redirects
> that will be processed by us).
[]
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
[]
> @@ -1355,11 +1357,11 @@ static int __mkroute_input(struct sk_buff *skb,
[]
> +	do_cache = false;
> +	if (res->fi) {
> +		if (!(flags & RTCF_DIRECTSRC) && !itag) {
> +			rth = FIB_RES_NH(*res).nh_rth_input;
> +			if (rth) {
> +				dst_use(&rth->dst, jiffies);
> +				goto out;
> +			}
> +			do_cache = true;
> +		}
> +	}
[]
> @@ -1568,8 +1580,20 @@ brd_input:
[]
> +	do_cache = false;
> +	if (res.fi) {
> +		if (!(flags & RTCF_DIRECTSRC) && !itag) {
> +			rth = FIB_RES_NH(res).nh_rth_input;
> +			if (rth) {
> +				dst_use(&rth->dst, jiffies);
> +				goto set_and_out;
> +			}
> +			do_cache = true;
> +		}
> +	}

Maybe a helper like:

	if (some_do_cache_name(rth, res, itag, flags, &do_cache))
		goto foo;

^ permalink raw reply

* Re: [PATCH 0/3 v2] net: various tilegx networking fixes
From: Chris Metcalf @ 2012-07-18 19:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel
In-Reply-To: <20120718.113623.984635805289135415.davem@davemloft.net>

On 7/18/2012 2:36 PM, David Miller wrote:
> From: Chris Metcalf <cmetcalf@tilera.com>
> Date: Sun, 1 Jul 2012 14:43:47 -0400
>
>> The tree is at:
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git net
>>
>> Chris Metcalf (3):
>>       net: tilegx driver bugfix (be explicit about percpu queue number)
>>       tilegx net driver: handle payload data not in frags
>>       tilegx net: use eth_hw_addr_random(), not random_ether_addr()
> These changes look fine, but when I pull from your tree I get tons of
> totally unrelated stuff and a merge conflict in this driver.
>
> Can you put together a clean pull against net-next?

The merge conflict was against Joe Perches' bombing of random_ether_addr()
to eth_random_addr().  I left in my change to convert that again to be
eth_hw_addr_random(), which naively seems like a better API, and sets
NET_ADDR_RANDOM, which is presumably a good thing.

I recreated the tree to be branched off of net-next. ( I had originally
created it off of Linus's tree, which in retrospect doesn't make much
sense.)  Please try to pull again - thanks!

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

^ permalink raw reply

* Re: [PATCH 08/15] ipv4: Kill routes during PMTU/redirect updates.
From: Joe Perches @ 2012-07-18 19:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120718.112356.1409220904008377845.davem@davemloft.net>

On Wed, 2012-07-18 at 11:23 -0700, David Miller wrote:
> Mark them obsolete so there will be a re-lookup to fetch the
> FIB nexthop exception info.
[]
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
[]
> @@ -716,8 +717,8 @@ static void __ip_do_redirect(struct rtable *rt, struct sk_buff *skb, struct flow
>  					fnhe->fnhe_gw = new_gw;
>  				spin_unlock_bh(&fnhe_lock);
>  			}
> -			rt->rt_gateway = new_gw;
> -			rt->rt_flags |= RTCF_REDIRECTED;
> +			if (kill_route)
> +				rt->dst.obsolete = -2;

Perhaps -2 should be a #define?

Perhaps struct dst_entry.obsolete could be a char instead of
a short and a pad byte could added for some future use.

Maybe:

 include/net/dst.h |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index 5161046..6c40490 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -65,7 +65,8 @@ struct dst_entry {
 	unsigned short		pending_confirm;
 
 	short			error;
-	short			obsolete;
+	char			obsolete;
+	char			__pad3;
 	unsigned short		header_len;	/* more space at head required */
 	unsigned short		trailer_len;	/* space to reserve at tail */
 #ifdef CONFIG_IP_ROUTE_CLASSID

^ permalink raw reply related

* Re: [PATCH] net: Statically initialize init_net.dev_base_head
From: Eric Dumazet @ 2012-07-18 19:13 UTC (permalink / raw)
  To: Mark Rustad; +Cc: netdev, davem, gaofeng, nhorman
In-Reply-To: <20120718190607.22923.77935.stgit@host1-mdrustad.localdomain>

On Wed, 2012-07-18 at 12:06 -0700, Mark Rustad wrote:
> This change eliminates an initialization-order hazard most
> recently seen when netprio_cgroup is built into the kernel.
> 
> With thanks to Eric Dumazet for catching a bug.
> 
> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
> ---
> 
>  net/core/dev.c           |    3 ++-
>  net/core/net_namespace.c |    4 +++-
>  2 files changed, 5 insertions(+), 2 deletions(-)

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* [PATCH net-next] ipx: move peII functions
From: Stephen Hemminger @ 2012-07-18 19:09 UTC (permalink / raw)
  To: David Miller, Arnaldo Carvalho de Melo; +Cc: netdev

The Ethernet II wrapper is only used by IPX protocol, may have once
been used by Appletalk but not currently. Therefore it makes sense to 
move it to the IPX dust bin and drop the exports.

Build tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


---
 net/ethernet/Makefile |    2 --
 net/ethernet/pe2.c    |   37 -------------------------------------
 net/ipx/Makefile      |    2 +-
 net/ipx/pe2.c         |   35 +++++++++++++++++++++++++++++++++++
 4 files changed, 36 insertions(+), 40 deletions(-)

--- a/net/ethernet/Makefile	2012-07-16 12:08:12.163683266 -0700
+++ b/net/ethernet/Makefile	2012-07-17 09:25:49.009927882 -0700
@@ -3,5 +3,3 @@
 #
 
 obj-y					+= eth.o
-obj-$(subst m,y,$(CONFIG_IPX))		+= pe2.o
-obj-$(subst m,y,$(CONFIG_ATALK))	+= pe2.o
--- a/net/ethernet/pe2.c	2012-07-16 12:08:12.163683266 -0700
+++ /dev/null	1970-01-01 00:00:00.000000000 +0000
@@ -1,37 +0,0 @@
-#include <linux/in.h>
-#include <linux/mm.h>
-#include <linux/module.h>
-#include <linux/netdevice.h>
-#include <linux/skbuff.h>
-#include <linux/slab.h>
-
-#include <net/datalink.h>
-
-static int pEII_request(struct datalink_proto *dl,
-			struct sk_buff *skb, unsigned char *dest_node)
-{
-	struct net_device *dev = skb->dev;
-
-	skb->protocol = htons(ETH_P_IPX);
-	dev_hard_header(skb, dev, ETH_P_IPX, dest_node, NULL, skb->len);
-	return dev_queue_xmit(skb);
-}
-
-struct datalink_proto *make_EII_client(void)
-{
-	struct datalink_proto *proto = kmalloc(sizeof(*proto), GFP_ATOMIC);
-
-	if (proto) {
-		proto->header_length = 0;
-		proto->request = pEII_request;
-	}
-
-	return proto;
-}
-EXPORT_SYMBOL(make_EII_client);
-
-void destroy_EII_client(struct datalink_proto *dl)
-{
-	kfree(dl);
-}
-EXPORT_SYMBOL(destroy_EII_client);
--- a/net/ipx/Makefile	2012-07-16 12:08:12.219683209 -0700
+++ b/net/ipx/Makefile	2012-07-17 09:27:36.218231168 -0700
@@ -4,5 +4,5 @@
 
 obj-$(CONFIG_IPX) += ipx.o
 
-ipx-y			:= af_ipx.o ipx_route.o ipx_proc.o
+ipx-y			:= af_ipx.o ipx_route.o ipx_proc.o pe2.o
 ipx-$(CONFIG_SYSCTL)	+= sysctl_net_ipx.o
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ b/net/ipx/pe2.c	2012-07-17 09:28:21.478359247 -0700
@@ -0,0 +1,35 @@
+#include <linux/in.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/skbuff.h>
+#include <linux/slab.h>
+
+#include <net/datalink.h>
+
+static int pEII_request(struct datalink_proto *dl,
+			struct sk_buff *skb, unsigned char *dest_node)
+{
+	struct net_device *dev = skb->dev;
+
+	skb->protocol = htons(ETH_P_IPX);
+	dev_hard_header(skb, dev, ETH_P_IPX, dest_node, NULL, skb->len);
+	return dev_queue_xmit(skb);
+}
+
+struct datalink_proto *make_EII_client(void)
+{
+	struct datalink_proto *proto = kmalloc(sizeof(*proto), GFP_ATOMIC);
+
+	if (proto) {
+		proto->header_length = 0;
+		proto->request = pEII_request;
+	}
+
+	return proto;
+}
+
+void destroy_EII_client(struct datalink_proto *dl)
+{
+	kfree(dl);
+}

^ permalink raw reply

* [PATCH] net: Statically initialize init_net.dev_base_head
From: Mark Rustad @ 2012-07-18 19:06 UTC (permalink / raw)
  To: netdev, davem, gaofeng, nhorman, eric.dumazet

This change eliminates an initialization-order hazard most
recently seen when netprio_cgroup is built into the kernel.

With thanks to Eric Dumazet for catching a bug.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
---

 net/core/dev.c           |    3 ++-
 net/core/net_namespace.c |    4 +++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 0f28a9e..1cb0d8a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6283,7 +6283,8 @@ static struct hlist_head *netdev_create_hash(void)
 /* Initialize per network namespace state */
 static int __net_init netdev_init(struct net *net)
 {
-	INIT_LIST_HEAD(&net->dev_base_head);
+	if (net != &init_net)
+		INIT_LIST_HEAD(&net->dev_base_head);
 
 	net->dev_name_head = netdev_create_hash();
 	if (net->dev_name_head == NULL)
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index dddbacb..42f1e1c 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -27,7 +27,9 @@ static DEFINE_MUTEX(net_mutex);
 LIST_HEAD(net_namespace_list);
 EXPORT_SYMBOL_GPL(net_namespace_list);
 
-struct net init_net;
+struct net init_net = {
+	.dev_base_head = LIST_HEAD_INIT(init_net.dev_base_head),
+};
 EXPORT_SYMBOL(init_net);
 
 #define INITIAL_NET_GEN_PTRS	13 /* +1 for len +2 for rcu_head */

^ permalink raw reply related

* Re: [PATCH 07/15] ipv4: Adjust semantics of rt->rt_gateway.
From: David Miller @ 2012-07-18 19:00 UTC (permalink / raw)
  To: joe; +Cc: netdev
In-Reply-To: <1342637834.2013.4.camel@joe2Laptop>

From: Joe Perches <joe@perches.com>
Date: Wed, 18 Jul 2012 11:57:14 -0700

> maybe use the moderately common gcc ?: extension

Yes, or even better a helper function.

I'll do something about this before I push it out for real, thanks
Joe.

^ permalink raw reply

* Re: [PATCH 07/15] ipv4: Adjust semantics of rt->rt_gateway.
From: Joe Perches @ 2012-07-18 18:57 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120718.112346.920698833832945112.davem@davemloft.net>

On Wed, 2012-07-18 at 11:23 -0700, David Miller wrote:
> In order to allow prefixed routes, we have to adjust how rt_gateway
> is set an interpreted.

typo an/and and a trivial style note:

> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
[]
> @@ -1079,8 +1079,10 @@ void ip_rt_get_source(u8 *addr, struct sk_buff *skb, struct rtable *rt)
>  		if (fib_lookup(dev_net(rt->dst.dev), &fl4, &res) == 0)
>  			src = FIB_RES_PREFSRC(dev_net(rt->dst.dev), res);
>  		else
> -			src = inet_select_addr(rt->dst.dev, rt->rt_gateway,
> -					RT_SCOPE_UNIVERSE);
> +			src = inet_select_addr(rt->dst.dev, (rt->rt_gateway ?
> +							     rt->rt_gateway :
> +							     iph->daddr),
> +					       RT_SCOPE_UNIVERSE);

maybe use the moderately common gcc ?: extension

			src = inet_select_addr(rt->dst.dev,
					       rt->rt_gateway ?: iph->daddr,
					       RT_SCOPE_UNIVERSE);

^ permalink raw reply

* [PATCH v2] net: cgroup: null ptr dereference in netprio cgroup during init
From: John Fastabend @ 2012-07-18 18:34 UTC (permalink / raw)
  To: davem, gaofeng, nhorman; +Cc: mark.d.rustad, netdev, eric.dumazet

When the netprio cgroup is built in the kernel cgroup_init will call
cgrp_create which eventually calls update_netdev_tables. This is
being called before do_initcalls() so a null ptr dereference occurs
on init_net.

This patch adds a check on init_net.count to verify the structure
has been initialized. The failure was introduced here,

commit ef209f15980360f6945873df3cd710c5f62f2a3e
Author: Gao feng <gaofeng@cn.fujitsu.com>
Date:   Wed Jul 11 21:50:15 2012 +0000

    net: cgroup: fix access the unallocated memory in netprio cgroup

Tested with ping with netprio_cgroup as a module and built in.

[    0.256451] Initializing cgroup subsys net_prio
[    0.269948] BUG: unable to handle kernel NULL pointer dereference at
0000000000000698
[    0.293303] IP: [<ffffffff81512e37>] cgrp_create+0x107/0x1c0
[    0.310175] PGD 0
[    0.316157] Oops: 0000 [#1] SMP
[    0.325775] CPU 0
[    0.331227] Modules linked in:
[    0.340846]
[    0.345264] Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7+ #1 AMD Dinar/Dinar
[    0.366555] RIP: 0010:[<ffffffff81512e37>]  [<ffffffff81512e37>]
cgrp_create+0x107/0x1c0
[    0.390681] RSP: 0000:ffffffff81c01ea8  EFLAGS: 00010213
[    0.406501] RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
[    0.427764] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81c9d840
[    0.449026] RBP: ffffffff81c01ed8 R08: 00000000000164e0 R09: 0000000000000000
[    0.470289] R10: ffff8804278303c0 R11: 0000000000000000 R12: 0000000000000001
[    0.491553] R13: ffff8804278303c0 R14: ffff881036fd0700 R15: 0000000000000000
[    0.512819] FS:  0000000000000000(0000) GS:ffff880427c00000(0000)
knlGS:0000000000000000
[    0.536932] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.554049] CR2: 0000000000000698 CR3: 0000000001c0b000 CR4: 00000000000406b0
[    0.575311] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.596574] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    0.617838] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task
ffffffff81c13420)
[    0.642471] Stack:
[    0.648442]  ffffffff81c01eb8 ffffffff81c9f320 ffffffff81c9f320
ffffffff81c9f320
[    0.670522]  ffffffff81c9f320 ffffffff81d482c0 ffffffff81c01ef8
ffffffff81d10397
[    0.692604]  ffffffff81e99790 0000000000000048 ffffffff81c01f18
ffffffff81d1062e
[    0.714687] Call Trace:
[    0.721960]  [<ffffffff81d10397>] cgroup_init_subsys+0x51/0xdf
[    0.739337]  [<ffffffff81d1062e>] cgroup_init+0x36/0x119
[    0.755160]  [<ffffffff81cf5c02>] start_kernel+0x38f/0x3c4
[    0.771501]  [<ffffffff81cf5672>] ? repair_env_string+0x5e/0x5e
[    0.789138]  [<ffffffff81cf5356>] x86_64_start_reservations+0x131/0x135
[    0.808849]  [<ffffffff81cf545a>] x86_64_start_kernel+0x100/0x10f


Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 net/core/net_namespace.c  |    4 +++-
 net/core/netprio_cgroup.c |    3 +++
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index dddbacb..faa33bb 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -27,7 +27,9 @@ static DEFINE_MUTEX(net_mutex);
 LIST_HEAD(net_namespace_list);
 EXPORT_SYMBOL_GPL(net_namespace_list);
 
-struct net init_net;
+struct net init_net = {
+	.count = ATOMIC_INIT(0),
+};
 EXPORT_SYMBOL(init_net);
 
 #define INITIAL_NET_GEN_PTRS	13 /* +1 for len +2 for rcu_head */
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index b2e9caa..e9fd7fd 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -116,6 +116,9 @@ static int update_netdev_tables(void)
 	u32 max_len;
 	struct netprio_map *map;
 
+	if (!atomic_read(&init_net.count))
+		return ret;
+
 	rtnl_lock();
 	max_len = atomic_read(&max_prioidx) + 1;
 	for_each_netdev(&init_net, dev) {

^ permalink raw reply related

* Re: [PATCH] net: cgroup: null ptr dereference in netprio cgroup during init
From: David Miller @ 2012-07-18 18:44 UTC (permalink / raw)
  To: john.r.fastabend; +Cc: gaofeng, nhorman, mark.d.rustad, netdev, eric.dumazet
In-Reply-To: <20120718182711.22872.95370.stgit@jf-dev1-dcblab>

From: John Fastabend <john.r.fastabend@intel.com>
Date: Wed, 18 Jul 2012 11:27:11 -0700

> @@ -27,7 +27,9 @@ static DEFINE_MUTEX(net_mutex);
>  LIST_HEAD(net_namespace_list);
>  EXPORT_SYMBOL_GPL(net_namespace_list);
>  
> -struct net init_net;
> +struct net init_net = {
> +			.count = ATOMIC_INIT(0),
> +		      };
>  EXPORT_SYMBOL(init_net);

That's some fancy indentation you've got there cowboy.

^ permalink raw reply

* Re: [PATCH net-next 2/7] sfc: Add channel specific receive_skb handler and post_remove callback
From: David Miller @ 2012-07-18 18:43 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, linux-net-drivers, ajackson, richardcochran
In-Reply-To: <1342636962.2617.68.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 18 Jul 2012 19:42:42 +0100

> On Wed, 2012-07-18 at 11:32 -0700, David Miller wrote:
>> From: Ben Hutchings <bhutchings@solarflare.com>
>> Date: Wed, 18 Jul 2012 19:20:00 +0100
>> 
>> > +	void (*receive_skb)(struct efx_channel *, struct sk_buff *);
>> 
>> This looks to me like a conduit for proprietary features implemented
>> in a binary-only blob.
>> 
>> I understand how you're using here for PTP, but you're really openning
>> the door for things I really wouldn't be very happy about.
> 
> Through all the functions that, er, aren't exported?
> 
> Even in the out-of-tree version of sfc there is no receive path hook any
> more; I converted the client driver that wanted it (which is under GNU
> GPL, thank you very much) to use netfilter.

Fair enough, in light of that this doesn't bother me so much :)

^ permalink raw reply

* Re: [PATCH net-next 2/7] sfc: Add channel specific receive_skb handler and post_remove callback
From: Ben Hutchings @ 2012-07-18 18:42 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers, ajackson, richardcochran
In-Reply-To: <20120718.113256.279646201702165485.davem@davemloft.net>

On Wed, 2012-07-18 at 11:32 -0700, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Wed, 18 Jul 2012 19:20:00 +0100
> 
> > +	void (*receive_skb)(struct efx_channel *, struct sk_buff *);
> 
> This looks to me like a conduit for proprietary features implemented
> in a binary-only blob.
> 
> I understand how you're using here for PTP, but you're really openning
> the door for things I really wouldn't be very happy about.

Through all the functions that, er, aren't exported?

Even in the out-of-tree version of sfc there is no receive path hook any
more; I converted the client driver that wanted it (which is under GNU
GPL, thank you very much) to use netfilter.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH] net: cgroup: null ptr dereference in netprio cgroup during init
From: John Fastabend @ 2012-07-18 18:27 UTC (permalink / raw)
  To: davem, gaofeng, nhorman; +Cc: mark.d.rustad, netdev, eric.dumazet

When the netprio cgroup is built in the kernel cgroup_init will call
cgrp_create which eventually calls update_netdev_tables. This is
being called before do_initcalls() so a null ptr dereference occurs
on init_net.

This patch adds a check on init_net.count to verify the structure
has been initialized. The failure was introduced here,

commit ef209f15980360f6945873df3cd710c5f62f2a3e
Author: Gao feng <gaofeng@cn.fujitsu.com>
Date:   Wed Jul 11 21:50:15 2012 +0000

    net: cgroup: fix access the unallocated memory in netprio cgroup

Tested with ping with netprio_cgroup as a module and built in.

[    0.256451] Initializing cgroup subsys net_prio
[    0.269948] BUG: unable to handle kernel NULL pointer dereference at
0000000000000698
[    0.293303] IP: [<ffffffff81512e37>] cgrp_create+0x107/0x1c0
[    0.310175] PGD 0
[    0.316157] Oops: 0000 [#1] SMP
[    0.325775] CPU 0
[    0.331227] Modules linked in:
[    0.340846]
[    0.345264] Pid: 0, comm: swapper/0 Not tainted 3.5.0-rc7+ #1 AMD Dinar/Dinar
[    0.366555] RIP: 0010:[<ffffffff81512e37>]  [<ffffffff81512e37>]
cgrp_create+0x107/0x1c0
[    0.390681] RSP: 0000:ffffffff81c01ea8  EFLAGS: 00010213
[    0.406501] RAX: 0000000000000000 RBX: ffffffffffffff10 RCX: 0000000000000000
[    0.427764] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffff81c9d840
[    0.449026] RBP: ffffffff81c01ed8 R08: 00000000000164e0 R09: 0000000000000000
[    0.470289] R10: ffff8804278303c0 R11: 0000000000000000 R12: 0000000000000001
[    0.491553] R13: ffff8804278303c0 R14: ffff881036fd0700 R15: 0000000000000000
[    0.512819] FS:  0000000000000000(0000) GS:ffff880427c00000(0000)
knlGS:0000000000000000
[    0.536932] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.554049] CR2: 0000000000000698 CR3: 0000000001c0b000 CR4: 00000000000406b0
[    0.575311] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.596574] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    0.617838] Process swapper/0 (pid: 0, threadinfo ffffffff81c00000, task
ffffffff81c13420)
[    0.642471] Stack:
[    0.648442]  ffffffff81c01eb8 ffffffff81c9f320 ffffffff81c9f320
ffffffff81c9f320
[    0.670522]  ffffffff81c9f320 ffffffff81d482c0 ffffffff81c01ef8
ffffffff81d10397
[    0.692604]  ffffffff81e99790 0000000000000048 ffffffff81c01f18
ffffffff81d1062e
[    0.714687] Call Trace:
[    0.721960]  [<ffffffff81d10397>] cgroup_init_subsys+0x51/0xdf
[    0.739337]  [<ffffffff81d1062e>] cgroup_init+0x36/0x119
[    0.755160]  [<ffffffff81cf5c02>] start_kernel+0x38f/0x3c4
[    0.771501]  [<ffffffff81cf5672>] ? repair_env_string+0x5e/0x5e
[    0.789138]  [<ffffffff81cf5356>] x86_64_start_reservations+0x131/0x135
[    0.808849]  [<ffffffff81cf545a>] x86_64_start_kernel+0x100/0x10f


Reported-by: Mark Rustad <mark.d.rustad@intel.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 net/core/net_namespace.c  |    4 +++-
 net/core/netprio_cgroup.c |    3 +++
 2 files changed, 6 insertions(+), 1 deletions(-)

diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index dddbacb..0d37c94 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -27,7 +27,9 @@ static DEFINE_MUTEX(net_mutex);
 LIST_HEAD(net_namespace_list);
 EXPORT_SYMBOL_GPL(net_namespace_list);
 
-struct net init_net;
+struct net init_net = {
+			.count = ATOMIC_INIT(0),
+		      };
 EXPORT_SYMBOL(init_net);
 
 #define INITIAL_NET_GEN_PTRS	13 /* +1 for len +2 for rcu_head */
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c
index b2e9caa..e9fd7fd 100644
--- a/net/core/netprio_cgroup.c
+++ b/net/core/netprio_cgroup.c
@@ -116,6 +116,9 @@ static int update_netdev_tables(void)
 	u32 max_len;
 	struct netprio_map *map;
 
+	if (!atomic_read(&init_net.count))
+		return ret;
+
 	rtnl_lock();
 	max_len = atomic_read(&max_prioidx) + 1;
 	for_each_netdev(&init_net, dev) {

^ permalink raw reply related

* Re: [PATCH net-next 0/4] net/mlx4_en: Add accelerated RFS support
From: David Miller @ 2012-07-18 18:41 UTC (permalink / raw)
  To: ogerlitz; +Cc: roland, netdev, oren, yevgenyp, amirv
In-Reply-To: <1342621162-18498-1-git-send-email-ogerlitz@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Wed, 18 Jul 2012 17:19:18 +0300

> This series from Amir Vadai adds support for Accelerated RFS 
> to the mlx4_en Ethernet driver.
> 
> The code uses the Accelerated RFS infrastructure and HW flow steering 
> to keep CPU affinity of rx interrupts and applications per TCP stream.
> 
> To do so, we had to add little protection to cpu_rmap.h against double 
> inclusion. Also, added linking between CPU to IRQ using rmap in the 
> mlx4_core driver.

Please use CONFIG_RFS_ACCEL consistently to protect this feature
in your driver sources.

Using CPU_RMAP in a few places is inconsistent, and not what other
drivers do.

Thanks.

^ permalink raw reply

* Re: [PATCH net-next V1 1/9] IB/ipoib: Add support for clones / multiple childs on the same partition
From: David Miller @ 2012-07-18 18:38 UTC (permalink / raw)
  To: ogerlitz; +Cc: roland, netdev, ali, sean.hefty, shlomop, erezsh
In-Reply-To: <1342609202-32427-2-git-send-email-ogerlitz@mellanox.com>

From: Or Gerlitz <ogerlitz@mellanox.com>
Date: Wed, 18 Jul 2012 13:59:54 +0300

> All sorts of childs are still created/deleted through sysfs, in a
> similar manner to the way legacy child interfaces are.

Network device instantiation of this type is the domain of
rtnl_link_ops rather than ugly sysfs interfaces.

^ permalink raw reply

* Re: [PATCH net-next V1 5/9] net/eipoib: Add ethtool file support
From: Ben Hutchings @ 2012-07-18 18:37 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: davem, roland, netdev, ali, sean.hefty, shlomop, Erez Shitrit
In-Reply-To: <1342609202-32427-6-git-send-email-ogerlitz@mellanox.com>

On Wed, 2012-07-18 at 13:59 +0300, Or Gerlitz wrote:
> From: Erez Shitrit <erezsh@mellanox.co.il>
> 
> Via ethtool the driver describes its version, ABI version, on what PIF
> interface it runs and various statistics.
[...]
> +static const char parent_strings[][ETH_GSTRING_LEN] = {
> +	/* private statistics */
> +	"tx_parent_dropped",
> +	"tx_vif_miss",
> +	"tx_neigh_miss",
> +	"tx_vlan",
> +	"tx_shared",
> +	"tx_proto_errors",
> +	"tx_skb_errors",
> +	"tx_slave_err",
> +
> +	"rx_parent_dropped",
> +	"rx_vif_miss",
> +	"rx_neigh_miss",
> +	"rx_vlan",
> +	"rx_shared",
> +	"rx_proto_errors",
> +	"rx_skb_errors",
> +	"rx_slave_err",
> +#define PORT_STATS_LEN	(8 * 2)
> +};
> +
> +#define PARENT_STATS_LEN (sizeof(parent_strings) / ETH_GSTRING_LEN)
> +
> +static void parent_get_strings(struct net_device *parent_dev,
> +			       uint32_t stringset, uint8_t *data)
> +{
> +	int index = 0, stats_off = 0, i;
> +
> +	if (stringset != ETH_SS_STATS)
> +		return;
> +
> +	for (i = 0; i < PORT_STATS_LEN; i++)
> +		strcpy(data + (index++) * ETH_GSTRING_LEN,
> +		       parent_strings[i + stats_off]);
> +
> +	stats_off += PORT_STATS_LEN;

This is a very longwinded way to write:
	memcpy(data, parent_strings, sizeof(parent_strings));

> +
> +}
> +
> +static void parent_get_ethtool_stats(struct net_device *parent_dev,
> +				     struct ethtool_stats *stats,
> +				     uint64_t *data)
> +{
> +	struct parent *parent = netdev_priv(parent_dev);
> +	int index = 0, i;
> +
> +	read_lock_bh(&parent->lock);
> +
> +	for (i = 0; i < PORT_STATS_LEN; i++)
> +		data[index++] = ((unsigned long *) &parent->port_stats)[i];
> +
> +	read_unlock_bh(&parent->lock);
> +}
> +
> +static int parent_get_sset_count(struct net_device *parent_dev, int sset)
> +{
> +	switch (sset) {
> +	case ETH_SS_STATS:
> +		return PARENT_STATS_LEN;
> +	default:
> +		return -EOPNOTSUPP;
> +	}
> +}
[...]

I get the feeling you've removed some code with unifdef; the result
looks really weird, with PORT_STATS_LEN and PARENT_STATS_LEN used
inconsistently.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 0/3 v2] net: various tilegx networking fixes
From: David Miller @ 2012-07-18 18:36 UTC (permalink / raw)
  To: cmetcalf; +Cc: netdev, linux-kernel
In-Reply-To: <201207181650.q6IGodZ7007565@lab-41.internal.tilera.com>

From: Chris Metcalf <cmetcalf@tilera.com>
Date: Sun, 1 Jul 2012 14:43:47 -0400

> The tree is at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git net
> 
> Chris Metcalf (3):
>       net: tilegx driver bugfix (be explicit about percpu queue number)
>       tilegx net driver: handle payload data not in frags
>       tilegx net: use eth_hw_addr_random(), not random_ether_addr()

These changes look fine, but when I pull from your tree I get tons of
totally unrelated stuff and a merge conflict in this driver.

Can you put together a clean pull against net-next?

Thanks.

^ permalink raw reply

* Re: That's pretty much it for 3.5.0
From: David Miller @ 2012-07-18 18:33 UTC (permalink / raw)
  To: mark.d.rustad-ral2JQCrhuEAvxtiuMwx3w
  Cc: eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	nhorman-2XuSBdqkA4R54TAoqtyWWQ,
	john.r.fastabend-ral2JQCrhuEAvxtiuMwx3w, h,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netfilter-devel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <FEB6B45E-1CCF-4CBC-AEB7-21D2088E175C-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

From: "Rustad, Mark D" <mark.d.rustad-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Date: Wed, 18 Jul 2012 18:31:31 +0000

> If this looks like a good change, I can send the patch. Is there any
> concern about init_net going from bss to data?

There is no such concern, I like your change a lot.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next 2/7] sfc: Add channel specific receive_skb handler and post_remove callback
From: David Miller @ 2012-07-18 18:32 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, linux-net-drivers, ajackson, richardcochran
In-Reply-To: <1342635600.2617.54.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 18 Jul 2012 19:20:00 +0100

> +	void (*receive_skb)(struct efx_channel *, struct sk_buff *);

This looks to me like a conduit for proprietary features implemented
in a binary-only blob.

I understand how you're using here for PTP, but you're really openning
the door for things I really wouldn't be very happy about.

^ permalink raw reply

* Re: That's pretty much it for 3.5.0
From: Rustad, Mark D @ 2012-07-18 18:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Neil Horman, Fastabend, John R,
	<h@hmsreliant.think-freely.org>, David Miller,
	<netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	<netfilter-devel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>
In-Reply-To: <1342634139.2626.3281.camel@edumazet-glaptop>


On Jul 18, 2012, at 10:55 AM, Eric Dumazet wrote:

> On Wed, 2012-07-18 at 17:36 +0000, Rustad, Mark D wrote:
>> 
>> The following change simply statically initializes init_net.dev_base_head. I copied and pasted it into the email, so this rendering may not work, but I can send it if this approach looks reasonable. I have verified that it resolves the issue above.
>> 
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 0f28a9e..db1ba61 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -6283,8 +6283,6 @@ static struct hlist_head *netdev_create_hash(void)
>> /* Initialize per network namespace state */
>> static int __net_init netdev_init(struct net *net)
>> {
>> -       INIT_LIST_HEAD(&net->dev_base_head);
>> -
> 
> 	if (net != &init_net)
> 		INIT_LIST_HEAD(&net->dev_base_head);

Ooooh. Good catch.

>>        net->dev_name_head = netdev_create_hash();
>>        if (net->dev_name_head == NULL)
>>                goto err_name;
>> diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
>> index dddbacb..42f1e1c 100644
>> --- a/net/core/net_namespace.c
>> +++ b/net/core/net_namespace.c
>> @@ -27,7 +27,9 @@ static DEFINE_MUTEX(net_mutex);
>> LIST_HEAD(net_namespace_list);
>> EXPORT_SYMBOL_GPL(net_namespace_list);
>> 
>> -struct net init_net;
>> +struct net init_net = {
>> +       .dev_base_head = LIST_HEAD_INIT(init_net.dev_base_head),
>> +};
>> EXPORT_SYMBOL(init_net);
>> 
>> #define INITIAL_NET_GEN_PTRS   13 /* +1 for len +2 for rcu_head */


If this looks like a good change, I can send the patch. Is there any concern about init_net going from bss to data?

-- 
Mark Rustad, LAN Access Division, Intel Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next v4] ipv6: add ipv6_addr_hash() helper
From: David Miller @ 2012-07-18 18:29 UTC (permalink / raw)
  To: eric.dumazet; +Cc: joe, netdev, andrewmcgr, dave.taht, therbert
In-Reply-To: <1342635072.2626.3322.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 18 Jul 2012 20:11:12 +0200

> From: Eric Dumazet <edumazet@google.com>
> 
> Introduce ipv6_addr_hash() helper doing a XOR on all bits
> of an IPv6 address, with an optimized x86_64 version.
> 
> Use it in flow dissector, as suggested by Andrew McGregor,
> to reduce hash collision probabilities in fq_codel (and other
> users of flow dissector)
> 
> Use it in ip6_tunnel.c and use more bit shuffling, as suggested
> by David Laight, as existing hash was ignoring most of them.
> 
> Use it in sunrpc and use more bit shuffling, using hash_32().
> 
> Use it in net/ipv6/addrconf.c, using hash_32() as well.
> 
> As a cleanup, use it in net/ipv4/tcp_metrics.c
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Andrew McGregor <andrewmcgr@gmail.com>

Applied, thanks.

> v4: net/ipv6/addrconf.c part, sorry again David

The more you test my routing cache removal patches, the
more you will be forgiven :-))))))

^ permalink raw reply

* Re: [patch net-next] team: refine IFF_XMIT_DST_RELEASE capability
From: David Miller @ 2012-07-18 18:28 UTC (permalink / raw)
  To: eric.dumazet; +Cc: jiri, netdev, edumazet
In-Reply-To: <1342635454.2626.3337.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 18 Jul 2012 20:17:34 +0200

> On Wed, 2012-07-18 at 19:39 +0200, Jiri Pirko wrote:
>> Cloned patch of Eric Dumazet for bonding.
>> 
>> Some workloads greatly benefit of IFF_XMIT_DST_RELEASE capability
>> on output net device, avoiding dirtying dst refcount.
>> 
>> team currently disables IFF_XMIT_DST_RELEASE unconditionally.
>> 
>> If all ports have the IFF_XMIT_DST_RELEASE bit set, then
>> team dev can also have it in its priv_flags.
>> 
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>> ---
>>  drivers/net/team/team.c |    5 +++++
>>  1 file changed, 5 insertions(+)
> 
> Acked-by: Eric Dumazet <edumazet@google.com>

Applied, thanks.

^ permalink raw reply

* [PATCH 15/15] ipv4: Kill rt->fi
From: David Miller @ 2012-07-18 18:24 UTC (permalink / raw)
  To: netdev


It's not really needed.

We only grabbed a reference to the fib_info for the sake of fib_info
local metrics.

However, fib_info objects are freed using RCU, as are therefore their
private metrics (if any).

We would have triggered a route cache flush if we eliminated a
reference to a fib_info object in the routing tables.

Therefore, any existing cached routes will first check and see that
they have been invalidated before an errant reference to these
metric values would occur.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/route.h |    1 -
 net/ipv4/route.c    |   32 +-------------------------------
 2 files changed, 1 insertion(+), 32 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index f3ef18a..665c9ce 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -56,7 +56,6 @@ struct rtable {
 
 	/* Miscellaneous cached information */
 	u32			rt_pmtu;
-	struct fib_info		*fi; /* for client ref to shared metrics */
 };
 
 static inline bool rt_is_input_route(const struct rtable *rt)
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index ee35047..34be3f2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -141,7 +141,6 @@ static int ip_rt_min_advmss __read_mostly	= 256;
 static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie);
 static unsigned int	 ipv4_default_advmss(const struct dst_entry *dst);
 static unsigned int	 ipv4_mtu(const struct dst_entry *dst);
-static void		 ipv4_dst_destroy(struct dst_entry *dst);
 static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst);
 static void		 ipv4_link_failure(struct sk_buff *skb);
 static void		 ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
@@ -171,7 +170,6 @@ static struct dst_ops ipv4_dst_ops = {
 	.default_advmss =	ipv4_default_advmss,
 	.mtu =			ipv4_mtu,
 	.cow_metrics =		ipv4_cow_metrics,
-	.destroy =		ipv4_dst_destroy,
 	.ifdown =		ipv4_dst_ifdown,
 	.negative_advice =	ipv4_negative_advice,
 	.link_failure =		ipv4_link_failure,
@@ -1026,17 +1024,6 @@ static struct dst_entry *ipv4_dst_check(struct dst_entry *dst, u32 cookie)
 	return dst;
 }
 
-static void ipv4_dst_destroy(struct dst_entry *dst)
-{
-	struct rtable *rt = (struct rtable *) dst;
-
-	if (rt->fi) {
-		fib_info_put(rt->fi);
-		rt->fi = NULL;
-	}
-}
-
-
 static void ipv4_link_failure(struct sk_buff *skb)
 {
 	struct rtable *rt;
@@ -1151,15 +1138,6 @@ static unsigned int ipv4_mtu(const struct dst_entry *dst)
 	return mtu;
 }
 
-static void rt_init_metrics(struct rtable *rt, struct fib_info *fi)
-{
-	if (fi->fib_metrics != (u32 *) dst_default_metrics) {
-		rt->fi = fi;
-		atomic_inc(&fi->fib_clntref);
-	}
-	dst_init_metrics(&rt->dst, fi->fib_metrics, true);
-}
-
 static struct fib_nh_exception *find_exception(struct fib_nh *nh, __be32 daddr)
 {
 	struct fnhe_hash_bucket *hash = nh->nh_exceptions;
@@ -1227,7 +1205,7 @@ static void rt_set_nexthop(struct rtable *rt, const struct fib_result *res,
 			rt->rt_gateway = nh->nh_gw;
 		if (unlikely(fnhe))
 			rt_bind_exception(rt, fnhe);
-		rt_init_metrics(rt, fi);
+		dst_init_metrics(&rt->dst, fi->fib_metrics, true);
 #ifdef CONFIG_IP_ROUTE_CLASSID
 		rt->dst.tclassid = nh->nh_tclassid;
 #endif
@@ -1300,7 +1278,6 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	rth->rt_iif	= dev->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway	= 0;
-	rth->fi = NULL;
 	if (our) {
 		rth->dst.input= ip_local_deliver;
 		rth->rt_flags |= RTCF_LOCAL;
@@ -1430,7 +1407,6 @@ static int __mkroute_input(struct sk_buff *skb,
 	rth->rt_iif 	= in_dev->dev->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway	= 0;
-	rth->fi = NULL;
 
 	rth->dst.input = ip_forward;
 	rth->dst.output = ip_output;
@@ -1608,7 +1584,6 @@ local_input:
 	rth->rt_iif	= dev->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway	= 0;
-	rth->fi = NULL;
 	if (res.type == RTN_UNREACHABLE) {
 		rth->dst.input= ip_error;
 		rth->dst.error= -err;
@@ -1773,7 +1748,6 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
 	rth->rt_iif	= orig_oif ? : dev_out->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway = 0;
-	rth->fi = NULL;
 
 	RT_CACHE_STAT_INC(out_slow_tot);
 
@@ -2018,7 +1992,6 @@ static u32 *ipv4_rt_blackhole_cow_metrics(struct dst_entry *dst,
 static struct dst_ops ipv4_dst_blackhole_ops = {
 	.family			=	AF_INET,
 	.protocol		=	cpu_to_be16(ETH_P_IP),
-	.destroy		=	ipv4_dst_destroy,
 	.check			=	ipv4_blackhole_dst_check,
 	.mtu			=	ipv4_blackhole_mtu,
 	.default_advmss		=	ipv4_default_advmss,
@@ -2052,9 +2025,6 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
 		rt->rt_flags = ort->rt_flags;
 		rt->rt_type = ort->rt_type;
 		rt->rt_gateway = ort->rt_gateway;
-		rt->fi = ort->fi;
-		if (rt->fi)
-			atomic_inc(&rt->fi->fib_clntref);
 
 		dst_free(new);
 	}
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 14/15] ipv4: Turn rt->rt_route_iif into rt->rt_is_input.
From: David Miller @ 2012-07-18 18:24 UTC (permalink / raw)
  To: netdev


That is this value's only use, as a boolean to indicate whether
a route is an input route or not.

So implement it that way, using a u16 gap present in the struct
already.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/route.h     |    6 +++---
 net/ipv4/route.c        |   10 +++++-----
 net/ipv4/xfrm4_policy.c |    2 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/net/route.h b/include/net/route.h
index ee3bf84..f3ef18a 100644
--- a/include/net/route.h
+++ b/include/net/route.h
@@ -47,8 +47,8 @@ struct rtable {
 	int			rt_genid;
 	unsigned int		rt_flags;
 	__u16			rt_type;
+	__u16			rt_is_input;
 
-	int			rt_route_iif;
 	int			rt_iif;
 
 	/* Info on neighbour */
@@ -61,12 +61,12 @@ struct rtable {
 
 static inline bool rt_is_input_route(const struct rtable *rt)
 {
-	return rt->rt_route_iif != 0;
+	return rt->rt_is_input != 0;
 }
 
 static inline bool rt_is_output_route(const struct rtable *rt)
 {
-	return rt->rt_route_iif == 0;
+	return rt->rt_is_input == 0;
 }
 
 struct ip_rt_acct {
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 4da374c..ee35047 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1296,7 +1296,7 @@ static int ip_route_input_mc(struct sk_buff *skb, __be32 daddr, __be32 saddr,
 	rth->rt_genid	= rt_genid(dev_net(dev));
 	rth->rt_flags	= RTCF_MULTICAST;
 	rth->rt_type	= RTN_MULTICAST;
-	rth->rt_route_iif = dev->ifindex;
+	rth->rt_is_input= 1;
 	rth->rt_iif	= dev->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway	= 0;
@@ -1426,7 +1426,7 @@ static int __mkroute_input(struct sk_buff *skb,
 	rth->rt_genid = rt_genid(dev_net(rth->dst.dev));
 	rth->rt_flags = flags;
 	rth->rt_type = res->type;
-	rth->rt_route_iif = in_dev->dev->ifindex;
+	rth->rt_is_input = 1;
 	rth->rt_iif 	= in_dev->dev->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway	= 0;
@@ -1604,7 +1604,7 @@ local_input:
 	rth->rt_genid = rt_genid(net);
 	rth->rt_flags 	= flags|RTCF_LOCAL;
 	rth->rt_type	= res.type;
-	rth->rt_route_iif = dev->ifindex;
+	rth->rt_is_input = 1;
 	rth->rt_iif	= dev->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway	= 0;
@@ -1769,7 +1769,7 @@ static struct rtable *__mkroute_output(const struct fib_result *res,
 	rth->rt_genid = rt_genid(dev_net(dev_out));
 	rth->rt_flags	= flags;
 	rth->rt_type	= type;
-	rth->rt_route_iif = 0;
+	rth->rt_is_input = 0;
 	rth->rt_iif	= orig_oif ? : dev_out->ifindex;
 	rth->rt_pmtu	= 0;
 	rth->rt_gateway = 0;
@@ -2044,7 +2044,7 @@ struct dst_entry *ipv4_blackhole_route(struct net *net, struct dst_entry *dst_or
 		if (new->dev)
 			dev_hold(new->dev);
 
-		rt->rt_route_iif = ort->rt_route_iif;
+		rt->rt_is_input = ort->rt_is_input;
 		rt->rt_iif = ort->rt_iif;
 		rt->rt_pmtu = ort->rt_pmtu;
 
diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
index 3c99b4c..c628184 100644
--- a/net/ipv4/xfrm4_policy.c
+++ b/net/ipv4/xfrm4_policy.c
@@ -79,7 +79,6 @@ static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
 	struct rtable *rt = (struct rtable *)xdst->route;
 	const struct flowi4 *fl4 = &fl->u.ip4;
 
-	xdst->u.rt.rt_route_iif = fl4->flowi4_iif;
 	xdst->u.rt.rt_iif = fl4->flowi4_iif;
 
 	xdst->u.dst.dev = dev;
@@ -87,6 +86,7 @@ static int xfrm4_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
 
 	/* Sheit... I remember I did this right. Apparently,
 	 * it was magically lost, so this code needs audit */
+	xdst->u.rt.rt_is_input = rt->rt_is_input;
 	xdst->u.rt.rt_flags = rt->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST |
 					      RTCF_LOCAL);
 	xdst->u.rt.rt_type = rt->rt_type;
-- 
1.7.10.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox