Netdev List

Netdev List
 help / color / mirror / Atom feed

* PATCH net/ipv6/mip6.c destopt corruption
From: András Takács @ 2012-07-17 14:47 UTC (permalink / raw)
  To: netdev
In-Reply-To: <A9CE2E85-2DDC-4182-B494-431A6A62BC95@wakoond.hu>


Dear All,


I have added a lot of debug messages to the kernel source and finally found the problem. When the kernel creates the skb from iovec (ip6_append_data) it sets the pointer of the network header to a wrong position. It will be shifted with 24 bytes (it is the length of the HAO dest. opt. header with paddings).

After this point, the message will be corrupted, the beginning (the first 24 bytes) of the MH part will be truncated. Later, when the kernel adds the dest. opt. header itself, there isn't any issue.

So, back to the wrong network header pointer. It is shifted by exthdrlen (= 24) by the skb_set_network_header() function. This exthdrlen comes from rt->rt6i_nfheader_len, which comes from the dst_entry chain. This nfheader_len value comes from the header_len of the desired xfrm type (in this case hao dest opt):

(net/xfrm/xfrm_policy.c: xfrm_bundle_create)
header_len += xfrm[i]->props.header_len;
if (xfrm[i]->type->flags & XFRM_TYPE_NON_FRAGMENT)
	nfheader_len += xfrm[i]->props.header_len;
...
xfrm_init_path((struct xfrm_dst *)dst0, dst, nfheader_len);

I have run a fast grep on the kernel tree, and this XFRM_TYPE_NON_FRAGMENT does not have any effect, just sets (or not) nfheader_len here. So, the following patch solves the issue:

diff -Nuar linux-3.4.2-orig/net/ipv6/mip6.c linux-3.4.2/net/ipv6/mip6.c
--- linux-3.4.2-orig/net/ipv6/mip6.c	2012-07-17 15:18:30.148777104 +0200
+++ linux-3.4.2/net/ipv6/mip6.c	2012-07-17 15:21:12.104779113 +0200
@@ -338,7 +338,7 @@
 	.description	= "MIP6DESTOPT",
 	.owner		= THIS_MODULE,
 	.proto	     	= IPPROTO_DSTOPTS,
-	.flags		= XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_LOCAL_COADDR,
+	.flags		= XFRM_TYPE_LOCAL_COADDR,
 	.init_state	= mip6_destopt_init_state,
 	.destructor	= mip6_destopt_destroy,
 	.input		= mip6_destopt_input,
@@ -471,7 +471,7 @@
 	.description	= "MIP6RT",
 	.owner		= THIS_MODULE,
 	.proto	     	= IPPROTO_ROUTING,
-	.flags		= XFRM_TYPE_NON_FRAGMENT | XFRM_TYPE_REMOTE_COADDR,
+	.flags		= XFRM_TYPE_REMOTE_COADDR,
 	.init_state	= mip6_rthdr_init_state,
 	.destructor	= mip6_rthdr_destroy,
 	.input		= mip6_rthdr_input,

What do you think about this fix? Does it have any drawback?


Ragards,
András


On Jul 16, 2012, at 3:40 PM, András Takács wrote:

> 
> Dear All,
> 
> 
> I have serious problems with HAO dest opt XFRM processing. In the past few days I have tried to find the problem, and I figured out the following:
> 
> 1. case: No XFRM rules
> It works fine (as it was described in my previous e-mail)
> 
> 2. case: HAO RO XFRM processing
> I have created the following rules manually:
> sudo ip -6 xfrm policy add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto 135 type 5 dir out priority 2 ptype sub tmpl src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro
> sudo ip -6 xfrm state add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro replay-window 0 coa 2001:470:7210:11:20c:29ff:fe46:a0e3 sel src 2001:470:7210:10::11 dst 2001:470:7210:10::1000
> 
> The message format is corrupted, because during the xfrm processing, the beginning of the MH part will be overwritten by the DST OPT header.
> 
> 3. case: ESP TUNNEL XFRM
> I have created ESP TUNNEL XFRM rules manually, and it was worked fine. 
> So the problem has to be somewhere in the net/ipv6/mip6.c or net/ipv6/xfrm_mode_ro.c files.
> 
> -------------
> 
> I added a lot of debug printk statements to the source, and I have figured out the following:
> 
> When the kernel creates the skb from the iovec, it seems to be ok (in ip6_append_data):
> 
> skb->data to skb->tail:
> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 57 39 B1 FB 87 39 B1 FB 64 92 FF FF 18 00 00 00 00 00 00 00 00 00 00 00 3B 03 05 00 00 00 00 01 00 00 00 02 01 00 03 10 20 01 04 70 72 10 00 11 02 0C 29 FF FE 46 A0 E3
> 
> Unfortunately at the beginning of the xfrm6_ro_output function, it seems to be corrupt:
> 
> 60 00 00 00 00 08 87 40 20 01 04 70 72 10 00 10 00 00 00 00 00 00 00 11 20 01 04 70 72 10 00 10 00 00 00 00 00 00 10 00 02 0C 29 FF FE 46 A0 E3
> 
> Here missing the first 24 bytes of the MH part. It is quite suspicious, because the size of the DST OPT header (with the necessary padding) is exactly same long. 
> 
> After this point xfrm6_ro_output and mip6_destopt_output works fine, and insert the DST OPT header to this truncated skb.
> 
> 
> Could you please help me to find the connection ("call - graph" ?) between ip6_append_data and xfrm6_ro_output? I can't find the point where it fails. In ip6_append_data, the beginning of the skb is reserved for the IPv6 header, but where will be this part filled with the right values?
> 
> 
> Thank you very much for your help!
> 
> 
> Regards,
> András
> 
> 
> On Jun 21, 2012, at 10:41 PM, Andras Takacs wrote:
> 
>> Dear All,
>> 
>> I'm working with Mobile IPv6 systems, and I'm setting up a new MIP6 environment. I would like to use the latest stable kernel, so I'm using 3.4.2. Unfortunately I have some serious problems with destination option XFRM processing. I have done the following tests to find the issue:
>> 
>> First case: No XFRM policies and states.
>> Sending MH messages without destopt header.
>> In this case the message format is OK, I have tested it with tcpdump and wireshark.
>> 
>> 21:33:58.817130 IP6 2001:470:7210:10::11 > 2001:470:7210:10::1000: mobility: BU seq#=1 lifetime=8
>> 	0x0000:  6000 0000 0020 8740 2001 0470 7210 0010  `......@...pr...
>> 	0x0010:  0000 0000 0000 0011 2001 0470 7210 0010  ...........pr...
>> 	0x0020:  0000 0000 0000 1000 3b03 0500 1c46 0001  ........;....F..
>> 	0x0030:  0000 0002 0100 0310 2001 0470 7210 0011  ...........pr...
>> 	0x0040:  020c 29ff fe46 a0e3                      ..)..F..
>> 
>> Second case: Adding destopt XFRM policy and state:
>> 
>> ip -6 xfrm policy add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto 135 type 5 dir out priority 2 ptype sub tmpl src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro level use
>> ip -6 xfrm state add src 2001:470:7210:10::11 dst 2001:470:7210:10::1000 proto hao reqid 0 mode ro replay-window 0 coa 2001:470:7210:11:20c:29ff:fe46:a0e3 sel src 2001:470:7210:10::11 dst 2001:470:7210:10::1000
>> 
>> In this case, the message format is corrupted:
>> 
>> 21:30:42.350315 IP6 2001:470:7210:11:20c:29ff:fe46:a0e3 > 2001:470:7210:10::1000: DSTOPT mobility: type-#41 len=12
>> 	0x0000:  6000 0000 0020 3c40 2001 0470 7210 0011  `.....<@...pr...
>> 	0x0010:  020c 29ff fe46 a0e3 2001 0470 7210 0010  ..)..F.....pr...
>> 	0x0020:  0000 0000 0000 1000 8702 0102 0000 c910  ................
>> 	0x0030:  2001 0470 7210 0010 0000 0000 0000 0011  ...pr...........
>> 	0x0040:  020c 29ff fe46 a0e3 
>> 
>> As you can see, the IPv6 header is OK. Next, the destination option header is OK. Finally, the following part of the packet isn't OK. If you compare the two dump carefully, you will see, that the last 8 bytes are identical. The mip6_destopt_output function adds the destination option header correctly, but overwrites the existing MH header, and doesn't shift it after the destopt header.
>> 
>> I'm not familiar with the XFRM framework enough to fix the problem. :(
>> Maybe, could anyone help to me to fix this issue?
>> 
>> The last environment, which worked fine was built on 2.6.35 version. The problem happened between 2.6.35 and 3.4.2. Sorry, I know, it is a quite big interval. :(
>> 
>> Thanks!
>> 
>> 
>> Best regards,
>> András Takács
> 

^ permalink raw reply

* [patch net-next 0/2] team: add netpoll support
From: Jiri Pirko @ 2012-07-17 15:22 UTC (permalink / raw)
  To: netdev; +Cc: davem

Also contains a little change to netpoll core.

Jiri Pirko (2):
  netpoll: move np->dev and np->dev_name init into __netpoll_setup()
  team: add netpoll support

 drivers/net/bonding/bond_main.c           |    4 +-
 drivers/net/team/team.c                   |  113 +++++++++++++++++++++++++++++
 drivers/net/team/team_mode_activebackup.c |    3 +-
 drivers/net/team/team_mode_broadcast.c    |    7 +-
 drivers/net/team/team_mode_loadbalance.c  |    3 +-
 drivers/net/team/team_mode_roundrobin.c   |    3 +-
 include/linux/if_team.h                   |   33 +++++++++
 include/linux/netpoll.h                   |    2 +-
 net/8021q/vlan_dev.c                      |    5 +-
 net/bridge/br_device.c                    |    5 +-
 net/core/netpoll.c                        |   10 +--
 11 files changed, 161 insertions(+), 27 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

* [patch net-next 1/2] netpoll: move np->dev and np->dev_name init into __netpoll_setup()
From: Jiri Pirko @ 2012-07-17 15:22 UTC (permalink / raw)
  To: netdev; +Cc: davem
In-Reply-To: <1342538556-22601-1-git-send-email-jiri@resnulli.us>

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 drivers/net/bonding/bond_main.c |    4 +---
 include/linux/netpoll.h         |    2 +-
 net/8021q/vlan_dev.c            |    5 +----
 net/bridge/br_device.c          |    5 +----
 net/core/netpoll.c              |   10 +++++-----
 5 files changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 4ddcc3e..1eb3979 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1240,9 +1240,7 @@ static inline int slave_enable_netpoll(struct slave *slave)
 	if (!np)
 		goto out;
 
-	np->dev = slave->dev;
-	strlcpy(np->dev_name, slave->dev->name, IFNAMSIZ);
-	err = __netpoll_setup(np);
+	err = __netpoll_setup(np, slave->dev);
 	if (err) {
 		kfree(np);
 		goto out;
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 5dfa091..28f5389 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -43,7 +43,7 @@ struct netpoll_info {
 void netpoll_send_udp(struct netpoll *np, const char *msg, int len);
 void netpoll_print_options(struct netpoll *np);
 int netpoll_parse_options(struct netpoll *np, char *opt);
-int __netpoll_setup(struct netpoll *np);
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev);
 int netpoll_setup(struct netpoll *np);
 int netpoll_trap(void);
 void netpoll_set_trap(int trap);
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index da1bc9c..73a2a83 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -681,10 +681,7 @@ static int vlan_dev_netpoll_setup(struct net_device *dev, struct netpoll_info *n
 	if (!netpoll)
 		goto out;
 
-	netpoll->dev = real_dev;
-	strlcpy(netpoll->dev_name, real_dev->name, IFNAMSIZ);
-
-	err = __netpoll_setup(netpoll);
+	err = __netpoll_setup(netpoll, real_dev);
 	if (err) {
 		kfree(netpoll);
 		goto out;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 929e48aed..f4be1bb 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -246,10 +246,7 @@ int br_netpoll_enable(struct net_bridge_port *p)
 	if (!np)
 		goto out;
 
-	np->dev = p->dev;
-	strlcpy(np->dev_name, p->dev->name, IFNAMSIZ);
-
-	err = __netpoll_setup(np);
+	err = __netpoll_setup(np, p->dev);
 	if (err) {
 		kfree(np);
 		goto out;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index f9f40b9..b4c90e4 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -715,14 +715,16 @@ int netpoll_parse_options(struct netpoll *np, char *opt)
 }
 EXPORT_SYMBOL(netpoll_parse_options);
 
-int __netpoll_setup(struct netpoll *np)
+int __netpoll_setup(struct netpoll *np, struct net_device *ndev)
 {
-	struct net_device *ndev = np->dev;
 	struct netpoll_info *npinfo;
 	const struct net_device_ops *ops;
 	unsigned long flags;
 	int err;
 
+	np->dev = ndev;
+	strlcpy(np->dev_name, ndev->name, IFNAMSIZ);
+
 	if ((ndev->priv_flags & IFF_DISABLE_NETPOLL) ||
 	    !ndev->netdev_ops->ndo_poll_controller) {
 		np_err(np, "%s doesn't support polling, aborting\n",
@@ -851,13 +853,11 @@ int netpoll_setup(struct netpoll *np)
 		np_info(np, "local IP %pI4\n", &np->local_ip);
 	}
 
-	np->dev = ndev;
-
 	/* fill up the skb queue */
 	refill_skbs();
 
 	rtnl_lock();
-	err = __netpoll_setup(np);
+	err = __netpoll_setup(np, ndev);
 	rtnl_unlock();
 
 	if (err)
-- 
1.7.10.4

^ permalink raw reply related

* [patch net-next 2/2] team: add netpoll support
From: Jiri Pirko @ 2012-07-17 15:22 UTC (permalink / raw)
  To: netdev; +Cc: davem
In-Reply-To: <1342538556-22601-1-git-send-email-jiri@resnulli.us>

It's done in very similar way this is done in bonding and bridge.

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 drivers/net/team/team.c                   |  113 +++++++++++++++++++++++++++++
 drivers/net/team/team_mode_activebackup.c |    3 +-
 drivers/net/team/team_mode_broadcast.c    |    7 +-
 drivers/net/team/team_mode_loadbalance.c  |    3 +-
 drivers/net/team/team_mode_roundrobin.c   |    3 +-
 include/linux/if_team.h                   |   33 +++++++++
 6 files changed, 152 insertions(+), 10 deletions(-)

diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
index 3620c63..1a13470 100644
--- a/drivers/net/team/team.c
+++ b/drivers/net/team/team.c
@@ -18,6 +18,7 @@
 #include <linux/ctype.h>
 #include <linux/notifier.h>
 #include <linux/netdevice.h>
+#include <linux/netpoll.h>
 #include <linux/if_vlan.h>
 #include <linux/if_arp.h>
 #include <linux/socket.h>
@@ -787,6 +788,58 @@ static void team_port_leave(struct team *team, struct team_port *port)
 	dev_put(team->dev);
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
+{
+	struct netpoll *np;
+	int err;
+
+	np = kzalloc(sizeof(*np), GFP_KERNEL);
+	if (!np)
+		return -ENOMEM;
+
+	err = __netpoll_setup(np, port->dev);
+	if (err) {
+		kfree(np);
+		return err;
+	}
+	port->np = np;
+	return err;
+}
+
+static void team_port_disable_netpoll(struct team_port *port)
+{
+	struct netpoll *np = port->np;
+
+	if (!np)
+		return;
+	port->np = NULL;
+
+	/* Wait for transmitting packets to finish before freeing. */
+	synchronize_rcu_bh();
+	__netpoll_cleanup(np);
+	kfree(np);
+}
+
+static struct netpoll_info *team_netpoll_info(struct team *team)
+{
+	return team->dev->npinfo;
+}
+
+#else
+static int team_port_enable_netpoll(struct team *team, struct team_port *port)
+{
+	return 0;
+}
+static void team_port_disable_netpoll(struct team_port *port)
+{
+}
+static struct netpoll_info *team_netpoll_info(struct team *team)
+{
+	return NULL;
+}
+#endif
+
 static void __team_port_change_check(struct team_port *port, bool linkup);
 
 static int team_port_add(struct team *team, struct net_device *port_dev)
@@ -853,6 +906,15 @@ static int team_port_add(struct team *team, struct net_device *port_dev)
 		goto err_vids_add;
 	}
 
+	if (team_netpoll_info(team)) {
+		err = team_port_enable_netpoll(team, port);
+		if (err) {
+			netdev_err(dev, "Failed to enable netpoll on device %s\n",
+				   portname);
+			goto err_enable_netpoll;
+		}
+	}
+
 	err = netdev_set_master(port_dev, dev);
 	if (err) {
 		netdev_err(dev, "Device %s failed to set master\n", portname);
@@ -892,6 +954,9 @@ err_handler_register:
 	netdev_set_master(port_dev, NULL);
 
 err_set_master:
+	team_port_disable_netpoll(port);
+
+err_enable_netpoll:
 	vlan_vids_del_by_dev(port_dev, dev);
 
 err_vids_add:
@@ -932,6 +997,7 @@ static int team_port_del(struct team *team, struct net_device *port_dev)
 	list_del_rcu(&port->list);
 	netdev_rx_handler_unregister(port_dev);
 	netdev_set_master(port_dev, NULL);
+	team_port_disable_netpoll(port);
 	vlan_vids_del_by_dev(port_dev, dev);
 	dev_close(port_dev);
 	team_port_leave(team, port);
@@ -1307,6 +1373,48 @@ static int team_vlan_rx_kill_vid(struct net_device *dev, uint16_t vid)
 	return 0;
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void team_poll_controller(struct net_device *dev)
+{
+}
+
+static void __team_netpoll_cleanup(struct team *team)
+{
+	struct team_port *port;
+
+	list_for_each_entry(port, &team->port_list, list)
+		team_port_disable_netpoll(port);
+}
+
+static void team_netpoll_cleanup(struct net_device *dev)
+{
+	struct team *team = netdev_priv(dev);
+
+	mutex_lock(&team->lock);
+	__team_netpoll_cleanup(team);
+	mutex_unlock(&team->lock);
+}
+
+static int team_netpoll_setup(struct net_device *dev,
+			      struct netpoll_info *npifo)
+{
+	struct team *team = netdev_priv(dev);
+	struct team_port *port;
+	int err;
+
+	mutex_lock(&team->lock);
+	list_for_each_entry(port, &team->port_list, list) {
+		err = team_port_enable_netpoll(team, port);
+		if (err) {
+			__team_netpoll_cleanup(team);
+			break;
+		}
+	}
+	mutex_unlock(&team->lock);
+	return err;
+}
+#endif
+
 static int team_add_slave(struct net_device *dev, struct net_device *port_dev)
 {
 	struct team *team = netdev_priv(dev);
@@ -1363,6 +1471,11 @@ static const struct net_device_ops team_netdev_ops = {
 	.ndo_get_stats64	= team_get_stats64,
 	.ndo_vlan_rx_add_vid	= team_vlan_rx_add_vid,
 	.ndo_vlan_rx_kill_vid	= team_vlan_rx_kill_vid,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	.ndo_poll_controller	= team_poll_controller,
+	.ndo_netpoll_setup	= team_netpoll_setup,
+	.ndo_netpoll_cleanup	= team_netpoll_cleanup,
+#endif
 	.ndo_add_slave		= team_add_slave,
 	.ndo_del_slave		= team_del_slave,
 	.ndo_fix_features	= team_fix_features,
diff --git a/drivers/net/team/team_mode_activebackup.c b/drivers/net/team/team_mode_activebackup.c
index 253b8a5..6262b4d 100644
--- a/drivers/net/team/team_mode_activebackup.c
+++ b/drivers/net/team/team_mode_activebackup.c
@@ -43,8 +43,7 @@ static bool ab_transmit(struct team *team, struct sk_buff *skb)
 	active_port = rcu_dereference_bh(ab_priv(team)->active_port);
 	if (unlikely(!active_port))
 		goto drop;
-	skb->dev = active_port->dev;
-	if (dev_queue_xmit(skb))
+	if (team_dev_queue_xmit(team, active_port, skb))
 		return false;
 	return true;
 
diff --git a/drivers/net/team/team_mode_broadcast.c b/drivers/net/team/team_mode_broadcast.c
index 5562345..c96e4d2 100644
--- a/drivers/net/team/team_mode_broadcast.c
+++ b/drivers/net/team/team_mode_broadcast.c
@@ -29,8 +29,8 @@ static bool bc_transmit(struct team *team, struct sk_buff *skb)
 			if (last) {
 				skb2 = skb_clone(skb, GFP_ATOMIC);
 				if (skb2) {
-					skb2->dev = last->dev;
-					ret = dev_queue_xmit(skb2);
+					ret = team_dev_queue_xmit(team, last,
+								  skb2);
 					if (!sum_ret)
 						sum_ret = ret;
 				}
@@ -39,8 +39,7 @@ static bool bc_transmit(struct team *team, struct sk_buff *skb)
 		}
 	}
 	if (last) {
-		skb->dev = last->dev;
-		ret = dev_queue_xmit(skb);
+		ret = team_dev_queue_xmit(team, last, skb);
 		if (!sum_ret)
 			sum_ret = ret;
 	}
diff --git a/drivers/net/team/team_mode_loadbalance.c b/drivers/net/team/team_mode_loadbalance.c
index 51a4b19..cdc31b5 100644
--- a/drivers/net/team/team_mode_loadbalance.c
+++ b/drivers/net/team/team_mode_loadbalance.c
@@ -217,8 +217,7 @@ static bool lb_transmit(struct team *team, struct sk_buff *skb)
 	port = select_tx_port_func(team, lb_priv, skb, hash);
 	if (unlikely(!port))
 		goto drop;
-	skb->dev = port->dev;
-	if (dev_queue_xmit(skb))
+	if (team_dev_queue_xmit(team, port, skb))
 		return false;
 	lb_update_tx_stats(tx_bytes, lb_priv, get_lb_port_priv(port), hash);
 	return true;
diff --git a/drivers/net/team/team_mode_roundrobin.c b/drivers/net/team/team_mode_roundrobin.c
index 0cf38e9..ad7ed0e 100644
--- a/drivers/net/team/team_mode_roundrobin.c
+++ b/drivers/net/team/team_mode_roundrobin.c
@@ -55,8 +55,7 @@ static bool rr_transmit(struct team *team, struct sk_buff *skb)
 	port = __get_first_port_up(team, port);
 	if (unlikely(!port))
 		goto drop;
-	skb->dev = port->dev;
-	if (dev_queue_xmit(skb))
+	if (team_dev_queue_xmit(team, port, skb))
 		return false;
 	return true;
 
diff --git a/include/linux/if_team.h b/include/linux/if_team.h
index dfa0c8e..7fd0cde 100644
--- a/include/linux/if_team.h
+++ b/include/linux/if_team.h
@@ -13,6 +13,8 @@
 
 #ifdef __KERNEL__
 
+#include <linux/netpoll.h>
+
 struct team_pcpu_stats {
 	u64			rx_packets;
 	u64			rx_bytes;
@@ -60,6 +62,10 @@ struct team_port {
 		unsigned int mtu;
 	} orig;
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	struct netpoll *np;
+#endif
+
 	long mode_priv[0];
 };
 
@@ -73,6 +79,33 @@ static inline bool team_port_txable(struct team_port *port)
 	return port->linkup && team_port_enabled(port);
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static inline void team_netpoll_send_skb(struct team_port *port,
+					 struct sk_buff *skb)
+{
+	struct netpoll *np = port->np;
+
+	if (np)
+		netpoll_send_skb(np, skb);
+}
+#else
+static inline void team_netpoll_send_skb(struct team_port *port,
+					 struct sk_buff *skb)
+{
+}
+#endif
+
+static inline int team_dev_queue_xmit(struct team *team, struct team_port *port,
+				      struct sk_buff *skb)
+{
+	skb->dev = port->dev;
+	if (unlikely(netpoll_tx_running(port->dev))) {
+		team_netpoll_send_skb(port, skb);
+		return 0;
+	}
+	return dev_queue_xmit(skb);
+}
+
 struct team_mode_ops {
 	int (*init)(struct team *team);
 	void (*exit)(struct team *team);
-- 
1.7.10.4

^ permalink raw reply related

* ethtool 3.4.2 released
From: Ben Hutchings @ 2012-07-17 15:31 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 756 bytes --]

ethtool version 3.4.2 has been released.  This fixes various bugs.

Home page: https://ftp.kernel.org/pub/software/network/ethtool/
Download link:
https://ftp.kernel.org/pub/software/network/ethtool/ethtool-3.4.2.tar.gz

Release notes:

	* Fix: Fix regression in RX NFC rule insertion for drivers that do
	  not select rule locations (-N/-U option)
	* Fix: Remove bogus error message when changing offload settings
	  on Linux < 2.6.39 (-K option)
	* Fix: Use alternate method to check for VLAN tag offload on Linux
	  < 2.6.37 (-k option)

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: [ethtool PATCH] ethtool: Resolve use of uninitialized memory in rxclass_get_dev_info
From: Ben Hutchings @ 2012-07-17 15:32 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: netdev, jeffrey.t.kirsher
In-Reply-To: <5004AD60.9090702@intel.com>

On Mon, 2012-07-16 at 17:10 -0700, Alexander Duyck wrote:
> On 07/16/2012 01:03 PM, Ben Hutchings wrote:
> > On Fri, 2012-07-13 at 09:55 -0700, Alexander Duyck wrote:
> >> The ethtool function for getting the rule count was not zeroing out the
> >> data field before passing it to the kernel.  As a result the value started
> >> uninitialized and was incorrectly returning a result indicating that
> >> devices supported setting new rule indexes.  In order to correct this I am
> >> adding a one line fix that sets data to zero before we pass the command to
> >> the kernel.
> > Right.  For 'get' commands with no parameters (besides the device) the
> > data copied back to userland is normally zero-initialised and then
> > filled out by the driver, and I seem to have worked on that assumption.
> > But because of the odd multiplexing of RX NFC commands
> > ETHTOOL_GRXCLSRLCNT doesn't work like that.  And for 'my' driver that
> > didn't matter.  Sorry about that.
> >
> > (We should really have some explicit documentation of responsibility for
> > structure initialisation.)
> >
> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> >> ---
> >>
> >> I am resending this since I didn't see any notification that it had been seen.
> >> I also realized that I had not clearly identified that this is an ethtool user
> >> space patch and not an ethtool kernel space patch.
> > It was perfectly clear and I had queued it up to review but hadn't yet
> > done so.
> >
> > Ben.
> >
> Yeah, that was my mistake.  I thought I hadn't sent it out with the
> ethtool prefix when I actually had.

So, anyway, I've applied it and just done a bug fix release (3.4.2).

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH 5/5 v2] ipv4: Add FIB nexthop exceptions.
From: David Miller @ 2012-07-17 15:58 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet


In a regime where we have subnetted route entries, we need a way to
store persistent storage about destination specific learned values
such as redirects and PMTU values.

This is implemented here via nexthop exceptions.

The initial implementation is a 2048 entry hash table with relaiming
starting at chain length 5.  A more sophisticated scheme can be
devised if that proves necessary.

Signed-off-by: David S. Miller <davem@davemloft.net>
---

Eric, just for you :-)

 include/net/ip_fib.h     |   18 ++++
 net/ipv4/fib_semantics.c |   23 +++++
 net/ipv4/route.c         |  256 ++++++++++++++++++++++++++++++++++++++++------
 3 files changed, 266 insertions(+), 31 deletions(-)

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 5697ace..e9ee1ca 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -18,6 +18,7 @@
 
 #include <net/flow.h>
 #include <linux/seq_file.h>
+#include <linux/rcupdate.h>
 #include <net/fib_rules.h>
 #include <net/inetpeer.h>
 
@@ -46,6 +47,22 @@ struct fib_config {
 
 struct fib_info;
 
+struct fib_nh_exception {
+	struct fib_nh_exception __rcu	*fnhe_next;
+	__be32				fnhe_daddr;
+	u32				fnhe_pmtu;
+	u32				fnhe_gw;
+	unsigned long			fnhe_expires;
+	unsigned long			fnhe_stamp;
+};
+
+struct fnhe_hash_bucket {
+	struct fib_nh_exception __rcu	*chain;
+};
+
+#define FNHE_HASH_SIZE		2048
+#define FNHE_RECLAIM_DEPTH	5
+
 struct fib_nh {
 	struct net_device	*nh_dev;
 	struct hlist_node	nh_hash;
@@ -63,6 +80,7 @@ struct fib_nh {
 	__be32			nh_gw;
 	__be32			nh_saddr;
 	int			nh_saddr_genid;
+	struct fnhe_hash_bucket	*nh_exceptions;
 };
 
 /*
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index d71bfbd..1e09852 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -140,6 +140,27 @@ const struct fib_prop fib_props[RTN_MAX + 1] = {
 	},
 };
 
+static void free_nh_exceptions(struct fib_nh *nh)
+{
+	struct fnhe_hash_bucket *hash = nh->nh_exceptions;
+	int i;
+
+	for (i = 0; i < FNHE_HASH_SIZE; i++) {
+		struct fib_nh_exception *fnhe;
+
+		fnhe = rcu_dereference(hash[i].chain);
+		while (fnhe) {
+			struct fib_nh_exception *next;
+			
+			next = rcu_dereference(fnhe->fnhe_next);
+			kfree(fnhe);
+
+			fnhe = next;
+		}
+	}
+	kfree(hash);
+}
+
 /* Release a nexthop info record */
 static void free_fib_info_rcu(struct rcu_head *head)
 {
@@ -148,6 +169,8 @@ static void free_fib_info_rcu(struct rcu_head *head)
 	change_nexthops(fi) {
 		if (nexthop_nh->nh_dev)
 			dev_put(nexthop_nh->nh_dev);
+		if (nexthop_nh->nh_exceptions)
+			free_nh_exceptions(nexthop_nh);
 	} endfor_nexthops(fi);
 
 	release_net(fi->fib_net);
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index b35d3bf..a5bd0b4 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1275,14 +1275,130 @@ static void rt_del(unsigned int hash, struct rtable *rt)
 	spin_unlock_bh(rt_hash_lock_addr(hash));
 }
 
-static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb)
+static void __build_flow_key(struct flowi4 *fl4, struct sock *sk,
+			     const struct iphdr *iph,
+			     int oif, u8 tos,
+			     u8 prot, u32 mark, int flow_flags)
+{
+	if (sk) {
+		const struct inet_sock *inet = inet_sk(sk);
+
+		oif = sk->sk_bound_dev_if;
+		mark = sk->sk_mark;
+		tos = RT_CONN_FLAGS(sk);
+		prot = inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol;
+	}
+	flowi4_init_output(fl4, oif, mark, tos,
+			   RT_SCOPE_UNIVERSE, prot,
+			   flow_flags,
+			   iph->daddr, iph->saddr, 0, 0);
+}
+
+static void build_skb_flow_key(struct flowi4 *fl4, struct sk_buff *skb, struct sock *sk)
+{
+	const struct iphdr *iph = ip_hdr(skb);
+	int oif = skb->dev->ifindex;
+	u8 tos = RT_TOS(iph->tos);
+	u8 prot = iph->protocol;
+	u32 mark = skb->mark;
+
+	__build_flow_key(fl4, sk, iph, oif, tos, prot, mark, 0);
+}
+
+static void build_sk_flow_key(struct flowi4 *fl4, struct sock *sk)
+{
+	const struct inet_sock *inet = inet_sk(sk);
+	struct ip_options_rcu *inet_opt;
+	__be32 daddr = inet->inet_daddr;
+
+	rcu_read_lock();
+	inet_opt = rcu_dereference(inet->inet_opt);
+	if (inet_opt && inet_opt->opt.srr)
+		daddr = inet_opt->opt.faddr;
+	flowi4_init_output(fl4, sk->sk_bound_dev_if, sk->sk_mark,
+			   RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE,
+			   inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
+			   inet_sk_flowi_flags(sk),
+			   daddr, inet->inet_saddr, 0, 0);
+	rcu_read_unlock();
+}
+
+static void ip_rt_build_flow_key(struct flowi4 *fl4, struct sock *sk,
+				 struct sk_buff *skb)
+{
+	if (skb)
+		build_skb_flow_key(fl4, skb, sk);
+	else
+		build_sk_flow_key(fl4, sk);
+}
+
+static DEFINE_SPINLOCK(fnhe_lock);
+
+static struct fib_nh_exception *fnhe_oldest(struct fnhe_hash_bucket *hash, __be32 daddr)
+{
+	struct fib_nh_exception *fnhe, *oldest;
+
+	oldest = rcu_dereference(hash->chain);
+	for (fnhe = rcu_dereference(oldest->fnhe_next); fnhe;
+	     fnhe = rcu_dereference(fnhe->fnhe_next)) {
+		if (time_before(fnhe->fnhe_stamp, oldest->fnhe_stamp))
+			oldest = fnhe;
+	}
+	return oldest;
+}
+
+static struct fib_nh_exception *find_or_create_fnhe(struct fib_nh *nh, __be32 daddr)
+{
+	struct fnhe_hash_bucket *hash = nh->nh_exceptions;
+	struct fib_nh_exception *fnhe;
+	int depth;
+	u32 hval;
+
+	if (!hash) {
+		hash = nh->nh_exceptions = kzalloc(FNHE_HASH_SIZE * sizeof(*hash),
+						   GFP_ATOMIC);
+		if (!hash)
+			return NULL;
+	}
+
+	hval = (__force u32) daddr;
+	hval ^= (hval >> 11) ^ (hval >> 22);
+	hash += hval;
+
+	depth = 0;
+	for (fnhe = rcu_dereference(hash->chain); fnhe;
+	     fnhe = rcu_dereference(fnhe->fnhe_next)) {
+		if (fnhe->fnhe_daddr == daddr)
+			goto out;
+		depth++;
+	}
+
+	if (depth > FNHE_RECLAIM_DEPTH) {
+		fnhe = fnhe_oldest(hash + hval, daddr);
+		goto out_daddr;
+	}
+	fnhe = kzalloc(sizeof(*fnhe), GFP_ATOMIC);
+	if (!fnhe)
+		return NULL;
+
+	fnhe->fnhe_next = hash->chain;
+	rcu_assign_pointer(hash->chain, fnhe);
+
+out_daddr:
+	fnhe->fnhe_daddr = daddr;
+out:
+	fnhe->fnhe_stamp = jiffies;
+	return fnhe;
+}
+
+static void __ip_do_redirect(struct rtable *rt, struct sk_buff *skb, struct flowi4 *fl4)
 {
 	__be32 new_gw = icmp_hdr(skb)->un.gateway;
 	__be32 old_gw = ip_hdr(skb)->saddr;
 	struct net_device *dev = skb->dev;
 	struct in_device *in_dev;
+	struct fib_result res;
 	struct neighbour *n;
-	struct rtable *rt;
 	struct net *net;
 
 	switch (icmp_hdr(skb)->code & 7) {
@@ -1296,7 +1412,6 @@ static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buf
 		return;
 	}
 
-	rt = (struct rtable *) dst;
 	if (rt->rt_gateway != old_gw)
 		return;
 
@@ -1320,11 +1435,21 @@ static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buf
 			goto reject_redirect;
 	}
 
-	n = ipv4_neigh_lookup(dst, NULL, &new_gw);
+	n = ipv4_neigh_lookup(&rt->dst, NULL, &new_gw);
 	if (n) {
 		if (!(n->nud_state & NUD_VALID)) {
 			neigh_event_send(n, NULL);
 		} else {
+			if (fib_lookup(net, fl4, &res) == 0) {
+				struct fib_nh *nh = &FIB_RES_NH(res);
+				struct fib_nh_exception *fnhe;
+
+				spin_lock_bh(&fnhe_lock);
+				fnhe = find_or_create_fnhe(nh, fl4->daddr);
+				if (fnhe)
+					fnhe->fnhe_gw = new_gw;
+				spin_unlock_bh(&fnhe_lock);
+			}
 			rt->rt_gateway = new_gw;
 			rt->rt_flags |= RTCF_REDIRECTED;
 			call_netevent_notifiers(NETEVENT_NEIGH_UPDATE, n);
@@ -1349,6 +1474,17 @@ reject_redirect:
 	;
 }
 
+static void ip_do_redirect(struct dst_entry *dst, struct sock *sk, struct sk_buff *skb)
+{
+	struct rtable *rt;
+	struct flowi4 fl4;
+
+	rt = (struct rtable *) dst;
+
+	ip_rt_build_flow_key(&fl4, sk, skb);
+	__ip_do_redirect(rt, skb, &fl4);
+}
+
 static struct dst_entry *ipv4_negative_advice(struct dst_entry *dst)
 {
 	struct rtable *rt = (struct rtable *)dst;
@@ -1508,33 +1644,51 @@ out:	kfree_skb(skb);
 	return 0;
 }
 
-static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
-			      struct sk_buff *skb, u32 mtu)
+static void __ip_rt_update_pmtu(struct rtable *rt, struct flowi4 *fl4, u32 mtu)
 {
-	struct rtable *rt = (struct rtable *) dst;
-
-	dst_confirm(dst);
+	struct fib_result res;
 
 	if (mtu < ip_rt_min_pmtu)
 		mtu = ip_rt_min_pmtu;
 
+	if (fib_lookup(dev_net(rt->dst.dev), fl4, &res) == 0) {
+		struct fib_nh *nh = &FIB_RES_NH(res);
+		struct fib_nh_exception *fnhe;
+
+		spin_lock_bh(&fnhe_lock);
+		fnhe = find_or_create_fnhe(nh, fl4->daddr);
+		if (fnhe) {
+			fnhe->fnhe_pmtu = mtu;
+			fnhe->fnhe_expires = jiffies + ip_rt_mtu_expires;
+		}
+		spin_unlock_bh(&fnhe_lock);
+	}
 	rt->rt_pmtu = mtu;
 	dst_set_expires(&rt->dst, ip_rt_mtu_expires);
 }
 
+static void ip_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
+			      struct sk_buff *skb, u32 mtu)
+{
+	struct rtable *rt = (struct rtable *) dst;
+	struct flowi4 fl4;
+
+	ip_rt_build_flow_key(&fl4, sk, skb);
+	__ip_rt_update_pmtu(rt, &fl4, mtu);
+}
+
 void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu,
 		      int oif, u32 mark, u8 protocol, int flow_flags)
 {
-	const struct iphdr *iph = (const struct iphdr *)skb->data;
+	const struct iphdr *iph = (const struct iphdr *) skb->data;
 	struct flowi4 fl4;
 	struct rtable *rt;
 
-	flowi4_init_output(&fl4, oif, mark, RT_TOS(iph->tos), RT_SCOPE_UNIVERSE,
-			   protocol, flow_flags,
-			   iph->daddr, iph->saddr, 0, 0);
+	__build_flow_key(&fl4, NULL, iph, oif,
+			 RT_TOS(iph->tos), protocol, mark, flow_flags);
 	rt = __ip_route_output_key(net, &fl4);
 	if (!IS_ERR(rt)) {
-		ip_rt_update_pmtu(&rt->dst, NULL, skb, mtu);
+		__ip_rt_update_pmtu(rt, &fl4, mtu);
 		ip_rt_put(rt);
 	}
 }
@@ -1542,27 +1696,31 @@ EXPORT_SYMBOL_GPL(ipv4_update_pmtu);
 
 void ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu)
 {
-	const struct inet_sock *inet = inet_sk(sk);
+	const struct iphdr *iph = (const struct iphdr *) skb->data;
+	struct flowi4 fl4;
+	struct rtable *rt;
 
-	return ipv4_update_pmtu(skb, sock_net(sk), mtu,
-				sk->sk_bound_dev_if, sk->sk_mark,
-				inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
-				inet_sk_flowi_flags(sk));
+	__build_flow_key(&fl4, sk, iph, 0, 0, 0, 0, 0);
+	rt = __ip_route_output_key(sock_net(sk), &fl4);
+	if (!IS_ERR(rt)) {
+		__ip_rt_update_pmtu(rt, &fl4, mtu);
+		ip_rt_put(rt);
+	}
 }
 EXPORT_SYMBOL_GPL(ipv4_sk_update_pmtu);
 
 void ipv4_redirect(struct sk_buff *skb, struct net *net,
 		   int oif, u32 mark, u8 protocol, int flow_flags)
 {
-	const struct iphdr *iph = (const struct iphdr *)skb->data;
+	const struct iphdr *iph = (const struct iphdr *) skb->data;
 	struct flowi4 fl4;
 	struct rtable *rt;
 
-	flowi4_init_output(&fl4, oif, mark, RT_TOS(iph->tos), RT_SCOPE_UNIVERSE,
-			   protocol, flow_flags, iph->daddr, iph->saddr, 0, 0);
+	__build_flow_key(&fl4, NULL, iph, oif,
+			 RT_TOS(iph->tos), protocol, mark, flow_flags);
 	rt = __ip_route_output_key(net, &fl4);
 	if (!IS_ERR(rt)) {
-		ip_do_redirect(&rt->dst, NULL, skb);
+		__ip_do_redirect(rt, skb, &fl4);
 		ip_rt_put(rt);
 	}
 }
@@ -1570,12 +1728,16 @@ EXPORT_SYMBOL_GPL(ipv4_redirect);
 
 void ipv4_sk_redirect(struct sk_buff *skb, struct sock *sk)
 {
-	const struct inet_sock *inet = inet_sk(sk);
+	const struct iphdr *iph = (const struct iphdr *) skb->data;
+	struct flowi4 fl4;
+	struct rtable *rt;
 
-	return ipv4_redirect(skb, sock_net(sk), sk->sk_bound_dev_if,
-			     sk->sk_mark,
-			     inet->hdrincl ? IPPROTO_RAW : sk->sk_protocol,
-			     inet_sk_flowi_flags(sk));
+	__build_flow_key(&fl4, sk, iph, 0, 0, 0, 0, 0);
+	rt = __ip_route_output_key(sock_net(sk), &fl4);
+	if (!IS_ERR(rt)) {
+		__ip_do_redirect(rt, skb, &fl4);
+		ip_rt_put(rt);
+	}
 }
 EXPORT_SYMBOL_GPL(ipv4_sk_redirect);
 
@@ -1722,14 +1884,46 @@ static void rt_init_metrics(struct rtable *rt, const struct flowi4 *fl4,
 	dst_init_metrics(&rt->dst, fi->fib_metrics, true);
 }
 
+static void rt_bind_exception(struct rtable *rt, struct fib_nh *nh, __be32 daddr)
+{
+	struct fnhe_hash_bucket *hash = nh->nh_exceptions;
+	struct fib_nh_exception *fnhe;
+	u32 hval;
+
+	hval = (__force u32) daddr;
+	hval ^= (hval >> 11) ^ (hval >> 22);
+
+	for (fnhe = rcu_dereference(hash[hval].chain); fnhe;
+	     fnhe = rcu_dereference(fnhe->fnhe_next)) {
+		if (fnhe->fnhe_daddr == daddr) {
+			if (fnhe->fnhe_pmtu) {
+				unsigned long expires = fnhe->fnhe_expires;
+				unsigned long diff = jiffies - expires;
+
+				if (time_before(jiffies, expires)) {
+					rt->rt_pmtu = fnhe->fnhe_pmtu;
+					dst_set_expires(&rt->dst, diff);
+				}
+			}
+			if (fnhe->fnhe_gw)
+				rt->rt_gateway = fnhe->fnhe_gw;
+			fnhe->fnhe_stamp = jiffies;
+			break;
+		}
+	}
+}
+
 static void rt_set_nexthop(struct rtable *rt, const struct flowi4 *fl4,
 			   const struct fib_result *res,
 			   struct fib_info *fi, u16 type, u32 itag)
 {
 	if (fi) {
-		if (FIB_RES_GW(*res) &&
-		    FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK)
-			rt->rt_gateway = FIB_RES_GW(*res);
+		struct fib_nh *nh = &FIB_RES_NH(*res);
+
+		if (nh->nh_gw && nh->nh_scope == RT_SCOPE_LINK)
+			rt->rt_gateway = nh->nh_gw;
+		if (unlikely(nh->nh_exceptions))
+			rt_bind_exception(rt, nh, fl4->daddr);
 		rt_init_metrics(rt, fl4, fi);
 #ifdef CONFIG_IP_ROUTE_CLASSID
 		rt->dst.tclassid = FIB_RES_NH(*res).nh_tclassid;
-- 
1.7.10.4

^ permalink raw reply related

* That's pretty much it for 3.5.0
From: David Miller @ 2012-07-17 16:01 UTC (permalink / raw)
  To: netdev; +Cc: linux-wireless, netfilter-devel

Linus was _extremely_ generous and took in all the stuff that was
pending in the net tree just now.

Besides very serious issues, I'm not willing to consider any more bug
fixes for the 'net' tree at this time.

Only one pending known bug qualifies, and that's the CIPSO ip option
processing OOPS'er.  And I'll work on that myself if Paul Moore
doesn't show a sign of life in the next day.

Thanks.

^ permalink raw reply

* Re: [patch net-next 0/2] team: add netpoll support
From: David Miller @ 2012-07-17 16:02 UTC (permalink / raw)
  To: jiri; +Cc: netdev
In-Reply-To: <1342538556-22601-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Tue, 17 Jul 2012 17:22:34 +0200

> Also contains a little change to netpoll core.
> 
> Jiri Pirko (2):
>   netpoll: move np->dev and np->dev_name init into __netpoll_setup()
>   team: add netpoll support

Both applied, thanks Jiri.

^ permalink raw reply

* [PATCH ethtool 0/3] Cleanup for the RPM
From: Ben Hutchings @ 2012-07-17 16:21 UTC (permalink / raw)
  To: netdev; +Cc: linux-net-drivers

I haven't tried building RPMs in a while, and I don't know whether
anyone actually uses the bundled spec file.  Anyway, this should freshen
it up a bit.

Ben.

Ben Hutchings (3):
  ethtool.spec: Update summary and description, based on Fedora package
  ethtool.spec: Update URL to the current home page
  ethtool.spec: Do not include ChangeLog or INSTALL

 ethtool.spec.in |   14 ++++++--------
 1 files changed, 6 insertions(+), 8 deletions(-)

-- 
1.7.7.6


-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH ethtool 1/3] ethtool.spec: Update summary and description, based on Fedora package
From: Ben Hutchings @ 2012-07-17 16:23 UTC (permalink / raw)
  To: netdev; +Cc: linux-net-drivers
In-Reply-To: <1342542068.2698.7.camel@bwh-desktop.uk.solarflarecom.com>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
It would probably be better to come up with a single concise description
to use in all the various places it's needed: RPMs, debs, web site,
manual page...

Ben.

 ethtool.spec.in |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/ethtool.spec.in b/ethtool.spec.in
index 4ff736a..879555d 100644
--- a/ethtool.spec.in
+++ b/ethtool.spec.in
@@ -3,7 +3,7 @@ Version		: @VERSION@
 Release		: 1
 Group		: Utilities
 
-Summary		: A tool for setting ethernet parameters
+Summary		: Settings tool for Ethernet and other network devices
 
 License		: GPL
 URL		: http://sourceforge.net/projects/gkernel/
@@ -13,11 +13,9 @@ Source		: %{name}-%{version}.tar.gz
 

 %description
-Ethtool is a small utility to get and set values from your your ethernet 
-controllers.  Not all ethernet drivers support ethtool, but it is getting 
-better.  If your ethernet driver doesn't support it, ask the maintainer to 
-write support - it's not hard!
-
+This utility allows querying and changing settings such as speed,
+port, auto-negotiation, PCI locations and checksum offload on many
+network devices, especially Ethernet devices.
 
 %prep
 %setup -q
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH ethtool 2/3] ethtool.spec: Update URL to the current home page
From: Ben Hutchings @ 2012-07-17 16:24 UTC (permalink / raw)
  To: netdev; +Cc: linux-net-drivers
In-Reply-To: <1342542068.2698.7.camel@bwh-desktop.uk.solarflarecom.com>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.spec.in |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ethtool.spec.in b/ethtool.spec.in
index 879555d..863dfd4 100644
--- a/ethtool.spec.in
+++ b/ethtool.spec.in
@@ -6,7 +6,7 @@ Group		: Utilities
 Summary		: Settings tool for Ethernet and other network devices
 
 License		: GPL
-URL		: http://sourceforge.net/projects/gkernel/
+URL		: https://ftp.kernel.org/pub/software/network/ethtool/
 
 Buildroot	: %{_tmppath}/%{name}-%{version}
 Source		: %{name}-%{version}.tar.gz
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH ethtool 3/3] ethtool.spec: Do not include ChangeLog or INSTALL
From: Ben Hutchings @ 2012-07-17 16:24 UTC (permalink / raw)
  To: netdev; +Cc: linux-net-drivers
In-Reply-To: <1342542068.2698.7.camel@bwh-desktop.uk.solarflarecom.com>

The ChangeLog is ancient history, replaced by the version control
changelog.

INSTALL is redundant in a binary package.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 ethtool.spec.in |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ethtool.spec.in b/ethtool.spec.in
index 863dfd4..6e9e1f5 100644
--- a/ethtool.spec.in
+++ b/ethtool.spec.in
@@ -34,7 +34,7 @@ make install DESTDIR=${RPM_BUILD_ROOT}
 %defattr(-,root,root)
 /usr/sbin/ethtool
 %{_mandir}/man8/ethtool.8*
-%doc AUTHORS COPYING INSTALL NEWS README ChangeLog
+%doc AUTHORS COPYING NEWS README
 

 %changelog
-- 
1.7.7.6


-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* Re: Crash in CIPSO_V4_TAG_LOCAL handling
From: Paul Moore @ 2012-07-17 16:25 UTC (permalink / raw)
  To: David Miller; +Cc: mlin, alan, netdev
In-Reply-To: <20120714.130817.394766887121758073.davem@davemloft.net>

On Sat, Jul 14, 2012 at 4:08 PM, David Miller <davem@davemloft.net> wrote:
> From: Lin Ming <mlin@ss.pku.edu.cn>
> Date: Sun, 15 Jul 2012 01:22:30 +0800
>
>> It's caused by below code added in commit 15c45f7b.
>>
>>                 case CIPSO_V4_TAG_LOCAL:
>>                         /* This is a non-standard tag that we only allow for
>>                          * local connections, so if the incoming interface is
>>                          * not the loopback device drop the packet. */
>>                         if (!(skb->dev->flags & IFF_LOOPBACK)) {
>>                                 err_offset = opt_iter;
>>                                 goto validate_return_locked;
>>                         }
>
> Paul please fix this, as shown 'skb' can easily be NULL in this
> code path.

Just saw this ... I'll start looking into this today.

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* [PATCH] jme: netpoll support
From: Lekensteyn @ 2012-07-17 16:29 UTC (permalink / raw)
  To: Guo-Fu Tseng; +Cc: netdev

From: Peter Wu <lekensteyn@gmail.com>

This patch adds the netpoll function to support netconsole. Tested and works
fine on my "JMC250 PCI Express Gigabit Ethernet Controller" (PCI ID 0250).

Signed-off-by: Peter Wu <lekensteyn@gmail.com>
---
 drivers/net/ethernet/jme.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index 4ea6580..c911d88 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -2743,6 +2743,17 @@ jme_set_features(struct net_device *netdev, netdev_features_t features)
 	return 0;
 }
 
+#ifdef CONFIG_NET_POLL_CONTROLLER
+static void jme_netpoll(struct net_device *dev)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	jme_intr(dev->irq, dev);
+	local_irq_restore(flags);
+}
+#endif
+
 static int
 jme_nway_reset(struct net_device *netdev)
 {
@@ -2944,6 +2955,9 @@ static const struct net_device_ops jme_netdev_ops = {
 	.ndo_tx_timeout		= jme_tx_timeout,
 	.ndo_fix_features       = jme_fix_features,
 	.ndo_set_features       = jme_set_features,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	.ndo_poll_controller	= jme_netpoll,
+#endif
 };
 
 static int __devinit
-- 
1.7.9.5

^ permalink raw reply related

* pull request: sfc-next 2012-07-17
From: Ben Hutchings @ 2012-07-17 17:05 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev

[-- Attachment #1: Type: text/plain, Size: 2182 bytes --]

The following changes since commit 141e369de698f2e17bf716b83fcc647ddcb2220c:

  xfrm: Initialize the struct xfrm_dst behind the dst_enty field (2012-07-14 00:29:12 -0700)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem

(commit c2dbab39db1c3c2ccbdbb2c6bac6f07cc7a7c1f6)

1. Fix potential badness when running a self-test with SR-IOV enabled.
2. Fix calculation of some interface statistics that could run backward.
3. Miscellaneous cleanup.

Ben.

Ben Hutchings (10):
      sfc: Work around bogus 'uninitialised variable' warning
      sfc: Use generic DMA API, not PCI-DMA API
      sfc: Remove dead write to tso_state::packet_space
      sfc: Stop changing header offsets on TX
      sfc: Use strlcpy() to copy ethtool stats names
      sfc: Use dev_kfree_skb() in efx_end_loopback()
      sfc: Explain why efx_mcdi_exit_assertion() ignores result of efx_mcdi_rpc()
      sfc: Disable VF queues during register self-test
      sfc: Fix interface statistics running backward
      sfc: Correct some comments on enum reset_type

 drivers/net/ethernet/sfc/efx.c         |   10 ++--
 drivers/net/ethernet/sfc/enum.h        |    8 ++--
 drivers/net/ethernet/sfc/ethtool.c     |    2 +-
 drivers/net/ethernet/sfc/falcon.c      |   35 +++++++++++--
 drivers/net/ethernet/sfc/falcon_xmac.c |   12 ++--
 drivers/net/ethernet/sfc/filter.c      |    2 +-
 drivers/net/ethernet/sfc/mcdi.c        |   11 +++-
 drivers/net/ethernet/sfc/net_driver.h  |    9 ++-
 drivers/net/ethernet/sfc/nic.c         |   11 ++---
 drivers/net/ethernet/sfc/nic.h         |   18 ++++++
 drivers/net/ethernet/sfc/rx.c          |   22 ++++----
 drivers/net/ethernet/sfc/selftest.c    |   64 ++++++----------------
 drivers/net/ethernet/sfc/siena.c       |   37 ++++++++++---
 drivers/net/ethernet/sfc/tx.c          |   93 ++++++++++++++------------------
 14 files changed, 181 insertions(+), 153 deletions(-)

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.



[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 490 bytes --]

^ permalink raw reply

* Re: Crash in CIPSO_V4_TAG_LOCAL handling
From: David Miller @ 2012-07-17 17:28 UTC (permalink / raw)
  To: paul; +Cc: mlin, alan, netdev
In-Reply-To: <CAHC9VhQKhoRt8yD4WAgD5tyqjY4bbp4Z4hr5cWfs-d6GJt0pjQ@mail.gmail.com>

From: Paul Moore <paul@paul-moore.com>
Date: Tue, 17 Jul 2012 12:25:28 -0400

> On Sat, Jul 14, 2012 at 4:08 PM, David Miller <davem@davemloft.net> wrote:
>> From: Lin Ming <mlin@ss.pku.edu.cn>
>> Date: Sun, 15 Jul 2012 01:22:30 +0800
>>
>>> It's caused by below code added in commit 15c45f7b.
>>>
>>>                 case CIPSO_V4_TAG_LOCAL:
>>>                         /* This is a non-standard tag that we only allow for
>>>                          * local connections, so if the incoming interface is
>>>                          * not the loopback device drop the packet. */
>>>                         if (!(skb->dev->flags & IFF_LOOPBACK)) {
>>>                                 err_offset = opt_iter;
>>>                                 goto validate_return_locked;
>>>                         }
>>
>> Paul please fix this, as shown 'skb' can easily be NULL in this
>> code path.
> 
> Just saw this ... I'll start looking into this today.

Thanks, sorry I messed up your email, I should have checked MAINTAINERS :)

^ permalink raw reply

* Re: pull request: sfc-next 2012-07-17
From: David Miller @ 2012-07-17 17:31 UTC (permalink / raw)
  To: bhutchings; +Cc: linux-net-drivers, netdev
In-Reply-To: <1342544740.2698.13.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Tue, 17 Jul 2012 18:05:40 +0100

> The following changes since commit 141e369de698f2e17bf716b83fcc647ddcb2220c:
> 
>   xfrm: Initialize the struct xfrm_dst behind the dst_enty field (2012-07-14 00:29:12 -0700)
> 
> are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem
> 
> (commit c2dbab39db1c3c2ccbdbb2c6bac6f07cc7a7c1f6)
> 
> 1. Fix potential badness when running a self-test with SR-IOV enabled.
> 2. Fix calculation of some interface statistics that could run backward.
> 3. Miscellaneous cleanup.

Please post the patches so that at least in theory people have the
opportunity to review them.

^ permalink raw reply

* Re: [PATCH] netem: fix rate extension and drop accounting
From: Eric Dumazet @ 2012-07-17 17:39 UTC (permalink / raw)
  To: Mark Gordon
  Cc: Hagen Paul Pfeifer, David Miller, netdev, Yuchung Cheng,
	Andreas Terzis
In-Reply-To: <CAPVr9VMCYFO-7uEzO6ft2vpPhVvRgHB3EWJJG62OqGqux1LsZQ@mail.gmail.com>

Please Mark :

1) Dont top post on netdev

2) Dont write HTML mails on netdev (your mail never went to netdev,
   only to CCed people). Only text mails are allowed.

On Tue, 2012-07-17 at 10:20 -0700, Mark Gordon wrote:
> Even the static delay case seems wrong with the new patch.  Assume all
> packets have the same sched_time.  Then if you spam packets that get
> processed at the same time by netem they will all get scheduled with
> the same time_to_send because the first packet will get time_to_send
> of [1] = clock_time + sched_time.  Then packet n compute 'now' as
> [n-1] and delay as sched_time - (clock_time - [1]) = 0 so that [n] =
> [n-1].  Therefore every packet gets scheduled at the same time.
> 
> 
> The above modification seems to fix the issue when latency/jitter is 0
> but suffers from a missing non-linearity when delay is present.  Is
> there a technical reason I'm missing that prevents us from doing rate
> and latency here?  Why wouldn't the 'official' patch have correct
> rate?

Because delay is variable (jitter)

netem as is is not working correctly if you have both a rate limit and
delay.

Hagen is working on a solution, but there is no easy fix.

The right solution is to have :

1) A rate stage, using a child qdisc (that you can graft to install your
own qdisc hierarchy if needed, say if you want codel or fq_codel ;))

  Thats basically a TBF...

2) skb orphan

3) drops/reorders/corrupt/additional delay (variable delay)
   using an internal tfifo, to mimic real networks behavior.

Thats the reverse of how its currently done.

Alternatively, this could be implemented as a special network device,
like bonding, instead of a qdisc.

^ permalink raw reply

* Re: That's pretty much it for 3.5.0
From: Rustad, Mark D @ 2012-07-17 17:41 UTC (permalink / raw)
  To: David Miller
  Cc: <netdev@vger.kernel.org>,
	<linux-wireless@vger.kernel.org>,
	<netfilter-devel@vger.kernel.org>
In-Reply-To: <20120717.090142.125145009944045241.davem@davemloft.net>

On Jul 17, 2012, at 9:01 AM, David Miller wrote:

> Linus was _extremely_ generous and took in all the stuff that was
> pending in the net tree just now.

Maybe *too* generous. :-) I just updated and when I boot I get an early crash in update_netdev_tables which is in netprio_cgroup.c.

> Besides very serious issues, I'm not willing to consider any more bug
> fixes for the 'net' tree at this time.

I think the above issue will have to be fixed, as it completely prevents booting for any kernel that includes the netprio_cgroup option.

> Only one pending known bug qualifies, and that's the CIPSO ip option
> processing OOPS'er.  And I'll work on that myself if Paul Moore
> doesn't show a sign of life in the next day.
> 
> Thanks.


I can start taking a look at this if you like, but I see that Gao feng has two patches in the last set of patches that may be related.

To give you an idea how early the crash is, here are a few log messages leading up to it:

[    0.003455] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
[    0.005550] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
[    0.007165] Mount-cache hash table entries: 256
[    0.010289] Initializing cgroup subsys net_cls
[    0.010947] Initializing cgroup subsys net_prio
[    0.011039] BUG: unable to handle kernel NULL pointer dereference at 0000000000000828
[    0.011998] IP: [<ffffffff814202c8>] update_netdev_tables+0x68/0xe0

-- 
Mark Rustad, LAN Access Division, Intel Corporation


^ permalink raw reply

* Re: [PATCH 0/5] Long term PMTU/redirect storage in ipv4.
From: David Miller @ 2012-07-17 18:03 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20120717.061418.1893307699868826531.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Tue, 17 Jul 2012 06:14:18 -0700 (PDT)

> These patches implement the final mechanism necessary to really allow
> us to go without the route cache in ipv4.

Ok I pushed this out to net-next with the v2 of patch #5 and the merge
commit message adjusted to suit.

I think the routing cache will die in net-next for real some time
later this week.

I'll start respinning those patches.

^ permalink raw reply

* Re: pull request: sfc-next 2012-07-17
From: Ben Hutchings @ 2012-07-17 18:03 UTC (permalink / raw)
  To: David Miller; +Cc: linux-net-drivers, netdev
In-Reply-To: <20120717.103103.655643871226631461.davem@davemloft.net>

On Tue, 2012-07-17 at 10:31 -0700, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Tue, 17 Jul 2012 18:05:40 +0100
> 
> > The following changes since commit 141e369de698f2e17bf716b83fcc647ddcb2220c:
> > 
> >   xfrm: Initialize the struct xfrm_dst behind the dst_enty field (2012-07-14 00:29:12 -0700)
> > 
> > are available in the git repository at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-next.git for-davem
> > 
> > (commit c2dbab39db1c3c2ccbdbb2c6bac6f07cc7a7c1f6)
> > 
> > 1. Fix potential badness when running a self-test with SR-IOV enabled.
> > 2. Fix calculation of some interface statistics that could run backward.
> > 3. Miscellaneous cleanup.
> 
> Please post the patches so that at least in theory people have the
> opportunity to review them.

Sorry, yes.  They're the same as last time modulo the MMIO, but I
suppose they may have been ignored after the objections to that.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH net-next 01/10] sfc: Work around bogus 'uninitialised variable' warning
From: Ben Hutchings @ 2012-07-17 18:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1342544740.2698.13.camel@bwh-desktop.uk.solarflarecom.com>

With some gcc versions & optimisations, the compiler will warn that
'depth' in efx_filter_insert_filter() may be used without being
initialised, although this is not the case.

This is related to inlining of efx_filter_search(), which only has
one caller since commit 8db182f4a8a6e2dcb8b65905ea4af56210e65430
('sfc: Remove now-unused filter function').

Shut the compiler up by initialising it to 0.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/filter.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/sfc/filter.c b/drivers/net/ethernet/sfc/filter.c
index fea7f73..c3fd61f 100644
--- a/drivers/net/ethernet/sfc/filter.c
+++ b/drivers/net/ethernet/sfc/filter.c
@@ -662,7 +662,7 @@ s32 efx_filter_insert_filter(struct efx_nic *efx, struct efx_filter_spec *spec,
 	struct efx_filter_table *table = efx_filter_spec_table(state, spec);
 	struct efx_filter_spec *saved_spec;
 	efx_oword_t filter;
-	unsigned int filter_idx, depth;
+	unsigned int filter_idx, depth = 0;
 	u32 key;
 	int rc;
 
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [PATCH net-next 02/10] sfc: Use generic DMA API, not PCI-DMA API
From: Ben Hutchings @ 2012-07-17 18:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1342544740.2698.13.camel@bwh-desktop.uk.solarflarecom.com>

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/sfc/efx.c        |   10 ++--
 drivers/net/ethernet/sfc/net_driver.h |    2 +-
 drivers/net/ethernet/sfc/nic.c        |    8 ++--
 drivers/net/ethernet/sfc/rx.c         |   22 ++++----
 drivers/net/ethernet/sfc/tx.c         |   83 ++++++++++++++++-----------------
 5 files changed, 62 insertions(+), 63 deletions(-)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index b95f2e1..70554a1 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -1103,8 +1103,8 @@ static int efx_init_io(struct efx_nic *efx)
 	 * masks event though they reject 46 bit masks.
 	 */
 	while (dma_mask > 0x7fffffffUL) {
-		if (pci_dma_supported(pci_dev, dma_mask)) {
-			rc = pci_set_dma_mask(pci_dev, dma_mask);
+		if (dma_supported(&pci_dev->dev, dma_mask)) {
+			rc = dma_set_mask(&pci_dev->dev, dma_mask);
 			if (rc == 0)
 				break;
 		}
@@ -1117,10 +1117,10 @@ static int efx_init_io(struct efx_nic *efx)
 	}
 	netif_dbg(efx, probe, efx->net_dev,
 		  "using DMA mask %llx\n", (unsigned long long) dma_mask);
-	rc = pci_set_consistent_dma_mask(pci_dev, dma_mask);
+	rc = dma_set_coherent_mask(&pci_dev->dev, dma_mask);
 	if (rc) {
-		/* pci_set_consistent_dma_mask() is not *allowed* to
-		 * fail with a mask that pci_set_dma_mask() accepted,
+		/* dma_set_coherent_mask() is not *allowed* to
+		 * fail with a mask that dma_set_mask() accepted,
 		 * but just in case...
 		 */
 		netif_err(efx, probe, efx->net_dev,
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 0e57535..8a9f6d4 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -100,7 +100,7 @@ struct efx_special_buffer {
  * @len: Length of this fragment.
  *	This field is zero when the queue slot is empty.
  * @continuation: True if this fragment is not the end of a packet.
- * @unmap_single: True if pci_unmap_single should be used.
+ * @unmap_single: True if dma_unmap_single should be used.
  * @unmap_len: Length of this fragment to unmap
  */
 struct efx_tx_buffer {
diff --git a/drivers/net/ethernet/sfc/nic.c b/drivers/net/ethernet/sfc/nic.c
index 4a9a5be..287738d 100644
--- a/drivers/net/ethernet/sfc/nic.c
+++ b/drivers/net/ethernet/sfc/nic.c
@@ -308,8 +308,8 @@ efx_free_special_buffer(struct efx_nic *efx, struct efx_special_buffer *buffer)
 int efx_nic_alloc_buffer(struct efx_nic *efx, struct efx_buffer *buffer,
 			 unsigned int len)
 {
-	buffer->addr = pci_alloc_consistent(efx->pci_dev, len,
-					    &buffer->dma_addr);
+	buffer->addr = dma_alloc_coherent(&efx->pci_dev->dev, len,
+					  &buffer->dma_addr, GFP_ATOMIC);
 	if (!buffer->addr)
 		return -ENOMEM;
 	buffer->len = len;
@@ -320,8 +320,8 @@ int efx_nic_alloc_buffer(struct efx_nic *efx, struct efx_buffer *buffer,
 void efx_nic_free_buffer(struct efx_nic *efx, struct efx_buffer *buffer)
 {
 	if (buffer->addr) {
-		pci_free_consistent(efx->pci_dev, buffer->len,
-				    buffer->addr, buffer->dma_addr);
+		dma_free_coherent(&efx->pci_dev->dev, buffer->len,
+				  buffer->addr, buffer->dma_addr);
 		buffer->addr = NULL;
 	}
 }
diff --git a/drivers/net/ethernet/sfc/rx.c b/drivers/net/ethernet/sfc/rx.c
index 243e91f..6d1c6cf 100644
--- a/drivers/net/ethernet/sfc/rx.c
+++ b/drivers/net/ethernet/sfc/rx.c
@@ -155,11 +155,11 @@ static int efx_init_rx_buffers_skb(struct efx_rx_queue *rx_queue)
 		rx_buf->len = skb_len - NET_IP_ALIGN;
 		rx_buf->flags = 0;
 
-		rx_buf->dma_addr = pci_map_single(efx->pci_dev,
+		rx_buf->dma_addr = dma_map_single(&efx->pci_dev->dev,
 						  skb->data, rx_buf->len,
-						  PCI_DMA_FROMDEVICE);
-		if (unlikely(pci_dma_mapping_error(efx->pci_dev,
-						   rx_buf->dma_addr))) {
+						  DMA_FROM_DEVICE);
+		if (unlikely(dma_mapping_error(&efx->pci_dev->dev,
+					       rx_buf->dma_addr))) {
 			dev_kfree_skb_any(skb);
 			rx_buf->u.skb = NULL;
 			return -EIO;
@@ -200,10 +200,10 @@ static int efx_init_rx_buffers_page(struct efx_rx_queue *rx_queue)
 				   efx->rx_buffer_order);
 		if (unlikely(page == NULL))
 			return -ENOMEM;
-		dma_addr = pci_map_page(efx->pci_dev, page, 0,
+		dma_addr = dma_map_page(&efx->pci_dev->dev, page, 0,
 					efx_rx_buf_size(efx),
-					PCI_DMA_FROMDEVICE);
-		if (unlikely(pci_dma_mapping_error(efx->pci_dev, dma_addr))) {
+					DMA_FROM_DEVICE);
+		if (unlikely(dma_mapping_error(&efx->pci_dev->dev, dma_addr))) {
 			__free_pages(page, efx->rx_buffer_order);
 			return -EIO;
 		}
@@ -247,14 +247,14 @@ static void efx_unmap_rx_buffer(struct efx_nic *efx,
 
 		state = page_address(rx_buf->u.page);
 		if (--state->refcnt == 0) {
-			pci_unmap_page(efx->pci_dev,
+			dma_unmap_page(&efx->pci_dev->dev,
 				       state->dma_addr,
 				       efx_rx_buf_size(efx),
-				       PCI_DMA_FROMDEVICE);
+				       DMA_FROM_DEVICE);
 		}
 	} else if (!(rx_buf->flags & EFX_RX_BUF_PAGE) && rx_buf->u.skb) {
-		pci_unmap_single(efx->pci_dev, rx_buf->dma_addr,
-				 rx_buf->len, PCI_DMA_FROMDEVICE);
+		dma_unmap_single(&efx->pci_dev->dev, rx_buf->dma_addr,
+				 rx_buf->len, DMA_FROM_DEVICE);
 	}
 }
 
diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c
index 94d0365..18860f2 100644
--- a/drivers/net/ethernet/sfc/tx.c
+++ b/drivers/net/ethernet/sfc/tx.c
@@ -36,15 +36,15 @@ static void efx_dequeue_buffer(struct efx_tx_queue *tx_queue,
 			       unsigned int *bytes_compl)
 {
 	if (buffer->unmap_len) {
-		struct pci_dev *pci_dev = tx_queue->efx->pci_dev;
+		struct device *dma_dev = &tx_queue->efx->pci_dev->dev;
 		dma_addr_t unmap_addr = (buffer->dma_addr + buffer->len -
 					 buffer->unmap_len);
 		if (buffer->unmap_single)
-			pci_unmap_single(pci_dev, unmap_addr, buffer->unmap_len,
-					 PCI_DMA_TODEVICE);
+			dma_unmap_single(dma_dev, unmap_addr, buffer->unmap_len,
+					 DMA_TO_DEVICE);
 		else
-			pci_unmap_page(pci_dev, unmap_addr, buffer->unmap_len,
-				       PCI_DMA_TODEVICE);
+			dma_unmap_page(dma_dev, unmap_addr, buffer->unmap_len,
+				       DMA_TO_DEVICE);
 		buffer->unmap_len = 0;
 		buffer->unmap_single = false;
 	}
@@ -138,7 +138,7 @@ efx_max_tx_len(struct efx_nic *efx, dma_addr_t dma_addr)
 netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 {
 	struct efx_nic *efx = tx_queue->efx;
-	struct pci_dev *pci_dev = efx->pci_dev;
+	struct device *dma_dev = &efx->pci_dev->dev;
 	struct efx_tx_buffer *buffer;
 	skb_frag_t *fragment;
 	unsigned int len, unmap_len = 0, fill_level, insert_ptr;
@@ -167,17 +167,17 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	fill_level = tx_queue->insert_count - tx_queue->old_read_count;
 	q_space = efx->txq_entries - 1 - fill_level;
 
-	/* Map for DMA.  Use pci_map_single rather than pci_map_page
+	/* Map for DMA.  Use dma_map_single rather than dma_map_page
 	 * since this is more efficient on machines with sparse
 	 * memory.
 	 */
 	unmap_single = true;
-	dma_addr = pci_map_single(pci_dev, skb->data, len, PCI_DMA_TODEVICE);
+	dma_addr = dma_map_single(dma_dev, skb->data, len, PCI_DMA_TODEVICE);
 
 	/* Process all fragments */
 	while (1) {
-		if (unlikely(pci_dma_mapping_error(pci_dev, dma_addr)))
-			goto pci_err;
+		if (unlikely(dma_mapping_error(dma_dev, dma_addr)))
+			goto dma_err;
 
 		/* Store fields for marking in the per-fragment final
 		 * descriptor */
@@ -246,7 +246,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 		i++;
 		/* Map for DMA */
 		unmap_single = false;
-		dma_addr = skb_frag_dma_map(&pci_dev->dev, fragment, 0, len,
+		dma_addr = skb_frag_dma_map(dma_dev, fragment, 0, len,
 					    DMA_TO_DEVICE);
 	}
 
@@ -261,7 +261,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 
 	return NETDEV_TX_OK;
 
- pci_err:
+ dma_err:
 	netif_err(efx, tx_err, efx->net_dev,
 		  " TX queue %d could not map skb with %d bytes %d "
 		  "fragments for DMA\n", tx_queue->queue, skb->len,
@@ -284,11 +284,11 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	/* Free the fragment we were mid-way through pushing */
 	if (unmap_len) {
 		if (unmap_single)
-			pci_unmap_single(pci_dev, unmap_addr, unmap_len,
-					 PCI_DMA_TODEVICE);
+			dma_unmap_single(dma_dev, unmap_addr, unmap_len,
+					 DMA_TO_DEVICE);
 		else
-			pci_unmap_page(pci_dev, unmap_addr, unmap_len,
-				       PCI_DMA_TODEVICE);
+			dma_unmap_page(dma_dev, unmap_addr, unmap_len,
+				       DMA_TO_DEVICE);
 	}
 
 	return rc;
@@ -684,20 +684,19 @@ static __be16 efx_tso_check_protocol(struct sk_buff *skb)
  */
 static int efx_tsoh_block_alloc(struct efx_tx_queue *tx_queue)
 {
-
-	struct pci_dev *pci_dev = tx_queue->efx->pci_dev;
+	struct device *dma_dev = &tx_queue->efx->pci_dev->dev;
 	struct efx_tso_header *tsoh;
 	dma_addr_t dma_addr;
 	u8 *base_kva, *kva;
 
-	base_kva = pci_alloc_consistent(pci_dev, PAGE_SIZE, &dma_addr);
+	base_kva = dma_alloc_coherent(dma_dev, PAGE_SIZE, &dma_addr, GFP_ATOMIC);
 	if (base_kva == NULL) {
 		netif_err(tx_queue->efx, tx_err, tx_queue->efx->net_dev,
 			  "Unable to allocate page for TSO headers\n");
 		return -ENOMEM;
 	}
 
-	/* pci_alloc_consistent() allocates pages. */
+	/* dma_alloc_coherent() allocates pages. */
 	EFX_BUG_ON_PARANOID(dma_addr & (PAGE_SIZE - 1u));
 
 	for (kva = base_kva; kva < base_kva + PAGE_SIZE; kva += TSOH_STD_SIZE) {
@@ -714,7 +713,7 @@ static int efx_tsoh_block_alloc(struct efx_tx_queue *tx_queue)
 /* Free up a TSO header, and all others in the same page. */
 static void efx_tsoh_block_free(struct efx_tx_queue *tx_queue,
 				struct efx_tso_header *tsoh,
-				struct pci_dev *pci_dev)
+				struct device *dma_dev)
 {
 	struct efx_tso_header **p;
 	unsigned long base_kva;
@@ -731,7 +730,7 @@ static void efx_tsoh_block_free(struct efx_tx_queue *tx_queue,
 			p = &(*p)->next;
 	}
 
-	pci_free_consistent(pci_dev, PAGE_SIZE, (void *)base_kva, base_dma);
+	dma_free_coherent(dma_dev, PAGE_SIZE, (void *)base_kva, base_dma);
 }
 
 static struct efx_tso_header *
@@ -743,11 +742,11 @@ efx_tsoh_heap_alloc(struct efx_tx_queue *tx_queue, size_t header_len)
 	if (unlikely(!tsoh))
 		return NULL;
 
-	tsoh->dma_addr = pci_map_single(tx_queue->efx->pci_dev,
+	tsoh->dma_addr = dma_map_single(&tx_queue->efx->pci_dev->dev,
 					TSOH_BUFFER(tsoh), header_len,
-					PCI_DMA_TODEVICE);
-	if (unlikely(pci_dma_mapping_error(tx_queue->efx->pci_dev,
-					   tsoh->dma_addr))) {
+					DMA_TO_DEVICE);
+	if (unlikely(dma_mapping_error(&tx_queue->efx->pci_dev->dev,
+				       tsoh->dma_addr))) {
 		kfree(tsoh);
 		return NULL;
 	}
@@ -759,9 +758,9 @@ efx_tsoh_heap_alloc(struct efx_tx_queue *tx_queue, size_t header_len)
 static void
 efx_tsoh_heap_free(struct efx_tx_queue *tx_queue, struct efx_tso_header *tsoh)
 {
-	pci_unmap_single(tx_queue->efx->pci_dev,
+	dma_unmap_single(&tx_queue->efx->pci_dev->dev,
 			 tsoh->dma_addr, tsoh->unmap_len,
-			 PCI_DMA_TODEVICE);
+			 DMA_TO_DEVICE);
 	kfree(tsoh);
 }
 
@@ -892,13 +891,13 @@ static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue)
 			unmap_addr = (buffer->dma_addr + buffer->len -
 				      buffer->unmap_len);
 			if (buffer->unmap_single)
-				pci_unmap_single(tx_queue->efx->pci_dev,
+				dma_unmap_single(&tx_queue->efx->pci_dev->dev,
 						 unmap_addr, buffer->unmap_len,
-						 PCI_DMA_TODEVICE);
+						 DMA_TO_DEVICE);
 			else
-				pci_unmap_page(tx_queue->efx->pci_dev,
+				dma_unmap_page(&tx_queue->efx->pci_dev->dev,
 					       unmap_addr, buffer->unmap_len,
-					       PCI_DMA_TODEVICE);
+					       DMA_TO_DEVICE);
 			buffer->unmap_len = 0;
 		}
 		buffer->len = 0;
@@ -954,9 +953,9 @@ static int tso_get_head_fragment(struct tso_state *st, struct efx_nic *efx,
 	int hl = st->header_len;
 	int len = skb_headlen(skb) - hl;
 
-	st->unmap_addr = pci_map_single(efx->pci_dev, skb->data + hl,
-					len, PCI_DMA_TODEVICE);
-	if (likely(!pci_dma_mapping_error(efx->pci_dev, st->unmap_addr))) {
+	st->unmap_addr = dma_map_single(&efx->pci_dev->dev, skb->data + hl,
+					len, DMA_TO_DEVICE);
+	if (likely(!dma_mapping_error(&efx->pci_dev->dev, st->unmap_addr))) {
 		st->unmap_single = true;
 		st->unmap_len = len;
 		st->in_len = len;
@@ -1008,7 +1007,7 @@ static int tso_fill_packet_with_fragment(struct efx_tx_queue *tx_queue,
 		buffer->continuation = !end_of_packet;
 
 		if (st->in_len == 0) {
-			/* Transfer ownership of the pci mapping */
+			/* Transfer ownership of the DMA mapping */
 			buffer->unmap_len = st->unmap_len;
 			buffer->unmap_single = st->unmap_single;
 			st->unmap_len = 0;
@@ -1181,18 +1180,18 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue,
 
  mem_err:
 	netif_err(efx, tx_err, efx->net_dev,
-		  "Out of memory for TSO headers, or PCI mapping error\n");
+		  "Out of memory for TSO headers, or DMA mapping error\n");
 	dev_kfree_skb_any(skb);
 
  unwind:
 	/* Free the DMA mapping we were in the process of writing out */
 	if (state.unmap_len) {
 		if (state.unmap_single)
-			pci_unmap_single(efx->pci_dev, state.unmap_addr,
-					 state.unmap_len, PCI_DMA_TODEVICE);
+			dma_unmap_single(&efx->pci_dev->dev, state.unmap_addr,
+					 state.unmap_len, DMA_TO_DEVICE);
 		else
-			pci_unmap_page(efx->pci_dev, state.unmap_addr,
-				       state.unmap_len, PCI_DMA_TODEVICE);
+			dma_unmap_page(&efx->pci_dev->dev, state.unmap_addr,
+				       state.unmap_len, DMA_TO_DEVICE);
 	}
 
 	efx_enqueue_unwind(tx_queue);
@@ -1216,5 +1215,5 @@ static void efx_fini_tso(struct efx_tx_queue *tx_queue)
 
 	while (tx_queue->tso_headers_free != NULL)
 		efx_tsoh_block_free(tx_queue, tx_queue->tso_headers_free,
-				    tx_queue->efx->pci_dev);
+				    &tx_queue->efx->pci_dev->dev);
 }
-- 
1.7.7.6



-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* [Upstream PATCH 1/2] driver: net: ethernet: davinci_mdio: runtime PM support
From: Mugunthan V N @ 2012-07-17 18:05 UTC (permalink / raw)
  To: netdev; +Cc: davem, Mugunthan V N
In-Reply-To: <1342548353-12153-1-git-send-email-mugunthanvnm@ti.com>

Enabling runtime PM support for davinci mdio device

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
---
 drivers/net/ethernet/ti/davinci_mdio.c |   25 ++++++++++++-------------
 1 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/ti/davinci_mdio.c b/drivers/net/ethernet/ti/davinci_mdio.c
index e4e4708..cd7ee20 100644
--- a/drivers/net/ethernet/ti/davinci_mdio.c
+++ b/drivers/net/ethernet/ti/davinci_mdio.c
@@ -34,6 +34,7 @@
 #include <linux/clk.h>
 #include <linux/err.h>
 #include <linux/io.h>
+#include <linux/pm_runtime.h>
 #include <linux/davinci_emac.h>
 
 /*
@@ -321,7 +322,9 @@ static int __devinit davinci_mdio_probe(struct platform_device *pdev)
 	snprintf(data->bus->id, MII_BUS_ID_SIZE, "%s-%x",
 		pdev->name, pdev->id);
 
-	data->clk = clk_get(dev, NULL);
+	pm_runtime_enable(&pdev->dev);
+	pm_runtime_get_sync(&pdev->dev);
+	data->clk = clk_get(&pdev->dev, "fck");
 	if (IS_ERR(data->clk)) {
 		dev_err(dev, "failed to get device clock\n");
 		ret = PTR_ERR(data->clk);
@@ -329,8 +332,6 @@ static int __devinit davinci_mdio_probe(struct platform_device *pdev)
 		goto bail_out;
 	}
 
-	clk_enable(data->clk);
-
 	dev_set_drvdata(dev, data);
 	data->dev = dev;
 	spin_lock_init(&data->lock);
@@ -378,10 +379,10 @@ bail_out:
 	if (data->bus)
 		mdiobus_free(data->bus);
 
-	if (data->clk) {
-		clk_disable(data->clk);
+	if (data->clk)
 		clk_put(data->clk);
-	}
+	pm_runtime_put_sync(&pdev->dev);
+	pm_runtime_disable(&pdev->dev);
 
 	kfree(data);
 
@@ -396,10 +397,10 @@ static int __devexit davinci_mdio_remove(struct platform_device *pdev)
 	if (data->bus)
 		mdiobus_free(data->bus);
 
-	if (data->clk) {
-		clk_disable(data->clk);
+	if (data->clk)
 		clk_put(data->clk);
-	}
+	pm_runtime_put_sync(&pdev->dev);
+	pm_runtime_disable(&pdev->dev);
 
 	dev_set_drvdata(dev, NULL);
 
@@ -421,8 +422,7 @@ static int davinci_mdio_suspend(struct device *dev)
 	__raw_writel(ctrl, &data->regs->control);
 	wait_for_idle(data);
 
-	if (data->clk)
-		clk_disable(data->clk);
+	pm_runtime_put_sync(data->dev);
 
 	data->suspended = true;
 	spin_unlock(&data->lock);
@@ -436,8 +436,7 @@ static int davinci_mdio_resume(struct device *dev)
 	u32 ctrl;
 
 	spin_lock(&data->lock);
-	if (data->clk)
-		clk_enable(data->clk);
+	pm_runtime_put_sync(data->dev);
 
 	/* restart the scan state machine */
 	ctrl = __raw_readl(&data->regs->control);
-- 
1.7.0.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox