Netdev List
 help / color / mirror / Atom feed
* [PATCH 1/2] net: Add function to get SW rxhash
From: Tom Herbert @ 2013-09-24 16:20 UTC (permalink / raw)
  To: davem; +Cc: netdev

Some uses of skb_get_rxhash expect that the function will return
a consistent value whether it is called on RX or TX paths. On RX
path, we will use the rxhash if provided by the NIC, so this
would not normally be the same result computed in TX path which is
a software calculation.

This patch adds skb_get_sw_rxhash to explicitly request a hash
calculated by the stack, disregarding the hash provided by NIC.

Signed-off-by: Tom Herbert <therbert@google.com>
---
 include/linux/skbuff.h    | 11 ++++++++++-
 net/core/flow_dissector.c |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ddb48d..917a590 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -488,6 +488,7 @@ struct sk_buff {
 	__u8			pfmemalloc:1;
 	__u8			ooo_okay:1;
 	__u8			l4_rxhash:1;
+	__u8			sw_rxhash:1;
 	__u8			wifi_acked_valid:1;
 	__u8			wifi_acked:1;
 	__u8			no_fcs:1;
@@ -498,7 +499,7 @@ struct sk_buff {
 	 * headers if needed
 	 */
 	__u8			encapsulation:1;
-	/* 7/9 bit hole (depending on ndisc_nodetype presence) */
+	/* 6/8 bit hole (depending on ndisc_nodetype presence) */
 	kmemcheck_bitfield_end(flags2);
 
 #if defined CONFIG_NET_DMA || defined CONFIG_NET_RX_BUSY_POLL
@@ -726,6 +727,14 @@ static inline __u32 skb_get_rxhash(struct sk_buff *skb)
 	return skb->rxhash;
 }
 
+static inline __u32 skb_get_sw_rxhash(struct sk_buff *skb)
+{
+	if (!skb->l4_rxhash && !skb->sw_rxhash)
+		__skb_get_rxhash(skb);
+
+	return skb->rxhash;
+}
+
 #ifdef NET_SKBUFF_DATA_USES_OFFSET
 static inline unsigned char *skb_end_pointer(const struct sk_buff *skb)
 {
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 1929af8..8979121 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -200,6 +200,7 @@ void __skb_get_rxhash(struct sk_buff *skb)
 		hash = 1;
 
 	skb->rxhash = hash;
+	skb->sw_rxhash = 1;
 }
 EXPORT_SYMBOL(__skb_get_rxhash);
 
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH net] vxlan: Use RCU apis to access sk_user_data.
From: Stephen Hemminger @ 2013-09-24 16:20 UTC (permalink / raw)
  To: Pravin B Shelar; +Cc: netdev, Jesse Gross
In-Reply-To: <1380000501-13969-1-git-send-email-pshelar@nicira.com>

On Mon, 23 Sep 2013 22:28:21 -0700
Pravin B Shelar <pshelar@nicira.com> wrote:

>  
> +#define sk_user_data_rcu(sk) ((*((void __rcu **)&(sk)->sk_user_data)))

I don't like the macro, is it just to keep sparse happy?
The name also implies that rcu_dereference() is built in.

^ permalink raw reply

* [PATCH net-next v2] dev: always advertise rx_flags changes
From: Nicolas Dichtel @ 2013-09-24 16:21 UTC (permalink / raw)
  To: stephen; +Cc: netdev, davem, Nicolas Dichtel
In-Reply-To: <20130923122715.1d1b2005@nehalam.linuxnetplumber.net>

Depending on the code path, there is/isn't netlink message/call to notifier
chains when rx_flags are updated, let's send advertisement for all cases.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---

v2: rework the patch to avoid double notification

 net/core/dev.c | 64 +++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 41 insertions(+), 23 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 5c713f2239cc..067bfbe70c4c 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4822,7 +4822,7 @@ static void dev_change_rx_flags(struct net_device *dev, int flags)
 		ops->ndo_change_rx_flags(dev, flags);
 }
 
-static int __dev_set_promiscuity(struct net_device *dev, int inc)
+static int __dev_set_promiscuity(struct net_device *dev, int inc, bool notif)
 {
 	unsigned int old_flags = dev->flags;
 	kuid_t uid;
@@ -4865,6 +4865,10 @@ static int __dev_set_promiscuity(struct net_device *dev, int inc)
 
 		dev_change_rx_flags(dev, IFF_PROMISC);
 	}
+	if (notif) {
+		call_netdevice_notifiers(NETDEV_CHANGE, dev);
+		rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_PROMISC);
+	}
 	return 0;
 }
 
@@ -4884,7 +4888,7 @@ int dev_set_promiscuity(struct net_device *dev, int inc)
 	unsigned int old_flags = dev->flags;
 	int err;
 
-	err = __dev_set_promiscuity(dev, inc);
+	err = __dev_set_promiscuity(dev, inc, true);
 	if (err < 0)
 		return err;
 	if (dev->flags != old_flags)
@@ -4893,22 +4897,9 @@ int dev_set_promiscuity(struct net_device *dev, int inc)
 }
 EXPORT_SYMBOL(dev_set_promiscuity);
 
-/**
- *	dev_set_allmulti	- update allmulti count on a device
- *	@dev: device
- *	@inc: modifier
- *
- *	Add or remove reception of all multicast frames to a device. While the
- *	count in the device remains above zero the interface remains listening
- *	to all interfaces. Once it hits zero the device reverts back to normal
- *	filtering operation. A negative @inc value is used to drop the counter
- *	when releasing a resource needing all multicasts.
- *	Return 0 if successful or a negative errno code on error.
- */
-
-int dev_set_allmulti(struct net_device *dev, int inc)
+static int __dev_set_allmulti(struct net_device *dev, int inc, bool notif)
 {
-	unsigned int old_flags = dev->flags;
+	unsigned int old_flags = dev->flags, old_gflags = dev->gflags;
 
 	ASSERT_RTNL();
 
@@ -4931,9 +4922,32 @@ int dev_set_allmulti(struct net_device *dev, int inc)
 	if (dev->flags ^ old_flags) {
 		dev_change_rx_flags(dev, IFF_ALLMULTI);
 		dev_set_rx_mode(dev);
+		if (notif) {
+			call_netdevice_notifiers(NETDEV_CHANGE, dev);
+			if (dev->gflags ^ old_gflags)
+				rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_ALLMULTI);
+		}
 	}
 	return 0;
 }
+
+/**
+ *	dev_set_allmulti	- update allmulti count on a device
+ *	@dev: device
+ *	@inc: modifier
+ *
+ *	Add or remove reception of all multicast frames to a device. While the
+ *	count in the device remains above zero the interface remains listening
+ *	to all interfaces. Once it hits zero the device reverts back to normal
+ *	filtering operation. A negative @inc value is used to drop the counter
+ *	when releasing a resource needing all multicasts.
+ *	Return 0 if successful or a negative errno code on error.
+ */
+
+int dev_set_allmulti(struct net_device *dev, int inc)
+{
+	return __dev_set_allmulti(dev, inc, true);
+}
 EXPORT_SYMBOL(dev_set_allmulti);
 
 /*
@@ -4958,10 +4972,10 @@ void __dev_set_rx_mode(struct net_device *dev)
 		 * therefore calling __dev_set_promiscuity here is safe.
 		 */
 		if (!netdev_uc_empty(dev) && !dev->uc_promisc) {
-			__dev_set_promiscuity(dev, 1);
+			__dev_set_promiscuity(dev, 1, false);
 			dev->uc_promisc = true;
 		} else if (netdev_uc_empty(dev) && dev->uc_promisc) {
-			__dev_set_promiscuity(dev, -1);
+			__dev_set_promiscuity(dev, -1, false);
 			dev->uc_promisc = false;
 		}
 	}
@@ -5050,9 +5064,13 @@ int __dev_change_flags(struct net_device *dev, unsigned int flags)
 
 	if ((flags ^ dev->gflags) & IFF_PROMISC) {
 		int inc = (flags & IFF_PROMISC) ? 1 : -1;
+		unsigned int old_flags = dev->flags;
 
 		dev->gflags ^= IFF_PROMISC;
-		dev_set_promiscuity(dev, inc);
+
+		if (__dev_set_promiscuity(dev, inc, false) >= 0)
+			if (dev->flags != old_flags)
+				dev_set_rx_mode(dev);
 	}
 
 	/* NOTE: order of synchronization of IFF_PROMISC and IFF_ALLMULTI
@@ -5063,7 +5081,7 @@ int __dev_change_flags(struct net_device *dev, unsigned int flags)
 		int inc = (flags & IFF_ALLMULTI) ? 1 : -1;
 
 		dev->gflags ^= IFF_ALLMULTI;
-		dev_set_allmulti(dev, inc);
+		__dev_set_allmulti(dev, inc, false);
 	}
 
 	return ret;
@@ -5101,13 +5119,13 @@ void __dev_notify_flags(struct net_device *dev, unsigned int old_flags)
 int dev_change_flags(struct net_device *dev, unsigned int flags)
 {
 	int ret;
-	unsigned int changes, old_flags = dev->flags;
+	unsigned int changes, old_flags = dev->flags, old_gflags = dev->gflags;
 
 	ret = __dev_change_flags(dev, flags);
 	if (ret < 0)
 		return ret;
 
-	changes = old_flags ^ dev->flags;
+	changes = (old_flags ^ dev->flags) | (old_gflags ^ dev->gflags);
 	if (changes)
 		rtmsg_ifinfo(RTM_NEWLINK, dev, changes);
 
-- 
1.8.2.1

^ permalink raw reply related

* Re: [PATCH net v2] vxlan: Use RCU apis to access sk_user_data.
From: Pravin Shelar @ 2013-09-24 16:25 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Jesse Gross
In-Reply-To: <1380039366.3165.92.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 9:16 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2013-09-24 at 09:09 -0700, Pravin B Shelar wrote:
>> Use of RCU api makes vxlan code easier to understand.  It also
>> fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
>> In rare case without ACCESS_ONCE() compiler might omit vs on
>> sk_user_data dereference. Compiler can use vs as alias for
>> sk->sk_user_data, resulting in multiple sk_user_data dereference
>> in rcu read context which could change.
>>
>> CC: Jesse Gross <jesse@nicira.com>
>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>> ---
>> v2:
>> Use RCU_INIT_POINTER
>> ---
>>  drivers/net/vxlan.c |   11 ++++-------
>>  include/net/sock.h  |    2 ++
>>  2 files changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>> index d1292fe..1fae0d6 100644
>> --- a/drivers/net/vxlan.c
>> +++ b/drivers/net/vxlan.c
>> @@ -1,4 +1,4 @@
>> -/*
>> + /*
>>   * VXLAN: Virtual eXtensible Local Area Network
>>   *
>
>
> extra space ?
>
right, I missed that.

^ permalink raw reply

* [RFC PATCH 0/2] l2tp: add vlan pseudowire support
From: James Chapman @ 2013-09-24 16:27 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman

This RFC patch series adds L2TP vlan pseudowire support by creating a
vlan netdev in addition to the l2tpeth netdev when the user creates
vlan pseudowires. To do so, new symbols are exported in 8021q to allow
l2tp to create and destroy vlan netdevs. The existing L2TP netlink API
already contains attributes for vlan pseudowires but they have been
unused to date.

L2TP vlan pseudowires are similar to the existing L2TP ethernet
pseudowires except that vlan pseudowires always carry only one vlan
and never carry untagged frames, while ethernet pseudowires can carry
any ethernet frame (tagged or untagged). It is possible to create vlan
interfaces on top of l2tpeth interfaces (ethernet pseudowires) today
using standard tools. But for VLAN pseudowires, there must be one and
only one vlan.

The changes in this patch series create a pair of netdevs per vlan
pseudowire: a vlan netdev and an l2tpeth netdev; the l2tpeth netdev is
used as the master for the vlan netdev. The master netdev must not be
configured since untagged frames are not allowed in vlan pseudowires.

Some L2TP setups have thousands of L2TP pseudowires so creating two
netdevs per vlan pseudowire isn't ideal when one might suffice. Should I
investigate changing 8021q to allow for the case where there is no
master netdev?


James Chapman (2):
  vlan: export functions to register/unregister vlan devices
  l2tp: add vlan pseudowire support

 include/linux/if_vlan.h |   11 ++++++
 net/8021q/vlan.c        |   82 ++++++++++++++++++++++++++++------------------
 net/l2tp/l2tp_eth.c     |   73 ++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 132 insertions(+), 34 deletions(-)

^ permalink raw reply

* [RFC PATCH 2/2] l2tp: add vlan pseudowire support
From: James Chapman @ 2013-09-24 16:27 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman
In-Reply-To: <1380040030-6648-1-git-send-email-jchapman@katalix.com>

Register the l2tp_eth driver for netlink ops using the vlan pseudowire
type. Add code to create/destroy a VLAN netdevice when the pseudowire
type is VLAN. This requires new exports in the vlan code.

This results in two netdevices per vlan pseudowire:
1. a master, which should never be used
2. a vlan device, which is enslaved to the master device

The session's ifname value is set to the VLAN netdevice name such
that it is the name returned in session_get netlink requests. For vlan
pseudowires, this should always be the interface that userspace
configures.
---
 net/l2tp/l2tp_eth.c |   73 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 71 insertions(+), 2 deletions(-)

diff --git a/net/l2tp/l2tp_eth.c b/net/l2tp/l2tp_eth.c
index 76125c5..aae38d9 100644
--- a/net/l2tp/l2tp_eth.c
+++ b/net/l2tp/l2tp_eth.c
@@ -17,6 +17,7 @@
 #include <linux/hash.h>
 #include <linux/l2tp.h>
 #include <linux/in.h>
+#include <linux/if_vlan.h>
 #include <linux/etherdevice.h>
 #include <linux/spinlock.h>
 #include <net/sock.h>
@@ -53,6 +54,7 @@ struct l2tp_eth {
 /* via l2tp_session_priv() */
 struct l2tp_eth_sess {
 	struct net_device	*dev;
+	struct net_device	*vlan_dev;
 };
 
 /* per-net private data for this module */
@@ -178,6 +180,50 @@ error:
 	kfree_skb(skb);
 }
 
+static int l2tp_vlan_create(struct l2tp_session *session, u16 vlan_id)
+{
+	struct l2tp_eth_sess *spriv;
+	struct net_device *dev;
+	struct net_device *vlan_dev;
+	char name[IFNAMSIZ + 1];
+	int rc;
+
+	spriv = l2tp_session_priv(session);
+	dev = spriv->dev;
+
+	if (!vlan_id)
+		return -EINVAL;
+
+	snprintf(name, IFNAMSIZ, "%s.%i", dev->name, vlan_id);
+	rtnl_lock();
+	rc = vlan_register_device(dev, vlan_id, &vlan_dev, name);
+	rtnl_unlock();
+	if (rc < 0)
+		return rc;
+
+	spriv->vlan_dev = vlan_dev;
+	strlcpy(session->ifname, vlan_dev->name, IFNAMSIZ);
+
+	return 0;
+}
+
+static int l2tp_vlan_delete(struct l2tp_session *session)
+{
+	struct l2tp_eth_sess *spriv;
+	struct net_device *dev;
+
+	spriv = l2tp_session_priv(session);
+	dev = spriv->dev;
+
+	rtnl_lock();
+	unregister_vlan_dev(spriv->vlan_dev, NULL);
+	rtnl_unlock();
+	strlcpy(session->ifname, dev->name, IFNAMSIZ);
+	spriv->vlan_dev = NULL;
+
+	return 0;
+}
+
 static void l2tp_eth_delete(struct l2tp_session *session)
 {
 	struct l2tp_eth_sess *spriv;
@@ -186,6 +232,8 @@ static void l2tp_eth_delete(struct l2tp_session *session)
 	if (session) {
 		spriv = l2tp_session_priv(session);
 		dev = spriv->dev;
+		if (spriv->vlan_dev)
+			l2tp_vlan_delete(session);
 		if (dev) {
 			unregister_netdev(dev);
 			spriv->dev = NULL;
@@ -277,10 +325,18 @@ static int l2tp_eth_create(struct net *net, u32 tunnel_id, u32 session_id, u32 p
 	if (rc < 0)
 		goto out_del_dev;
 
-	__module_get(THIS_MODULE);
 	/* Must be done after register_netdev() */
 	strlcpy(session->ifname, dev->name, IFNAMSIZ);
 
+	if (cfg->pw_type == L2TP_PWTYPE_ETH_VLAN) {
+		/* Create VLAN interface. Replaces session->ifname */
+		rc = l2tp_vlan_create(session, cfg->vlan_id);
+		if (rc < 0)
+			goto out_unreg;
+	}
+
+	__module_get(THIS_MODULE);
+
 	dev_hold(dev);
 	pn = l2tp_eth_pernet(dev_net(dev));
 	spin_lock(&pn->l2tp_eth_lock);
@@ -289,6 +345,8 @@ static int l2tp_eth_create(struct net *net, u32 tunnel_id, u32 session_id, u32 p
 
 	return 0;
 
+out_unreg:
+	unregister_netdev(dev);
 out_del_dev:
 	free_netdev(dev);
 	spriv->dev = NULL;
@@ -328,6 +386,11 @@ static int __init l2tp_eth_init(void)
 	err = l2tp_nl_register_ops(L2TP_PWTYPE_ETH, &l2tp_eth_nl_cmd_ops);
 	if (err)
 		goto out;
+#if IS_ENABLED(CONFIG_VLAN_8021Q)
+	err = l2tp_nl_register_ops(L2TP_PWTYPE_ETH_VLAN, &l2tp_eth_nl_cmd_ops);
+	if (err)
+		goto out;
+#endif
 
 	err = register_pernet_device(&l2tp_eth_net_ops);
 	if (err)
@@ -338,6 +401,9 @@ static int __init l2tp_eth_init(void)
 	return 0;
 
 out_unreg:
+#if IS_ENABLED(CONFIG_VLAN_8021Q)
+	l2tp_nl_unregister_ops(L2TP_PWTYPE_ETH_VLAN);
+#endif
 	l2tp_nl_unregister_ops(L2TP_PWTYPE_ETH);
 out:
 	return err;
@@ -346,6 +412,9 @@ out:
 static void __exit l2tp_eth_exit(void)
 {
 	unregister_pernet_device(&l2tp_eth_net_ops);
+#if IS_ENABLED(CONFIG_VLAN_8021Q)
+	l2tp_nl_unregister_ops(L2TP_PWTYPE_ETH_VLAN);
+#endif
 	l2tp_nl_unregister_ops(L2TP_PWTYPE_ETH);
 }
 
@@ -355,4 +424,4 @@ module_exit(l2tp_eth_exit);
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("James Chapman <jchapman@katalix.com>");
 MODULE_DESCRIPTION("L2TP ethernet pseudowire driver");
-MODULE_VERSION("1.0");
+MODULE_VERSION("1.1");
-- 
1.7.0.4

^ permalink raw reply related

* [RFC PATCH 1/2] vlan: export functions to register/unregister vlan devices
From: James Chapman @ 2013-09-24 16:27 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman
In-Reply-To: <1380040030-6648-1-git-send-email-jchapman@katalix.com>

Export two new symbols to let external code register and unregister
vlan devices. Allow the caller to specify the name template for the
new netdev instance.

The new symbols are:
vlan_register_device
unregister_vlan_dev
---
 include/linux/if_vlan.h |   11 ++++++
 net/8021q/vlan.c        |   82 ++++++++++++++++++++++++++++------------------
 2 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 715c343..c6431d5 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -101,6 +101,8 @@ extern void vlan_vids_del_by_dev(struct net_device *dev,
 				 const struct net_device *by_dev);
 
 extern bool vlan_uses_dev(const struct net_device *dev);
+extern int vlan_register_device(struct net_device *real_dev, u16 vlan_id, struct net_device **dev, char *name);
+extern void unregister_vlan_dev(struct net_device *dev, struct list_head *head);
 #else
 static inline struct net_device *
 __vlan_find_dev_deep(struct net_device *real_dev,
@@ -155,6 +157,15 @@ static inline bool vlan_uses_dev(const struct net_device *dev)
 {
 	return false;
 }
+
+static inline int vlan_register_device(struct net_device *real_dev, u16 vlan_id, struct net_device **dev, char *name)
+{
+	return -EPROTONOSUPPORT;
+}
+
+static inline void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
+{
+}
 #endif
 
 static inline bool vlan_hw_offload_capable(netdev_features_t features,
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 61fc573..272157c 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -121,6 +121,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 	/* Get rid of the vlan's reference to real_dev */
 	dev_put(real_dev);
 }
+EXPORT_SYMBOL(unregister_vlan_dev);
 
 int vlan_check_real_dev(struct net_device *real_dev,
 			__be16 protocol, u16 vlan_id)
@@ -204,18 +205,16 @@ out_vid_del:
 	return err;
 }
 
-/*  Attach a VLAN device to a mac address (ie Ethernet Card).
- *  Returns 0 if the device was created or a negative error code otherwise.
- */
-static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
+int vlan_register_device(struct net_device *real_dev, u16 vlan_id, struct net_device **dev, char *name)
 {
 	struct net_device *new_dev;
 	struct vlan_dev_priv *vlan;
 	struct net *net = dev_net(real_dev);
-	struct vlan_net *vn = net_generic(net, vlan_net_id);
-	char name[IFNAMSIZ];
 	int err;
 
+	if (!dev || !name)
+		return -EINVAL;
+
 	if (vlan_id >= VLAN_VID_MASK)
 		return -ERANGE;
 
@@ -223,32 +222,6 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
 	if (err < 0)
 		return err;
 
-	/* Gotta set up the fields for the device. */
-	switch (vn->name_type) {
-	case VLAN_NAME_TYPE_RAW_PLUS_VID:
-		/* name will look like:	 eth1.0005 */
-		snprintf(name, IFNAMSIZ, "%s.%.4i", real_dev->name, vlan_id);
-		break;
-	case VLAN_NAME_TYPE_PLUS_VID_NO_PAD:
-		/* Put our vlan.VID in the name.
-		 * Name will look like:	 vlan5
-		 */
-		snprintf(name, IFNAMSIZ, "vlan%i", vlan_id);
-		break;
-	case VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD:
-		/* Put our vlan.VID in the name.
-		 * Name will look like:	 eth0.5
-		 */
-		snprintf(name, IFNAMSIZ, "%s.%i", real_dev->name, vlan_id);
-		break;
-	case VLAN_NAME_TYPE_PLUS_VID:
-		/* Put our vlan.VID in the name.
-		 * Name will look like:	 vlan0005
-		 */
-	default:
-		snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
-	}
-
 	new_dev = alloc_netdev(sizeof(struct vlan_dev_priv), name, vlan_setup);
 
 	if (new_dev == NULL)
@@ -273,12 +246,57 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
 	if (err < 0)
 		goto out_free_newdev;
 
+	*dev = new_dev;
+
 	return 0;
 
 out_free_newdev:
 	free_netdev(new_dev);
 	return err;
 }
+EXPORT_SYMBOL(vlan_register_device);
+
+/*  Attach a VLAN device to a mac address (ie Ethernet Card).
+ *  Returns 0 if the device was created or a negative error code otherwise.
+ */
+static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
+{
+	struct net_device *new_dev;
+	struct net *net = dev_net(real_dev);
+	struct vlan_net *vn = net_generic(net, vlan_net_id);
+	char name[IFNAMSIZ];
+	int err;
+
+	/* Gotta set up the fields for the device. */
+	switch (vn->name_type) {
+	case VLAN_NAME_TYPE_RAW_PLUS_VID:
+		/* name will look like:	 eth1.0005 */
+		snprintf(name, IFNAMSIZ, "%s.%.4i", real_dev->name, vlan_id);
+		break;
+	case VLAN_NAME_TYPE_PLUS_VID_NO_PAD:
+		/* Put our vlan.VID in the name.
+		 * Name will look like:	 vlan5
+		 */
+		snprintf(name, IFNAMSIZ, "vlan%i", vlan_id);
+		break;
+	case VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD:
+		/* Put our vlan.VID in the name.
+		 * Name will look like:	 eth0.5
+		 */
+		snprintf(name, IFNAMSIZ, "%s.%i", real_dev->name, vlan_id);
+		break;
+	case VLAN_NAME_TYPE_PLUS_VID:
+		/* Put our vlan.VID in the name.
+		 * Name will look like:	 vlan0005
+		 */
+	default:
+		snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
+	}
+
+	err = vlan_register_device(real_dev, vlan_id, &new_dev, name);
+
+	return err;
+}
 
 static void vlan_sync_address(struct net_device *dev,
 			      struct net_device *vlandev)
-- 
1.7.0.4

^ permalink raw reply related

* Re: [PATCH net] vxlan: Use RCU apis to access sk_user_data.
From: Pravin Shelar @ 2013-09-24 16:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Jesse Gross
In-Reply-To: <20130924092050.2c80bc74@nehalam.linuxnetplumber.net>

On Tue, Sep 24, 2013 at 9:20 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon, 23 Sep 2013 22:28:21 -0700
> Pravin B Shelar <pshelar@nicira.com> wrote:
>
>>
>> +#define sk_user_data_rcu(sk) ((*((void __rcu **)&(sk)->sk_user_data)))
>
> I don't like the macro, is it just to keep sparse happy?
yes, it is for sparse.
> The name also implies that rcu_dereference() is built in.
ok, I can right separate API for rcu-set and rcu-deref.

^ permalink raw reply

* Re: [Xen-devel] [PATCH net-next] xen-netfront: convert to GRO API and advertise this feature
From: Konrad Rzeszutek Wilk @ 2013-09-24 16:30 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, Anirban Chakraborty, Ian Campbell, xen-devel
In-Reply-To: <1379779543-27122-1-git-send-email-wei.liu2@citrix.com>

On Sat, Sep 21, 2013 at 05:05:43PM +0100, Wei Liu wrote:
> Anirban was seeing netfront received MTU size packets, which downgraded
> throughput. The following patch makes netfront use GRO API which
> improves throughput for that case.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
> Cc: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> ---
>  drivers/net/xen-netfront.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 36808bf..5664165 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device *dev,
>  		u64_stats_update_end(&stats->syncp);
>  
>  		/* Pass it up. */
> -		netif_receive_skb(skb);
> +		napi_gro_receive(&np->napi, skb);
>  	}
>  
>  	return packets_dropped;
> @@ -1051,6 +1051,8 @@ err:
>  	if (work_done < budget) {
>  		int more_to_do = 0;
>  
> +		napi_gro_flush(napi, false);
> +
>  		local_irq_save(flags);
>  
>  		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
> @@ -1371,7 +1373,8 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>  	netif_napi_add(netdev, &np->napi, xennet_poll, 64);
>  	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>  				  NETIF_F_GSO_ROBUST;
> -	netdev->hw_features	= NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO;
> +	netdev->hw_features	= NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO |
> +				  NETIF_F_GRO;
>  
>  	/*
>           * Assume that all hw features are available for now. This set
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 16:32 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, jesse.brandeburg
In-Reply-To: <alpine.DEB.2.02.1309231535030.23896@tomh.mtv.corp.google.com>

On Mon, 2013-09-23 at 15:41 -0700, Tom Herbert wrote:
> Introduce Toeplitz hash functions. Toeplitz is a hash used primarily in
> NICs to performan RSS flow steering.  This is a software implemenation
> of that. In order to make the hash calculation efficient, we precompute
> the possible hash values for each inidividual byte of input. The input
> length is up to 40 bytes, so we make an array of cache[40][256].

The input length is up to 36 bytes, but we need an extra 31 bits in the
key so that each bit of input is multiplied by 32 bits of the key.  The
first dimension should therefore be 36.

But this is still pretty big. :-/

> The implemenation was verified against MSDN "Verify RSS hash" sample
> values.
[...]
> --- /dev/null
> +++ b/lib/toeplitz.c

I'm not convinced that this is going to be useful outside of net.

> @@ -0,0 +1,66 @@
> +/*
> + * Toeplitz hash implemenation. See include/linux/toeplitz.h
> + *
> + * Copyright (c) 2011, Tom Herbert <therbert@google.com>
> + */
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/slab.h>
> +#include <linux/random.h>
> +#include <linux/toeplitz.h>
> +
> +struct toeplitz *toeplitz_alloc(void)
> +{
> +	return kmalloc(sizeof(struct toeplitz), GFP_KERNEL);
> +}
> +
> +static u32 toeplitz_get_kval(unsigned char *key, int idx)

unsigned int idx

> +{
> +	u32 v, r;
> +	int off, rem;
> +
> +	off = idx / 8;
> +	rem = idx % 8;

Otherwise this requires a 'real' division if the compiler doesn't work
out that idx >= 0.

> +	v = (((unsigned int)key[off]) << 24) +
> +	    (((unsigned int)key[off + 1]) << 16) +
> +	    (((unsigned int)key[off + 2]) << 8) +
> +	    (((unsigned int)key[off + 3]));
> +
> +	r = v << rem | (unsigned int)key[off + 4] >> (8 - rem);
> +	return r;
> +}
> +
> +static inline int idx8(int idx)
> +{
> +#ifdef __LITTLE_ENDIAN
> +        idx = (idx / 8) * 8 + (8 - (idx % 8 + 1));

Or in short: idx ^= 7

> +#endif
> +        return idx;
> +}
> +
> +void toeplitz_init(struct toeplitz *toeplitz, u8 *key_vals)
> +{
> +	int i;
> +	unsigned long a, j;
> +	unsigned int result = 0;
> +
> +	/* Set up key val table */
> +	if (key_vals)
> +		for (i = 0; i < TOEPLITZ_KEY_LEN; i++)
> +			toeplitz->key_vals[i] = key_vals[i];
> +	else
> +		prandom_bytes(toeplitz->key_vals, TOEPLITZ_KEY_LEN);

Should have braces around the if and else bodies.

Is it worth sacrificing a little randomness here by avoiding long runs
of zeroes in the key?

> +       /* Set up key cache table */
> +       for (i = 0; i < TOEPLITZ_KEY_LEN; i++) {
[...]

As I said above, the outer dimension of key_cache should be 36.  The
current loop causes toeplitz_get_kval() to over-run the bounds of the
key_vals array.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 16:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <1380039299.3165.91.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 9:14 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2013-09-24 at 18:01 +0200, Hannes Frederic Sowa wrote:
>
>> inet(6)_ehashfn does get used by udp btw. But I do not understand the code,
>> yet. ;)
>
> Oh well, thats the SO_REUSEPORT thing. We do not really care, thats only
> used to try to steer udp flows on a given socket, but its a hint.
>
> Presumably it should use a separate secret key.
>
>
We should really be using rxhash for that anyway, eliminate this
ehashfn.  This would entail adding rxhash argument in the various
udp_lookup functions.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 16:38 UTC (permalink / raw)
  To: David Laight; +Cc: Tom Herbert, davem, netdev, jesse.brandeburg
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B7355@saturn3.aculab.com>

On Tue, 2013-09-24 at 09:32 +0100, David Laight wrote:
> > +static inline unsigned int
> > +toeplitz_hash(const unsigned char *bytes,
> > +	      struct toeplitz *toeplitz, int n)
> > +{
> > +	int i;
> > +	unsigned int result = 0;
> > +
> > +	for (i = 0; i < n; i++)
> > +		result ^= toeplitz->key_cache[i][bytes[i]];
> > +
> > +        return result;
> > +};
> 
> That is a horrid hash function to be calculating in software.
> 
> The code looks very much like a simple 32bit CRC.
> It isn't entirely clears exactly where the 'key' gets included,
> but I suspect it is just xored with the data bytes.

Each input bit is multiplied by a 32-bit slice of the key and the
products are then bitwise-summed (i.e. xored).

[...]
> I also thought the hash was arranged so that tx and rx packets
> for a single connection hash to the same value?

Not in general.  I think I saw that one of the BSDs (DragonflyBSD?)
generates Toeplitz keys with this property and has an efficient software
implementation that works for those keys.  But that significantly
weakens the hash.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* [PATCH 2/2] tun: call skb_get_sw_rxhash
From: Tom Herbert @ 2013-09-24 16:21 UTC (permalink / raw)
  To: davem; +Cc: netdev

tun calls skb_get_rxhash on both transmit and receive side with the
expectation that the calculation in symmetric. To ensure this
property call skb_get_sw_rxhash instead (ignores hash provided by NIC
on receive).

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/tun.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 807815f..0619fdf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -358,7 +358,7 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb)
 	rcu_read_lock();
 	numqueues = ACCESS_ONCE(tun->numqueues);
 
-	txq = skb_get_rxhash(skb);
+	txq = skb_get_sw_rxhash(skb);
 	if (txq) {
 		e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq);
 		if (e)
@@ -1138,7 +1138,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	skb_reset_network_header(skb);
 	skb_probe_transport_header(skb, 0);
 
-	rxhash = skb_get_rxhash(skb);
+	rxhash = skb_get_sw_rxhash(skb);
 	netif_rx_ni(skb);
 
 	tun->dev->stats.rx_packets++;
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 16:46 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <CA+mtBx9u3-mnULxTwyWAZA4gZ6Dg-PeYYVGZibwQi7toinHJ-A@mail.gmail.com>

On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:

> We should really be using rxhash for that anyway, eliminate this
> ehashfn.  This would entail adding rxhash argument in the various
> udp_lookup functions.

Nope : Some NICs provide UDP rxhash only using L3  (source IP,
destination IP), not L4 (adding source & destination ports)

This is again a case where we want our own hashing.

^ permalink raw reply

* Re: [PATCH net-next v2] dev: always advertise rx_flags changes
From: Stephen Hemminger @ 2013-09-24 16:47 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: netdev, davem
In-Reply-To: <1380039692-8157-1-git-send-email-nicolas.dichtel@6wind.com>

On Tue, 24 Sep 2013 18:21:32 +0200
Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5c713f2239cc..067bfbe70c4c 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4822,7 +4822,7 @@ static void dev_change_rx_flags(struct net_device *dev, int flags)
>  		ops->ndo_change_rx_flags(dev, flags);
>  }
>  
> -static int __dev_set_promiscuity(struct net_device *dev, int inc)
> +static int __dev_set_promiscuity(struct net_device *dev, int inc, bool notif)
>  {
>  	unsigned int old_flags = dev->flags;
>  	kuid_t uid;
> @@ -4865,6 +4865,10 @@ static int __dev_set_promiscuity(struct net_device *dev, int inc)
>  
>  		dev_change_rx_flags(dev, IFF_PROMISC);
>  	}
> +	if (notif) {
> +		call_netdevice_notifiers(NETDEV_CHANGE, dev);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_PROMISC);
> +	}
>  	return 0;
>  }

This is getting uglier as the conditional is added to each
function. Why not create a helper function and change callers.

static void __dev_notify_flag(dev, int flags)
{
        call_netdevice_notifiers(NETDEV_CHANGE, dev);
	rtmsg_ifinfo(RTM_NEWLINK, dev, flags);
}

^ permalink raw reply

* Re: SMSC 9303 support
From: Ben Hutchings @ 2013-09-24 16:51 UTC (permalink / raw)
  To: Gary Thomas; +Cc: netdev
In-Reply-To: <524183D6.6040801@mlbassoc.com>

On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
> I need to support the SMSC9303 in an embedded system.  I'm not
> finding any [explicit] support for this device in the latest
> mainline kernel.  Did I miss something?
> 
> To be clear, the SMSC9303 is a 3-port managed ethernet switch
> capable of supporting 802.1D/802.1Q directly. This switch is
> driven by a single MAC via MII/RMII and exposes the other two
> ports via physical PHYs.  What I need it to do is behave like
> two external, separate devices.  I was thinking that what I need
> to do is treat these as VLAN devices since the switch can manage
> the routing.
> 
> Does this seem like a reasonable approach?

Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
of packets to indicate which switch port they are sent or received
through.  This was originally added to support some Marvell switch chips
and I don't know whether it would be suitable or extensible for this
one.

Ben.

> How do I "hook up" my normal ethernet driver to it?  To the hardware
> it just looks like any other MII/RMII PHY.  The device is managed
> separately via I2C.  I can have that set up separately if necessary.
> 
> Thanks for any pointers/ideas
> 

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 04/10] amd/7990: Remove extern from function prototypes
From: Joe Perches @ 2013-09-24 16:55 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-kernel
In-Reply-To: <65ef2eba58fddc268ba0afb4de7757f073d3bb49.1379974101.git.joe@perches.com>

Hey David.

This ended up in your tree as
commit 44da5c2f523876a02336c18dbcce93449fdbf0ca
without your signoff.

Is there something I need to do differently so
your scripts add your sign-off properly or did
this manage to just slip through via some manual
git am without the -s?

^ permalink raw reply

* Re: SMSC 9303 support
From: Gary Thomas @ 2013-09-24 16:59 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev
In-Reply-To: <1380041488.2736.41.camel@bwh-desktop.uk.level5networks.com>

On 2013-09-24 10:51, Ben Hutchings wrote:
> On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
>> I need to support the SMSC9303 in an embedded system.  I'm not
>> finding any [explicit] support for this device in the latest
>> mainline kernel.  Did I miss something?
>>
>> To be clear, the SMSC9303 is a 3-port managed ethernet switch
>> capable of supporting 802.1D/802.1Q directly. This switch is
>> driven by a single MAC via MII/RMII and exposes the other two
>> ports via physical PHYs.  What I need it to do is behave like
>> two external, separate devices.  I was thinking that what I need
>> to do is treat these as VLAN devices since the switch can manage
>> the routing.
>>
>> Does this seem like a reasonable approach?
>
> Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
> of packets to indicate which switch port they are sent or received
> through.  This was originally added to support some Marvell switch chips
> and I don't know whether it would be suitable or extensible for this
> one.

I've used the DSA stuff for years (worked directly with the Marvell folks
when it was being developed).  It might work for this device, I'll think
some more about using it although I was hoping for a lighter weight solution.

>> How do I "hook up" my normal ethernet driver to it?  To the hardware
>> it just looks like any other MII/RMII PHY.  The device is managed
>> separately via I2C.  I can have that set up separately if necessary.
>>
>> Thanks for any pointers/ideas
>>
>

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 17:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, Hannes Frederic Sowa, David Miller,
	Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <1380041185.3165.97.camel@edumazet-glaptop>

On Tue, 2013-09-24 at 09:46 -0700, Eric Dumazet wrote:
> On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
> 
> > We should really be using rxhash for that anyway, eliminate this
> > ehashfn.  This would entail adding rxhash argument in the various
> > udp_lookup functions.
> 
> Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> destination IP), not L4 (adding source & destination ports)

Indeed the Microsoft RSS spec doesn't define an L4 hash for UDP/IP,
since it can't work with fragmented datagrams (which TCP avoids).  Using
the TCP/IP hash function for UDP is a non-standard option - presumably
useful where no fragmentation is expected.

Ben.

> This is again a case where we want our own hashing.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 1/2] net: Add function to get SW rxhash
From: Eric Dumazet @ 2013-09-24 17:03 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.2.02.1309240915001.24581@tomh.mtv.corp.google.com>

On Tue, 2013-09-24 at 09:20 -0700, Tom Herbert wrote:
> Some uses of skb_get_rxhash expect that the function will return
> a consistent value whether it is called on RX or TX paths. On RX
> path, we will use the rxhash if provided by the NIC, so this
> would not normally be the same result computed in TX path which is
> a software calculation.
> 
> This patch adds skb_get_sw_rxhash to explicitly request a hash
> calculated by the stack, disregarding the hash provided by NIC.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---

1) Adding another skb field for this corner case, is it worth it ?
   Computing rxhash in software should be real fast for the
   cases where we really care of symmetry.

  ie ignore skb->rxhash and use __skb_get_rxhash()

2) Not copying sw_rxhash in __copy_skb_header()

   -> value will be undefined after a clone()

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 17:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <1380041185.3165.97.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 9:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
>
>> We should really be using rxhash for that anyway, eliminate this
>> ehashfn.  This would entail adding rxhash argument in the various
>> udp_lookup functions.
>
> Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> destination IP), not L4 (adding source & destination ports)
>
Then the NIC won't set l4_rxhash and we'll rehash over 4-tuple when
skb_get_rxhash is called.

> This is again a case where we want our own hashing.
>
>
>

^ permalink raw reply

* Re: [PATCH] powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file
From: Richard Cochran @ 2013-09-24 17:14 UTC (permalink / raw)
  To: Aida Mynzhasova; +Cc: linuxppc-dev, devicetree, netdev
In-Reply-To: <1380008397-22980-1-git-send-email-aida.mynzhasova@skitlab.ru>

On Tue, Sep 24, 2013 at 11:39:57AM +0400, Aida Mynzhasova wrote:
> Currently IEEE 1588 timer reference clock source is determined through
> hard-coded value in gianfar_ptp driver. This patch allows to select ptp
> clock source by means of device tree file node.
> 
> For instance:
> 
> 	fsl,cksel = <0>;
> 
> for using external (TSEC_TMR_CLK input) high precision timer
> reference clock.
> 
> Other acceptable values:
> 
> 	<1> : eTSEC system clock
> 	<2> : eTSEC1 transmit clock
> 	<3> : RTC clock input

I think it would be useful to have this table in the binding document
as well.

Thanks,
Richard

^ permalink raw reply

* [PATCH net v3] vxlan: Use RCU apis to access sk_user_data.
From: Pravin B Shelar @ 2013-09-24 17:25 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, stephen, Pravin B Shelar, Jesse Gross

Use of RCU api makes vxlan code easier to understand.  It also
fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
In rare case without ACCESS_ONCE() compiler might omit vs on
sk_user_data dereference.
Compiler can use vs as alias for sk->sk_user_data, resulting in
multiple sk_user_data dereference in rcu read context which
could change.

CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
v3:
remove extra space
User rcu_deref and rcu_assign api for accessing sk_user_data
v2:
Use RCU_INIT_POINTER
---
 drivers/net/vxlan.c |    9 +++------
 include/net/sock.h  |    5 +++++
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index d1292fe..2ef5b62 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -952,8 +952,7 @@ void vxlan_sock_release(struct vxlan_sock *vs)
 
 	spin_lock(&vn->sock_lock);
 	hlist_del_rcu(&vs->hlist);
-	smp_wmb();
-	vs->sock->sk->sk_user_data = NULL;
+	rcu_assign_sk_user_data(vs->sock->sk, NULL);
 	vxlan_notify_del_rx_port(sk);
 	spin_unlock(&vn->sock_lock);
 
@@ -1048,8 +1047,7 @@ static int vxlan_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
 
 	port = inet_sk(sk)->inet_sport;
 
-	smp_read_barrier_depends();
-	vs = (struct vxlan_sock *)sk->sk_user_data;
+	vs = rcu_dereference_sk_user_data(sk);
 	if (!vs)
 		goto drop;
 
@@ -2302,8 +2300,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port,
 	atomic_set(&vs->refcnt, 1);
 	vs->rcv = rcv;
 	vs->data = data;
-	smp_wmb();
-	vs->sock->sk->sk_user_data = vs;
+	rcu_assign_sk_user_data(vs->sock->sk, vs);
 
 	spin_lock(&vn->sock_lock);
 	hlist_add_head_rcu(&vs->hlist, vs_head(net, port));
diff --git a/include/net/sock.h b/include/net/sock.h
index 6ba2e7b..1d37a80 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -409,6 +409,11 @@ struct sock {
 	void                    (*sk_destruct)(struct sock *sk);
 };
 
+#define __sk_user_data(sk) ((*((void __rcu **)&(sk)->sk_user_data)))
+
+#define rcu_dereference_sk_user_data(sk)	rcu_dereference(__sk_user_data((sk)))
+#define rcu_assign_sk_user_data(sk, ptr)	rcu_assign_pointer(__sk_user_data((sk)), ptr)
+
 /*
  * SK_CAN_REUSE and SK_NO_REUSE on a socket mean that the socket is OK
  * or not whether his port will be reused by someone else. SK_FORCE_REUSE
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH net 0/4] bridge: Fix problems around the PVID
From: Toshiaki Makita @ 2013-09-24 17:30 UTC (permalink / raw)
  To: vyasevic
  Cc: Toshiaki Makita, David Miller, netdev, Fernando Luis Vazquez Cao,
	Patrick McHardy
In-Reply-To: <52419509.1020103@redhat.com>

On Tue, 2013-09-24 at 09:35 -0400, Vlad Yasevich wrote:
> On 09/24/2013 07:45 AM, Toshiaki Makita wrote:
> > On Mon, 2013-09-23 at 10:41 -0400, Vlad Yasevich wrote:
> >> On 09/17/2013 04:12 AM, Toshiaki Makita wrote:
> >>> On Mon, 2013-09-16 at 13:49 -0400, Vlad Yasevich wrote:
> >>>> On 09/13/2013 08:06 AM, Toshiaki Makita wrote:
> >>>>> On Thu, 2013-09-12 at 16:00 -0400, David Miller wrote:
> >>>>>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> >>>>>> Date: Tue, 10 Sep 2013 19:27:54 +0900
> >>>>>>
> >>>>>>> There seem to be some undesirable behaviors related with PVID.
> >>>>>>> 1. It has no effect assigning PVID to a port. PVID cannot be applied
> >>>>>>> to any frame regardless of whether we set it or not.
> >>>>>>> 2. FDB entries learned via frames applied PVID are registered with
> >>>>>>> VID 0 rather than VID value of PVID.
> >>>>>>> 3. We can set 0 or 4095 as a PVID that are not allowed in IEEE 802.1Q.
> >>>>>>> This leads interoperational problems such as sending frames with VID
> >>>>>>> 4095, which is not allowed in IEEE 802.1Q, and treating frames with VID
> >>>>>>> 0 as they belong to VLAN 0, which is expected to be handled as they have
> >>>>>>> no VID according to IEEE 802.1Q.
> >>>>>>>
> >>>>>>> Note: 2nd and 3rd problems are potential and not exposed unless 1st problem
> >>>>>>> is fixed, because we cannot activate PVID due to it.
> >>>>>>
> >>>>>> Please work out the issues in patch #2 with Vlad and resubmit this
> >>>>>> series.
> >>>>>>
> >>>>>> Thank you.
> >>>>>
> >>>>> I'm hovering between whether we should fix the issue by changing vlan 0
> >>>>> interface behavior in 8021q module or enabling a bridge port to sending
> >>>>> priority-tagged frames, or another better way.
> >>>>>
> >>>>> If you could comment it, I'd appreciate it :)
> >>>>>
> >>>>>
> >>>>> BTW, I think what is discussed in patch #2 is another problem about
> >>>>> handling priority-tags, and it exists without this patch set applied.
> >>>>> It looks like that we should prepare another patch set than this to fix
> >>>>> that problem.
> >>>>>
> >>>>> Should I include patches that fix the priority-tags problem in this
> >>>>> patch set and resubmit them all together?
> >>>>>
> >>>>
> >>>> I am thinking that we might need to do it in bridge and it looks like
> >>>> the simplest way to do it is to have default priority regeneration table
> >>>> (table 6-5 from 802.1Q doc).
> >>>>
> >>>> That way I think we would conform to the spec.
> >>>>
> >>>> -vlad
> >>>
> >>> Unfortunately I don't think the default priority regeneration table
> >>> resolves the problem because IEEE 802.1Q says that a VLAN-aware bridge
> >>> can transmit untagged or VLAN-tagged frames only (the end of section 7.5
> >>> and 8.1.7).
> >>>
> >>> No mechanism to send priority-tagged frames is found as far as I can see
> >>> the standard. I think the regenerated priority is used for outgoing PCP
> >>> field only if egress policy is not untagged (i.e. transmitting as
> >>> VLAN-tagged), and unused if untagged (Section 6.9.2 3rd/4th Paragraph).
> >>>
> >>> If we want to transmit priority-tagged frames from a bridge port, I
> >>> think we need to implement a new (optional) feature that is above the
> >>> standard, as I stated previously.
> >>>
> >>> How do you feel about adding a per-port policy that enables a bridge to
> >>> send priority-tagged frames instead of untagged frames when egress
> >>> policy for the port is untagged?
> >>> With this change, we can transmit frames for a given vlan as either all
> >>> untagged, all priority-tagged or all VLAN-tagged.
> >>
> >> That would work.  What I am thinking is that we do it by special casing
> >> the vid 0 egress policy specification.  Let it be untagged by default
> >> and if it is tagged, then we preserve the priority field and forward
> >> it on.
> >>
> >> This keeps the API stable and doesn't require user/admin from knowing
> >> exactly what happens.  Default operation conforms to the spec and allows
> >> simple change to make it backward-compatible.
> >>
> >> What do you think.  I've done a simple prototype of this an it seems to
> >> work with the VMs I am testing with.
> >
> > Are you saying that
> > - by default, set the 0th bit of untagged_bitmap; and
> > - if we unset the 0th bit and set the "vid"th bit, we transmit frames
> > classified as belonging to VLAN "vid" as priority-tagged?
> >
> > If so, though it's attractive to keep current API, I'm worried about if
> > it could be a bit confusing and not intuitive for kernel/iproute2
> > developers that VID 0 has a special meaning only in the egress policy.
> > Wouldn't it be better to adding a new member to struct net_port_vlans
> > instead of using VID 0 of untagged_bitmap?
> >
> > Or are you saying that we use a new flag in struct net_port_vlans but
> > use the BRIDGE_VLAN_INFO_UNTAGGED bit with VID 0 in netlink to set the
> > flag?
> >
> > Even in that case, I'm afraid that it might be confusing for developers
> > for the same reason. We are going to prohibit to specify VID with 0 (and
> > 4095) in adding/deleting a FDB entry or a vlan filtering entry, but it
> > would allow us to use VID 0 only when a vlan filtering entry is
> > configured.
> > I am thinking a new nlattr is a straightforward approach to configure
> > it.
> 
> By making this an explicit attribute it makes vid 0 a special case for
> any automatic tool that would provision such filtering.  Seeing vid 0
> would mean that these tools would have to know that this would have to
> be translated to a different attribute instead of setting the policy
> values.

Yes, I agree with you that we can do it by the way you explained.
What I don't understand is the advantage of using vid 0 over another way
such as adding a new nlattr.
I think we can indicate transmitting priority-tags explicitly by such a
nlattr. Using vid 0 seems to be easier to implement than a new nlattr,
but, for me, it looks less intuitive and more difficult to maintain
because we have to care about vid 0 instead of simply ignoring it.

Thanks,

Toshiaki Makita

> 
> How it is implemented internally in the kernel isn't as big of an issue.
> We can do it as a separate flag or as part of existing policy.
> 
> -vlad
> 
> >
> > Thanks,
> >
> > Toshiaki Makita
> >
> >>
> >> -vlad
> >>
> >>>
> >>> Thanks,
> >>>
> >>> Toshiaki Makita
> >>>
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Toshiaki Makita
> >>>>>
> >>>>>>
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >
> >
> 

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 17:34 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <CA+mtBx_tWS44q8jER+QtR5k4k2ieq99He6YdPF8KKBF4Z2+32w@mail.gmail.com>

On Tue, 2013-09-24 at 10:03 -0700, Tom Herbert wrote:
> On Tue, Sep 24, 2013 at 9:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
> >
> >> We should really be using rxhash for that anyway, eliminate this
> >> ehashfn.  This would entail adding rxhash argument in the various
> >> udp_lookup functions.
> >
> > Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> > destination IP), not L4 (adding source & destination ports)
> >
> Then the NIC won't set l4_rxhash and we'll rehash over 4-tuple when
> skb_get_rxhash is called.

Yes, but then in this case you add cpu cycles for no reason.

If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
be 0

hash = inet_ehashfn(net, daddr, hnum, saddr, sport);

is faster than the whole flow dissection game.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox