Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next V3 00/13] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-19 17:30 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, davem, or.gerlitz, jhs, mst, erdnetdev, jiri

This series of patches provides an ability to add VLANs to the bridge
ports.  This is similar to what can be found in most switches.  The bridge
port may have any number of VLANs added to it including vlan 0 priority tagged
traffic.  When vlans are added to the port, only traffic tagged with particular
vlan will forwarded over this port.  Additionally, vlan ids are added to FDB
entries and become part of the lookup.  This way we correctly identify the FDB
entry.

A single vlan per port may also be designated as untagged.  Any untagged
traffic recieved by the port will be assigned to this vlan.  Any traffic
exiting the port with a VID matching the untagged vlan will exit untagged (the
bridge will strip the vlan header).  This is similar to "Native Vlan" support
available in most switches.  This is also configurable on the bridge master
interface as well.

The default behavior of the bridge is unchanged if no vlans have been
configured.  Default behavior of each port is also unchanged if no
vlans are configured on that port (i.e there are no ingress/egress checks
or vlan header manipulation).

Changes since v2:
 - Added inline functiosn to manimulate vlan hw filters and re-use in 8021q
   and bridge code.
 - Use rtnl_dereference (Michael Tsirkin)
 - Remove synchronize_net() call (Eric Dumazet)
 - Fix NULL ptr deref bug I introduced in br_ifinfo_notify.

Changes since v1:
 - Fixed some forwarding bugs.
 - Add vlan to local fdb entries.  New local entries are created per vlan
   to facilite correct forwarding to bridge interface.
 - Allow configuration of vlans directly on the bridge master device
   in addition to ports.

Changes since rfc v2:
 - Per-port vlan bitmap is gone and is replaced with a vlan list.
 - Added bridge vlan list, which is referenced by each port.  Entries in
   the birdge vlan list have port bitmap that shows which port are parts
   of which vlan.
 - Netlink API changes.
 - Dropped sysfs support for now.  If people think this is really usefull,
   can add it back.
 - Support for native/untagged vlans.

Changes since rfc v1:
 - Comments addressed regarding formatting and RCU usage
 - iocts have been removed and changed over the netlink interface.
 - Added support of user added ndb entries.
 - changed sysfs interface to export a bitmap.  Also added a write interface.
   I am not sure how much I like it, but it made my testing easier/faster.  I
   might change the write interface to take text instead of binary.

Vlad Yasevich (12):
  bridge: Add vlan filtering infrastructure
  bridge: Validate that vlan is permitted on ingress
  bridge: Verify that a vlan is allowed to egress on give port
  bridge: Cache vlan in the cb for faster egress lookup.
  bridge: Add vlan to unicast fdb entries
  bridge: Add vlan id to multicast groups
  bridge: Add netlink interface to configure vlans on bridge ports
  bridge: Add vlan support to static neighbors
  bridge: Add the ability to configure untagged vlans
  bridge: Implement untagged vlan handling
  bridge: Dump vlan information from a bridge port
  bridge: Add vlan support for local fdb entries

 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    5 +-
 drivers/net/macvlan.c                         |    2 +-
 drivers/net/vxlan.c                           |    3 +-
 include/linux/netdevice.h                     |    4 +-
 include/uapi/linux/if_bridge.h                |   23 ++-
 include/uapi/linux/neighbour.h                |    1 +
 include/uapi/linux/rtnetlink.h                |    1 +
 net/bridge/br_device.c                        |   34 ++-
 net/bridge/br_fdb.c                           |  253 ++++++++++++---
 net/bridge/br_forward.c                       |  160 ++++++++++
 net/bridge/br_if.c                            |  404 ++++++++++++++++++++++++-
 net/bridge/br_input.c                         |   65 ++++-
 net/bridge/br_multicast.c                     |   71 +++--
 net/bridge/br_netlink.c                       |  178 ++++++++++--
 net/bridge/br_private.h                       |   71 ++++-
 net/core/rtnetlink.c                          |   40 ++-
 16 files changed, 1190 insertions(+), 125 deletions(-)

-- 
1.7.7.6

^ permalink raw reply

* [PATCH net-next V3 01/13] vlan: wrap hw-acceleration calls in separate functions.
From: Vlad Yasevich @ 2012-12-19 17:30 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, davem, or.gerlitz, jhs, mst, erdnetdev, jiri
In-Reply-To: <1355938248-8407-1-git-send-email-vyasevic@redhat.com>

Wrap VLAN hardware acceleration calls into separate functions.  This way
other code can re-use it.

Singed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 include/linux/if_vlan.h |   57 +++++++++++++++++++++++++++++++++++++++++++++++
 net/8021q/vlan.c        |    4 +--
 net/8021q/vlan_core.c   |   22 ++++++-----------
 3 files changed, 66 insertions(+), 17 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index d06cc5c..5fc6a02 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -158,6 +158,63 @@ static inline bool vlan_uses_dev(const struct net_device *dev)
 #endif
 
 /**
+ * vlan_hw_buggy - Check to see if VLAN hw acceleration is supported.
+ * @dev: netdevice of the lowerdev/hw nic
+ *
+ * Checks to see if HW and driver report VLAN acceleration correctly.
+ */
+static inline bool vlan_hw_buggy(const struct net_device *dev)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+
+	if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
+	    (!ops->ndo_vlan_rx_add_vid || !ops->ndo_vlan_rx_kill_vid))
+		return true;
+
+	return false;
+}
+
+/**
+ * vlan_vid_add_hw - Add the VLAN vid to the HW filter
+ * @dev: netdevice of the lowerdev/hw nic
+ * @vid: vlan id.
+ *
+ * Inserts the vid into the HW vlan filter table if hw supports it.
+ */
+static inline int vlan_vid_add_hw(const struct netdevice *dev,
+				  unsigned short vid)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	int err = 0;
+
+	if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
+	    ops->ndo_vlan_rx_add_vid)
+		err = ops->ndo_vlan_rx_add_vid(dev, vid);
+
+	return err;
+}
+
+/**
+ * vlan_vid_del_hw - Delete the VLAN vid from the HW filter
+ * @dev: netdevice of the lowerdev/hw nic
+ * @vid: vlan id.
+ *
+ * Delete the vid from the HW vlan filter table if hw supports it.
+ */
+static inline int vlan_vid_del_hw(const struct netdevice *dev,
+				  unsigned short vid)
+{
+	const struct net_device_ops *ops = dev->netdev_ops;
+	int err = 0;
+
+	if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
+	    ops->ndo_vlan_rx_kill_vid)
+		err = ops->ndo_vlan_rx_add_vid(dev, vid);
+
+	return err;
+}
+
+/**
  * vlan_insert_tag - regular VLAN tag inserting
  * @skb: skbuff to tag
  * @vlan_tci: VLAN TCI to insert
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index a292e80..d1ac63f 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -115,15 +115,13 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 int vlan_check_real_dev(struct net_device *real_dev, u16 vlan_id)
 {
 	const char *name = real_dev->name;
-	const struct net_device_ops *ops = real_dev->netdev_ops;
 
 	if (real_dev->features & NETIF_F_VLAN_CHALLENGED) {
 		pr_info("VLANs not supported on %s\n", name);
 		return -EOPNOTSUPP;
 	}
 
-	if ((real_dev->features & NETIF_F_HW_VLAN_FILTER) &&
-	    (!ops->ndo_vlan_rx_add_vid || !ops->ndo_vlan_rx_kill_vid)) {
+	if (vlan_hw_buggy(real_dev)) {
 		pr_info("Device %s has buggy VLAN hw accel\n", name);
 		return -EOPNOTSUPP;
 	}
diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 65e06ab..52d83be 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -220,13 +220,10 @@ static int __vlan_vid_add(struct vlan_info *vlan_info, unsigned short vid,
 	if (!vid_info)
 		return -ENOMEM;
 
-	if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
-	    ops->ndo_vlan_rx_add_vid) {
-		err =  ops->ndo_vlan_rx_add_vid(dev, vid);
-		if (err) {
-			kfree(vid_info);
-			return err;
-		}
+	err = vlan_vid_add_hw(dev, vid);
+	if (err) {
+		kfree(vid_info);
+		return err;
 	}
 	list_add(&vid_info->list, &vlan_info->vid_list);
 	vlan_info->nr_vids++;
@@ -278,13 +275,10 @@ static void __vlan_vid_del(struct vlan_info *vlan_info,
 	unsigned short vid = vid_info->vid;
 	int err;
 
-	if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
-	     ops->ndo_vlan_rx_kill_vid) {
-		err = ops->ndo_vlan_rx_kill_vid(dev, vid);
-		if (err) {
-			pr_warn("failed to kill vid %d for device %s\n",
-				vid, dev->name);
-		}
+	err = vlan_vid_del_hw(dev, vid);
+	if (err) {
+		pr_warn("failed to kill vid %d for device %s\n",
+			vid, dev->name);
 	}
 	list_del(&vid_info->list);
 	kfree(vid_info);
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH net-next V3 03/13] bridge: Validate that vlan is permitted on ingress
From: Vlad Yasevich @ 2012-12-19 17:30 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, davem, or.gerlitz, jhs, mst, erdnetdev, jiri
In-Reply-To: <1355938248-8407-1-git-send-email-vyasevic@redhat.com>

When a frame arrives on a port, if we have VLANs configured,
validate that a given VLAN is allowed to ingress on a given
port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 net/bridge/br_input.c   |   23 +++++++++++++++++++++++
 net/bridge/br_private.h |   15 +++++++++++++--
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 4b34207..54c0894 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -17,6 +17,7 @@
 #include <linux/etherdevice.h>
 #include <linux/netfilter_bridge.h>
 #include <linux/export.h>
+#include <linux/rculist.h>
 #include "br_private.h"
 
 /* Hook for brouter */
@@ -41,6 +42,25 @@ static int br_pass_frame_up(struct sk_buff *skb)
 		       netif_receive_skb);
 }
 
+static bool br_allowed_ingress(struct net_bridge_port *p, struct sk_buff *skb)
+{
+	struct net_port_vlan *pve;
+	u16 vid;
+
+	/* If there are no vlan in the permitted list, all packets are
+	 * permitted.
+	 */
+	if (list_empty(&p->vlan_list))
+		return true;
+
+	vid = br_get_vlan(skb);
+	pve = nbp_vlan_find(p, vid);
+	if (pve)
+		return true;
+
+	return false;
+}
+
 /* note: already called with rcu_read_lock */
 int br_handle_frame_finish(struct sk_buff *skb)
 {
@@ -54,6 +74,9 @@ int br_handle_frame_finish(struct sk_buff *skb)
 	if (!p || p->state == BR_STATE_DISABLED)
 		goto drop;
 
+	if (!br_allowed_ingress(p, skb))
+		goto drop;
+
 	/* insert into forwarding database after filtering to avoid spoofing */
 	br = p->br;
 	br_fdb_update(br, p, eth_hdr(skb)->h_source);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 76d9fbc..1ba76b4 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -66,8 +66,6 @@ struct br_ip
 };
 
 #define BR_INVALID_VID	(1<<15)
-#define BR_UNTAGGED_VID (1<<14)
-
 #define BR_VID_HASH_SIZE (1<<6)
 #define br_vlan_hash(vid) ((vid) % (BR_VID_HASH_SIZE - 1))
 
@@ -197,6 +195,19 @@ static inline struct net_bridge_port *br_port_get_rtnl(struct net_device *dev)
 		rtnl_dereference(dev->rx_handler_data) : NULL;
 }
 
+static inline u16 br_get_vlan(const struct sk_buff *skb)
+{
+	u16 tag;
+
+	if (vlan_tx_tag_present(skb))
+		return vlan_tx_tag_get(skb) & VLAN_VID_MASK;
+
+	if (vlan_get_tag(skb, &tag))
+		return 0;
+
+	return tag & VLAN_VID_MASK;
+}
+
 struct br_cpu_netstats {
 	u64			rx_packets;
 	u64			rx_bytes;
-- 
1.7.7.6

^ permalink raw reply related

* [PATCH net-next V3 02/13] bridge: Add vlan filtering infrastructure
From: Vlad Yasevich @ 2012-12-19 17:30 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, davem, or.gerlitz, jhs, mst, erdnetdev, jiri
In-Reply-To: <1355938248-8407-1-git-send-email-vyasevic@redhat.com>

This is an infrastructure patch.  It adds 2 structures types:
  net_bridge_vlan - list element of all vlans that have been configured
                    on the bridge.
  net_port_vlan - list element of all vlans configured on a specific port.
                  references net_bridge_vlan.

In this implementation, bridge has a hash list of all vlans that have
been added to the bridge.  Each vlan element holds a vid and port_bitmap
where each port sets its bit if a given vlan is added to the port.

Each port has its own list of vlans.  Each element here refrences a vlan
from the bridge list.

Write access to both lists is protected by RTNL, and read access is
protected by RCU.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 net/bridge/br_device.c  |    3 +
 net/bridge/br_if.c      |  243 +++++++++++++++++++++++++++++++++++++++++++++++
 net/bridge/br_private.h |   33 +++++++
 3 files changed, 279 insertions(+), 0 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 7c78e26..9546742 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -332,6 +332,7 @@ static struct device_type br_type = {
 void br_dev_setup(struct net_device *dev)
 {
 	struct net_bridge *br = netdev_priv(dev);
+	int i;
 
 	eth_hw_addr_random(dev);
 	ether_setup(dev);
@@ -354,6 +355,8 @@ void br_dev_setup(struct net_device *dev)
 	spin_lock_init(&br->lock);
 	INIT_LIST_HEAD(&br->port_list);
 	spin_lock_init(&br->hash_lock);
+	for (i = 0; i < BR_VID_HASH_SIZE; i++)
+		INIT_HLIST_HEAD(&br->vlan_hlist[i]);
 
 	br->bridge_id.prio[0] = 0x80;
 	br->bridge_id.prio[1] = 0x00;
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 1c8fdc3..f7641dd6 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -83,6 +83,246 @@ void br_port_carrier_check(struct net_bridge_port *p)
 	spin_unlock_bh(&br->lock);
 }
 
+static void br_vlan_destroy(struct net_bridge_vlan *vlan)
+{
+	if (!bitmap_empty(vlan->port_bitmap, PORT_BITMAP_LEN)) {
+		pr_err("Attempt to delete a VLAN %d from the bridge with "
+		       "non-empty port bitmap (%p)\n", vlan->vid, vlan);
+		BUG();
+	}
+
+	hlist_del_rcu(&vlan->hlist);
+	kfree_rcu(vlan, rcu);
+}
+
+static void br_vlan_hold(struct net_bridge_vlan *vlan)
+{
+	atomic_inc(&vlan->refcnt);
+}
+
+static void br_vlan_put(struct net_bridge_vlan *vlan)
+{
+	if (atomic_dec_and_test(&vlan->refcnt))
+		br_vlan_destroy(vlan);
+}
+
+struct net_bridge_vlan *br_vlan_find(struct net_bridge *br, u16 vid)
+{
+	struct net_bridge_vlan *vlan;
+	struct hlist_node *node;
+
+	hlist_for_each_entry_rcu(vlan, node,
+				 &br->vlan_hlist[br_vlan_hash(vid)], hlist) {
+		if (vlan->vid == vid)
+			return vlan;
+	}
+
+	return NULL;
+}
+
+/* Must be protected by RTNL */
+struct net_bridge_vlan *br_vlan_add(struct net_bridge *br, u16 vid,
+				    u16 flags)
+{
+	struct net_bridge_vlan *vlan;
+
+	ASSERT_RTNL();
+
+	vlan = br_vlan_find(br, vid);
+	if (vlan)
+		return vlan;
+
+	vlan = kzalloc(sizeof(struct net_bridge_vlan), GFP_KERNEL);
+	if (!vlan)
+		return NULL;
+
+	vlan->vid = vid;
+	atomic_set(&vlan->refcnt, 1);
+
+	if (flags & BRIDGE_FLAGS_SELF) {
+		/* Set bit 0 that is associated with the bridge master
+		 * device.  Port numbers start with 1.
+		 */
+		set_bit(0, vlan->port_bitmap);
+	}
+
+	hlist_add_head_rcu(&vlan->hlist, &br->vlan_hlist[br_vlan_hash(vid)]);
+	return vlan;
+}
+
+/* Must be protected by RTNL */
+static void br_vlan_del(struct net_bridge_vlan *vlan, u16 flags)
+{
+	ASSERT_RTNL();
+
+	if (flags & BRIDGE_FLAGS_SELF) {
+		/* Clear bit 0 that is associated with the bridge master
+		 * device.
+		 */
+		clear_bit(0, vlan->port_bitmap);
+	}
+
+	/* Try to remove the vlan, but only once all the ports have
+	 * been removed from the port bitmap
+	 */
+	if (!bitmap_empty(vlan->port_bitmap, PORT_BITMAP_LEN))
+		return;
+
+	vlan->vid = BR_INVALID_VID;
+
+	/* Drop the self-ref to trigger descrution. */
+	br_vlan_put(vlan);
+}
+
+/* Must be protected by RTNL */
+int br_vlan_delete(struct net_bridge *br, u16 vid, u16 flags)
+{
+	struct net_bridge_vlan *vlan;
+
+	ASSERT_RTNL();
+
+	vlan = br_vlan_find(br, vid);
+	if (!vlan)
+		return -ENOENT;
+
+	br_vlan_del(vlan, flags);
+	return 0;
+}
+
+static void br_vlan_flush(struct net_bridge *br)
+{
+	struct net_bridge_vlan *vlan;
+	struct hlist_node *node;
+	struct hlist_node *tmp;
+	int i;
+
+	/* Make sure that there are no vlans left in the bridge after
+	 * all the ports have been removed.
+	 */
+	for (i = 0; i < BR_VID_HASH_SIZE; i++) {
+		hlist_for_each_entry_safe(vlan, node, tmp,
+					  &br->vlan_hlist[i], hlist) {
+			br_vlan_del(vlan, BRIDGE_FLAGS_SELF);
+		}
+	}
+}
+
+struct net_port_vlan *nbp_vlan_find(const struct net_bridge_port *p, u16 vid)
+{
+	struct net_port_vlan *pve;
+
+	/* Must be done either in rcu critical section or with RTNL held */
+	WARN_ON_ONCE(!rcu_read_lock_held() && !rtnl_is_locked());
+
+	list_for_each_entry_rcu(pve, &p->vlan_list, list) {
+		if (pve->vid == vid)
+			return pve;
+	}
+
+	return NULL;
+}
+
+/* Must be protected by RTNL */
+int nbp_vlan_add(struct net_bridge_port *p, u16 vid, u16 flags)
+{
+	struct net_port_vlan *pve;
+	struct net_bridge_vlan *vlan;
+	struct net_device *dev = p->dev;
+	int err;
+
+	ASSERT_RTNL();
+
+	/* Find a vlan in the bridge vlan list.  If it isn't there,
+	 * create it
+	 */
+	vlan = br_vlan_add(p->br, vid, flags);
+	if (!vlan)
+		return -ENOMEM;
+
+	/* Check to see if this port is already part of the vlan.  If
+	 * it is, there is nothing more to do.
+	 */
+	if (test_bit(p->port_no, vlan->port_bitmap))
+		return -EEXIST;
+
+	/* Create port vlan, link it to bridge vlan list, and add port the
+	 * portgroup.
+	 */
+	pve = kmalloc(sizeof(*pve), GFP_KERNEL);
+	if (!pve) {
+		err = -ENOMEM;
+		goto clean_up;
+	}
+
+	/* Add VLAN to the device filter if it is supported.
+	 * Stricly speaking, this is not necessary now, since devices
+	 * are made promiscuous by the bridge, but if that ever changes
+	 * this code will allow tagged traffic to enter the bridge.
+	 */
+	if (!vlan_hw_buggy(dev)) {
+		err = vlan_add_vid_hw(dev, vid);
+		if (err)
+			goto clean_up;
+	}
+
+	pve->vid = vid;
+	pve->vlan = vlan;
+	br_vlan_hold(vlan);
+	set_bit(p->port_no, vlan->port_bitmap);
+
+	list_add_tail_rcu(&pve->list, &p->vlan_list);
+	return 0;
+
+clean_up:
+	kfree(pve);
+	br_vlan_del(vlan, flags);
+	return err;
+}
+
+/* Must be protected by RTNL */
+int nbp_vlan_delete(struct net_bridge_port *p, u16 vid, u16 flags)
+{
+	struct net_device *dev = p->dev;
+	struct net_port_vlan *pve;
+	struct net_bridge_vlan *vlan;
+
+	ASSERT_RTNL();
+
+	pve = nbp_vlan_find(p, vid);
+	if (!pve)
+		return -ENOENT;
+
+	/* Remove VLAN from the device filter if it is supported. */
+	if (vlan_vid_del_hw(dev, vid))
+		pr_warn("failed to kill vid %d for device %s\n",
+			vid, dev->name);
+
+	pve->vid = BR_INVALID_VID;
+
+	vlan = pve->vlan;
+	pve->vlan = NULL;
+	clear_bit(p->port_no, vlan->port_bitmap);
+	br_vlan_put(vlan);
+
+	list_del_rcu(&pve->list);
+	kfree_rcu(pve, rcu);
+
+	br_vlan_del(vlan, flags);
+
+	return 0;
+}
+
+static void nbp_vlan_flush(struct net_bridge_port *p)
+{
+	struct net_port_vlan *pve;
+	struct net_port_vlan *tmp;
+
+	ASSERT_RTNL();
+
+	list_for_each_entry_safe(pve, tmp, &p->vlan_list, list)
+		nbp_vlan_delete(p, pve->vid, BRIDGE_FLAGS_SELF);
+}
+
 static void release_nbp(struct kobject *kobj)
 {
 	struct net_bridge_port *p
@@ -139,6 +379,7 @@ static void del_nbp(struct net_bridge_port *p)
 
 	br_ifinfo_notify(RTM_DELLINK, p);
 
+	nbp_vlan_flush(p);
 	br_fdb_delete_by_port(br, p, 1);
 
 	list_del_rcu(&p->list);
@@ -170,6 +411,7 @@ void br_dev_delete(struct net_device *dev, struct list_head *head)
 		del_nbp(p);
 	}
 
+	br_vlan_flush(br);
 	del_timer_sync(&br->gc_timer);
 
 	br_sysfs_delbr(br->dev);
@@ -222,6 +464,7 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br,
 	p->flags = 0;
 	br_init_port(p);
 	p->state = BR_STATE_DISABLED;
+	INIT_LIST_HEAD(&p->vlan_list);
 	br_stp_port_timer_init(p);
 	br_multicast_add_port(p);
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index ae0a6ec..76d9fbc 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -18,6 +18,7 @@
 #include <linux/netpoll.h>
 #include <linux/u64_stats_sync.h>
 #include <net/route.h>
+#include <linux/if_vlan.h>
 
 #define BR_HASH_BITS 8
 #define BR_HASH_SIZE (1 << BR_HASH_BITS)
@@ -26,6 +27,7 @@
 
 #define BR_PORT_BITS	10
 #define BR_MAX_PORTS	(1<<BR_PORT_BITS)
+#define PORT_BITMAP_LEN	BITS_TO_LONGS(BR_MAX_PORTS)
 
 #define BR_VERSION	"2.3"
 
@@ -63,6 +65,27 @@ struct br_ip
 	__be16		proto;
 };
 
+#define BR_INVALID_VID	(1<<15)
+#define BR_UNTAGGED_VID (1<<14)
+
+#define BR_VID_HASH_SIZE (1<<6)
+#define br_vlan_hash(vid) ((vid) % (BR_VID_HASH_SIZE - 1))
+
+struct net_bridge_vlan {
+	struct hlist_node		hlist;
+	atomic_t			refcnt;
+	struct rcu_head			rcu;
+	u16				vid;
+	unsigned long			port_bitmap[PORT_BITMAP_LEN];
+};
+
+struct net_port_vlan {
+	struct list_head		list;
+	struct net_bridge_vlan		*vlan;
+	struct rcu_head			rcu;
+	u16				vid;
+};
+
 struct net_bridge_fdb_entry
 {
 	struct hlist_node		hlist;
@@ -155,6 +178,7 @@ struct net_bridge_port
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll			*np;
 #endif
+	struct list_head		vlan_list;
 };
 
 #define br_port_exists(dev) (dev->priv_flags & IFF_BRIDGE_PORT)
@@ -259,6 +283,7 @@ struct net_bridge
 	struct timer_list		topology_change_timer;
 	struct timer_list		gc_timer;
 	struct kobject			*ifobj;
+	struct hlist_head		vlan_hlist[BR_VID_HASH_SIZE];
 };
 
 struct br_input_skb_cb {
@@ -400,6 +425,14 @@ extern int br_del_if(struct net_bridge *br,
 extern int br_min_mtu(const struct net_bridge *br);
 extern netdev_features_t br_features_recompute(struct net_bridge *br,
 	netdev_features_t features);
+extern struct net_bridge_vlan *br_vlan_add(struct net_bridge *br, u16 vid,
+					   u16 flags);
+extern int br_vlan_delete(struct net_bridge *br, u16 vid, u16 flags);
+extern struct net_bridge_vlan *br_vlan_find(struct net_bridge *br, u16 vid);
+extern int nbp_vlan_add(struct net_bridge_port *p, u16 vid, u16 flags);
+extern int nbp_vlan_delete(struct net_bridge_port *p, u16 vid, u16 flags);
+extern struct net_port_vlan *nbp_vlan_find(const struct net_bridge_port *p,
+					   u16 vid);
 
 /* br_input.c */
 extern int br_handle_frame_finish(struct sk_buff *skb);
-- 
1.7.7.6

^ permalink raw reply related

* Re: [GIT PULL net-next 04/17] ndisc: Introduce ndisc_fill_redirect_hdr_option().
From: YOSHIFUJI Hideaki @ 2012-12-19 17:27 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: YOSHIFUJI Hideaki, davem, netdev
In-Reply-To: <50D1EA8F.7070504@linux-ipv6.org>

(2012年12月20日 01:25), YOSHIFUJI Hideaki wrote:
> Bjørn Mork wrote:
>> YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> writes:

>>> +static u8 *ndisc_fill_redirect_hdr_option(u8 *opt, struct sk_buff *orig_skb,
>>> +					  int rd_len)
>>> +{
>>> +	memset(opt, 0, 8);
>>> +	*(opt++) = ND_OPT_REDIRECT_HDR;
>>> +	*(opt++) = (rd_len >> 3);
>>> +	opt += 6;
>>> +
>>> +	memcpy(opt, ipv6_hdr(orig_skb), rd_len - 8);
>>> +
>>> +	return opt;
>>> +}
:
>> I understand that opt isn't currently used after this, but if it ever is
>> then it is going to come as big a surprise that this implies opt += 8;
>>
>> This was previously quite clear when the code was inline, but it becomes
>> problematic when it is factored out.
> 
> I understand your concern.  opt will be disappeared by following
> changeset (12 of 17).

Argh, I now notice return value was not quite right; it should
return opt + rd_len - 8.

Fixed in my local tree.  Thanks.

--yoshfuji

^ permalink raw reply

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Thomas Graf @ 2012-12-19 17:20 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: Jiri Pirko, netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <50D1F551.6010301@redhat.com>

On 12/19/12 at 12:11pm, Vlad Yasevich wrote:
> Could we consolidate the code after this is accepted and all the parties
> can agree on the consolidation?  I'd really like to keep this series
> as minimally invasive as possible.

Sure, this was just a general remark on general future direction :)

^ permalink raw reply

* Re: [PATCH v2] netlink: align attributes on 64-bits
From: Thomas Graf @ 2012-12-19 17:20 UTC (permalink / raw)
  To: David Laight; +Cc: nicolas.dichtel, bhutchings, netdev, davem
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70F6@saturn3.aculab.com>

On 12/19/12 at 09:17am, David Laight wrote:
> You can't use memcpy() to copy a pointer to a misaligned
> structure into an aligned buffer. The compiler assumes
> the pointer is aligned and will use instructions that
> depend on the alignment.

I am not sure I understand this correctly. Are you saying
that the following does not work on i386?

struct foo {
  uint32_t a;
  uint64_t b;
};

struct foo buf;

memcpy(&buf, nla_data(attr), nla_len(attr));
printf([...], buf.b);


> I think:
> 1) Alignment is only needed on systems that have 'strict alignment'
>    requirements (maybe disable for testing?)

Right, what about mixed 32bit/64bit environments?

> 2) Alignment is only needed for parameters whose size is a
>    multiple of the alignment (a structure containing a
>    field that needs 8 byte alignment will always be a multiple
>    of 8 bytes long).

Good point. I'll fix this in the next iteration of the patch.

> 3) You need to add NA_HDR_LEN to the write pointer before
>    determining the size of the pad.

Right, I'm doing this in the patch I proposed. Or are you referring
to something else?

^ permalink raw reply

* [PATCH 4/4] net/smsc911x: Provide common clock functionality
From: Lee Jones @ 2012-12-19 17:19 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: arnd, linus.walleij, Lee Jones, Steve Glendinning, netdev
In-Reply-To: <1355937587-31730-1-git-send-email-lee.jones@linaro.org>

Some platforms provide clocks which require enabling before the
SMSC911x chip will power on. This patch uses the new common clk
framework to do just that. If no clock is provided, it will just
be ignored and the driver will continue to assume that no clock
is required for the chip to run successfully.

Cc: Steve Glendinning <steve.glendinning@shawell.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
---
 drivers/net/ethernet/smsc/smsc911x.c |   31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
index 4616bf2..f6196cd 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -33,6 +33,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/crc32.h>
+#include <linux/clk.h>
 #include <linux/delay.h>
 #include <linux/errno.h>
 #include <linux/etherdevice.h>
@@ -144,6 +145,9 @@ struct smsc911x_data {
 
 	/* regulators */
 	struct regulator_bulk_data supplies[SMSC911X_NUM_SUPPLIES];
+
+	/* clock */
+	struct clk *clk;
 };
 
 /* Easy access to information */
@@ -369,7 +373,7 @@ out:
 }
 
 /*
- * enable resources, currently just regulators.
+ * enable regulator and clock resources.
  */
 static int smsc911x_enable_resources(struct platform_device *pdev)
 {
@@ -382,6 +386,13 @@ static int smsc911x_enable_resources(struct platform_device *pdev)
 	if (ret)
 		netdev_err(ndev, "failed to enable regulators %d\n",
 				ret);
+
+	if (pdata->clk) {
+		ret = clk_prepare_enable(pdata->clk);
+		if (ret < 0)
+			netdev_err(ndev, "failed to enable clock %d\n", ret);
+	}
+
 	return ret;
 }
 
@@ -396,6 +407,10 @@ static int smsc911x_disable_resources(struct platform_device *pdev)
 
 	ret = regulator_bulk_disable(ARRAY_SIZE(pdata->supplies),
 			pdata->supplies);
+
+	if (pdata->clk)
+		clk_disable_unprepare(pdata->clk);
+
 	return ret;
 }
 
@@ -421,6 +436,14 @@ static int smsc911x_request_resources(struct platform_device *pdev)
 	if (ret)
 		netdev_err(ndev, "couldn't get regulators %d\n",
 				ret);
+
+	/* Request clock */
+	pdata->clk = clk_get(&pdev->dev, NULL);
+	if (IS_ERR(pdata->clk)) {
+		netdev_warn(ndev, "couldn't get clock %d\n", PTR_ERR(pdata->clk));
+		pdata->clk = NULL;
+	}
+
 	return ret;
 }
 
@@ -436,6 +459,12 @@ static void smsc911x_free_resources(struct platform_device *pdev)
 	/* Free regulators */
 	regulator_bulk_free(ARRAY_SIZE(pdata->supplies),
 			pdata->supplies);
+
+	/* Free clock */
+	if (pdata->clk) {
+		clk_put(pdata->clk);
+		pdata->clk = NULL;
+	}
 }
 
 /* waits for MAC not busy, with timeout.  Only called by smsc911x_mac_read
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Jiri Pirko @ 2012-12-19 17:19 UTC (permalink / raw)
  To: Vlad Yasevich
  Cc: Thomas Graf, netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <50D1F551.6010301@redhat.com>

Wed, Dec 19, 2012 at 06:11:45PM CET, vyasevic@redhat.com wrote:
>On 12/19/2012 12:04 PM, Thomas Graf wrote:
>>On 12/19/12 at 09:27am, Jiri Pirko wrote:
>>>Tue, Dec 18, 2012 at 11:46:21PM CET, vyasevic@redhat.com wrote:
>>>>On 12/18/2012 05:32 PM, Jiri Pirko wrote:
>>>>>
>>>>>
>>>>>I see that this patchset replicates a lot of code which is already
>>>>>present in net/8021q/ or include/linux/if_vlan.h. I think it would
>>>>>be nice to move this code into some "common" place, wouldn't it?
>>>>>
>>>>
>>>>The only replication that I am aware of is in br_vlan_untag().  I
>>>>thought about pulling that piece out, but I think there is a reason
>>>>why it's not available when 801q support isn't turned on.  I noted that
>>>>openvswitch implemented its own vlan header manipulation functions as well.
>>>
>>>openvswitch should use the "common" code as well.
>>
>>I was just about to mention this. This overlaps with openvswitch
>>in functionality which I have absoluetely no objections against
>>but code reuse should come to focus in order to avoid having to
>>fix bugs twice.
>>
>
>Could we consolidate the code after this is accepted and all the parties
>can agree on the consolidation?  I'd really like to keep this series
>as minimally invasive as possible.

That sounds good to me.

>
>Thanks
>-vlad

^ permalink raw reply

* [PATCH] IPoIB: Call skb_dst_drop() once skb is enqueued for sending
From: Roland Dreier @ 2012-12-19 17:17 UTC (permalink / raw)
  To: linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: Roland Dreier

From: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>

Currently, IPoIB delays collecting send completions for TX packets in
order to batch work more efficiently.  It does skb_orphan() right after
queuing the packets so that destructors run early, to avoid problems
like holding socket send buffers for too long (since we might not
collect a send completion until a long time after the packet is
actually sent).

However, IPoIB clears IFF_XMIT_DST_RELEASE because it actually looks
at skb_dst() to update the PMTU when it gets a too-long packet.  This
means that the packets sitting in the TX ring with uncollected send
completions are holding a reference on the dst.  We've seen this lead
to pathological behavior with respect to route and neighbour GC.  The
easy fix for this is to call skb_dst_drop() when we call skb_orphan().

Also, give packets sent via connected mode (CM) the same skb_orphan()
/ skb_dst_drop() treatment that packets sent via datagram mode get.

Signed-off-by: Roland Dreier <roland-BHEL68pLQRGGvPXPguhicg@public.gmane.org>
---
Planning to merge this for 3.8 unless someone objects.

 drivers/infiniband/ulp/ipoib/ipoib_cm.c | 3 +++
 drivers/infiniband/ulp/ipoib/ipoib_ib.c | 3 ++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 72ae63f..03103d2 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -752,6 +752,9 @@ void ipoib_cm_send(struct net_device *dev, struct sk_buff *skb, struct ipoib_cm_
 		dev->trans_start = jiffies;
 		++tx->tx_head;

+		skb_orphan(skb);
+		skb_dst_drop(skb);
+
 		if (++priv->tx_outstanding == ipoib_sendq_size) {
 			ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
 				  tx->qp->qp_num);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index f10221f..a1bca70 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -615,8 +615,9 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb,

 		address->last_send = priv->tx_head;
 		++priv->tx_head;
-		skb_orphan(skb);

+		skb_orphan(skb);
+		skb_dst_drop(skb);
 	}

 	if (unlikely(priv->tx_outstanding > MAX_SEND_CQE))
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-19 17:11 UTC (permalink / raw)
  To: Thomas Graf; +Cc: Jiri Pirko, netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <20121219170431.GA6975@casper.infradead.org>

On 12/19/2012 12:04 PM, Thomas Graf wrote:
> On 12/19/12 at 09:27am, Jiri Pirko wrote:
>> Tue, Dec 18, 2012 at 11:46:21PM CET, vyasevic@redhat.com wrote:
>>> On 12/18/2012 05:32 PM, Jiri Pirko wrote:
>>>>
>>>>
>>>> I see that this patchset replicates a lot of code which is already
>>>> present in net/8021q/ or include/linux/if_vlan.h. I think it would
>>>> be nice to move this code into some "common" place, wouldn't it?
>>>>
>>>
>>> The only replication that I am aware of is in br_vlan_untag().  I
>>> thought about pulling that piece out, but I think there is a reason
>>> why it's not available when 801q support isn't turned on.  I noted that
>>> openvswitch implemented its own vlan header manipulation functions as well.
>>
>> openvswitch should use the "common" code as well.
>
> I was just about to mention this. This overlaps with openvswitch
> in functionality which I have absoluetely no objections against
> but code reuse should come to focus in order to avoid having to
> fix bugs twice.
>

Could we consolidate the code after this is accepted and all the parties
can agree on the consolidation?  I'd really like to keep this series
as minimally invasive as possible.

Thanks
-vlad

^ permalink raw reply

* Re: [PATCH v2] netlink: align attributes on 64-bits
From: Thomas Graf @ 2012-12-19 17:09 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: bhutchings, netdev, davem, David.Laight
In-Reply-To: <50D1A37C.8090705@6wind.com>

On 12/19/12 at 12:22pm, Nicolas Dichtel wrote:
> Here padlen will return 4, which is wrong: padlen + NLA_HDRLEN = 8,
> alignment is the same than before. Here is a proposal fix:
> 
> diff --git a/lib/nlattr.c b/lib/nlattr.c
> index e4f0329..1556313 100644
> --- a/lib/nlattr.c
> +++ b/lib/nlattr.c
> @@ -338,7 +338,10 @@ struct nlattr *__nla_reserve(struct sk_buff
> *skb, int attrtype, int attrlen)
>  		struct nlattr *pad;
>  		size_t padlen;
> 
> -		padlen = nla_total_size(offset) - offset -  NLA_HDRLEN;
> +		/* We need to remove NLA_HDRLEN two times: one time for the
> +		 * attribute hdr and one time for the pad attribute hdr.
> +		 */
> +		padlen = nla_total_size(offset) - offset -  2 * NLA_HDRLEN;
>  		pad = (struct nlattr *) skb_put(skb, nla_attr_size(padlen));
>  		pad->nla_type = 0;
>  		pad->nla_len = nla_attr_size(padlen);
> 
> With this patch, it seems goods. attribute are always aligned on 8 bytes. Also
> I did not notice any problem with size calculation (I try some ip
> link, ip xfrm, ip [m]route).
> 
> Do you want to make more tests? Or will your repost the full patch?
> I can do it if you don't have time.

Thanks.

I would like to do some testing as well. I do expect some fallout from
this. There is likely some interface abuse that will now be exposed
due to this.

We'll have to wait for the next merge window to open anyway. I'd
consider this a new feature and not a bugfix based on the possible
regression impact it could have.

I'll post a new version of the patch integrating your fix above so
others (especially subsystem maintainers depending on netlink) can run
the patch as well.

^ permalink raw reply

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Thomas Graf @ 2012-12-19 17:04 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Vlad Yasevich, netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <20121219082727.GB1637@minipsycho.orion>

On 12/19/12 at 09:27am, Jiri Pirko wrote:
> Tue, Dec 18, 2012 at 11:46:21PM CET, vyasevic@redhat.com wrote:
> >On 12/18/2012 05:32 PM, Jiri Pirko wrote:
> >>
> >>
> >>I see that this patchset replicates a lot of code which is already
> >>present in net/8021q/ or include/linux/if_vlan.h. I think it would
> >>be nice to move this code into some "common" place, wouldn't it?
> >>
> >
> >The only replication that I am aware of is in br_vlan_untag().  I
> >thought about pulling that piece out, but I think there is a reason
> >why it's not available when 801q support isn't turned on.  I noted that
> >openvswitch implemented its own vlan header manipulation functions as well.
> 
> openvswitch should use the "common" code as well.

I was just about to mention this. This overlaps with openvswitch
in functionality which I have absoluetely no objections against
but code reuse should come to focus in order to avoid having to
fix bugs twice.

^ permalink raw reply

* Re: [RFC PATCH v3 0/2] Fix some multiqueue TUN problems
From: Paul Moore @ 2012-12-19 16:59 UTC (permalink / raw)
  To: linux-security-module, selinux, eparis; +Cc: netdev, jasowang, mst
In-Reply-To: <20121218225001.16104.34454.stgit@localhost>

On Tuesday, December 18, 2012 05:53:37 PM Paul Moore wrote:
> A refresh/respin of the LSM/SELinux fixes to work on top of Jason's
> latest API tweak (now living in DaveM's net tree).  In general, I
> believe the hooks and thinking behind the v2 patchset still make sense
> so no changes there, although I did change the SELinux permission from
> "create_queue" to "attach_queue" to match the API changes.
> 
> Comments are welcome and encouraged; we need to get this fixed before
> 3.8 is released.

SELinux (I'm looking at you Eric) and LSM folks - any comments/objections to 
these changes?

> ---
> 
> Paul Moore (2):
>       selinux: add the "attach_queue" permission to the "tun_socket" class
>       tun: fix LSM/SELinux labeling of tun/tap devices

-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* Re: [RFC PATCH v3 2/2] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-19 16:58 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S. Tsirkin, netdev, linux-security-module, selinux
In-Reply-To: <50D154B1.4010909@redhat.com>

On Wednesday, December 19, 2012 01:46:25 PM Jason Wang wrote:
> On 12/19/2012 07:08 AM, Michael S. Tsirkin wrote:
> > On Tue, Dec 18, 2012 at 05:53:52PM -0500, Paul Moore wrote:
> >> This patch corrects some problems with LSM/SELinux that were introduced
> >> with the multiqueue patchset.  The problem stems from the fact that the
> >> multiqueue work changed the relationship between the tun device and its
> >> associated socket; before the socket persisted for the life of the
> >> device, however after the multiqueue changes the socket only persisted
> >> for the life of the userspace connection (fd open).  For non-persistent
> >> devices this is not an issue, but for persistent devices this can cause
> >> the tun device to lose its SELinux label.
> >> 
> >> We correct this problem by adding an opaque LSM security blob to the
> >> tun device struct which allows us to have the LSM security state, e.g.
> >> SELinux labeling information, persist for the lifetime of the tun
> >> device.  In the process we tweak the LSM hooks to work with this new
> >> approach to TUN device/socket labeling and introduce a new LSM hook,
> >> security_tun_dev_attach_queue(), to approve requests to attach to a
> >> TUN queue via TUNSETQUEUE.
> >> 
> >> The SELinux code has been adjusted to match the new LSM hooks, the
> >> other LSMs do not make use of the LSM TUN controls.  This patch makes
> >> use of the recently added "tun_socket:attach_queue" permission to
> >> restrict access to the TUNSETQUEUE operation.  On older SELinux
> >> policies which do not define the "tun_socket:attach_queue" permission
> >> the access control decision for TUNSETQUEUE will be handled according
> >> to the SELinux policy's unknown permission setting.
> >> 
> >> Signed-off-by: Paul Moore <pmoore@redhat.com>
> > 
> > Looks good to me. A comment not directly related to this patch, below.
> 
> Good to me too, will do some test on this.

Great.  I'll do some more testing and make sure the LSM and SELinux crowd are 
okay with the changes.

-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* Re: [GIT PULL net-next 04/17] ndisc: Introduce ndisc_fill_redirect_hdr_option().
From: YOSHIFUJI Hideaki @ 2012-12-19 16:25 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: davem, netdev, YOSHIFUJI Hideaki
In-Reply-To: <87txrib6wa.fsf@nemi.mork.no>

Bjørn Mork wrote:
> YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> writes:
> 
>> Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
>> ---
>>  net/ipv6/ndisc.c |   21 +++++++++++++++------
>>  1 file changed, 15 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
>> index a181113..0a4f3a9 100644
>> --- a/net/ipv6/ndisc.c
>> +++ b/net/ipv6/ndisc.c
>> @@ -1332,6 +1332,19 @@ static void ndisc_redirect_rcv(struct sk_buff *skb)
>>  	icmpv6_notify(skb, NDISC_REDIRECT, 0, 0);
>>  }
>>  
>> +static u8 *ndisc_fill_redirect_hdr_option(u8 *opt, struct sk_buff *orig_skb,
>> +					  int rd_len)
>> +{
>> +	memset(opt, 0, 8);
>> +	*(opt++) = ND_OPT_REDIRECT_HDR;
>> +	*(opt++) = (rd_len >> 3);
>> +	opt += 6;
>> +
>> +	memcpy(opt, ipv6_hdr(orig_skb), rd_len - 8);
>> +
>> +	return opt;
>> +}
>> +
> 
> I realize that it doesn't currently matter, but the above modification
> of "opt" looks like a bug-waiting-to-happen to me.
> 
>>  void ndisc_send_redirect(struct sk_buff *skb, const struct in6_addr *target)
>>  {
>>  	struct net_device *dev = skb->dev;
>> @@ -1461,12 +1474,8 @@ void ndisc_send_redirect(struct sk_buff *skb, const struct in6_addr *target)
>>  	 *	build redirect option and copy skb over to the new packet.
>>  	 */
>>  
>> -	memset(opt, 0, 8);
>> -	*(opt++) = ND_OPT_REDIRECT_HDR;
>> -	*(opt++) = (rd_len >> 3);
>> -	opt += 6;
>> -
>> -	memcpy(opt, ipv6_hdr(skb), rd_len - 8);
>> +	if (rd_len)
>> +		opt = ndisc_fill_redirect_hdr_option(opt, skb, rd_len);
> 
> 
> I understand that opt isn't currently used after this, but if it ever is
> then it is going to come as big a surprise that this implies opt += 8;
> 
> This was previously quite clear when the code was inline, but it becomes
> problematic when it is factored out.

I understand your concern.  opt will be disappeared by following
changeset (12 of 17).

--yoshfuji

^ permalink raw reply

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-19 16:25 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <20121219082727.GB1637@minipsycho.orion>

On 12/19/2012 03:27 AM, Jiri Pirko wrote:
> Tue, Dec 18, 2012 at 11:46:21PM CET, vyasevic@redhat.com wrote:
>> On 12/18/2012 05:32 PM, Jiri Pirko wrote:
>>>
>>>
>>> I see that this patchset replicates a lot of code which is already
>>> present in net/8021q/ or include/linux/if_vlan.h. I think it would
>>> be nice to move this code into some "common" place, wouldn't it?
>>>
>>
>> The only replication that I am aware of is in br_vlan_untag().  I
>> thought about pulling that piece out, but I think there is a reason
>> why it's not available when 801q support isn't turned on.  I noted that
>> openvswitch implemented its own vlan header manipulation functions as well.
>
> openvswitch should use the "common" code as well.
>
>>
>> What else are you seeing that's duplicate?
>
> For example I spotted check of ndo_vlan_rx_[add/kill]_vid and
> NETIF_F_HW_VLAN_FILTER and ndo_vlan_rx_[add/kill]_vid call

Ahh yes....  I can make that generic.  Thanks

-vlad

>
>
>>
>> -vlad
>>
>>> Jiri
>>>
>>> Tue, Dec 18, 2012 at 08:00:51PM CET, vyasevic@redhat.com wrote:
>>>> This series of patches provides an ability to add VLANs to the bridge
>>>> ports.  This is similar to what can be found in most switches.  The bridge
>>>> port may have any number of VLANs added to it including vlan 0 priority tagged
>>>> traffic.  When vlans are added to the port, only traffic tagged with particular
>>>> vlan will forwarded over this port.  Additionally, vlan ids are added to FDB
>>>> entries and become part of the lookup.  This way we correctly identify the FDB
>>>> entry.
>>>>
>>>> A single vlan may also be designated as untagged.  Any untagged traffic
>>>> recieved by the port will be assigned to this vlan.  Any traffic exiting
>>>> the port with a VID matching the untagged vlan will exit untagged (the
>>>> bridge will strip the vlan header).  This is similar to "Native Vlan" support
>>>> available in most switches.
>>>>
>>>> The default behavior ofthe bridge is unchanged if no vlans have been
>>>> configured.
>>>>
>>>> Changes since v1:
>>>> - Fixed some forwarding bugs.
>>>> - Add vlan to local fdb entries.  New local entries are created per vlan
>>>>    to facilite correct forwarding to bridge interface.
>>>> - Allow configuration of vlans directly on the bridge master device
>>>>    in addition to ports.
>>>>
>>>> Changes since rfc v2:
>>>> - Per-port vlan bitmap is gone and is replaced with a vlan list.
>>>> - Added bridge vlan list, which is referenced by each port.  Entries in
>>>>    the birdge vlan list have port bitmap that shows which port are parts
>>>>    of which vlan.
>>>> - Netlink API changes.
>>>> - Dropped sysfs support for now.  If people think this is really usefull,
>>>>    can add it back.
>>>> - Support for native/untagged vlans.
>>>>
>>>> Changes since rfc v1:
>>>> - Comments addressed regarding formatting and RCU usage
>>>> - iocts have been removed and changed over the netlink interface.
>>>> - Added support of user added ndb entries.
>>>> - changed sysfs interface to export a bitmap.  Also added a write interface.
>>>>    I am not sure how much I like it, but it made my testing easier/faster.  I
>>>>    might change the write interface to take text instead of binary.
>>>>
>>>>
>>>> Vlad Yasevich (12):
>>>>   bridge: Add vlan filtering infrastructure
>>>>   bridge: Validate that vlan is permitted on ingress
>>>>   bridge: Verify that a vlan is allowed to egress on give port
>>>>   bridge: Cache vlan in the cb for faster egress lookup.
>>>>   bridge: Add vlan to unicast fdb entries
>>>>   bridge: Add vlan id to multicast groups
>>>>   bridge: Add netlink interface to configure vlans on bridge ports
>>>>   bridge: Add vlan support to static neighbors
>>>>   bridge: Add the ability to configure untagged vlans
>>>>   bridge: Implement untagged vlan handling
>>>>   bridge: Dump vlan information from a bridge port
>>>>   bridge: Add vlan support for local fdb entries
>>>>
>>>> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    5 +-
>>>> drivers/net/macvlan.c                         |    2 +-
>>>> drivers/net/vxlan.c                           |    3 +-
>>>> include/linux/netdevice.h                     |    4 +-
>>>> include/uapi/linux/if_bridge.h                |   23 ++-
>>>> include/uapi/linux/neighbour.h                |    1 +
>>>> include/uapi/linux/rtnetlink.h                |    1 +
>>>> net/bridge/br_device.c                        |   34 ++-
>>>> net/bridge/br_fdb.c                           |  253 ++++++++++++---
>>>> net/bridge/br_forward.c                       |  160 ++++++++++
>>>> net/bridge/br_if.c                            |  404 ++++++++++++++++++++++++-
>>>> net/bridge/br_input.c                         |   65 ++++-
>>>> net/bridge/br_multicast.c                     |   71 +++--
>>>> net/bridge/br_netlink.c                       |  178 ++++++++++--
>>>> net/bridge/br_private.h                       |   71 ++++-
>>>> net/core/rtnetlink.c                          |   40 ++-
>>>> 16 files changed, 1190 insertions(+), 125 deletions(-)
>>>>
>>>> --
>>>> 1.7.7.6
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

* Re: [PATCH] xen/netfront: improve truesize tracking
From: Eric Dumazet @ 2012-12-19 16:17 UTC (permalink / raw)
  To: Sander Eikelenboom
  Cc: Ian Campbell, netdev@vger.kernel.org, Konrad Rzeszutek Wilk,
	annie li, xen-devel@lists.xensource.com
In-Reply-To: <55633610.20121219123427@eikelenboom.it>

On Wed, 2012-12-19 at 12:34 +0100, Sander Eikelenboom wrote:

> Hi Ian,
> 
> It ran overnight and i haven't seen the warn_once trigger.
> (but i also didn't with the previous patch)
> 

As I said, the miminum value to not trigger the warning was what Ian
patch was doing, but it was still a not accurate estimation.

Doing the real accounting might trigger slow transferts, or dropped
packets because of socket limits (SNDBUF / RCVBUF) being hit sooner.

So the real question was : If accounting for full pages, is your
applications run as smooth as before, with no huge performance
regression ?

^ permalink raw reply

* Re: [RFC PATCH] fix IP_ECN_set_ce
From: Eric Dumazet @ 2012-12-19 16:14 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev
In-Reply-To: <1355898095-7444-1-git-send-email-roy.qing.li@gmail.com>

On Wed, 2012-12-19 at 14:21 +0800, roy.qing.li@gmail.com wrote:
> From: Li RongQing <roy.qing.li@gmail.com>
> 
> 1. ECN uses the two least significant (right-most) bits of the DiffServ
> field in the IPv4, so it should be in iph->tos, not in (iph->tos+1)
> 
> 2. When setting CE, we should check if ECN Capable Transport supports,
> both 10 and 01 mean ECN Capable Transport, so only check 10 is not enough
>     00: Non ECN-Capable Transport — Non-ECT
>     10: ECN Capable Transport — ECT(0)
>     01: ECN Capable Transport — ECT(1)
>     11: Congestion Encountered — CE
> 
> 3. Remove the misunderstand comment
> 
> 4. fix the checksum computation
> 
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

This is total crap.

Its perfectly clear to me and compiler generates fast code.

If you don't understand this code, please don't touch it.

^ permalink raw reply

* Re: [PATCH 4/4] FEC: Add time stamping code and a PTP hardware clock
From: Ben Hutchings @ 2012-12-19 15:53 UTC (permalink / raw)
  To: Frank Li, Sascha Hauer
  Cc: Richard Cochran, Shawn Guo, Frank Li, lznua, linux-arm-kernel,
	netdev, davem
In-Reply-To: <20121218070420.GA2946@netboy.at.omicron.at>

On Tue, Dec 18, 2012 at 08:04:20AM +0100, Richard Cochran wrote:
> On Mon, Dec 17, 2012 at 09:02:32PM +0100, Sascha Hauer wrote:
> > This leaves an option in the tree which can be used to break FEC on
> > i.MX3/5.
> > 
> > 	depends on !SOC_IMX31 && !SOC_IMX35 && !SOC_IMX5
> > 
> > might be an option, but given that this patch seems to have bypassed any
> > review I feel more like reverting it.
> 
> Instead of reverting, I suggest finding a solution (Frank) to let the
> code work when it can work and to prevent it when it cannot. This
> could be kconfig, DT, or run time probing of silicon revisions, but I
> don't have access to this hardware, and so I can't really say how to
> fix it.
[...]

Please implement run-time probing.  A different configuration for
each SoC is just not sustainable for distributions.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jan Engelhardt @ 2012-12-19 15:52 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Hasan Chowdhury, Stephen Hemminger, Yury Stankevich,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <50D1AB7E.5060000@mojatatu.com>


On Wednesday 2012-12-19 12:56, Jamal Hadi Salim wrote:
>
> To be applied pending more testing.
>
> Attached. Sorry, I thought I had sent this out over the weekend.
> I have done basic testing with a single mark and sending pings to
> update stats which can then displayed for the mark.
>
> diffstat xt-p1
> Kconfig  |   15 ++
> Makefile |    1 
> act_xt.c |  324 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 339 insertions(+), 1 deletion(-)

Humm... that's a huge patch for what seems to be equal to act_ipt.c
Let's do a cross-diff:

--- act_ipt.c	2012-10-25 19:49:25.372191795 +0200
+++ act_xt.c	2012-12-19 16:48:22.052419730 +0100
@@ -2 +2 @@
- * net/sched/ipt.c     iptables target interface
+ * net/sched/act_xt.c     iptables target interface
@@ -11 +11 @@
- * Copyright:	Jamal Hadi Salim (2002-4)
+ * Copyright:	Jamal Hadi Salim (2002-12)
@@ -30 +29,0 @@
-
@@ -42 +41,2 @@ static struct tcf_hashinfo ipt_hash_info
-static int ipt_init_target(struct xt_entry_target *t, char *table, unsigned int hook)
+static int ipt_init_target(struct xt_entry_target *t, char *table,
+			   unsigned int hook)
@@ -243,2 +243,2 @@ static int tcf_ipt(struct sk_buff *skb,
-		net_notice_ratelimited("tc filter: Bogus netfilter code %d assume ACCEPT\n",
-				       ret);
+		net_notice_ratelimited
+		    ("tc filter: Bogus netfilter code %d assume ACCEPT\n", ret);
@@ -253 +253,2 @@ static int tcf_ipt(struct sk_buff *skb,
-static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref)
+static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind,
+			int ref)
@@ -295 +296 @@ static struct tc_action_ops act_ipt_ops
-	.kind		=	"ipt",
+	.kind = "xt",
@@ -308,2 +309,2 @@ static struct tc_action_ops act_ipt_ops
-MODULE_AUTHOR("Jamal Hadi Salim(2002-4)");
-MODULE_DESCRIPTION("Iptables target actions");
+MODULE_AUTHOR("Jamal Hadi Salim(2002-12)");
+MODULE_DESCRIPTION("New Iptables target actions");


Is that [the set of hunks] all? Then I would instead suggest
something like:


diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c
index 58fb3c7..f92a007 100644
--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -305,18 +305,43 @@ static struct tc_action_ops act_ipt_ops = {
 	.walk		=	tcf_generic_walker
 };
 
+static struct tc_action_ops act_xt_ops = {
+	.kind		=	"xt",
+	.hinfo		=	&ipt_hash_info,
+	.type		=	TCA_ACT_IPT,
+	.capab		=	TCA_CAP_NONE,
+	.owner		=	THIS_MODULE,
+	.act		=	tcf_ipt,
+	.dump		=	tcf_ipt_dump,
+	.cleanup	=	tcf_ipt_cleanup,
+	.lookup		=	tcf_hash_search,
+	.init		=	tcf_ipt_init,
+	.walk		=	tcf_generic_walker
+};
+
 MODULE_AUTHOR("Jamal Hadi Salim(2002-4)");
 MODULE_DESCRIPTION("Iptables target actions");
 MODULE_LICENSE("GPL");
+MODULE_ALIAS("act_xt");
 
 static int __init ipt_init_module(void)
 {
-	return tcf_register_action(&act_ipt_ops);
+	int ret;
+	ret = tcf_register_action(&act_ipt_ops);
+	if (ret < 0)
+		return ret;
+	ret = tcf_register_action(&xt_ipt_ops);
+	if (ret < 0) {
+		tcf_unregister_action(&act_ipt_ops);
+		return ret;
+	}
+	return 0;
 }
 
 static void __exit ipt_cleanup_module(void)
 {
 	tcf_unregister_action(&act_ipt_ops);
+	tcf_unregister_action(&act_xt_ops);
 }
 
 module_init(ipt_init_module);

^ permalink raw reply related

* Re: [PATCH V2 09/12] bridge: Add the ability to configure untagged vlans
From: Vlad Yasevich @ 2012-12-19 14:50 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, shemminger, davem, or.gerlitz, jhs
In-Reply-To: <20121218231049.GD1135@redhat.com>

On 12/18/2012 06:10 PM, Michael S. Tsirkin wrote:
> On Tue, Dec 18, 2012 at 06:03:25PM -0500, Vlad Yasevich wrote:
>> On 12/18/2012 06:01 PM, Michael S. Tsirkin wrote:
>>> On Tue, Dec 18, 2012 at 02:01:00PM -0500, Vlad Yasevich wrote:
>>>> A user may designate a certain vlan as untagged.  This means that
>>>> any ingress frame is assigned to this vlan and any forwarding decisions
>>>> are made with this vlan in mind.  On egress, any frames tagged/labeled
>>>> with untagged vlan have the vlan tag removed and are send as regular
>>>> ethernet frames.
>>>>
>>>> Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
>>>> ---
>>>>   include/uapi/linux/if_bridge.h |    3 +
>>>>   net/bridge/br_if.c             |  146 +++++++++++++++++++++++++++++++++++++---
>>>>   net/bridge/br_netlink.c        |    6 +-
>>>>   net/bridge/br_private.h        |    2 +
>>>>   4 files changed, 144 insertions(+), 13 deletions(-)
>>>>
>>>> diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
>>>> index d0b4f5c..988d858 100644
>>>> --- a/include/uapi/linux/if_bridge.h
>>>> +++ b/include/uapi/linux/if_bridge.h
>>>> @@ -127,6 +127,9 @@ enum {
>>>>   	BR_VLAN_DEL,
>>>>   };
>>>>
>>>> +#define BRIDGE_VLAN_INFO_MASTER		1
>>>> +#define BRIDGE_VLAN_INFO_UNTAGGED	2
>>>> +
>>>>   struct bridge_vlan_info {
>>>>   	u16 op_code;
>>>>   	u16 flags;
>>>> diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
>>>> index 57bbb35..14563fb 100644
>>>> --- a/net/bridge/br_if.c
>>>> +++ b/net/bridge/br_if.c
>>>> @@ -108,6 +108,34 @@ static void br_vlan_put(struct net_bridge_vlan *vlan)
>>>>   		br_vlan_destroy(vlan);
>>>>   }
>>>>
>>>> +/* Must be protected by RTNL */
>>>> +static void br_vlan_add_untagged(struct net_bridge *br,
>>>> +				 struct net_bridge_vlan *vlan)
>>>> +{
>>>> +	ASSERT_RTNL();
>>>> +	if (br->untagged == vlan)
>>>> +		return;
>>>> +	else if (br->untagged) {
>>>> +		/* Untagged vlan is already set on the master,
>>>> +		 * so drop the ref since we'll be replacing it.
>>>> +		 */
>>>> +		br_vlan_put(br->untagged);
>>>> +	}
>>>> +	br_vlan_hold(vlan);
>>>> +	rcu_assign_pointer(br->untagged, vlan);
>>>
>>> Is there a reason for rcu here but not else where? If all users are under
>>> rtnl you can just assign in a simple way.
>>> If not then rcu_dereference_protected would be more appropriate.
>>
>> Everywhere that the pointer changes rcu_assign_pointer is used.
>>
>> Now, if we hold an RTNL, we can technically read the pointer with
>> rcu since it's guaranteed not to change since it only changes under
>> RTNL.
>> I'll check that this is consistent.
>
> Check what rcu_dereference_protected does. It's really just
> an explicit way to say "this is accessed without rcu because I have
> this lock".

Looks like the helper rtnl_dereference() already does what I need.  I'll 
use that.

Thanks
-vlad

>
>> If I access the pointer without rtnl, it's always inside rcu
>> critical section and with rcu_dereference().
>>
>> I thought those were the basic rules of rcu.  Did that change?
>>
>> -vlad
>
>
>
>>>
>>>> +}
>>>> +
>>>> +/* Must be protected by RTNL */
>>>> +static void br_vlan_del_untagged(struct net_bridge *br,
>>>> +				 struct net_bridge_vlan *vlan)
>>>> +{
>>>> +	ASSERT_RTNL();
>>>> +	if (br->untagged == vlan) {
>>>> +		br_vlan_put(vlan);
>>>> +		rcu_assign_pointer(br->untagged, NULL);
>>>> +	}
>>>> +}
>>>> +
>>>>   struct net_bridge_vlan *br_vlan_find(struct net_bridge *br, u16 vid)
>>>>   {
>>>>   	struct net_bridge_vlan *vlan;
>>>> @@ -132,7 +160,7 @@ struct net_bridge_vlan *br_vlan_add(struct net_bridge *br, u16 vid,
>>>>
>>>>   	vlan = br_vlan_find(br, vid);
>>>>   	if (vlan)
>>>> -		return vlan;
>>>> +		goto untagged;
>>>>
>>>>   	vlan = kzalloc(sizeof(struct net_bridge_vlan), GFP_KERNEL);
>>>>   	if (!vlan)
>>>> @@ -141,7 +169,7 @@ struct net_bridge_vlan *br_vlan_add(struct net_bridge *br, u16 vid,
>>>>   	vlan->vid = vid;
>>>>   	atomic_set(&vlan->refcnt, 1);
>>>>
>>>> -	if (flags & BRIDGE_FLAGS_SELF) {
>>>> +	if (flags & BRIDGE_VLAN_INFO_MASTER) {
>>>>   		/* Set bit 0 that is associated with the bridge master
>>>>   		 * device.  Port numbers start with 1.
>>>>   		 */
>>>> @@ -149,15 +177,24 @@ struct net_bridge_vlan *br_vlan_add(struct net_bridge *br, u16 vid,
>>>>   	}
>>>>
>>>>   	hlist_add_head_rcu(&vlan->hlist, &br->vlan_hlist[br_vlan_hash(vid)]);
>>>> +
>>>> +untagged:
>>>> +	if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
>>>> +		br_vlan_add_untagged(br, vlan);
>>>> +
>>>>   	return vlan;
>>>>   }
>>>>
>>>>   /* Must be protected by RTNL */
>>>> -static void br_vlan_del(struct net_bridge_vlan *vlan, u16 flags)
>>>> +static void br_vlan_del(struct net_bridge *br, struct net_bridge_vlan *vlan,
>>>> +			u16 flags)
>>>>   {
>>>>   	ASSERT_RTNL();
>>>>
>>>> -	if (flags & BRIDGE_FLAGS_SELF) {
>>>> +	if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
>>>> +		br_vlan_del_untagged(br, vlan);
>>>> +
>>>> +	if (flags & BRIDGE_VLAN_INFO_MASTER) {
>>>>   		/* Clear bit 0 that is associated with the bridge master
>>>>   		 * device.
>>>>   		 */
>>>> @@ -172,6 +209,14 @@ static void br_vlan_del(struct net_bridge_vlan *vlan, u16 flags)
>>>>
>>>>   	vlan->vid = BR_INVALID_VID;
>>>>
>>>> +	/* If, for whatever reason, bridge still has a ref on this vlan
>>>> +	 * through the @untagged pointer, drop that ref and clear untagged.
>>>> +	 */
>>>> +	if (br->untagged == vlan) {
>>>> +		br_vlan_put(vlan);
>>>> +		rcu_assign_pointer(br->untagged, NULL);
>>>> +	}
>>>> +
>>>>   	/* Drop the self-ref to trigger descrution. */
>>>>   	br_vlan_put(vlan);
>>>>   }
>>>> @@ -187,7 +232,7 @@ int br_vlan_delete(struct net_bridge *br, u16 vid, u16 flags)
>>>>   	if (!vlan)
>>>>   		return -ENOENT;
>>>>
>>>> -	br_vlan_del(vlan, flags);
>>>> +	br_vlan_del(br, vlan, flags);
>>>>   	return 0;
>>>>   }
>>>>
>>>> @@ -204,7 +249,9 @@ static void br_vlan_flush(struct net_bridge *br)
>>>>   	for (i = 0; i < BR_VID_HASH_SIZE; i++) {
>>>>   		hlist_for_each_entry_safe(vlan, node, tmp,
>>>>   					  &br->vlan_hlist[i], hlist) {
>>>> -			br_vlan_del(vlan, BRIDGE_FLAGS_SELF);
>>>> +			br_vlan_del(br, vlan,
>>>> +				    (BRIDGE_VLAN_INFO_MASTER |
>>>> +				     BRIDGE_VLAN_INFO_UNTAGGED));
>>>>   		}
>>>>   	}
>>>>   }
>>>> @@ -224,10 +271,70 @@ struct net_port_vlan *nbp_vlan_find(const struct net_bridge_port *p, u16 vid)
>>>>   	return NULL;
>>>>   }
>>>>
>>>> +static int nbp_vlan_add_untagged(struct net_bridge_port *p,
>>>> +			  struct net_bridge_vlan *vlan,
>>>> +			  u16 flags)
>>>> +{
>>>> +	struct net_device *dev = p->dev;
>>>> +
>>>> +	if (p->untagged) {
>>>> +		/* Port already has untagged vlan set.  Drop the ref
>>>> +		 * to the old one since we'll be replace it.
>>>> +		 */
>>>> +		br_vlan_put(p->untagged);
>>>> +	} else {
>>>> +		int err;
>>>> +
>>>> +		/* Add vid 0 to filter if filter is available. */
>>>> +		if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
>>>> +		    dev->netdev_ops->ndo_vlan_rx_add_vid &&
>>>> +		    dev->netdev_ops->ndo_vlan_rx_kill_vid) {
>>>> +			err = dev->netdev_ops->ndo_vlan_rx_add_vid(dev, 0);
>>>> +			if (err)
>>>> +				return err;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	/* This VLAN is handled as untagged/native. Save an
>>>> +	 * additional ref.
>>>> +	 */
>>>> +	br_vlan_hold(vlan);
>>>> +	rcu_assign_pointer(p->untagged, vlan);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void nbp_vlan_delete_untagged(struct net_bridge_port *p,
>>>> +				     struct net_bridge_vlan *vlan)
>>>> +{
>>>> +	if (p->untagged != vlan)
>>>> +		return;
>>>> +
>>>> +	/* Remove VLAN from the device filter if it is supported. */
>>>> +	if ((p->dev->features & NETIF_F_HW_VLAN_FILTER) &&
>>>> +	    p->dev->netdev_ops->ndo_vlan_rx_kill_vid) {
>>>> +		int err;
>>>> +
>>>> +		err = p->dev->netdev_ops->ndo_vlan_rx_kill_vid(p->dev, 0);
>>>> +		if (err) {
>>>> +			pr_warn("failed to kill vid %d for device %s\n",
>>>> +				vlan->vid, p->dev->name);
>>>> +		}
>>>> +	}
>>>> +
>>>> +	/* If this VLAN is currently functioning as untagged, clear it.
>>>> +	 * It's safe to drop the refcount, since the vlan is still held
>>>> +	 * by the port.
>>>> +	 */
>>>> +	br_vlan_put(vlan);
>>>> +	rcu_assign_pointer(p->untagged, NULL);
>>>> +
>>>> +}
>>>> +
>>>>   /* Must be protected by RTNL */
>>>>   int nbp_vlan_add(struct net_bridge_port *p, u16 vid, u16 flags)
>>>>   {
>>>> -	struct net_port_vlan *pve;
>>>> +	struct net_port_vlan *pve = NULL;
>>>>   	struct net_bridge_vlan *vlan;
>>>>   	struct net_device *dev = p->dev;
>>>>   	int err;
>>>> @@ -275,11 +382,21 @@ int nbp_vlan_add(struct net_bridge_port *p, u16 vid, u16 flags)
>>>>   	set_bit(p->port_no, vlan->port_bitmap);
>>>>
>>>>   	list_add_tail_rcu(&pve->list, &p->vlan_list);
>>>> +
>>>> +	if (flags & BRIDGE_VLAN_INFO_UNTAGGED) {
>>>> +		err = nbp_vlan_add_untagged(p, vlan, flags);
>>>> +		if (err)
>>>> +			goto del_vlan;
>>>> +	}
>>>> +
>>>>   	return 0;
>>>>
>>>>   clean_up:
>>>>   	kfree(pve);
>>>> -	br_vlan_del(vlan, flags);
>>>> +	br_vlan_del(p->br, vlan, flags);
>>>> +	return err;
>>>> +del_vlan:
>>>> +	nbp_vlan_delete(p, vid, flags);
>>>>   	return err;
>>>>   }
>>>>
>>>> @@ -296,6 +413,9 @@ int nbp_vlan_delete(struct net_bridge_port *p, u16 vid, u16 flags)
>>>>   	if (!pve)
>>>>   		return -ENOENT;
>>>>
>>>> +	if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
>>>> +		nbp_vlan_delete_untagged(p, pve->vlan);
>>>> +
>>>>   	/* Remove VLAN from the device filter if it is supported. */
>>>>   	if ((dev->features & NETIF_F_HW_VLAN_FILTER) &&
>>>>   	    dev->netdev_ops->ndo_vlan_rx_kill_vid) {
>>>> @@ -306,6 +426,7 @@ int nbp_vlan_delete(struct net_bridge_port *p, u16 vid, u16 flags)
>>>>   			pr_warn("failed to kill vid %d for device %s\n",
>>>>   				vid, dev->name);
>>>>   	}
>>>> +
>>>>   	pve->vid = BR_INVALID_VID;
>>>>
>>>>   	vlan = pve->vlan;
>>>> @@ -316,7 +437,7 @@ int nbp_vlan_delete(struct net_bridge_port *p, u16 vid, u16 flags)
>>>>   	list_del_rcu(&pve->list);
>>>>   	kfree_rcu(pve, rcu);
>>>>
>>>> -	br_vlan_del(vlan, flags);
>>>> +	br_vlan_del(p->br, vlan, flags);
>>>>
>>>>   	return 0;
>>>>   }
>>>> @@ -328,8 +449,11 @@ static void nbp_vlan_flush(struct net_bridge_port *p)
>>>>
>>>>   	ASSERT_RTNL();
>>>>
>>>> -	list_for_each_entry_safe(pve, tmp, &p->vlan_list, list)
>>>> -		nbp_vlan_delete(p, pve->vid, BRIDGE_FLAGS_SELF);
>>>> +	list_for_each_entry_safe(pve, tmp, &p->vlan_list, list)  {
>>>> +		nbp_vlan_delete(p, pve->vid,
>>>> +				(BRIDGE_VLAN_INFO_MASTER |
>>>> +				 BRIDGE_VLAN_INFO_UNTAGGED));
>>>> +	}
>>>>   }
>>>>
>>>>   static void release_nbp(struct kobject *kobj)
>>>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>>>> index 9cf2879..1b302ce 100644
>>>> --- a/net/bridge/br_netlink.c
>>>> +++ b/net/bridge/br_netlink.c
>>>> @@ -199,7 +199,8 @@ static int br_afspec(struct net_bridge *br, struct net_bridge_port *p,
>>>>   			if (p)
>>>>   				err = nbp_vlan_add(p, vinfo->vid, vinfo->flags);
>>>>   			else {
>>>> -				u16 flags = vinfo->flags | BRIDGE_FLAGS_SELF;
>>>> +				u16 flags = vinfo->flags |
>>>> +					    BRIDGE_VLAN_INFO_MASTER;
>>>>   				if (!br_vlan_add(br, vinfo->vid, flags))
>>>>   					err = -ENOMEM;
>>>>   			}
>>>> @@ -210,7 +211,8 @@ static int br_afspec(struct net_bridge *br, struct net_bridge_port *p,
>>>>   				err = nbp_vlan_delete(p, vinfo->vid,
>>>>   						      vinfo->flags);
>>>>   			else {
>>>> -				u16 flags = vinfo->flags | BRIDGE_FLAGS_SELF;
>>>> +				u16 flags = vinfo->flags |
>>>> +					    BRIDGE_VLAN_INFO_MASTER;
>>>>   				err = br_vlan_delete(br, vinfo->vid, flags);
>>>>   			}
>>>>   			break;
>>>> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
>>>> index cc75212..9328463 100644
>>>> --- a/net/bridge/br_private.h
>>>> +++ b/net/bridge/br_private.h
>>>> @@ -179,6 +179,7 @@ struct net_bridge_port
>>>>   	struct netpoll			*np;
>>>>   #endif
>>>>   	struct list_head		vlan_list;
>>>> +	struct net_bridge_vlan __rcu	*untagged;
>>>>   };
>>>>
>>>>   #define br_port_exists(dev) (dev->priv_flags & IFF_BRIDGE_PORT)
>>>> @@ -298,6 +299,7 @@ struct net_bridge
>>>>   	struct timer_list		gc_timer;
>>>>   	struct kobject			*ifobj;
>>>>   	struct hlist_head		vlan_hlist[BR_VID_HASH_SIZE];
>>>> +	struct net_bridge_vlan __rcu	*untagged;
>>>>   };
>>>>
>>>>   struct br_input_skb_cb {
>>>> --
>>>> 1.7.7.6

^ permalink raw reply

* Re: PMTU discovery is broken on kernel 3.7.1 for UDP sockets
From: Yurij M. Plotnikov @ 2012-12-19 14:27 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Alexandra N. Kossovsky
In-Reply-To: <1355924119.2676.6.camel@bwh-desktop.uk.solarflarecom.com>

On 12/19/12 17:35, Ben Hutchings wrote:
> On Wed, 2012-12-19 at 17:10 +0400, Yurij M. Plotnikov wrote:
>    
>> On kernel 3.7.1 I get strange behaviour of IP_MTU_DISCOVER socket
>> option. The behaviour in case of IP_PMTUDISC_DO and IP_PMTUDISC_WANT
>> values of IP_MTU_DISCOVER socket option on SOCK_DGRAM socket are the
>> same and packet is always sent with "Don't Fragment" bit in case of
>> IP_PMTUDISC_WANT. Also, the value of IP_MTU socket option is not updated.
>>      
> You could try reverting:
>
> commit ee9a8f7ab2edf801b8b514c310455c94acc232f6
> Author: Steffen Klassert<steffen.klassert@secunet.com>
> Date:   Mon Oct 8 00:56:54 2012 +0000
>
>      ipv4: Don't report stale pmtu values to userspace
>
>      We report cached pmtu values even if they are already expired.
>      Change this to not report these values after they are expired
>      and fix a race in the expire time calculation, as suggested by
>      Eric Dumazet.
>
> Still, PMTU information is not supposed to expire for 10 minutes...
>
>    
With reverted commit there is no such problem on 3.7.1: IP_MTU is 
updated and DF is set only for the first packet in case of 
IP_PMTUDISC_WANT.
> [...]
>    
>> On eth2 on host_B and on eth1 on host_C change MTU from 1500 to 750.
>> Wait for a while.
>>
>> 9. send(6, lenght=1400) ->  1400 // the packet is sent with "Don't
>> Fragment" bit, tcpdump on eth1 on host_B shows it
>> 10. sleep(5);
>> 11. send(6, length=1400) ->  -1 with EMSGSIZE
>> 12. sleep(5);
>> 13. getsockopt(6,IP_MTU) ->  0 // Returns that MTU is 1500 once again. So
>> value is not updated.
>>      
> [...]
>
> What if you read this option immediately before the sleep(5)?
>    
It still returns that MTU is 1500.

Yurij.

^ permalink raw reply

* [PATCH net] net: qmi_wwan: add ZTE MF880
From: Bjørn Mork @ 2012-12-19 14:15 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, Bjørn Mork

The driver description files gives these names to the vendor specific
functions on this modem:

 diag: VID_19D2&PID_0284&MI_00
 nmea: VID_19D2&PID_0284&MI_01
 at:   VID_19D2&PID_0284&MI_02
 mdm:  VID_19D2&PID_0284&MI_03
 net:  VID_19D2&PID_0284&MI_04

Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/net/usb/qmi_wwan.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 9b950f5..91d7cb9 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -433,6 +433,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x19d2, 0x0199, 1)},	/* ZTE MF820S */
 	{QMI_FIXED_INTF(0x19d2, 0x0200, 1)},
 	{QMI_FIXED_INTF(0x19d2, 0x0257, 3)},	/* ZTE MF821 */
+	{QMI_FIXED_INTF(0x19d2, 0x0284, 4)},	/* ZTE MF880 */
 	{QMI_FIXED_INTF(0x19d2, 0x0326, 4)},	/* ZTE MF821D */
 	{QMI_FIXED_INTF(0x19d2, 0x1008, 4)},	/* ZTE (Vodafone) K3570-Z */
 	{QMI_FIXED_INTF(0x19d2, 0x1010, 4)},	/* ZTE (Vodafone) K3571-Z */
-- 
1.7.10.4

^ permalink raw reply related

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-19 14:13 UTC (permalink / raw)
  To: Shmulik Ladkani; +Cc: netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <20121219101006.7086faef@pixies.home.jungo.com>

On 12/19/2012 03:10 AM, Shmulik Ladkani wrote:
> Thanks Vlad,
>
> On Tue, 18 Dec 2012 14:00:51 -0500 Vlad Yasevich <vyasevic@redhat.com> wrote:
>> A single vlan may also be designated as untagged.  Any untagged traffic
>> recieved by the port will be assigned to this vlan.
>
> Why the "untagged vlan" is per-bridge global?
> Usually, 802.1q switches define the PVID (port's VID) which controls
> the value of VID, in case ingress frame is either untagged or
> priority-tagged (per port configuration).
> This gives greater flexibility.

It's not.  There is a per port untagged pointer where you can designate 
which VLAN is untagged/native on a port.  The bride interface itself
can also function as a port, so it gets its own untagged pointer so
it can behave similar to port.

>
>> Any traffic exiting
>> the port with a VID matching the untagged vlan will exit untagged (the
>> bridge will strip the vlan header).  This is similar to "Native Vlan" support
>> available in most switches.
>
> 802.1q switches usually allow conifguring per-vlan, per-port
> tagged/untagged egress policy: each vid has its port membership map and
> an accompanying port egress-policy map.
> This gives great flexibility defining all sorts of configurations.

Right, and that's what's provided here.
  * Each VLAN has port membership map (net_bridge_vlan.portgroup).
  * Each port has a list of vlans configured as well 
(net_port_vlan.vlan_list).
  * Each port also has a single vlan that can be untagged 
(net_bridge_port.untagged).
  * The bridge also has a single untagged vlan (net_bridge.untagged)

The limitation (in switches as well) is that only a single VLAN
may be untagged on any 1 port.  If you have more then 1, you don't know
which VLAN the untagged traffic belongs to.

>
> Personally, I'd prefer a fully flexible vlan bridge allowing all sorts
> of configurations (as available in 802.1q switches).
>
> What's the reason limiting such configurations?

So, what do you see that's missing?

-vlad

>
> Regards,
> Shmulik
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox