Netdev List

Netdev List
 help / color / mirror / Atom feed

* RE: [PATCH net-next] mlx4: use dev_kfree_skb() instead of dev_kfree_skb_any()
From: Eric Dumazet @ 2012-09-19 12:12 UTC (permalink / raw)
  To: Yevgeny Petrilin; +Cc: David Miller, netdev, Or Gerlitz, Ying Cai
In-Reply-To: <953B660C027164448AE903364AC447D28721B46E@MTLDAG01.mtl.com>

On Wed, 2012-09-19 at 07:58 +0000, Yevgeny Petrilin wrote:
> > 
> > Since commit e22979d96a5 (mlx4_en: Moving to Interrupts for TX
> > completions), we no longer can free TX skb from hard IRQ, but only from
> > normal softirq or process context.
> > 
> > Therefore, we can directly call dev_kfree_skb() from
> > mlx4_en_free_tx_desc() like other conventional NAPI drivers.
> > 
> 
> Hi Eric,
> At the moment the TX completion processing is done from IRQ context.
> So I think we need to change the driver to work with NAPI for TX completions
> before making this change.
> 
> I'll send the patch in a few days.

Oops you're right, it seems I misread e22979d96 commit.

irq term is a bit generic, you might add soft/hard qualifiers to
distinguish the variant.

Thanks

^ permalink raw reply

* Regarding ethernet directory between IP and SoC chip vendor.
From: byungho an @ 2012-09-19 12:39 UTC (permalink / raw)
  To: netdev
  Cc: davem, peppe.cavallaro, deepak.silki, francesco.virlinzi,
	jeffrey.t.kirsher, eilong, alexander.h.duyck, bhutchings,
	linville, wey-ty.w.guy, coelho, e.wahlig, aditya.ps, ihlee215

Hi all,

I have one suggestion for ethernet dir.
Currently It is well-defined and good for management.

But if IP vendor is different from SoC vender, It is a bit confusing
to guess dir name.
For example, stmmac is using Synopsys dwmac.
In this case, if another SoC vendors try to use Synopsys IP, they
sould make their own dir under the their name? Even the IP is same...

If there is common dir of IP vendor, It would be more clear.
If that, other SoC vendors that try to use the IP can make their own
directory and drivers intuitionally.

What do you think about it?
I want to exchange opinion and find a resonable and rational way.

Thank you.
Andy

^ permalink raw reply

* Re: [PATCH 1/2] Added information about which firmware file is being requested.
From: Julian Calaby @ 2012-09-19 12:39 UTC (permalink / raw)
  To: Jarl Friis
  Cc: Stefano Brivio, Gábor Stefanik,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	b43-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	netdev-u79uwXL29TY76Z2rM5mHXA, John W. Linville
In-Reply-To: <1348053493-22955-1-git-send-email-jarl-bE7lSbLpGj1/SzgSGea1oA@public.gmane.org>

Hi Jarl,

You should really mention which driver these are for in the subject line, say:

[PATCH] b43: Added information about which firmware file is being requested

On Wed, Sep 19, 2012 at 9:18 PM, Jarl Friis <jarl-bE7lSbLpGj1/SzgSGea1oA@public.gmane.org> wrote:
> This is informative information to provide about which actual firmware
> file is being used.

Also, this patch is so small and obvious that you could arguably get
away with something like:

Subject: [PATCH] b43: Log firmware filename

Log the name of the firmware file requested.


Or something along those lines.

Thanks,

-- 
Julian Calaby

Email: julian.calaby-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
Profile: http://www.google.com/profiles/julian.calaby/
.Plan: http://sites.google.com/site/juliancalaby/
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [RFC PATCHv2 bridge 0/7] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich

This series of patches provides an ability to add VLAN IDs to the bridge
ports.  This is similar to what can be found in most switches.  The bridge
port may have any number of VLANs added to it including vlan 0 for untagged
traffic.  When vlans are added to the port, only traffic tagged with particular
vlan will forwarded over this port.  Additionally, vlan ids are added to FDB
entries and become part of the lookup.  This way we correctly identify the FDB
entry.

The default behavior ofthe bridge is unchanged if no vlans have been
configured.

Changes since v1:
 - Comments addressed regarding formatting and RCU usage
 - iocts have been removed and changed over the netlink interface.
 - Added support of user added ndb entries.
 - changed sysfs interface to export a bitmap.  Also added a write interface.
   I am not sure how much I like it, but it made my testing easier/faster.  I
   might change the write interface to take text instead of binary.

Vlad Yasevich (7):
  bridge: Add vlan check to forwarding path
  bridge: Add vlan to unicast fdb entries
  bridge: Add vlan id to multicast groups
  bridge: Add netlink interface to configure vlans on bridge ports
  bridge: Add vlan support to static neighbors
  bridge: Add sysfs interface to display VLANS
  bridge:  Add the ability to show dump the vlan map from a bridge port

 include/linux/if_bridge.h |    3 +-
 include/linux/if_link.h   |   23 ++++++
 include/linux/neighbour.h |    2 +-
 net/bridge/br_device.c    |    2 +-
 net/bridge/br_fdb.c       |   77 +++++++++++--------
 net/bridge/br_forward.c   |   15 ++++-
 net/bridge/br_if.c        |   74 +++++++++++++++++
 net/bridge/br_input.c     |   19 ++++-
 net/bridge/br_multicast.c |   64 +++++++++++-----
 net/bridge/br_netlink.c   |  190 ++++++++++++++++++++++++++++++++++++++------
 net/bridge/br_private.h   |   29 +++++++-
 net/bridge/br_sysfs_if.c  |   70 +++++++++++++++++
 12 files changed, 481 insertions(+), 87 deletions(-)

-- 
1.7.7.6

^ permalink raw reply

* [RFC PATCHv2 bridge 1/7] bridge: Add vlan check to forwarding path
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

When forwarding packets make sure vlan matches any configured vlan for
the port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 net/bridge/br_forward.c |   15 ++++++++++++++-
 net/bridge/br_input.c   |   12 ++++++++++++
 net/bridge/br_private.h |   17 +++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletions(-)

diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c
index 02015a5..f917cb8 100644
--- a/net/bridge/br_forward.c
+++ b/net/bridge/br_forward.c
@@ -26,11 +26,24 @@ static int deliver_clone(const struct net_bridge_port *prev,
 			 void (*__packet_hook)(const struct net_bridge_port *p,
 					       struct sk_buff *skb));
 
-/* Don't forward packets to originating port or forwarding diasabled */
+/* check to see that the vlan is allowed to be forwarded on this interface */
+static inline int vlan_match(const struct net_bridge_port *p,
+			     const struct sk_buff *skb)
+{
+	unsigned long *vlan_map = rcu_dereference(p->vlan_map);
+	unsigned short vid = br_get_vlan(skb);
+
+	/* The map keeps the vlans off by 1 so adjust for that */
+	return vlan_map && test_bit(br_vid(vid), vlan_map);
+}
+
+/* Don't forward packets to originating port or forwarding diasabled.
+ */
 static inline int should_deliver(const struct net_bridge_port *p,
 				 const struct sk_buff *skb)
 {
 	return (((p->flags & BR_HAIRPIN_MODE) || skb->dev != p->dev) &&
+		vlan_match(p, skb) &&
 		p->state == BR_STATE_FORWARDING);
 }
 
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 76f15fd..44f352d 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -53,10 +53,22 @@ int br_handle_frame_finish(struct sk_buff *skb)
 	struct net_bridge_fdb_entry *dst;
 	struct net_bridge_mdb_entry *mdst;
 	struct sk_buff *skb2;
+	unsigned long *vlan_map;
+	u16 vid = 0;
 
 	if (!p || p->state == BR_STATE_DISABLED)
 		goto drop;
 
+	/* If VLAN filter is configured on the port, make sure we accept
+	 * only traffic matching the VLAN filter.
+	 */
+	vlan_map = rcu_dereference(p->vlan_map);
+	if (vlan_map) {
+		vid = br_get_vlan(skb);
+		if (!test_bit(br_vid(vid), vlan_map))
+			goto drop;
+	}
+
 	/* insert into forwarding database after filtering to avoid spoofing */
 	br = p->br;
 	br_fdb_update(br, p, eth_hdr(skb)->h_source);
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index f507d2a..baf1835 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -18,6 +18,7 @@
 #include <linux/netpoll.h>
 #include <linux/u64_stats_sync.h>
 #include <net/route.h>
+#include <linux/if_vlan.h>
 
 #define BR_HASH_BITS 8
 #define BR_HASH_SIZE (1 << BR_HASH_BITS)
@@ -152,10 +153,17 @@ struct net_bridge_port
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	struct netpoll			*np;
 #endif
+	/* VLAN map of all vlans allowed on this port.  Stored off by 1,
+	 * such at VLAN 0 (untagged) is stored in bit 1.
+	 */
+	unsigned long __rcu		*vlan_map;
 };
 
 #define br_port_exists(dev) (dev->priv_flags & IFF_BRIDGE_PORT)
 
+/* Use this macro to get the correct VLAN id */
+#define br_vid(vid)		((vid) + 1)
+
 static inline struct net_bridge_port *br_port_get_rcu(const struct net_device *dev)
 {
 	struct net_bridge_port *port = rcu_dereference(dev->rx_handler_data);
@@ -168,6 +176,15 @@ static inline struct net_bridge_port *br_port_get_rtnl(struct net_device *dev)
 		rtnl_dereference(dev->rx_handler_data) : NULL;
 }
 
+static inline u16 br_get_vlan(const struct sk_buff *skb)
+{
+	u16 uninitialized_var(tag);
+
+	if (vlan_get_tag(skb, &tag))
+		return 0;
+	return tag & VLAN_VID_MASK;
+}
+
 struct br_cpu_netstats {
 	u64			rx_packets;
 	u64			rx_bytes;
-- 
1.7.7.6

^ permalink raw reply related

* [RFC PATCHv2 bridge 2/7] bridge: Add vlan to unicast fdb entries
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

This patch adds vlan to unicast fdb entries that are created for
learned addresses (not the manually configured ones).  It adds
vlan id into the hash mix and uses vlan as an addditional parameter
for an entry match.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 include/linux/if_bridge.h |    2 +-
 net/bridge/br_device.c    |    2 +-
 net/bridge/br_fdb.c       |   71 ++++++++++++++++++++++++++------------------
 net/bridge/br_input.c     |    7 ++--
 net/bridge/br_private.h   |    7 +++-
 5 files changed, 53 insertions(+), 36 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index dd3f201..8476d6f 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -94,7 +94,7 @@ struct __fdb_entry {
 	__u32 ageing_timer_value;
 	__u8 port_hi;
 	__u8 pad0;
-	__u16 unused;
+	__u16 fdb_vid;
 };
 
 #ifdef __KERNEL__
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 070e8a6..bf08c09 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -67,7 +67,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 			br_multicast_deliver(mdst, skb);
 		else
 			br_flood_deliver(br, skb);
-	} else if ((dst = __br_fdb_get(br, dest)) != NULL)
+	} else if ((dst = __br_fdb_get(br, dest, br_get_vlan(skb))) != NULL)
 		br_deliver(dst->dst, skb);
 	else
 		br_flood_deliver(br, skb);
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 9ce430b..e17f9f2 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -23,6 +23,7 @@
 #include <linux/slab.h>
 #include <linux/atomic.h>
 #include <asm/unaligned.h>
+#include <linux/if_vlan.h>
 #include "br_private.h"
 
 static struct kmem_cache *br_fdb_cache __read_mostly;
@@ -67,11 +68,12 @@ static inline int has_expired(const struct net_bridge *br,
 		time_before_eq(fdb->updated + hold_time(br), jiffies);
 }
 
-static inline int br_mac_hash(const unsigned char *mac)
+static inline int br_mac_hash(const unsigned char *mac, __u16 vlan_tci)
 {
-	/* use 1 byte of OUI cnd 3 bytes of NIC */
+	/* use 1 byte of OUI and 3 bytes of NIC */
 	u32 key = get_unaligned((u32 *)(mac + 2));
-	return jhash_1word(key, fdb_salt) & (BR_HASH_SIZE - 1);
+	return jhash_2words(key, (vlan_tci & VLAN_VID_MASK),
+				fdb_salt) & (BR_HASH_SIZE - 1);
 }
 
 static void fdb_rcu_free(struct rcu_head *head)
@@ -132,7 +134,7 @@ void br_fdb_change_mac_address(struct net_bridge *br, const u8 *newaddr)
 	struct net_bridge_fdb_entry *f;
 
 	/* If old entry was unassociated with any port, then delete it. */
-	f = __br_fdb_get(br, br->dev->dev_addr);
+	f = __br_fdb_get(br, br->dev->dev_addr, 0);
 	if (f && f->is_local && !f->dst)
 		fdb_delete(br, f);
 
@@ -231,13 +233,16 @@ void br_fdb_delete_by_port(struct net_bridge *br,
 
 /* No locking or refcounting, assumes caller has rcu_read_lock */
 struct net_bridge_fdb_entry *__br_fdb_get(struct net_bridge *br,
-					  const unsigned char *addr)
+					  const unsigned char *addr,
+					  __u16 vlan_tci)
 {
 	struct hlist_node *h;
 	struct net_bridge_fdb_entry *fdb;
 
-	hlist_for_each_entry_rcu(fdb, h, &br->hash[br_mac_hash(addr)], hlist) {
-		if (ether_addr_equal(fdb->addr.addr, addr)) {
+	hlist_for_each_entry_rcu(fdb, h,
+				&br->hash[br_mac_hash(addr, vlan_tci)], hlist) {
+		if (ether_addr_equal(fdb->addr.addr, addr) &&
+		    fdb->vlan_id == (vlan_tci & VLAN_VID_MASK) ) {
 			if (unlikely(has_expired(br, fdb)))
 				break;
 			return fdb;
@@ -261,7 +266,7 @@ int br_fdb_test_addr(struct net_device *dev, unsigned char *addr)
 	if (!port)
 		ret = 0;
 	else {
-		fdb = __br_fdb_get(port->br, addr);
+		fdb = __br_fdb_get(port->br, addr, 0);
 		ret = fdb && fdb->dst && fdb->dst->dev != dev &&
 			fdb->dst->state == BR_STATE_FORWARDING;
 	}
@@ -313,6 +318,7 @@ int br_fdb_fillbuf(struct net_bridge *br, void *buf,
 			fe->is_local = f->is_local;
 			if (!f->is_static)
 				fe->ageing_timer_value = jiffies_delta_to_clock_t(jiffies - f->updated);
+			fe->fdb_vid = f->vlan_id;
 			++fe;
 			++num;
 		}
@@ -325,26 +331,30 @@ int br_fdb_fillbuf(struct net_bridge *br, void *buf,
 }
 
 static struct net_bridge_fdb_entry *fdb_find(struct hlist_head *head,
-					     const unsigned char *addr)
+					     const unsigned char *addr,
+					     __u16 vlan_tci)
 {
 	struct hlist_node *h;
 	struct net_bridge_fdb_entry *fdb;
 
 	hlist_for_each_entry(fdb, h, head, hlist) {
-		if (ether_addr_equal(fdb->addr.addr, addr))
+		if (ether_addr_equal(fdb->addr.addr, addr) &&
+		    fdb->vlan_id == (vlan_tci & VLAN_VID_MASK))
 			return fdb;
 	}
 	return NULL;
 }
 
 static struct net_bridge_fdb_entry *fdb_find_rcu(struct hlist_head *head,
-						 const unsigned char *addr)
+						 const unsigned char *addr,
+						 __u16 vlan_tci)
 {
 	struct hlist_node *h;
 	struct net_bridge_fdb_entry *fdb;
 
 	hlist_for_each_entry_rcu(fdb, h, head, hlist) {
-		if (ether_addr_equal(fdb->addr.addr, addr))
+		if (ether_addr_equal(fdb->addr.addr, addr) &&
+		    fdb->vlan_id == (vlan_tci & VLAN_VID_MASK))
 			return fdb;
 	}
 	return NULL;
@@ -352,7 +362,8 @@ static struct net_bridge_fdb_entry *fdb_find_rcu(struct hlist_head *head,
 
 static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
 					       struct net_bridge_port *source,
-					       const unsigned char *addr)
+					       const unsigned char *addr,
+					       __u16 vlan_tci)
 {
 	struct net_bridge_fdb_entry *fdb;
 
@@ -360,6 +371,7 @@ static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
 	if (fdb) {
 		memcpy(fdb->addr.addr, addr, ETH_ALEN);
 		fdb->dst = source;
+		fdb->vlan_id = (vlan_tci & VLAN_VID_MASK);
 		fdb->is_local = 0;
 		fdb->is_static = 0;
 		fdb->updated = fdb->used = jiffies;
@@ -371,13 +383,13 @@ static struct net_bridge_fdb_entry *fdb_create(struct hlist_head *head,
 static int fdb_insert(struct net_bridge *br, struct net_bridge_port *source,
 		  const unsigned char *addr)
 {
-	struct hlist_head *head = &br->hash[br_mac_hash(addr)];
+	struct hlist_head *head = &br->hash[br_mac_hash(addr, 0)];
 	struct net_bridge_fdb_entry *fdb;
 
 	if (!is_valid_ether_addr(addr))
 		return -EINVAL;
 
-	fdb = fdb_find(head, addr);
+	fdb = fdb_find(head, addr, 0);
 	if (fdb) {
 		/* it is okay to have multiple ports with same
 		 * address, just use the first one.
@@ -390,7 +402,7 @@ static int fdb_insert(struct net_bridge *br, struct net_bridge_port *source,
 		fdb_delete(br, fdb);
 	}
 
-	fdb = fdb_create(head, source, addr);
+	fdb = fdb_create(head, source, addr, 0);
 	if (!fdb)
 		return -ENOMEM;
 
@@ -412,9 +424,9 @@ int br_fdb_insert(struct net_bridge *br, struct net_bridge_port *source,
 }
 
 void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
-		   const unsigned char *addr)
+		   const unsigned char *addr, __u16 vlan_tci)
 {
-	struct hlist_head *head = &br->hash[br_mac_hash(addr)];
+	struct hlist_head *head = &br->hash[br_mac_hash(addr, vlan_tci)];
 	struct net_bridge_fdb_entry *fdb;
 
 	/* some users want to always flood. */
@@ -426,7 +438,7 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 	      source->state == BR_STATE_FORWARDING))
 		return;
 
-	fdb = fdb_find_rcu(head, addr);
+	fdb = fdb_find_rcu(head, addr, vlan_tci);
 	if (likely(fdb)) {
 		/* attempt to update an entry for a local interface */
 		if (unlikely(fdb->is_local)) {
@@ -441,8 +453,8 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source,
 		}
 	} else {
 		spin_lock(&br->hash_lock);
-		if (likely(!fdb_find(head, addr))) {
-			fdb = fdb_create(head, source, addr);
+		if (likely(!fdb_find(head, addr, vlan_tci))) {
+			fdb = fdb_create(head, source, addr, vlan_tci);
 			if (fdb)
 				fdb_notify(br, fdb, RTM_NEWNEIGH);
 		}
@@ -571,18 +583,18 @@ out:
 
 /* Update (create or replace) forwarding database entry */
 static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
-			 __u16 state, __u16 flags)
+			 __u16 state, __u16 flags, __u16 vlan_tci)
 {
 	struct net_bridge *br = source->br;
-	struct hlist_head *head = &br->hash[br_mac_hash(addr)];
+	struct hlist_head *head = &br->hash[br_mac_hash(addr, vlan_tci)];
 	struct net_bridge_fdb_entry *fdb;
 
-	fdb = fdb_find(head, addr);
+	fdb = fdb_find(head, addr, vlan_tci);
 	if (fdb == NULL) {
 		if (!(flags & NLM_F_CREATE))
 			return -ENOENT;
 
-		fdb = fdb_create(head, source, addr);
+		fdb = fdb_create(head, source, addr, vlan_tci);
 		if (!fdb)
 			return -ENOMEM;
 		fdb_notify(br, fdb, RTM_NEWNEIGH);
@@ -628,11 +640,12 @@ int br_fdb_add(struct ndmsg *ndm, struct net_device *dev,
 
 	if (ndm->ndm_flags & NTF_USE) {
 		rcu_read_lock();
-		br_fdb_update(p->br, p, addr);
+		br_fdb_update(p->br, p, addr, 0);
 		rcu_read_unlock();
 	} else {
 		spin_lock_bh(&p->br->hash_lock);
-		err = fdb_add_entry(p, addr, ndm->ndm_state, nlh_flags);
+		err = fdb_add_entry(p, addr, ndm->ndm_state, nlh_flags,
+				0);
 		spin_unlock_bh(&p->br->hash_lock);
 	}
 
@@ -642,10 +655,10 @@ int br_fdb_add(struct ndmsg *ndm, struct net_device *dev,
 static int fdb_delete_by_addr(struct net_bridge_port *p, u8 *addr)
 {
 	struct net_bridge *br = p->br;
-	struct hlist_head *head = &br->hash[br_mac_hash(addr)];
+	struct hlist_head *head = &br->hash[br_mac_hash(addr, 0)];
 	struct net_bridge_fdb_entry *fdb;
 
-	fdb = fdb_find(head, addr);
+	fdb = fdb_find(head, addr, 0);
 	if (!fdb)
 		return -ENOENT;
 
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 44f352d..c4f0020 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -71,7 +71,7 @@ int br_handle_frame_finish(struct sk_buff *skb)
 
 	/* insert into forwarding database after filtering to avoid spoofing */
 	br = p->br;
-	br_fdb_update(br, p, eth_hdr(skb)->h_source);
+	br_fdb_update(br, p, eth_hdr(skb)->h_source, vid);
 
 	if (!is_broadcast_ether_addr(dest) && is_multicast_ether_addr(dest) &&
 	    br_multicast_rcv(br, p, skb))
@@ -106,7 +106,8 @@ int br_handle_frame_finish(struct sk_buff *skb)
 			skb2 = skb;
 
 		br->dev->stats.multicast++;
-	} else if ((dst = __br_fdb_get(br, dest)) && dst->is_local) {
+	} else if ((dst = __br_fdb_get(br, dest, vid)) &&
+			dst->is_local) {
 		skb2 = skb;
 		/* Do not forward the packet since it's local. */
 		skb = NULL;
@@ -135,7 +136,7 @@ static int br_handle_local_finish(struct sk_buff *skb)
 {
 	struct net_bridge_port *p = br_port_get_rcu(skb->dev);
 
-	br_fdb_update(p->br, p, eth_hdr(skb)->h_source);
+	br_fdb_update(p->br, p, eth_hdr(skb)->h_source, br_get_vlan(skb));
 	return 0;	 /* process further */
 }
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index baf1835..bb382f1 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -73,6 +73,7 @@ struct net_bridge_fdb_entry
 	unsigned long			updated;
 	unsigned long			used;
 	mac_addr			addr;
+	__u16				vlan_id;
 	unsigned char			is_local;
 	unsigned char			is_static;
 };
@@ -367,7 +368,8 @@ extern void br_fdb_cleanup(unsigned long arg);
 extern void br_fdb_delete_by_port(struct net_bridge *br,
 				  const struct net_bridge_port *p, int do_all);
 extern struct net_bridge_fdb_entry *__br_fdb_get(struct net_bridge *br,
-						 const unsigned char *addr);
+						 const unsigned char *addr,
+						 __u16 vlan_tci);
 extern int br_fdb_test_addr(struct net_device *dev, unsigned char *addr);
 extern int br_fdb_fillbuf(struct net_bridge *br, void *buf,
 			  unsigned long count, unsigned long off);
@@ -376,7 +378,8 @@ extern int br_fdb_insert(struct net_bridge *br,
 			 const unsigned char *addr);
 extern void br_fdb_update(struct net_bridge *br,
 			  struct net_bridge_port *source,
-			  const unsigned char *addr);
+			  const unsigned char *addr,
+			  __u16 vlan_tci);
 
 extern int br_fdb_delete(struct ndmsg *ndm,
 			 struct net_device *dev,
-- 
1.7.7.6

^ permalink raw reply related

* [RFC PATCHv2 bridge 3/7] bridge: Add vlan id to multicast groups
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

Add vlan_id to multicasts groups so that we know which vlan each group belongs
to and can correctly forward to appropriate vlan.

Signed-off-by: Vlad Yasevich <vyasevich@redhat.com>
---
 net/bridge/br_multicast.c |   64 +++++++++++++++++++++++++++++++--------------
 net/bridge/br_private.h   |    1 +
 2 files changed, 45 insertions(+), 20 deletions(-)

diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 2417434..888fc08 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -51,6 +51,8 @@ static inline int br_ip_equal(const struct br_ip *a, const struct br_ip *b)
 {
 	if (a->proto != b->proto)
 		return 0;
+	if (a->vid != b->vid)
+		return 0;
 	switch (a->proto) {
 	case htons(ETH_P_IP):
 		return a->u.ip4 == b->u.ip4;
@@ -62,16 +64,19 @@ static inline int br_ip_equal(const struct br_ip *a, const struct br_ip *b)
 	return 0;
 }
 
-static inline int __br_ip4_hash(struct net_bridge_mdb_htable *mdb, __be32 ip)
+static inline int __br_ip4_hash(struct net_bridge_mdb_htable *mdb, __be32 ip,
+				__u16 vid)
 {
-	return jhash_1word(mdb->secret, (__force u32)ip) & (mdb->max - 1);
+	return jhash_2words((__force u32)ip, vid, mdb->secret) & (mdb->max - 1);
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
 static inline int __br_ip6_hash(struct net_bridge_mdb_htable *mdb,
-				const struct in6_addr *ip)
+				const struct in6_addr *ip,
+				__u16 vid)
 {
-	return jhash2((__force u32 *)ip->s6_addr32, 4, mdb->secret) & (mdb->max - 1);
+	return jhash_2words(ipv6_addr_hash(ip), vid,
+			    mdb->secret) & (mdb->max - 1);
 }
 #endif
 
@@ -80,10 +85,10 @@ static inline int br_ip_hash(struct net_bridge_mdb_htable *mdb,
 {
 	switch (ip->proto) {
 	case htons(ETH_P_IP):
-		return __br_ip4_hash(mdb, ip->u.ip4);
+		return __br_ip4_hash(mdb, ip->u.ip4, ip->vid);
 #if IS_ENABLED(CONFIG_IPV6)
 	case htons(ETH_P_IPV6):
-		return __br_ip6_hash(mdb, &ip->u.ip6);
+		return __br_ip6_hash(mdb, &ip->u.ip6, ip->vid);
 #endif
 	}
 	return 0;
@@ -113,24 +118,27 @@ static struct net_bridge_mdb_entry *br_mdb_ip_get(
 }
 
 static struct net_bridge_mdb_entry *br_mdb_ip4_get(
-	struct net_bridge_mdb_htable *mdb, __be32 dst)
+	struct net_bridge_mdb_htable *mdb, __be32 dst, __u16 vlan_tci)
 {
 	struct br_ip br_dst;
 
 	br_dst.u.ip4 = dst;
 	br_dst.proto = htons(ETH_P_IP);
+	br_dst.vid = vlan_tci & VLAN_VID_MASK;
 
 	return br_mdb_ip_get(mdb, &br_dst);
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
 static struct net_bridge_mdb_entry *br_mdb_ip6_get(
-	struct net_bridge_mdb_htable *mdb, const struct in6_addr *dst)
+	struct net_bridge_mdb_htable *mdb, const struct in6_addr *dst,
+	__u16 vlan_tci)
 {
 	struct br_ip br_dst;
 
 	br_dst.u.ip6 = *dst;
 	br_dst.proto = htons(ETH_P_IPV6);
+	br_dst.vid = vlan_tci & VLAN_VID_MASK;
 
 	return br_mdb_ip_get(mdb, &br_dst);
 }
@@ -692,7 +700,8 @@ err:
 
 static int br_ip4_multicast_add_group(struct net_bridge *br,
 				      struct net_bridge_port *port,
-				      __be32 group)
+				      __be32 group,
+				      __u16 vlan_tci)
 {
 	struct br_ip br_group;
 
@@ -701,6 +710,7 @@ static int br_ip4_multicast_add_group(struct net_bridge *br,
 
 	br_group.u.ip4 = group;
 	br_group.proto = htons(ETH_P_IP);
+	br_group.vid = vlan_tci & VLAN_VID_MASK;
 
 	return br_multicast_add_group(br, port, &br_group);
 }
@@ -708,7 +718,8 @@ static int br_ip4_multicast_add_group(struct net_bridge *br,
 #if IS_ENABLED(CONFIG_IPV6)
 static int br_ip6_multicast_add_group(struct net_bridge *br,
 				      struct net_bridge_port *port,
-				      const struct in6_addr *group)
+				      const struct in6_addr *group,
+				      __u16 vlan_tci)
 {
 	struct br_ip br_group;
 
@@ -717,6 +728,7 @@ static int br_ip6_multicast_add_group(struct net_bridge *br,
 
 	br_group.u.ip6 = *group;
 	br_group.proto = htons(ETH_P_IPV6);
+	br_group.vid = vlan_tci & VLAN_VID_MASK;
 
 	return br_multicast_add_group(br, port, &br_group);
 }
@@ -928,7 +940,8 @@ static int br_ip4_multicast_igmp3_report(struct net_bridge *br,
 			continue;
 		}
 
-		err = br_ip4_multicast_add_group(br, port, group);
+		err = br_ip4_multicast_add_group(br, port, group,
+						 br_get_vlan(skb));
 		if (err)
 			break;
 	}
@@ -988,7 +1001,8 @@ static int br_ip6_multicast_mld2_report(struct net_bridge *br,
 			continue;
 		}
 
-		err = br_ip6_multicast_add_group(br, port, &grec->grec_mca);
+		err = br_ip6_multicast_add_group(br, port, &grec->grec_mca,
+						 br_get_vlan(skb));
 		if (!err)
 			break;
 	}
@@ -1106,7 +1120,8 @@ static int br_ip4_multicast_query(struct net_bridge *br,
 	if (!group)
 		goto out;
 
-	mp = br_mdb_ip4_get(mlock_dereference(br->mdb, br), group);
+	mp = br_mdb_ip4_get(mlock_dereference(br->mdb, br), group,
+			    br_get_vlan(skb));
 	if (!mp)
 		goto out;
 
@@ -1178,7 +1193,8 @@ static int br_ip6_multicast_query(struct net_bridge *br,
 	if (!group)
 		goto out;
 
-	mp = br_mdb_ip6_get(mlock_dereference(br->mdb, br), group);
+	mp = br_mdb_ip6_get(mlock_dereference(br->mdb, br), group,
+			    br_get_vlan(skb));
 	if (!mp)
 		goto out;
 
@@ -1262,7 +1278,8 @@ out:
 
 static void br_ip4_multicast_leave_group(struct net_bridge *br,
 					 struct net_bridge_port *port,
-					 __be32 group)
+					 __be32 group,
+					 __u16 vlan_tci)
 {
 	struct br_ip br_group;
 
@@ -1271,6 +1288,7 @@ static void br_ip4_multicast_leave_group(struct net_bridge *br,
 
 	br_group.u.ip4 = group;
 	br_group.proto = htons(ETH_P_IP);
+	br_group.vid = vlan_tci & VLAN_VID_MASK;
 
 	br_multicast_leave_group(br, port, &br_group);
 }
@@ -1278,7 +1296,8 @@ static void br_ip4_multicast_leave_group(struct net_bridge *br,
 #if IS_ENABLED(CONFIG_IPV6)
 static void br_ip6_multicast_leave_group(struct net_bridge *br,
 					 struct net_bridge_port *port,
-					 const struct in6_addr *group)
+					 const struct in6_addr *group,
+					 __u16 vlan_tci)
 {
 	struct br_ip br_group;
 
@@ -1287,6 +1306,7 @@ static void br_ip6_multicast_leave_group(struct net_bridge *br,
 
 	br_group.u.ip6 = *group;
 	br_group.proto = htons(ETH_P_IPV6);
+	br_group.vid = vlan_tci & VLAN_VID_MASK;
 
 	br_multicast_leave_group(br, port, &br_group);
 }
@@ -1369,7 +1389,8 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br,
 	case IGMP_HOST_MEMBERSHIP_REPORT:
 	case IGMPV2_HOST_MEMBERSHIP_REPORT:
 		BR_INPUT_SKB_CB(skb)->mrouters_only = 1;
-		err = br_ip4_multicast_add_group(br, port, ih->group);
+		err = br_ip4_multicast_add_group(br, port, ih->group,
+						 br_get_vlan(skb2));
 		break;
 	case IGMPV3_HOST_MEMBERSHIP_REPORT:
 		err = br_ip4_multicast_igmp3_report(br, port, skb2);
@@ -1378,7 +1399,8 @@ static int br_multicast_ipv4_rcv(struct net_bridge *br,
 		err = br_ip4_multicast_query(br, port, skb2);
 		break;
 	case IGMP_HOST_LEAVE_MESSAGE:
-		br_ip4_multicast_leave_group(br, port, ih->group);
+		br_ip4_multicast_leave_group(br, port, ih->group,
+					     br_get_vlan(skb2));
 		break;
 	}
 
@@ -1498,7 +1520,8 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 		}
 		mld = (struct mld_msg *)skb_transport_header(skb2);
 		BR_INPUT_SKB_CB(skb)->mrouters_only = 1;
-		err = br_ip6_multicast_add_group(br, port, &mld->mld_mca);
+		err = br_ip6_multicast_add_group(br, port, &mld->mld_mca,
+						 br_get_vlan(skb2));
 		break;
 	    }
 	case ICMPV6_MLD2_REPORT:
@@ -1515,7 +1538,8 @@ static int br_multicast_ipv6_rcv(struct net_bridge *br,
 			goto out;
 		}
 		mld = (struct mld_msg *)skb_transport_header(skb2);
-		br_ip6_multicast_leave_group(br, port, &mld->mld_mca);
+		br_ip6_multicast_leave_group(br, port, &mld->mld_mca,
+					     br_get_vlan(skb2));
 	    }
 	}
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index bb382f1..166dcf4 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -62,6 +62,7 @@ struct br_ip
 #endif
 	} u;
 	__be16		proto;
+	__u16		vid;
 };
 
 struct net_bridge_fdb_entry
-- 
1.7.7.6

^ permalink raw reply related

* [RFC PATCHv2 bridge 4/7] bridge: Add netlink interface to configure vlans on bridge ports
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

Add a netlink interface to add and remove vlan configuration on bridge port.
The interface uses the RTM_SETLINK message and encodes the vlan
configuration inside the IFLA_AF_SPEC.  It is possble to include multiple
vlans to either add or remove in a single message.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 include/linux/if_link.h |   22 +++++++++
 net/bridge/br_if.c      |   74 +++++++++++++++++++++++++++++
 net/bridge/br_netlink.c |  117 ++++++++++++++++++++++++++++++++++++++--------
 net/bridge/br_private.h |    2 +
 4 files changed, 194 insertions(+), 21 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index ac173bd..38dbcff 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -398,4 +398,26 @@ struct ifla_port_vsi {
 	__u8 pad[3];
 };
 
+/* Bridge Section
+ * [IFLA_AF_SPEC] = {
+ *   [AF_BRIDGE] = {
+ *	[IFLA_BR_VLAN_INFO] = ...
+ *   }
+ * }
+ */
+enum {
+	IFLA_BR_UNSPEC,
+	IFLA_BR_VLAN_INFO,
+	__IFLA_BR_MAX,
+};
+#define IFLA_BR_MAX (__IFLA_BR_MAX - 1)
+
+enum {
+	IFLA_BR_VLAN_UNSPEC,
+	IFLA_BR_VLAN_ADD,
+	IFLA_BR_VLAN_DEL,
+	__IFLA_BR_VLAN_MAX,
+};
+#define IFLA_BR_VLAN_MAX (__IFLA_BR_VLAN_MAX - 1)
+
 #endif /* _LINUX_IF_LINK_H */
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 1c8fdc3..c6a66e2 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -23,6 +23,7 @@
 #include <linux/if_ether.h>
 #include <linux/slab.h>
 #include <net/sock.h>
+#include <linux/if_vlan.h>
 
 #include "br_private.h"
 
@@ -445,6 +446,79 @@ int br_del_if(struct net_bridge *br, struct net_device *dev)
 	return 0;
 }
 
+/* Called with RTNL */
+int br_set_port_vlan(struct net_bridge_port *p, unsigned short vlan)
+{
+	unsigned long table_size = BITS_TO_LONGS(br_vid(VLAN_N_VID));
+	unsigned long *vid_map = NULL;
+	__u16 vid = br_vid(vlan);
+	int ret = 0;
+
+	/* The vlan map is indexed by vid+1.  This way we can store
+	 * vid 0 (untagged) into the map as well.
+	 */
+	if (!p->vlan_map) {
+		vid_map = kzalloc(table_size, GFP_KERNEL);
+		if (!vid_map) {
+			return -ENOMEM;
+		}
+
+		set_bit(vid, vid_map);
+		rcu_assign_pointer(p->vlan_map, vid_map);
+		synchronize_net();
+	} else {
+		/* Map is already allocated */
+		set_bit(vid, rcu_dereference_rtnl(p->vlan_map));
+	}
+
+	return ret;
+}
+
+
+/* Called with RTNL */
+int br_del_port_vlan(struct net_bridge_port *p, unsigned short vlan)
+{
+	unsigned long first_bit;
+	unsigned long next_bit;
+	__u16 vid = br_vid(vlan);
+	unsigned long tbl_len = BITS_TO_LONGS(br_vid(VLAN_N_VID));
+
+	if (!p->vlan_map) {
+		return -EINVAL;
+	}
+
+	if (!test_bit(vlan, p->vlan_map)) {
+		return -EINVAL;
+	}
+
+	/* Check to see if any other vlans are in this table.  If this
+	 * is the last vlan, delete the whole table.  If this is not the
+	 * last vlan, just clear the bit.
+	 */
+	first_bit = find_first_bit(p->vlan_map, tbl_len);
+	next_bit = find_next_bit(p->vlan_map, tbl_len, (tbl_len - vid));
+
+	if (first_bit != vid || next_bit < tbl_len) {
+		/* There are other vlans still configured.  We can simply
+		 * clear our bit and be safe.
+		 */
+		clear_bit(vid, rcu_dereference_rtnl(p->vlan_map));
+	} else {
+		unsigned long *map = NULL;
+
+		/* This is the last vlan we are removing.  Replace the
+		 * map with a NULL pointer and free the old map
+		 */
+		map = rcu_dereference(p->vlan_map);
+
+		rcu_assign_pointer(p->vlan_map, NULL);
+		synchronize_net();
+		kfree(map);
+	}
+
+	return 0;
+}
+
 void __net_exit br_net_exit(struct net *net)
 {
 	struct net_device *dev;
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index fe41260..8a97f93 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -16,6 +16,7 @@
 #include <net/rtnetlink.h>
 #include <net/net_namespace.h>
 #include <net/sock.h>
+#include <linux/if_vlan.h>
 
 #include "br_private.h"
 #include "br_private_stp.h"
@@ -140,6 +141,71 @@ skip:
 	return skb->len;
 }
 
+static int br_validate_vlan_info(struct nlattr *attr)
+{
+	struct nlattr *vinfo;
+	int rem;
+
+	nla_for_each_nested(vinfo, attr, rem) {
+		int type = nla_type(vinfo);
+		unsigned short vid = nla_get_u16(vinfo);
+
+		if (vid > VLAN_N_VID)
+			return -EINVAL;
+		
+		if (type < IFLA_BR_VLAN_ADD || type > IFLA_BR_VLAN_DEL)
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+const struct nla_policy ifla_br_policy[IFLA_BR_MAX + 1] = {
+	[IFLA_BR_VLAN_INFO] = { .type = NLA_NESTED },
+};
+
+static int br_afspec(struct net_bridge_port *p, struct nlattr *af_spec)
+{
+	struct nlattr *vinfo;
+	struct nlattr *tb[IFLA_BR_MAX+1];
+	int err;
+	int rem;
+	
+	if (nla_type(af_spec) != AF_BRIDGE)
+		return -EINVAL;
+
+	err = nla_parse_nested(tb, IFLA_BR_MAX, af_spec, ifla_br_policy);
+	if (err)
+		return err;
+
+	if (tb[IFLA_BR_VLAN_INFO]) {
+		err = br_validate_vlan_info(tb[IFLA_BR_VLAN_INFO]);
+		if (err)
+			return err;
+		
+		nla_for_each_nested(vinfo, tb[IFLA_BR_VLAN_INFO], rem) {
+			int type = nla_type(vinfo);
+			unsigned short vid = nla_get_u16(vinfo);
+
+			switch (type) {
+				case IFLA_BR_VLAN_ADD:
+					br_set_port_vlan(p, vid);
+					break;
+				case IFLA_BR_VLAN_DEL:
+					br_del_port_vlan(p, vid);
+					break;
+			}
+		}
+	}
+
+	return 0;
+}
+
+const struct nla_policy ifla_policy[IFLA_MAX+1] = {
+	[IFLA_PROTINFO]	= { .type = NLA_U8 },
+	[IFLA_AF_SPEC]	= { .type = NLA_NESTED },
+};
+
 /*
  * Change state of port (ie from forwarding to blocking etc)
  * Used by spanning tree in user space.
@@ -148,26 +214,23 @@ static int br_rtm_setlink(struct sk_buff *skb,  struct nlmsghdr *nlh, void *arg)
 {
 	struct net *net = sock_net(skb->sk);
 	struct ifinfomsg *ifm;
-	struct nlattr *protinfo;
+	struct nlattr *tb[IFLA_MAX+1];
 	struct net_device *dev;
 	struct net_bridge_port *p;
+	int err = 0;
 	u8 new_state;
 
 	if (nlmsg_len(nlh) < sizeof(*ifm))
 		return -EINVAL;
 
+	err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy);
+	if (err)
+		return err;
+	
 	ifm = nlmsg_data(nlh);
 	if (ifm->ifi_family != AF_BRIDGE)
 		return -EPFNOSUPPORT;
 
-	protinfo = nlmsg_find_attr(nlh, sizeof(*ifm), IFLA_PROTINFO);
-	if (!protinfo || nla_len(protinfo) < sizeof(u8))
-		return -EINVAL;
-
-	new_state = nla_get_u8(protinfo);
-	if (new_state > BR_STATE_BLOCKING)
-		return -EINVAL;
-
 	dev = __dev_get_by_index(net, ifm->ifi_index);
 	if (!dev)
 		return -ENODEV;
@@ -176,23 +239,35 @@ static int br_rtm_setlink(struct sk_buff *skb,  struct nlmsghdr *nlh, void *arg)
 	if (!p)
 		return -EINVAL;
 
-	/* if kernel STP is running, don't allow changes */
-	if (p->br->stp_enabled == BR_KERNEL_STP)
-		return -EBUSY;
+	if (tb[IFLA_PROTINFO]) {
+		new_state = nla_get_u8(tb[IFLA_PROTINFO]);
+		if (new_state > BR_STATE_BLOCKING)
+			return -EINVAL;
 
-	if (!netif_running(dev) ||
-	    (!netif_carrier_ok(dev) && new_state != BR_STATE_DISABLED))
-		return -ENETDOWN;
+		/* if kernel STP is running, don't allow changes */
+		if (p->br->stp_enabled == BR_KERNEL_STP)
+			return -EBUSY;
 
-	p->state = new_state;
-	br_log_state(p);
+		if (!netif_running(dev) ||
+		    (!netif_carrier_ok(dev) && new_state != BR_STATE_DISABLED))
+			return -ENETDOWN;
 
-	spin_lock_bh(&p->br->lock);
-	br_port_state_selection(p->br);
-	spin_unlock_bh(&p->br->lock);
+		p->state = new_state;
+		br_log_state(p);
 
-	br_ifinfo_notify(RTM_NEWLINK, p);
+		spin_lock_bh(&p->br->lock);
+		br_port_state_selection(p->br);
+		spin_unlock_bh(&p->br->lock);
 
+	}
+
+	if (tb[IFLA_AF_SPEC]) {
+		err = br_afspec(p, tb[IFLA_AF_SPEC]);
+		if (err)
+			return err;
+	}
+			
+	br_ifinfo_notify(RTM_NEWLINK, p);
 	return 0;
 }
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 166dcf4..8eb3ffc 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -417,6 +417,8 @@ extern int br_del_if(struct net_bridge *br,
 extern int br_min_mtu(const struct net_bridge *br);
 extern netdev_features_t br_features_recompute(struct net_bridge *br,
 	netdev_features_t features);
+extern int br_set_port_vlan(struct net_bridge_port *p, unsigned short vid);
+extern int br_del_port_vlan(struct net_bridge_port *p, unsigned short vid);
 
 /* br_input.c */
 extern int br_handle_frame_finish(struct sk_buff *skb);
-- 
1.7.7.6

^ permalink raw reply related

* [RFC PATCHv2 bridge 5/7] bridge: Add vlan support to static neighbors
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

---
 include/linux/neighbour.h |    2 +-
 net/bridge/br_fdb.c       |   12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/neighbour.h b/include/linux/neighbour.h
index 275e5d6..044df8f 100644
--- a/include/linux/neighbour.h
+++ b/include/linux/neighbour.h
@@ -7,7 +7,7 @@
 struct ndmsg {
 	__u8		ndm_family;
 	__u8		ndm_pad1;
-	__u16		ndm_pad2;
+	__u16		ndm_vlan;
 	__s32		ndm_ifindex;
 	__u16		ndm_state;
 	__u8		ndm_flags;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e17f9f2..3c21a3d 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -493,7 +493,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br,
 	ndm = nlmsg_data(nlh);
 	ndm->ndm_family	 = AF_BRIDGE;
 	ndm->ndm_pad1    = 0;
-	ndm->ndm_pad2    = 0;
+	ndm->ndm_vlan    = fdb->vlan_id;
 	ndm->ndm_flags	 = 0;
 	ndm->ndm_type	 = 0;
 	ndm->ndm_ifindex = fdb->dst ? fdb->dst->dev->ifindex : br->dev->ifindex;
@@ -640,25 +640,25 @@ int br_fdb_add(struct ndmsg *ndm, struct net_device *dev,
 
 	if (ndm->ndm_flags & NTF_USE) {
 		rcu_read_lock();
-		br_fdb_update(p->br, p, addr, 0);
+		br_fdb_update(p->br, p, addr, ndm->ndm_vlan);
 		rcu_read_unlock();
 	} else {
 		spin_lock_bh(&p->br->hash_lock);
 		err = fdb_add_entry(p, addr, ndm->ndm_state, nlh_flags,
-				0);
+				ndm->ndm_vlan);
 		spin_unlock_bh(&p->br->hash_lock);
 	}
 
 	return err;
 }
 
-static int fdb_delete_by_addr(struct net_bridge_port *p, u8 *addr)
+static int fdb_delete_by_addr(struct net_bridge_port *p, u8 *addr, u16 vlan)
 {
 	struct net_bridge *br = p->br;
 	struct hlist_head *head = &br->hash[br_mac_hash(addr, 0)];
 	struct net_bridge_fdb_entry *fdb;
 
-	fdb = fdb_find(head, addr, 0);
+	fdb = fdb_find(head, addr, vlan);
 	if (!fdb)
 		return -ENOENT;
 
@@ -681,7 +681,7 @@ int br_fdb_delete(struct ndmsg *ndm, struct net_device *dev,
 	}
 
 	spin_lock_bh(&p->br->hash_lock);
-	err = fdb_delete_by_addr(p, addr);
+	err = fdb_delete_by_addr(p, addr, ndm->ndm_vlan);
 	spin_unlock_bh(&p->br->hash_lock);
 
 	return err;
-- 
1.7.7.6

^ permalink raw reply related

* [RFC PATCHv2 bridge 6/7] bridge: Add sysfs interface to display VLANS
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

Add a binary sysfs file that will dump out vlans currently configured on the
port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 include/linux/if_bridge.h |    1 +
 net/bridge/br_sysfs_if.c  |   70 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 71 insertions(+), 0 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index 8476d6f..cc85739 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -20,6 +20,7 @@
 #define SYSFS_BRIDGE_PORT_SUBDIR "brif"
 #define SYSFS_BRIDGE_PORT_ATTR	"brport"
 #define SYSFS_BRIDGE_PORT_LINK	"bridge"
+#define SYSFS_BRIDGE_PORT_VLANS "vlans"
 
 #define BRCTL_VERSION 1
 
diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c
index 13b36bd..b07a743 100644
--- a/net/bridge/br_sysfs_if.c
+++ b/net/bridge/br_sysfs_if.c
@@ -234,6 +234,71 @@ const struct sysfs_ops brport_sysfs_ops = {
 };
 
 /*
+ * Export the vlan table for a given port as a binary file.
+ * The entire bitmap is exported.
+ *
+ * Returns the number of bytes read.
+ */
+static ssize_t brport_vlans_read(struct file *filp, struct kobject *kobj,
+				struct bin_attribute *bin_attr,
+				char *buf, loff_t off, size_t count)
+{
+	struct net_bridge_port *p = to_brport(kobj);
+	unsigned long *map = rcu_dereference(p->vlan_map);
+	ssize_t map_len;
+
+	/* Just write the map back to userspace.  brctl will interpret
+	 * it correctly.
+	 */
+	map_len = BITS_TO_LONGS(br_vid(VLAN_N_VID));
+	memcpy(buf, map, map_len);
+	return map_len;
+}
+
+/* Set a vlan id specified in @buf into the vlan map of the port.  The vlan id
+ * is specified as an unsinged short.
+ */
+static ssize_t brport_vlans_write(struct file *filp, struct kobject *kobj,
+				struct bin_attribute *bin_attr,
+				char *buf, loff_t off, size_t count)
+{
+	struct net_bridge_port *p = to_brport(kobj);
+	unsigned short val;
+	int rc = 0;
+
+	if (!capable(CAP_NET_ADMIN))
+		return -EPERM;
+
+	if (count < sizeof(unsigned short))
+		return -EINVAL;
+
+	val = *(unsigned short *)buf;
+
+	if (val > VLAN_N_VID)
+		return -EINVAL;
+
+	if (!rtnl_trylock())
+		return restart_syscall();
+
+	if (p->dev && p->br) {
+		rc = br_set_port_vlan(p, val);
+
+		if (rc == 0)
+			rc = count;
+	}
+
+	rtnl_unlock();
+	return rc;
+}
+
+static struct bin_attribute port_vlans = {
+	.attr = { .name = SYSFS_BRIDGE_PORT_VLANS,
+		  .mode = S_IRUGO | S_IWUSR, },
+	.read = brport_vlans_read,
+	.write = brport_vlans_write,
+};
+
+/*
  * Add sysfs entries to ethernet device added to a bridge.
  * Creates a brport subdirectory with bridge attributes.
  * Puts symlink in bridge's brif subdirectory
@@ -255,6 +320,11 @@ int br_sysfs_addif(struct net_bridge_port *p)
 			return err;
 	}
 
+	err = sysfs_create_bin_file(&p->kobj, &port_vlans);
+	if (err) {
+		return err;
+	}
+
 	strlcpy(p->sysfs_name, p->dev->name, IFNAMSIZ);
 	return sysfs_create_link(br->ifobj, &p->kobj, p->sysfs_name);
 }
-- 
1.7.7.6

^ permalink raw reply related

* [RFC PATCHv2 bridge 7/7] bridge:  Add the ability to show dump the vlan map from a bridge port
From: Vlad Yasevich @ 2012-09-19 12:42 UTC (permalink / raw)
  To: netdev; +Cc: shemminger, Vlad Yasevich
In-Reply-To: <1348058536-22607-1-git-send-email-vyasevic@redhat.com>

Using the RTM_GETLINK dump the vlan map of a given bridge port.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
---
 include/linux/if_link.h |    1 +
 net/bridge/br_netlink.c |   73 +++++++++++++++++++++++++++++++++++++++++++---
 net/bridge/br_private.h |    2 +
 3 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 38dbcff..6953233 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -416,6 +416,7 @@ enum {
 	IFLA_BR_VLAN_UNSPEC,
 	IFLA_BR_VLAN_ADD,
 	IFLA_BR_VLAN_DEL,
+	IFLA_BR_VLAN_MAP,
 	__IFLA_BR_VLAN_MAX,
 };
 #define IFLA_BR_VLAN_MAX (__IFLA_BR_VLAN_MAX - 1)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 8a97f93..72fec1b 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -69,9 +69,35 @@ static int br_fill_ifinfo(struct sk_buff *skb, const struct net_bridge_port *por
 	     nla_put(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr)) ||
 	    (dev->ifindex != dev->iflink &&
 	     nla_put_u32(skb, IFLA_LINK, dev->iflink)) ||
-	    (event == RTM_NEWLINK &&
+	    ((event == RTM_NEWLINK || event == RTM_GETLINK) &&
 	     nla_put_u8(skb, IFLA_PROTINFO, port->state)))
 		goto nla_put_failure;
+
+	if (event == RTM_GETLINK) {
+		unsigned long *map = rcu_dereference(port->vlan_map);
+		struct nlattr *af;
+		struct nlattr *vidinfo;
+
+		if (!map)
+			goto done;
+
+		af = nla_nest_start(skb, IFLA_AF_SPEC);
+		if (!af)
+			goto nla_put_failure;
+
+		vidinfo = nla_nest_start(skb, IFLA_BR_VLAN_INFO);
+		if (!vidinfo)
+			goto nla_put_failure;
+
+		if (nla_put(skb, IFLA_BR_VLAN_MAP,
+			    BITS_TO_LONGS(br_vid(VLAN_N_VID)), map))
+			goto nla_put_failure;
+
+		nla_nest_end(skb, vidinfo);
+		nla_nest_end(skb, af);
+	}
+
+done:
 	return nlmsg_end(skb, nlh);
 
 nla_put_failure:
@@ -129,7 +155,7 @@ static int br_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 
 		if (br_fill_ifinfo(skb, port,
 				   NETLINK_CB(cb->skb).pid,
-				   cb->nlh->nlmsg_seq, RTM_NEWLINK,
+				   cb->nlh->nlmsg_seq, RTM_GETLINK,
 				   NLM_F_MULTI) < 0)
 			break;
 skip:
@@ -283,6 +309,36 @@ static int br_validate(struct nlattr *tb[], struct nlattr *data[])
 	return 0;
 }
 
+static size_t br_get_link_af_size(const struct net_device *dev)
+{
+	struct net_bridge_port *p;
+	unsigned long *map;
+
+	rcu_read_lock();
+	p = br_port_get_rcu(dev);
+	if (!p)
+		goto err;
+
+	map = rcu_dereference(p->vlan_map);
+	if (!map)
+		goto err;
+
+	rcu_read_unlock();
+
+	/* Account for the full bitmap length.  We are going to export the
+	 * whole bitmap.
+	 */
+	return nla_total_size(BITS_TO_LONGS(br_vid(VLAN_N_VID)));
+err:
+	rcu_read_unlock();
+	return 0;
+}
+
+struct rtnl_af_ops br_af_ops = {
+	.family			= AF_BRIDGE,
+	.get_link_af_size	= br_get_link_af_size,
+};
+
 struct rtnl_link_ops br_link_ops __read_mostly = {
 	.kind		= "bridge",
 	.priv_size	= sizeof(struct net_bridge),
@@ -299,19 +355,25 @@ int __init br_netlink_init(void)
 	if (err < 0)
 		goto err1;
 
+	err = rtnl_af_register(&br_af_ops);
+	if (err < 0)
+		goto err2;
+
 	err = __rtnl_register(PF_BRIDGE, RTM_GETLINK, NULL,
 			      br_dump_ifinfo, NULL);
 	if (err)
-		goto err2;
+		goto err3;
 	err = __rtnl_register(PF_BRIDGE, RTM_SETLINK,
 			      br_rtm_setlink, NULL, NULL);
 	if (err)
-		goto err3;
+		goto err4;
 
 	return 0;
 
-err3:
+err4:
 	rtnl_unregister_all(PF_BRIDGE);
+err3:
+	rtnl_af_unregister(&br_af_ops);
 err2:
 	rtnl_link_unregister(&br_link_ops);
 err1:
@@ -320,6 +382,7 @@ err1:
 
 void __exit br_netlink_fini(void)
 {
+	rtnl_af_unregister(&br_af_ops);
 	rtnl_link_unregister(&br_link_ops);
 	rtnl_unregister_all(PF_BRIDGE);
 }
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 8eb3ffc..0575edb 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -419,6 +419,8 @@ extern netdev_features_t br_features_recompute(struct net_bridge *br,
 	netdev_features_t features);
 extern int br_set_port_vlan(struct net_bridge_port *p, unsigned short vid);
 extern int br_del_port_vlan(struct net_bridge_port *p, unsigned short vid);
+extern size_t br_port_fill_vlans(struct net_bridge_port *p, char *buf,
+				unsigned long max, unsigned long skip);
 
 /* br_input.c */
 extern int br_handle_frame_finish(struct sk_buff *skb);
-- 
1.7.7.6

^ permalink raw reply related

* Re: Oops with latest (netfilter) nf-next tree, when unloading iptable_nat
From: Jesper Dangaard Brouer @ 2012-09-19 12:46 UTC (permalink / raw)
  To: Patrick McHardy
  Cc: Pablo Neira Ayuso, Florian Westphal, netfilter-devel, netdev,
	yongjun_wei
In-Reply-To: <Pine.GSO.4.63.1209141511020.11622@stinky-local.trash.net>

On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
> On Fri, 14 Sep 2012, Pablo Neira Ayuso wrote:
> 
[...cut...]
> >> Patrick, any other idea?
> >
[...cut...]
> > >
> > We can add nf_nat_iterate_cleanup that can iterate over the NAT
> > hashtable to replace current usage of nf_ct_iterate_cleanup.
> 
> Lets just bail out when IPS_SRC_NAT_DONE is not set, that should also fix
> it. Could you try this patch please?

On Fri, 2012-09-14 at 15:15 +0200, Patrick McHardy wrote:
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
> index 29d4452..8b5d220 100644
> --- a/net/netfilter/nf_nat_core.c
> +++ b/net/netfilter/nf_nat_core.c
> @@ -481,6 +481,8 @@ static int nf_nat_proto_clean(struct nf_conn *i,
void *data)
>  
>         if (!nat)
>                 return 0;
> +       if (!(i->status & IPS_SRC_NAT_DONE))
> +               return 0;
>         if ((clean->l3proto && nf_ct_l3num(i) != clean->l3proto) ||
>             (clean->l4proto && nf_ct_protonum(i) != clean->l4proto))
>                 return 0;
> 

No it does not work :-(


[ 1216.310146] general protection fault: 0000 [#1] SMP 
[ 1216.311046] Modules linked in: netconsole ip_vs_lblc ip_vs_lc ip_vs_rr ip_vs libcrc32c ipt_MASQUERADE nf_nat_ipv4(-) nf_nat iptable_mangle xt_mark ip6table_mangle xt_LOG ip6table_filter ip6_tables virtio_balloon virtio_net [last unloaded: iptable_nat]
[ 1216.311046] CPU 1 
[ 1216.311046] Pid: 4052, comm: modprobe Not tainted 3.6.0-rc3-test-nat-unload-fix+ #32 Red Hat KVM
[ 1216.311046] RIP: 0010:[<ffffffffa002c303>]  [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
[ 1216.311046] RSP: 0018:ffff88007808fe18  EFLAGS: 00010246
[ 1216.311046] RAX: 0000000000000000 RBX: ffff8800728550c0 RCX: ffff8800756288b0
[ 1216.311046] RDX: dead000000200200 RSI: ffff88007808fe88 RDI: ffffffffa002f208
[ 1216.311046] RBP: ffff88007808fe28 R08: ffff88007808e000 R09: 0000000000000000
[ 1216.311046] R10: dead000000200200 R11: dead000000100100 R12: ffffffff81c6dc00
[ 1216.311046] R13: ffff8800787582b8 R14: ffff880078758278 R15: ffff88007808fe88
[ 1216.311046] FS:  00007f515985d700(0000) GS:ffff88007cd00000(0000) knlGS:0000000000000000
[ 1216.311046] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1216.311046] CR2: 00007f515986a000 CR3: 000000007867a000 CR4: 00000000000006e0
[ 1216.311046] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1216.311046] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1216.311046] Process modprobe (pid: 4052, threadinfo ffff88007808e000, task ffff8800756288b0)
[ 1216.311046] Stack:
[ 1216.311046]  ffff88007808fe68 ffffffffa002c290 ffff88007808fe78 ffffffff815614e3
[ 1216.311046]  ffffffff00000000 00000aeb00000246 ffff88007808fe68 ffffffff81c6dc00
[ 1216.311046]  ffff88007808fe88 ffffffffa00358a0 0000000000000000 000000000040f5b0
[ 1216.311046] Call Trace:
[ 1216.311046]  [<ffffffffa002c290>] ? nf_nat_net_exit+0x50/0x50 [nf_nat]
[ 1216.311046]  [<ffffffff815614e3>] nf_ct_iterate_cleanup+0xc3/0x170
[ 1216.311046]  [<ffffffffa002c55a>] nf_nat_l3proto_unregister+0x8a/0x100 [nf_nat]
[ 1216.311046]  [<ffffffff812a0303>] ? compat_prepare_timeout+0x13/0xb0
[ 1216.311046]  [<ffffffffa0035848>] nf_nat_l3proto_ipv4_exit+0x10/0x23 [nf_nat_ipv4]
[ 1216.311046]  [<ffffffff8109f4a5>] sys_delete_module+0x235/0x2b0
[ 1216.311046]  [<ffffffff810b8193>] ? __audit_syscall_entry+0x1b3/0x1f0
[ 1216.311046]  [<ffffffff810b8776>] ? __audit_syscall_exit+0x3e6/0x410
[ 1216.311046]  [<ffffffff816679e2>] system_call_fastpath+0x16/0x1b
[ 1216.311046] Code: 75 6e 0f b6 46 01 84 c0 74 05 3a 42 3e 75 61 80 7e 02 00 74 43 48 c7 c7 08 f2 02 a0 e8 37 3b 63 e1 48 8b 03 48 8b 53 08 48 85 c0 <48> 89 02 74 04 48 89 50 08 48 be 00 02 20 00 00 00 ad de 48 c7 
[ 1216.311046] RIP  [<ffffffffa002c303>] nf_nat_proto_clean+0x73/0xd0 [nf_nat]
[ 1216.311046]  RSP <ffff88007808fe18>


-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer



^ permalink raw reply

* Re: [PATCH net-next] net: only run neigh_forced_gc() from one cpu
From: Neil Horman @ 2012-09-19 12:54 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Maciej Żenczykowski, Tom Herbert,
	Lorenzo Colitti
In-Reply-To: <1348052957.26523.680.camel@edumazet-glaptop>

On Wed, Sep 19, 2012 at 01:09:17PM +0200, Eric Dumazet wrote:
> On Wed, 2012-09-19 at 13:07 +0200, Eric Dumazet wrote:
> > On Wed, 2012-09-19 at 06:50 -0400, Neil Horman wrote:
> > 
> > > This is going to cause callers in neigh_alloc to immediately fail their
> > > allocation attempts.  Would it be a good idea to modify that call site so that
> > > instead of returning NULL, instead reread tbl->entries before comparing to
> > > gc_thresh3, on the hope that the cpu in the garbage collecting routine has freed
> > > some entries?
> > 
> > neigh_alloc() fails only if gc_thresh3 is hit, and if it is hit, we are
> > under attack by definition.
> > 
> > (the gc is run every 5 seconds is above gc_thresh2, and below
> > gc_thresh3)
> > 
> > No matter what you try, the attacker is going to be the winner.
> > 
> > The best thing here is to drop packets, not spending several milli
> > seconds to serve one packet, as queues are going to tail drop anyway.
> > 
> 
> I meant several hundred of milli seconds per packet.
> 
> In our tests we even trigger a softlockup, so thats more than 10 seconds
> waiting for the rwlock, for a single packet.
> 
Ok, thanks for the explination
Acked-by: Neil Horman <nhorman@tuxdriver.com>

> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* [RFC PATCH] tcp: use of undefined variable
From: Alan Cox @ 2012-09-19 14:46 UTC (permalink / raw)
  To: netdev

From: Alan Cox <alan@linux.intel.com>

Both tcp_timewait_state_process and tcp_check_req use the same basic
construct of

	struct tcp_options received tmp_opt;
	tmp_opt.saw_tstamp = 0;

then call

	tcp_parse_options

However if they are fed a frame containing a TCP_SACK then tbe code
behaviour is undefined because opt_rx->sack_ok is undefined data.

This ought to be documented if it is intentional.

Signed-off-by: Alan Cox <alan@linux.intel.com>
---

 net/ipv4/tcp_minisocks.c |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index e965319..a4ace80 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -85,6 +85,8 @@ static bool tcp_in_window(u32 seq, u32 end_seq, u32 s_win, u32 e_win)
  * spinlock it. I do not want! Well, probability of misbehaviour
  * is ridiculously low and, seems, we could use some mb() tricks
  * to avoid misread sequence numbers, states etc.  --ANK
+ *
+ * We don't need to initialize tmp_out.sack_ok as we don't use the results
  */
 enum tcp_tw_status
 tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
@@ -96,6 +98,7 @@ tcp_timewait_state_process(struct inet_timewait_sock *tw, struct sk_buff *skb,
 	bool paws_reject = false;
 
 	tmp_opt.saw_tstamp = 0;
+
 	if (th->doff > (sizeof(*th) >> 2) && tcptw->tw_ts_recent_stamp) {
 		tcp_parse_options(skb, &tmp_opt, &hash_location, 0, NULL);
 
@@ -522,6 +525,8 @@ EXPORT_SYMBOL(tcp_create_openreq_child);
  *
  * XXX (TFO) - The current impl contains a special check for ack
  * validation and inside tcp_v4_reqsk_send_ack(). Can we do better?
+ *
+ * We don't need to initialize tmp_opt.sack_ok as we don't use the results
  */
 
 struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,

^ permalink raw reply related

* Re: [PATCH 2/2] Using LP firmware for taking advantage of the low-power capabilities.
From: Larry Finger @ 2012-09-19 15:08 UTC (permalink / raw)
  To: Jarl Friis
  Cc: Stefano Brivio, Gábor Stefanik, netdev, linux-wireless,
	b43-dev
In-Reply-To: <1348053493-22955-2-git-send-email-jarl@softace.dk>

On 09/19/2012 06:18 AM, Jarl Friis wrote:
> This is using the LP specific firmware to better take advantage of the
> Low-Power capabilities.
>
> Signed-off-by: Jarl Friis <jarl@softace.dk>
> ---
>   drivers/net/wireless/b43/main.c |   16 ++++++++++++++--
>   1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/wireless/b43/main.c b/drivers/net/wireless/b43/main.c
> index 202a0eb..9ee6030 100644
> --- a/drivers/net/wireless/b43/main.c
> +++ b/drivers/net/wireless/b43/main.c
> @@ -8,6 +8,7 @@
>     Copyright (c) 2005 Danny van Dyk <kugelfang@gentoo.org>
>     Copyright (c) 2005 Andreas Jaggi <andreas.jaggi@waterwave.ch>
>     Copyright (c) 2010-2011 Rafał Miłecki <zajec5@gmail.com>
> +  Copyright (c) 2012 Jarl Friis <jarl@softace.dk>
>
>     SDIO support
>     Copyright (c) 2009 Albert Herranz <albert_herranz@yahoo.es>
> @@ -72,6 +73,7 @@ MODULE_FIRMWARE("b43/ucode11.fw");
>   MODULE_FIRMWARE("b43/ucode13.fw");
>   MODULE_FIRMWARE("b43/ucode14.fw");
>   MODULE_FIRMWARE("b43/ucode15.fw");
> +MODULE_FIRMWARE("b43/ucode16_lp.fw");
>   MODULE_FIRMWARE("b43/ucode16_mimo.fw");
>   MODULE_FIRMWARE("b43/ucode5.fw");
>   MODULE_FIRMWARE("b43/ucode9.fw");
> @@ -2208,6 +2210,12 @@ static int b43_try_request_fw(struct b43_request_fw_context *ctx)
>   			else
>   				goto err_no_ucode;
>   			break;
> +		case B43_PHYTYPE_LP:
> +			if (rev >= 16)
> +				filename = "ucode16_lp";
> +			else
> +				goto err_no_ucode;
> +			break;
>   		case B43_PHYTYPE_HT:
>   			if (rev == 29)
>   				filename = "ucode29_mimo";
> @@ -2277,8 +2285,10 @@ static int b43_try_request_fw(struct b43_request_fw_context *ctx)
>   			filename = "lp0initvals13";
>   		else if (rev == 14)
>   			filename = "lp0initvals14";
> -		else if (rev >= 15)
> +		else if (rev == 15)
>   			filename = "lp0initvals15";
> +		else if (rev >= 16)
> +			filename = "lp0initvals16";
>   		else
>   			goto err_no_initvals;
>   		break;
> @@ -2336,8 +2346,10 @@ static int b43_try_request_fw(struct b43_request_fw_context *ctx)
>   			filename = "lp0bsinitvals13";
>   		else if (rev == 14)
>   			filename = "lp0bsinitvals14";
> -		else if (rev >= 15)
> +		else if (rev == 15)
>   			filename = "lp0bsinitvals15";
> +		else if (rev >= 16)
> +			filename = "lp0bsinitvals16";
>   		else
>   			goto err_no_initvals;
>   		break;

I have some questions about this patch. Where did you get the information needed 
to make these changes? Did it come from reverse engineering some Broadcom code, 
or did you look at their actual code? There is a great deal of difference 
relative to our "clean-room" status. Anyone that has seen non-GPL Broadcom 
material cannot contribute code to b43.

Have you tested this code on devices with rev>=16?

Now for some comments: This patch also needs the "b43:" added to the subject. In 
addition, you appear to have at least one white-space error in the 
MODULE_FIRMWARE line. Is the addition of your copyright to the driver warranted 
by this change? For example, I have made much larger contributions to b43 over 
the years before I started doing reverse-engineering on this driver, but I never 
added my copyright. Your "Signed-off-by" implies copyright for the patch.

Larry

^ permalink raw reply

* Re: [PATCH v2 4/9] net/macb: remove macb_get_drvinfo()
From: Ben Hutchings @ 2012-09-19 15:10 UTC (permalink / raw)
  To: Nicolas Ferre
  Cc: netdev, davem, havard, linux-arm-kernel, plagnioj,
	patrice.vilchez, linux-kernel
In-Reply-To: <3fbd9b0eb1e255eccd14ad43044e146776baa963.1348055112.git.nicolas.ferre@atmel.com>

On Wed, 2012-09-19 at 13:55 +0200, Nicolas Ferre wrote:
> This function has little meaning so remove it altogether and
> let ethtool core fill in the fields automatically.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
>  drivers/net/ethernet/cadence/macb.c | 11 -----------
>  1 file changed, 11 deletions(-)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index 2948553..31f945c 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -1217,20 +1217,9 @@ static int macb_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
>  	return phy_ethtool_sset(phydev, cmd);
>  }
>  
> -static void macb_get_drvinfo(struct net_device *dev,
> -			     struct ethtool_drvinfo *info)
> -{
> -	struct macb *bp = netdev_priv(dev);
> -
> -	strcpy(info->driver, bp->pdev->dev.driver->name);
> -	strcpy(info->version, "$Revision: 1.14 $");
> -	strcpy(info->bus_info, dev_name(&bp->pdev->dev));
> -}
> -
>  static const struct ethtool_ops macb_ethtool_ops = {
>  	.get_settings		= macb_get_settings,
>  	.set_settings		= macb_set_settings,
> -	.get_drvinfo		= macb_get_drvinfo,
>  	.get_link		= ethtool_op_get_link,
>  };
>  

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 2/4] ipv6: unify conntrack reassembly expire code with standard one
From: Jesper Dangaard Brouer @ 2012-09-19 15:12 UTC (permalink / raw)
  To: Cong Wang
  Cc: netdev, Netfilter Developers, Herbert Xu, Michal Kubeček,
	David Miller, Hideaki YOSHIFUJI, Patrick McHardy,
	Pablo Neira Ayuso
In-Reply-To: <1348023011-16195-3-git-send-email-amwang@redhat.com>

On Wed, 19 Sep 2012, Cong Wang wrote:

[cut]
> With this patch applied, I can see ICMP Time Exceeded sent
> from the receiver when the sender sent out 3/4 fragmented
> IPv6 UPD packet.

Typo "UPD" -> "UDP"

If people want to redo the IPv6 UDP fragment tests, they can use my scapy 
script, and comment out sending the last fragment:
  https://github.com/netoptimizer/network-testing/blob/master/scapy/ipv6_fragment01.py

Another thing, could you please "mark"/put the version of the patch in the 
subject line, like:

  [PATCH V4 2/4] ipv6: ...

This makes it easier, to follow on which version of the patch people are 
replying to.

With git send-email I think you have to do:

   git send-email --subject-prefix="PATCH V4"

And with stg (stacked git) I usually do:

   stg mail --version "V4" --to netdev ...

Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------

^ permalink raw reply

* Macvtap bug: contractor wanted
From: Richard Davies, Chris Webb @ 2012-09-19 15:11 UTC (permalink / raw)
  To: netdev, qemu-devel; +Cc: Jason Wang, Arnd Bergmann, Michael S. Tsirkin
In-Reply-To: <20120816153613.GA22326@redhat.com>

Hi. We run a cloud compute provider using qemu-kvm and macvtap and are keen
to find a paid contractor to fix a bug with unusably slow inbound networking
over macvtap.

We originally reported the bug in this thread (report copied below):

  http://marc.info/?t=134511098600002

We have also reproduced using only a Fedora 17 Live CD:

  https://bugzilla.redhat.com/show_bug.cgi?id=855640

This bug is a serious problem for us, since we have built a new version of our
product which suffers from it and did not realise in testing, only once we had
live production installs.

Many thanks to Michael Tsirkin for his initial help. However, we appreciate
that his time is limited and divided among many projects. Given the commercial
time pressure on us to fix this bug, we are keen to hire a contractor to start
work immediately.

If anyone knowledgeable in the area would be interested in being paid to work
on this, or if you know someone who might be, we would be delighted to hear
from you.

Cheers,

Chris and Richard.

P.S. The original report read as follows:

  I'm experiencing a problem with qemu + macvtap which I can reproduce on a
  variety of hardware, with kernels varying from 3.0.4 (the oldest I tried) to
  3.5.1 and with qemu[-kvm] versions 0.14.1, 1.0, and 1.1.

  Large data transfers over TCP into a guest from another machine on the
  network are very slow (often less than 100kB/s) whereas transfers outbound
  from the guest, between two guests on the same host, or between the guest
  and its host run at normal speeds (>= 50MB/s).

  The slow inbound data transfer speeds up substantially when a ping flood is
  aimed either at the host or the guest, or when the qemu process is straced.
  Presumably both of these are ways to wake up something that is otherwise
  sleeping too long?

  For example, I can run

    ip addr add 192.168.1.2/24 dev eth0
    ip link set eth0 up
    ip link add link eth0 name tap0 address 02:02:02:02:02:02 type macvtap mode bridge
    ip link set tap0 up
    qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
      -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
      -net tap,fd=3 3<>/dev/tap$(< /sys/class/net/tap0/ifindex)

  on one physical host which is otherwise completely idle. From a second
  physical host on the same network, I then scp a large (say 50MB) file onto
  the new guest. On a gigabit LAN, speeds consistently drop to less than
  100kB/s as the transfer progresses, within a second of starting.

  The choice of virtio virtual nic in the above isn't significant: the same thing
  happens with e1000 or rtl8139. You can also replace the scp with a straight
  netcat and see the same effect.

  Doing the transfer in the other direction (i.e. copying a large file from the
  guest to an external host) achieves 50MB/s or faster as expected. Copying
  between two guests on the same host (i.e. taking advantage of the 'mode
  bridge') is also fast.

  If I create a macvlan device attached to eth0 and move the host IP address to
  that, I can communicate between the host itself and the guest because of the
  'mode bridge'. Again, this case is fast in both directions.

  Using a bridge and a standard tap interface, transfers in and out are fast
  too:

    ip tuntap add tap0 mode tap
    brctl addbr br0
    brctl addif br0 eth0
    brctl addif br0 tap1
    ip link set eth0 up
    ip link set tap0 up
    ip link set br0 up
    ip addr add 192.168.1.2/24 dev br0
    qemu-kvm -hda debian.img -cpu host -m 512 -vnc :0 \
      -net nic,model=virtio,macaddr=02:02:02:02:02:02 \
      -net tap,script=no,downscript=no,ifname=tap0

  As mentioned in the summary at the beginning of this report, when I strace a
  guest in the original configuration which is receiving data slowly, the data
  rate improves from less than 100kB/s to around 3.1MB/s. Similarly, if I ping
  flood either the guest or the host it is running on from another machine on
  the network, the transfer rate improves to around 1.1MB/s. This seems quite
  suggestive of a problem with delayed wake-up of the guest.

  Two reasonably up-to-date examples of machines I've reproduced this on are
  my laptop with an r8169 gigabit ethernet card, Debian qemu-kvm 1.0 and
  upstream 3.4.8 kernel whose .config and boot dmesg are at

    http://cdw.me.uk/tmp/laptop-config.txt
    http://cdw.me.uk/tmp/laptop-dmesg.txt

  and one of our large servers with an igb gigabit ethernet card, upstream
  qemu-kvm 1.1.1 and upstream 3.5.1 linux:

    http://cdw.me.uk/tmp/server-config.txt
    http://cdw.me.uk/tmp/server-dmesg.txt

  For completeness, I've put the Debian 6 test image I've been using for
  testing at

    http://cdw.me.uk/tmp/test-debian.img.xz

  though I've see the same problem from a variety of guest operating systems.
  (In fact, I've not yet found any combination of host kernel, guest OS and
  hardware which doesn't show these symptoms, so it seems to be very easy to
  reproduce.)

We later found that

  -CONFIG_INTEL_IDLE=y
  +# CONFIG_INTEL_IDLE is not set

helped the problem on my laptop, but none of the obvious similar things made
any difference on AMD hardware.

The bug appears whether or not vhost-net is used, and irrespective of emulated
NIC in qemu, so is very likely to be a kernel issue rather than a qemu issue.

^ permalink raw reply

* Re: [PATCH v2 7/9] net/macb: ethtool interface: add register dump feature
From: Ben Hutchings @ 2012-09-19 15:14 UTC (permalink / raw)
  To: Nicolas Ferre
  Cc: netdev, davem, havard, linux-arm-kernel, plagnioj,
	patrice.vilchez, linux-kernel
In-Reply-To: <20ebfb29f6f4a84d8ba20553e2d81cd456f438de.1348055112.git.nicolas.ferre@atmel.com>

On Wed, 2012-09-19 at 13:55 +0200, Nicolas Ferre wrote:
> Add macb_get_regs() ethtool function and its helper function:
> macb_get_regs_len().
> The version field is deduced from the IP revision which gives the
> "MACB or GEM" information. An additional version field is reserved.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
[...]

Please also send the register dump decoder for the ethtool utility once
this series is accepted.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [RFC] tcp: use order-3 pages in tcp_sendmsg()
From: Eric Dumazet @ 2012-09-19 15:14 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120917.130732.1894375657044880827.davem@davemloft.net>

On Mon, 2012-09-17 at 13:07 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Mon, 17 Sep 2012 19:04:53 +0200
> 
> > On Mon, 2012-09-17 at 19:02 +0200, Eric Dumazet wrote:
> > 
> >> A driver already exports a dev->gso_max_size, dev->gso_max_segs, I guess
> >> it could export a dev->max_seg_order (default to 0)
> > 
> > Oh well, if we use a per thread order-3 page, a driver wont define an
> > order, but the max size of a segment (dev->max_seg_size).
> 
> Since you said that your audit showed that most can handle arbitrary
> segment sizes, it's better to default to infinity or similar.
> 
> Otherwise we'll have to annotate almost every single driver with a
> non-zero value, that's not an efficient way to handle this and
> deploy the higher performance quickly.

I did some tests and got no problem so far, even using splice() [ this
one was tricky because it only deals with order-0 pages at this moment ]

NIC tested : ixgbe, igb, bnx2x, tg3, mellanox mlx4

On loopback, performance of netperf goes from 31900 Mb/s to 38500 Mb/s,
thats a 20 % increase.

^ permalink raw reply

* Re: [RFC PATCHv2 bridge 5/7] bridge: Add vlan support to static neighbors
From: John Fastabend @ 2012-09-19 15:20 UTC (permalink / raw)
  To: Vlad Yasevich; +Cc: netdev, shemminger
In-Reply-To: <1348058536-22607-6-git-send-email-vyasevic@redhat.com>

On 9/19/2012 5:42 AM, Vlad Yasevich wrote:
> ---
>   include/linux/neighbour.h |    2 +-
>   net/bridge/br_fdb.c       |   12 ++++++------
>   2 files changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/neighbour.h b/include/linux/neighbour.h
> index 275e5d6..044df8f 100644
> --- a/include/linux/neighbour.h
> +++ b/include/linux/neighbour.h
> @@ -7,7 +7,7 @@
>   struct ndmsg {
>   	__u8		ndm_family;
>   	__u8		ndm_pad1;
> -	__u16		ndm_pad2;
> +	__u16		ndm_vlan;

But ndm_pad2 is also used in neighbour.c so you'll need to fix that up
as well.

net/core/neighbour.c: In function âneigh_fill_infoâ:
net/core/neighbour.c:2152: error: âstruct ndmsgâ has no member named 
ândm_pad2â
net/core/neighbour.c: In function âpneigh_fill_infoâ:
net/core/neighbour.c:2203: error: âstruct ndmsgâ has no member named 
ândm_pad2â
make[2]: *** [net/core/neighbour.o] Error 1
make[1]: *** [net/core] Error 2
make[1]: *** Waiting for unfinished jobs....
make: *** [net] Error 2
make: *** Waiting for unfinished jobs....

^ permalink raw reply

* Re: [RFC PATCHv2 bridge 5/7] bridge: Add vlan support to static neighbors
From: Vlad Yasevich @ 2012-09-19 15:24 UTC (permalink / raw)
  To: John Fastabend; +Cc: netdev, shemminger
In-Reply-To: <5059E2BB.8060507@intel.com>

On 09/19/2012 11:20 AM, John Fastabend wrote:
> On 9/19/2012 5:42 AM, Vlad Yasevich wrote:
>> ---
>>   include/linux/neighbour.h |    2 +-
>>   net/bridge/br_fdb.c       |   12 ++++++------
>>   2 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/include/linux/neighbour.h b/include/linux/neighbour.h
>> index 275e5d6..044df8f 100644
>> --- a/include/linux/neighbour.h
>> +++ b/include/linux/neighbour.h
>> @@ -7,7 +7,7 @@
>>   struct ndmsg {
>>       __u8        ndm_family;
>>       __u8        ndm_pad1;
>> -    __u16        ndm_pad2;
>> +    __u16        ndm_vlan;
>
> But ndm_pad2 is also used in neighbour.c so you'll need to fix that up
> as well.
>
> net/core/neighbour.c: In function âneigh_fill_infoâ:
> net/core/neighbour.c:2152: error: âstruct ndmsgâ has no member named
> ândm_pad2â
> net/core/neighbour.c: In function âpneigh_fill_infoâ:
> net/core/neighbour.c:2203: error: âstruct ndmsgâ has no member named
> ândm_pad2â
> make[2]: *** [net/core/neighbour.o] Error 1
> make[1]: *** [net/core] Error 2
> make[1]: *** Waiting for unfinished jobs....
> make: *** [net] Error 2
> make: *** Waiting for unfinished jobs....


dough!!!  patches from wrong branch.  sorry about that.

-vlad

^ permalink raw reply

* Re: Possible networking regression in 3.6.0
From: Chris Clayton @ 2012-09-19 15:26 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1347979239.26523.267.camel@edumazet-glaptop>

>
> It would help to have some traffic sample, maybe.
>
> Especially if the problem is not easily reproductible for us.
>

OK, I've used an netsniff-ng to capture the traffic on all interfaces on 
the host (that would be tap0 and eth0, I guess) whilst attempting to 
ping the router from the WinXP KVM client. The result is a pcap file 
that I processed with tcpdump to produce:

reading from file net-trace.pcap, link-type EN10MB (Ethernet)
14:56:31.406336 ARP, Request who-has 192.168.200.254 tell 192.168.200.1, 
length 28
         0x0000:  0001 0800 0604 0001 5254 0c3b 1728 c0a8
         0x0010:  c801 0000 0000 0000 c0a8 c8fe
14:56:31.406357 ARP, Reply 192.168.200.254 is-at 46:83:93:8f:f0:7e, 
length 28
         0x0000:  0001 0800 0604 0002 4683 938f f07e c0a8
         0x0010:  c8fe 5254 0c3b 1728 c0a8 c801
14:56:31.406534 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id 
512, seq 4352, length 40
         0x0000:  4500 003c 0195 0000 8001 efd8 c0a8 c801
         0x0010:  c0a8 0001 0800 3a5c 0200 1100 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:31.406566 ARP, Request who-has 192.168.0.1 tell 192.168.0.40, 
length 28
         0x0000:  0001 0800 0604 0001 5c9a d85c 6331 c0a8
         0x0010:  0028 0000 0000 0000 c0a8 0001
14:56:31.410830 ARP, Reply 192.168.0.1 is-at 00:1f:33:80:09:44, length 46
         0x0000:  0001 0800 0604 0002 001f 3380 0944 c0a8
         0x0010:  0001 5c9a d85c 6331 c0a8 0028 c0a8 0001
         0x0020:  e000 0001 1164 ee9b 0000 0000 4500
14:56:31.410851 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id 
512, seq 4352, length 40
         0x0000:  4500 003c 0195 0000 7f01 b8b2 c0a8 0028
         0x0010:  c0a8 0001 0800 3a5c 0200 1100 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:31.414474 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512, 
seq 4352, length 40
         0x0000:  4500 003c cf4f 0000 ff01 6af7 c0a8 0001
         0x0010:  c0a8 0028 0000 425c 0200 1100 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:36.404781 ARP, Request who-has 192.168.0.40 tell 192.168.0.1, 
length 46
         0x0000:  0001 0800 0604 0001 001f 3380 0944 c0a8
         0x0010:  0001 0000 0000 0000 c0a8 0028 c0a8 0001
         0x0020:  c0a8 0028 0000 425c 0200 1100 6162
14:56:36.404806 ARP, Reply 192.168.0.40 is-at 5c:9a:d8:5c:63:31, length 28
         0x0000:  0001 0800 0604 0002 5c9a d85c 6331 c0a8
         0x0010:  0028 001f 3380 0944 c0a8 0001
14:56:36.689750 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id 
512, seq 4608, length 40
         0x0000:  4500 003c 0196 0000 8001 efd7 c0a8 c801
         0x0010:  c0a8 0001 0800 395c 0200 1200 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:36.689774 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id 
512, seq 4608, length 40
         0x0000:  4500 003c 0196 0000 7f01 b8b1 c0a8 0028
         0x0010:  c0a8 0001 0800 395c 0200 1200 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:36.693330 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512, 
seq 4608, length 40
         0x0000:  4500 003c cf50 0000 ff01 6af6 c0a8 0001
         0x0010:  c0a8 0028 0000 415c 0200 1200 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:42.189424 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id 
512, seq 4864, length 40
         0x0000:  4500 003c 0197 0000 8001 efd6 c0a8 c801
         0x0010:  c0a8 0001 0800 385c 0200 1300 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:42.189447 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id 
512, seq 4864, length 40
         0x0000:  4500 003c 0197 0000 7f01 b8b0 c0a8 0028
         0x0010:  c0a8 0001 0800 385c 0200 1300 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:42.193029 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512, 
seq 4864, length 40
         0x0000:  4500 003c cf51 0000 ff01 6af5 c0a8 0001
         0x0010:  c0a8 0028 0000 405c 0200 1300 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:47.689414 IP 192.168.200.1 > 192.168.0.1: ICMP echo request, id 
512, seq 5120, length 40
         0x0000:  4500 003c 0198 0000 8001 efd5 c0a8 c801
         0x0010:  c0a8 0001 0800 375c 0200 1400 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:47.689439 IP 192.168.0.40 > 192.168.0.1: ICMP echo request, id 
512, seq 5120, length 40
         0x0000:  4500 003c 0198 0000 7f01 b8af c0a8 0028
         0x0010:  c0a8 0001 0800 375c 0200 1400 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869
14:56:47.693661 IP 192.168.0.1 > 192.168.0.40: ICMP echo reply, id 512, 
seq 5120, length 40
         0x0000:  4500 003c cf52 0000 ff01 6af4 c0a8 0001
         0x0010:  c0a8 0028 0000 3f5c 0200 1400 6162 6364
         0x0020:  6566 6768 696a 6b6c 6d6e 6f70 7172 7374
         0x0030:  7576 7761 6263 6465 6667 6869

Is this what you asked for?

Chris

^ permalink raw reply

* Re: New commands to configure IOV features
From: Greg Rose @ 2012-09-19 15:53 UTC (permalink / raw)
  To: Yuval Mintz
  Cc: davem@davemloft.net, netdev@vger.kernel.org, Ariel Elior,
	Eilon Greenstein
In-Reply-To: <5059A767.2090307@broadcom.com>

On Wed, 19 Sep 2012 14:07:19 +0300
Yuval Mintz <yuvalmin@broadcom.com> wrote:

> >>> Back to the original discussion though--has anyone got any ideas
> >>> about the best way to trigger runtime creation of VFs?  I don't
> >>> know what the binary APIs looks like, but via sysfs I could see
> >>> something like
> >>>
> >>> echo number_of_new_vfs_to_create >
> >>> /sys/bus/pci/devices/<address>/create_vfs
> >>>
> >>> Something else that occurred to me--is there buy-in from driver
> >>> maintainers?  I know the Intel ethernet drivers (what I'm most
> >>> familiar
> >>> with) would need to be substantially modified to support
> >>> on-the-fly addition of new vfs.  Currently they assume that the
> >>> number of vfs is known at module init time.
> >>
> >> Why couldn't rtnl_link_ops be used for this. It is already the
> >> preferred interface to create vlan's, bond devices, and other
> >> virtual devices? The one issue is that do the created VF's exist
> >> in kernel as devices or only visible to guest?
> > 
> > I would say that rtnl_link_ops are network oriented and not
> > appropriate for something like a storage controller or graphics
> > device, which are two other common SR-IOV capable devices.
> 
> Hi Dave,
> 
> We're currently fine-tuning our SRIOV support, which we will shortly
> send upstream.
> 
> We've encountered a problem though - all drivers currently supporting
> SRIOV do so with the usage of a module param: e.g., 'max_vfs' for
> ixgbe, 'num_vfs' for benet, etc.
> The SRIOV feature is disabled by default on all the drivers; it can
> only be enabled via usage of the module param.
> 
> We don't want the lack of SRIOV module param in the bnx2x driver to be
> the bottle-neck when we'll submit the SRIOV feature upstream, and we
> also don't want to enable SRIOV by default (following the same logic
> of other drivers; most users don't use SRIOV and it would strain their
> resources).
> 
> As we see it, there are several possible ways of solving the issue:
>  1. Use some network-tool (e.g., ethtool).
>  2. Implement a standard sysfs interface for PCIe devices, as SRIOV is
>     not solely network-related (this should be done via the PCI linux
>     tree).

I was not able to attend the Linux conference held at the end of August
myself but coworkers of mine here at Intel informed that method 2 here
seems to be the preferred approach.  Perhaps some folks who attended
the the conference can chime in with more specifics.

- Greg
LAN Access Division
Intel Corp.



>  3. Implement a module param in our bnx2x code.
> 
> We would like to know what's your preferred method for solving this
> issue, and to hear if you have another (better?) method by which we
> can add this kind of support.
> 
> Thanks,
> Yuval Mintz
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net-next] net: more accurate network taps in transmit path
From: Jamie Gloudon @ 2012-09-19 15:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1348037089.26523.397.camel@edumazet-glaptop>

Just to report. This patch fixed the invalid tcp tx checksum issue via tap for me. Thanks!

On Wed, Sep 19, 2012 at 08:44:49AM +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> dev_queue_xmit_nit() should be called right before ndo_start_xmit()
> calls or we might give wrong packet contents to taps users :
> 
> Packet checksum can be changed, or packet can be linearized or
> segmented, and segments partially sent for the later case.
> 
> Also a memory allocation can fail and packet never really hit the
> driver entry point.
> 
> Reported-by: Jamie Gloudon <jamie.gloudon@gmail.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  net/core/dev.c |    9 ++++++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/dev.c b/net/core/dev.c
> index dcc673d..52cd1d7 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2213,9 +2213,6 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  		if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
>  			skb_dst_drop(skb);
>  
> -		if (!list_empty(&ptype_all))
> -			dev_queue_xmit_nit(skb, dev);
> -
>  		features = netif_skb_features(skb);
>  
>  		if (vlan_tx_tag_present(skb) &&
> @@ -2250,6 +2247,9 @@ int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
>  			}
>  		}
>  
> +		if (!list_empty(&ptype_all))
> +			dev_queue_xmit_nit(skb, dev);
> +
>  		skb_len = skb->len;
>  		rc = ops->ndo_start_xmit(skb, dev);
>  		trace_net_dev_xmit(skb, rc, dev, skb_len);
> @@ -2272,6 +2272,9 @@ gso:
>  		if (dev->priv_flags & IFF_XMIT_DST_RELEASE)
>  			skb_dst_drop(nskb);
>  
> +		if (!list_empty(&ptype_all))
> +			dev_queue_xmit_nit(nskb, dev);
> +
>  		skb_len = nskb->len;
>  		rc = ops->ndo_start_xmit(nskb, dev);
>  		trace_net_dev_xmit(nskb, rc, dev, skb_len);
> 
> 

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox