Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next v5 0/6] bonding: Patchset for rcu use in bonding
From: Ding Tianhong @ 2013-09-25  9:52 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Veaceslav Falico, Netdev

Hi:

The Patch Set convert the xmit of 3ad and alb mode to use rcu lock.
restructure and add more rcu list function.
add rtnl lock to protect bonding_store_xmit_hash().
remove read lock in bond_3ad_lacpdu_recv().

I test the patch well and no problems found till now.

v1: add 1 patch named remove the no effect lock for bond_3ad_lacpdu_recv().
    get the advice from Nikolay Aleksandrov, and modify some mistake in the code.

v2: accept the Nikolay Aleksandrov's advise, modify the patch 05/06.
    the new bond_for_each_slave_from will not use bond->slave_cnt and int cnt any more,
    it test well.
    move the rcu_dereference call of curr_active_slave after the lock avoid 
    a race condition with the slave removal. 

v3: according to the advise of Veaceslav Falico and David S. Miller, modify the patch 01/06 and patch 05/06.
    add rcu protect for read bond->slave_list.
    simplify the bond_for_each_slave_from() and  bond_for_each_slave_from_rcu()

v4: thanks for Veaceslav Falico's patient guidance, fix the problem of patch 01/06 and
patch 05/06.
    restructure the bond_first_slave_rcu().
    restructure the bond_for_each_slave() and bond_for_each_slave_rcu().

v5: modifty the patch 01/06,05/06,06/06
    restructure the bond_first_slave_rcu().
    remove the bond_for_each_slave_from_rcu(), it is no need to use till now, and the restructure is really
    hard work, so miss it, and rebuild the logic for bond_for_each_slave_from in rcu protect.

Ding Tianhong (4):
Wang Yufen (1):
Yang Yingliang (1):
  root (6):
  bonding: simplify and use RCU protection for 3ad xmit path
  bonding: remove the no effect lock for bond_3ad_lacpdu_recv()
  bonding: replace read_lock to rcu_read_lock for
  bond_3ad_get_active_agg_info()
  bonding: add rtnl lock for bonding_store_xmit_hash
  bonding: restructure and add rcu for bond_for_each_slave_next()
  bonding: use RCU protection for alb xmit path

 drivers/net/bonding/bond_3ad.c   | 38 ++++++++++++---------------
 drivers/net/bonding/bond_alb.c   | 23 +++++++----------
 drivers/net/bonding/bond_main.c  |  6 ++---
 drivers/net/bonding/bond_sysfs.c |  4 +++
 drivers/net/bonding/bonding.h    | 56 +++++++++++++++++++++++++++++++++++-----
 5 files changed, 82 insertions(+), 45 deletions(-)

-- 
1.8.2.1

^ permalink raw reply

* Re: [PATCH] ipvs: improved SH fallback strategy
From: Alexander Frolkin @ 2013-09-25  9:26 UTC (permalink / raw)
  To: Simon Horman
  Cc: Sergei Shtylyov, Julian Anastasov, lvs-devel, Wensong Zhang,
	netdev, linux-kernel
In-Reply-To: <20130925003033.GG26081@verge.net.au>

Improve the SH fallback realserver selection strategy.

With sh and sh-fallback, if a realserver is down, this attempts to
distribute the traffic that would have gone to that server evenly
among the remaining servers.
 
Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk>

--
diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c
index 3588fae..3d5ab7c 100644
--- a/net/netfilter/ipvs/ip_vs_sh.c
+++ b/net/netfilter/ipvs/ip_vs_sh.c
@@ -115,27 +115,47 @@ ip_vs_sh_get(struct ip_vs_service *svc, struct ip_vs_sh_state *s,
 }
 
 
-/* As ip_vs_sh_get, but with fallback if selected server is unavailable */
+/* As ip_vs_sh_get, but with fallback if selected server is unavailable
+ *
+ * The fallback strategy loops around the table starting from a "random"
+ * point (in fact, it is chosen to be the original hash value to make the
+ * algorithm deterministic) to find a new server.
+ */
 static inline struct ip_vs_dest *
 ip_vs_sh_get_fallback(struct ip_vs_service *svc, struct ip_vs_sh_state *s,
 		      const union nf_inet_addr *addr, __be16 port)
 {
-	unsigned int offset;
-	unsigned int hash;
+	unsigned int offset, roffset;
+	unsigned int hash, ihash;
 	struct ip_vs_dest *dest;
 
-	for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) {
-		hash = ip_vs_sh_hashkey(svc->af, addr, port, offset);
-		dest = rcu_dereference(s->buckets[hash].dest);
-		if (!dest)
-			break;
-		if (is_unavailable(dest))
-			IP_VS_DBG_BUF(6, "SH: selected unavailable server "
-				      "%s:%d (offset %d)",
+	/* first try the dest it's supposed to go to */
+	ihash = ip_vs_sh_hashkey(svc->af, addr, port, 0);
+	dest = rcu_dereference(s->buckets[ihash].dest);
+	if (!dest)
+		return NULL;
+	if (is_unavailable(dest)) {
+		IP_VS_DBG_BUF(6, "SH: selected unavailable server "
+		      "%s:%d, reselecting",
+		      IP_VS_DBG_ADDR(svc->af, &dest->addr),
+		      ntohs(dest->port));
+		/* if the original dest is unavailable, loop around the table
+		 * starting from ihash to find a new dest
+		 */
+		for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) {
+			roffset = (offset + ihash) % IP_VS_SH_TAB_SIZE;
+			hash = ip_vs_sh_hashkey(svc->af, addr, port, roffset);
+			dest = rcu_dereference(s->buckets[hash].dest);
+			if (is_unavailable(dest))
+				IP_VS_DBG_BUF(6, "SH: selected unavailable "
+				      "server %s:%d (offset %d), reselecting",
 				      IP_VS_DBG_ADDR(svc->af, &dest->addr),
-				      ntohs(dest->port), offset);
-		else
-			return dest;
+				      ntohs(dest->port), roffset);
+			else
+				return dest;
+		}
+	} else {
+		return dest;
 	}
 
 	return NULL;

^ permalink raw reply related

* [PATCH 2/2] net: qmi_wwan: fix checkpatch warnings
From: Fabio Porcedda @ 2013-09-25  9:21 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, David S. Miller, Bjørn Mork, Dan Williams
In-Reply-To: <1380100886-16531-1-git-send-email-fabio.porcedda@gmail.com>

Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com>
---
 drivers/net/usb/qmi_wwan.c | 56 +++++++++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 20 deletions(-)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 5f6b6fa..0e59f9e 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -143,16 +143,22 @@ static const struct net_device_ops qmi_wwan_netdev_ops = {
 	.ndo_validate_addr	= eth_validate_addr,
 };
 
-/* using a counter to merge subdriver requests with our own into a combined state */
+/* using a counter to merge subdriver requests with our own into a
+ * combined state
+ */
 static int qmi_wwan_manage_power(struct usbnet *dev, int on)
 {
 	struct qmi_wwan_state *info = (void *)&dev->data;
 	int rv = 0;
 
-	dev_dbg(&dev->intf->dev, "%s() pmcount=%d, on=%d\n", __func__, atomic_read(&info->pmcount), on);
+	dev_dbg(&dev->intf->dev, "%s() pmcount=%d, on=%d\n", __func__,
+		atomic_read(&info->pmcount), on);
 
-	if ((on && atomic_add_return(1, &info->pmcount) == 1) || (!on && atomic_dec_and_test(&info->pmcount))) {
-		/* need autopm_get/put here to ensure the usbcore sees the new value */
+	if ((on && atomic_add_return(1, &info->pmcount) == 1) ||
+	    (!on && atomic_dec_and_test(&info->pmcount))) {
+		/* need autopm_get/put here to ensure the usbcore sees
+		 * the new value
+		 */
 		rv = usb_autopm_get_interface(dev->intf);
 		if (rv < 0)
 			goto err;
@@ -199,7 +205,8 @@ static int qmi_wwan_register_subdriver(struct usbnet *dev)
 	atomic_set(&info->pmcount, 0);
 
 	/* register subdriver */
-	subdriver = usb_cdc_wdm_register(info->control, &dev->status->desc, 4096, &qmi_wwan_cdc_wdm_manage_power);
+	subdriver = usb_cdc_wdm_register(info->control, &dev->status->desc,
+					 4096, &qmi_wwan_cdc_wdm_manage_power);
 	if (IS_ERR(subdriver)) {
 		dev_err(&info->control->dev, "subdriver registration failed\n");
 		rv = PTR_ERR(subdriver);
@@ -228,7 +235,8 @@ static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 	struct usb_driver *driver = driver_of(intf);
 	struct qmi_wwan_state *info = (void *)&dev->data;
 
-	BUILD_BUG_ON((sizeof(((struct usbnet *)0)->data) < sizeof(struct qmi_wwan_state)));
+	BUILD_BUG_ON((sizeof(((struct usbnet *)0)->data) <
+		      sizeof(struct qmi_wwan_state)));
 
 	/* set up initial state */
 	info->control = intf;
@@ -250,7 +258,8 @@ static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 				goto err;
 			}
 			if (h->bLength != sizeof(struct usb_cdc_header_desc)) {
-				dev_dbg(&intf->dev, "CDC header len %u\n", h->bLength);
+				dev_dbg(&intf->dev, "CDC header len %u\n",
+					h->bLength);
 				goto err;
 			}
 			break;
@@ -260,7 +269,8 @@ static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 				goto err;
 			}
 			if (h->bLength != sizeof(struct usb_cdc_union_desc)) {
-				dev_dbg(&intf->dev, "CDC union len %u\n", h->bLength);
+				dev_dbg(&intf->dev, "CDC union len %u\n",
+					h->bLength);
 				goto err;
 			}
 			cdc_union = (struct usb_cdc_union_desc *)buf;
@@ -271,15 +281,15 @@ static int qmi_wwan_bind(struct usbnet *dev, struct usb_interface *intf)
 				goto err;
 			}
 			if (h->bLength != sizeof(struct usb_cdc_ether_desc)) {
-				dev_dbg(&intf->dev, "CDC ether len %u\n",  h->bLength);
+				dev_dbg(&intf->dev, "CDC ether len %u\n",
+					h->bLength);
 				goto err;
 			}
 			cdc_ether = (struct usb_cdc_ether_desc *)buf;
 			break;
 		}
 
-		/*
-		 * Remember which CDC functional descriptors we've seen.  Works
+		/* Remember which CDC functional descriptors we've seen.  Works
 		 * for all types we care about, of which USB_CDC_ETHERNET_TYPE
 		 * (0x0f) is the highest numbered
 		 */
@@ -293,10 +303,14 @@ next_desc:
 
 	/* Use separate control and data interfaces if we found a CDC Union */
 	if (cdc_union) {
-		info->data = usb_ifnum_to_if(dev->udev, cdc_union->bSlaveInterface0);
-		if (desc->bInterfaceNumber != cdc_union->bMasterInterface0 || !info->data) {
-			dev_err(&intf->dev, "bogus CDC Union: master=%u, slave=%u\n",
-				cdc_union->bMasterInterface0, cdc_union->bSlaveInterface0);
+		info->data = usb_ifnum_to_if(dev->udev,
+					     cdc_union->bSlaveInterface0);
+		if (desc->bInterfaceNumber != cdc_union->bMasterInterface0 ||
+		    !info->data) {
+			dev_err(&intf->dev,
+				"bogus CDC Union: master=%u, slave=%u\n",
+				cdc_union->bMasterInterface0,
+				cdc_union->bSlaveInterface0);
 			goto err;
 		}
 	}
@@ -374,8 +388,7 @@ static int qmi_wwan_suspend(struct usb_interface *intf, pm_message_t message)
 	struct qmi_wwan_state *info = (void *)&dev->data;
 	int ret;
 
-	/*
-	 * Both usbnet_suspend() and subdriver->suspend() MUST return 0
+	/* Both usbnet_suspend() and subdriver->suspend() MUST return 0
 	 * in system sleep context, otherwise, the resume callback has
 	 * to recover device from previous suspend failure.
 	 */
@@ -383,7 +396,8 @@ static int qmi_wwan_suspend(struct usb_interface *intf, pm_message_t message)
 	if (ret < 0)
 		goto err;
 
-	if (intf == info->control && info->subdriver && info->subdriver->suspend)
+	if (intf == info->control && info->subdriver &&
+	    info->subdriver->suspend)
 		ret = info->subdriver->suspend(intf, message);
 	if (ret < 0)
 		usbnet_resume(intf);
@@ -396,7 +410,8 @@ static int qmi_wwan_resume(struct usb_interface *intf)
 	struct usbnet *dev = usb_get_intfdata(intf);
 	struct qmi_wwan_state *info = (void *)&dev->data;
 	int ret = 0;
-	bool callsub = (intf == info->control && info->subdriver && info->subdriver->resume);
+	bool callsub = (intf == info->control && info->subdriver &&
+			info->subdriver->resume);
 
 	if (callsub)
 		ret = info->subdriver->resume(intf);
@@ -777,7 +792,8 @@ static const struct usb_device_id products[] = {
 };
 MODULE_DEVICE_TABLE(usb, products);
 
-static int qmi_wwan_probe(struct usb_interface *intf, const struct usb_device_id *prod)
+static int qmi_wwan_probe(struct usb_interface *intf,
+			  const struct usb_device_id *prod)
 {
 	struct usb_device_id *id = (struct usb_device_id *)prod;
 
-- 
1.8.4

^ permalink raw reply related

* [PATCH 1/2] net: qmi_wwan: add Telit LE920 newer firmware support
From: Fabio Porcedda @ 2013-09-25  9:21 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, David S. Miller,
	Bjørn Mork, Dan Williams

Newer firmware use a new pid and a different interface.

Signed-off-by: Fabio Porcedda <fabio.porcedda-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 drivers/net/usb/qmi_wwan.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 6312332..5f6b6fa 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -714,6 +714,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x2357, 0x0201, 4)},	/* TP-LINK HSUPA Modem MA180 */
 	{QMI_FIXED_INTF(0x2357, 0x9000, 4)},	/* TP-LINK MA260 */
 	{QMI_FIXED_INTF(0x1bc7, 0x1200, 5)},	/* Telit LE920 */
+	{QMI_FIXED_INTF(0x1bc7, 0x1201, 2)},	/* Telit LE920 */
 	{QMI_FIXED_INTF(0x1e2d, 0x12d1, 4)},	/* Cinterion PLxx */
 
 	/* 4. Gobi 1000 devices */
-- 
1.8.4

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH] ipvs: improved SH fallback strategy
From: Alexander Frolkin @ 2013-09-25  9:01 UTC (permalink / raw)
  To: Simon Horman
  Cc: Sergei Shtylyov, Julian Anastasov, lvs-devel, Wensong Zhang,
	netdev, linux-kernel
In-Reply-To: <20130925003033.GG26081@verge.net.au>

Hi,

> could you add some comments to the code or at least a description of the
> algorithm to the above the function.  The intent of original code may not
> have been obvious to the eye but this version certainly isn't obvious to
> mine.

Sure.  I have a bad habit of assuming that if I understand something,
then others automatically do too. :-)

The original code went through the table, starting at the same place as
the code without fallback and if that returned an unavailable
realserver, it offset the hash by one and repeated the lookup, then added
two, etc., up to IP_VS_SH_TAB_SIZE-1.  So the hash offset was 0,
1, ..., IP_VS_SH_TAB_SIZE-1.

The result is that if a server is down, all traffic destined for it
would fall back onto the next server in the list.

The new code also starts at the same place as the old code (offset 0),
but if that fails, it uses the same fallback strategy as the old code,
but the hash offset is now ihash, ihash + 1, ..., IP_VS_SH_TAB_SIZE-1,
0, 1, ..., ihash - 1, i.e., it starts at ihash instead of 0 and loops
around the table.  ihash could have been a random number, but choosing
it to be something based on the source IP and port (in which case it may
as well be the same hash [offset 0]) means that the behaviour will be
the same on different directors. 

This spreads the load of an unavailable server across the remaining
servers instead of just moving it to the next one in the list.

Hope that makes sense...

I'll submit a patch with a comment shortly.

Alex

^ permalink raw reply

* [PATCH RFC] random: introduce get_random_bytes_busy_wait_initialized
From: Hannes Frederic Sowa @ 2013-09-25  9:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, davem, netdev, jesse.brandeburg, tytso, linux-kernel
In-Reply-To: <1380028797.3165.65.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 06:19:57AM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> A host might need net_secret[] and never open a single socket. 
> 
> Problem added in commit aebda156a570782
> ("net: defer net_secret[] initialization")
> 
> Based on prior patch from Hannes Frederic Sowa.
> 
> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Perhaps we can even do a bit better? This patch is a RFC and I could split the
random and network parts if needed.

[PATCH RFC] random: introduce get_random_bytes_busy_wait_initialized

We want to use good entropy for initializing the secret keys used for
hashing in the core network stack. So busy wait before extracting random
data until the nonblocking_pool is initialized.

Further entropy is also gathered by interrupts, so we are guaranteed to
make progress here.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 drivers/char/random.c  | 18 ++++++++++++++++++
 include/linux/random.h |  1 +
 net/core/secure_seq.c  |  3 ++-
 net/ipv4/af_inet.c     |  2 +-
 4 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 7737b5b..50e8030 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1058,6 +1058,24 @@ void get_random_bytes(void *buf, int nbytes)
 EXPORT_SYMBOL(get_random_bytes);
 
 /*
+ * Busy loop until the nonblocking_pool is intialized and return
+ * random data in buf of size nbytes.
+ *
+ * This is used by the network stack to defer the extraction of
+ * entropy from the nonblocking_pool until the pool is initialized.
+ *
+ * We need to busy loop here, because we could be called from an
+ * atomic section.
+ */
+void get_random_bytes_busy_wait_initialized(void *buf, int nbytes)
+{
+	while (!nonblocking_pool.initialized)
+		cpu_relax();
+	get_random_bytes(buf, nbytes);
+}
+EXPORT_SYMBOL(get_random_bytes_busy_wait_initialized);
+
+/*
  * This function will use the architecture-specific hardware random
  * number generator if it is available.  The arch-specific hw RNG will
  * almost certainly be faster than what we can do in software, but it
diff --git a/include/linux/random.h b/include/linux/random.h
index 3b9377d..0b7e7dd 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -15,6 +15,7 @@ extern void add_input_randomness(unsigned int type, unsigned int code,
 extern void add_interrupt_randomness(int irq, int irq_flags);
 
 extern void get_random_bytes(void *buf, int nbytes);
+void get_random_bytes_busy_wait_initialized(void *buf, int nbbytes);
 extern void get_random_bytes_arch(void *buf, int nbytes);
 void generate_random_uuid(unsigned char uuid_out[16]);
 
diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 3f1ec15..ac55cb7 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -24,7 +24,8 @@ static void net_secret_init(void)
 
 	for (i = NET_SECRET_SIZE; i > 0;) {
 		do {
-			get_random_bytes(&tmp, sizeof(tmp));
+			get_random_bytes_busy_wait_initialized(&tmp,
+							       sizeof(tmp));
 		} while (!tmp);
 		cmpxchg(&net_secret[--i], 0, tmp);
 	}
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index cfeb85c..3edd277 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -260,7 +260,7 @@ void build_ehash_secret(void)
 	u32 rnd;
 
 	do {
-		get_random_bytes(&rnd, sizeof(rnd));
+		get_random_bytes_busy_wait_initialized(&rnd, sizeof(rnd));
 	} while (rnd == 0);
 
 	if (cmpxchg(&inet_ehash_secret, 0, rnd) == 0)
-- 
1.8.3.1

^ permalink raw reply related

* Re: BUG: MARK in OUTPUT + ip_tunnel causes kernel panic
From: Steffen Klassert @ 2013-09-25  8:59 UTC (permalink / raw)
  To: Konstantin Kuzov; +Cc: netdev
In-Reply-To: <loom.20130925T095425-819@post.gmane.org>

On Wed, Sep 25, 2013 at 08:31:52AM +0000, Konstantin Kuzov wrote:
> Kristian Evensen <kristian.evensen <at> gmail.com> writes:
> 
> > When trying to tunnel traffic originating from the same machine as the
> > tunnel endpoint, I am experiencing kernel panics for some types of
> > traffic (ICMP and UDP). TCP seems not to be affected by this, at least
> > I have not been able to trigger the panic.
> > 
> > I have one tunnel (without an IP address) and use policy routing to
> > steer some traffic through the tunnels.
> [...]
> > An interesting thing is that I have seen different kernel panics being
> > triggered. The other one I have seen has RIP pointing to
> > e1000_xmit_frame() and the message "protocol 0800 is buggy". However,
> > the one I have posted is by far the most common.
> I'm experiencing the same issue on two different machines. It happens on any 
> kernel starting from 3.10 when ip_tunnel/ip_tunnel_core were introduced.
> 

Can you please try the patch below?
I've posted the same patch already to netdev in the morning.

Subject: [PATCH net 1/2] ip_tunnel: Fix a memory corruption in ip_tunnel_xmit

We might extend the used aera of a skb beyond the total
headroom when we install the ipip header. Fix this by
calling skb_cow_head() unconditionally.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/ip_tunnel.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index ac9fabe..b8ce640 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -641,13 +641,13 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
 
 	max_headroom = LL_RESERVED_SPACE(rt->dst.dev) + sizeof(struct iphdr)
 			+ rt->dst.header_len;
-	if (max_headroom > dev->needed_headroom) {
+	if (max_headroom > dev->needed_headroom)
 		dev->needed_headroom = max_headroom;
-		if (skb_cow_head(skb, dev->needed_headroom)) {
-			dev->stats.tx_dropped++;
-			dev_kfree_skb(skb);
-			return;
-		}
+
+	if (skb_cow_head(skb, dev->needed_headroom)) {
+		dev->stats.tx_dropped++;
+		dev_kfree_skb(skb);
+		return;
 	}
 
 	err = iptunnel_xmit(rt, skb, fl4.saddr, fl4.daddr, protocol,
-- 
1.7.9.5

^ permalink raw reply related

* Re: BUG: MARK in OUTPUT + ip_tunnel causes kernel panic
From: Konstantin Kuzov @ 2013-09-25  8:31 UTC (permalink / raw)
  To: netdev
In-Reply-To: <CAKfDRXgJwyhE6sg1Ej=S35A+-dnPiU9TBA69zb-n7hGXgG+Sxg@mail.gmail.com>

Kristian Evensen <kristian.evensen <at> gmail.com> writes:

> When trying to tunnel traffic originating from the same machine as the
> tunnel endpoint, I am experiencing kernel panics for some types of
> traffic (ICMP and UDP). TCP seems not to be affected by this, at least
> I have not been able to trigger the panic.
> 
> I have one tunnel (without an IP address) and use policy routing to
> steer some traffic through the tunnels.
[...]
> An interesting thing is that I have seen different kernel panics being
> triggered. The other one I have seen has RIP pointing to
> e1000_xmit_frame() and the message "protocol 0800 is buggy". However,
> the one I have posted is by far the most common.
I'm experiencing the same issue on two different machines. It happens on any 
kernel starting from 3.10 when ip_tunnel/ip_tunnel_core were introduced.

But in my configuration I have addresses on tunnel interfaces and only doing 
masquerading on postrouting...

Same as you it triggers different kernel panics depends on which kernel 
modules involved (nic, vbox, etc...) here are some samples:
http://nosferatu.g0x.ru/pub/kerneloops/

But most common one looks like that:

[   81.797190] skbuff: skb_under_panic: text:ffffffff8170307f len:142 put:14 
head:ffff88040cd12a00 data:ffff88040cd129ee tail:0x7c end:0xc0 dev:v33
[   81.797240] ------------[ cut here ]------------
[   81.797256] kernel BUG at net/core/skbuff.c:126!
[   81.797272] invalid opcode: 0000 [#1] SMP 
[   81.797291] Modules linked in: ext2
[   81.797309] CPU: 0 PID: 4654 Comm: ffmpeg Not tainted 3.11.1 #3
[   81.797328] Hardware name: Gigabyte Technology Co., Ltd. Z68A-D3H-
B3/Z68A-D3H-B3, BIOS F13 03/20/2012
[   81.797356] task: ffff88040cd5bc80 ti: ffff880407c38000 task.ti: 
ffff880407c38000
[   81.797380] RIP: 0010:[<ffffffff81881a55>]  [<ffffffff81881a55>] 
skb_panic+0x5e/0x60
[   81.797411] RSP: 0000:ffff88041fa03898  EFLAGS: 00010296
[   81.797428] RAX: 0000000000000084 RBX: ffff88040a2bd300 RCX: 
0000000000000000
[   81.797450] RDX: ffff88041fa0eb48 RSI: ffff88041fa0d258 RDI: 
ffff88041fa0d258
[   81.797472] RBP: ffff88041fa038b8 R08: 0000000000000000 R09: 
00000000000003ca
[   81.797494] R10: 0000000000000001 R11: 0000000000aaaaaa R12: 
ffff88040b90aed8
[   81.797516] R13: 000000000000000e R14: ffff88040b90aee8 R15: 
0000000000000000
[   81.797538] FS:  00007ff3002a0740(0000) GS:ffff88041fa00000(0000) 
knlGS:0000000000000000
[   81.797564] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   81.797582] CR2: 0000000008212000 CR3: 000000040a097000 CR4: 
00000000000407f0
[   81.797604] Stack:
[   81.797613]  ffff88040cd129ee 000000000000007c 00000000000000c0 
ffff88040c71e000
[   81.797643]  ffff88041fa038c8 ffffffff816971d5 ffff88041fa03928 
ffffffff8170307f
[   81.797673]  ffffffff81ee10a0 ffff88040c71e000 ffff88040d79dfc0 
fe971fac81ee10a0
[   81.797703] Call Trace:
[   81.797713]  <IRQ> 
[   81.797721] 
[   81.797730]  [<ffffffff816971d5>] skb_push+0x35/0x40
[   81.797745]  [<ffffffff8170307f>] ip_finish_output+0x2af/0x3a0
[   81.797765]  [<ffffffff81703a98>] ip_output+0x88/0x90
[   81.797782]  [<ffffffff81703214>] ip_local_out+0x24/0x30
[   81.797801]  [<ffffffff8174291b>] iptunnel_xmit+0x17b/0x1b0
[   81.797820]  [<ffffffff81744560>] ip_tunnel_xmit+0x2e0/0x7d0
[   81.797839]  [<ffffffff8174a9ec>] ipip_tunnel_xmit+0x5c/0x70
[   81.797859]  [<ffffffff816a8240>] dev_hard_start_xmit+0x300/0x510
[   81.798750]  [<ffffffff81758668>] ? nf_nat_ipv4_out+0x58/0x100
[   81.799648]  [<ffffffff816a8718>] dev_queue_xmit+0x2c8/0x460
[   81.800543]  [<ffffffff816ae14c>] neigh_direct_output+0xc/0x10
[   81.801434]  [<ffffffff81702f7f>] ip_finish_output+0x1af/0x3a0
[   81.802337]  [<ffffffff81703a98>] ip_output+0x88/0x90
[   81.803243]  [<ffffffff81703214>] ip_local_out+0x24/0x30
[   81.804143]  [<ffffffff817044f4>] ip_send_skb+0x14/0x50
[   81.805040]  [<ffffffff81704562>] ip_push_pending_frames+0x32/0x40
[   81.805950]  [<ffffffff8172e1ae>] icmp_push_reply+0xee/0x120
[   81.806851]  [<ffffffff8172e969>] icmp_send+0x419/0x490
[   81.807750]  [<ffffffff816a6082>] ? __netif_receive_skb_core+0x622/0x7f0
[   81.808651]  [<ffffffff81077c00>] ? update_curr+0x10/0x160
[   81.809552]  [<ffffffff816b00e0>] ? neigh_invalidate+0x120/0x120
[   81.810445]  [<ffffffff816f933d>] ipv4_link_failure+0x1d/0x70
[   81.811333]  [<ffffffff8172c50d>] arp_error_report+0x2d/0x40
[   81.812217]  [<ffffffff816b004c>] neigh_invalidate+0x8c/0x120
[   81.813102]  [<ffffffff816b0316>] neigh_timer_handler+0x236/0x2a0
[   81.813989]  [<ffffffff8104fb3a>] call_timer_fn+0x3a/0x110
[   81.814883]  [<ffffffff816b00e0>] ? neigh_invalidate+0x120/0x120
[   81.815787]  [<ffffffff81050ea0>] run_timer_softirq+0x1c0/0x2a0
[   81.816695]  [<ffffffff81048ee9>] __do_softirq+0xe9/0x230
[   81.817605]  [<ffffffff81049185>] irq_exit+0x95/0xa0
[   81.818513]  [<ffffffff8102da75>] smp_apic_timer_interrupt+0x45/0x60
[   81.819431]  [<ffffffff8188feca>] apic_timer_interrupt+0x6a/0x70
[   81.820355]  <EOI> 
[   81.820363] 
[   81.821277]  [<ffffffff8188f312>] ? system_call_fastpath+0x16/0x1b
[   81.822208] Code: 00 00 48 89 44 24 10 8b 87 d0 00 00 00 48 89 44 24 08 
48 8b 87 e0 00 00 00 48 c7 c7 80 fa c5 81 48 89 04 24 31 c0 e8 94 95 ff ff 
<0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 
[   81.823365] RIP  [<ffffffff81881a55>] skb_panic+0x5e/0x60
[   81.824409]  RSP <ffff88041fa03898>

I also can't trace why that happens and strangely I can't reproduce that 
issue on virtualbox. Have you discovered anything more about this issue in 
past month?

^ permalink raw reply

* Re: SMSC 9303 support
From: Florian Fainelli @ 2013-09-25  8:24 UTC (permalink / raw)
  To: Gary Thomas; +Cc: Ben Hutchings, netdev
In-Reply-To: <5241E448.1060403@mlbassoc.com>

2013/9/24 Gary Thomas <gary@mlbassoc.com>:
> On 2013-09-24 12:29, Florian Fainelli wrote:
>>
>> Hello,
>>
>> 2013/9/24 Gary Thomas <gary@mlbassoc.com>:
>>>
>>> On 2013-09-24 10:51, Ben Hutchings wrote:
>>>>
>>>>
>>>> On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
>>>>>
>>>>>
>>>>> I need to support the SMSC9303 in an embedded system.  I'm not
>>>>> finding any [explicit] support for this device in the latest
>>>>> mainline kernel.  Did I miss something?
>>>>>
>>>>> To be clear, the SMSC9303 is a 3-port managed ethernet switch
>>>>> capable of supporting 802.1D/802.1Q directly. This switch is
>>>>> driven by a single MAC via MII/RMII and exposes the other two
>>>>> ports via physical PHYs.  What I need it to do is behave like
>>>>> two external, separate devices.  I was thinking that what I need
>>>>> to do is treat these as VLAN devices since the switch can manage
>>>>> the routing.
>>>>>
>>>>> Does this seem like a reasonable approach?
>>>>
>>>>
>>>>
>>>> Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
>>>> of packets to indicate which switch port they are sent or received
>>>> through.  This was originally added to support some Marvell switch chips
>>>> and I don't know whether it would be suitable or extensible for this
>>>> one.
>>>
>>>
>>>
>>> I've used the DSA stuff for years (worked directly with the Marvell folks
>>> when it was being developed).  It might work for this device, I'll think
>>> some more about using it although I was hoping for a lighter weight
>>> solution.
>>
>>
>> I do not think DSA is suitable for pure 802.1q switches such as this
>> one. OpenWrt has an out of tree patch which adds some switch-specific
>> operations that can be controlled over netlink (currently trying to
>> get them in a shape where they can be submitted for mainline
>> inclusion) [1], which I think is much more suitable than DSA or any
>> other proprietary switch tagging mechanism.
>>
>> [1]:
>> https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/swconfig.c
>>
>
> This looks interesting.  Do you have any more information on how to
> integrate this and/or use it?

Here are a couple of drivers that implement these "switch ops", my
favorite being b53 because it shows nicely how SPI, MDIO or MMIO
switch can be supported within the same core:

https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/b53

adm6996 is also pretty straight forward:

https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/adm6996.c

and here is the user-space command line tool to query/control these
(needs libnl):

https://dev.openwrt.org/browser/trunk/package/network/config/swconfig/src
-- 
Florian

^ permalink raw reply

* [PATCH] iproute2: bridge: Close file with bridge monitor file
From: Petr Písař @ 2013-09-25  7:45 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger, Petr Písař

The `bridge monitor file FILENAME' reads dumped netlink messages from
a file. But it forgot to close the file after using it. This patch
fixes it.

Signed-off-by: Petr Písař <ppisar@redhat.com>
---
 bridge/monitor.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/bridge/monitor.c b/bridge/monitor.c
index e96fcaf..76e7d47 100644
--- a/bridge/monitor.c
+++ b/bridge/monitor.c
@@ -132,12 +132,15 @@ int do_monitor(int argc, char **argv)
 
 	if (file) {
 		FILE *fp;
+		int err;
 		fp = fopen(file, "r");
 		if (fp == NULL) {
 			perror("Cannot fopen");
 			exit(-1);
 		}
-		return rtnl_from_file(fp, accept_msg, stdout);
+		err = rtnl_from_file(fp, accept_msg, stdout);
+		fclose(fp);
+		return err;
 	}
 
 	if (rtnl_open(&rth, groups) < 0)
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH] moxa: drop free_irq of devm_request_irq allocated irq
From: Wei Yongjun @ 2013-09-25  7:33 UTC (permalink / raw)
  To: grant.likely, rob.herring, davem, jg1.han, jonas.jensen
  Cc: yongjun_wei, netdev

From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>

irq allocated with devm_request_irq should not be freed using
free_irq, because doing so causes a dangling pointer, and a
subsequent double free.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
---
 drivers/net/ethernet/moxa/moxart_ether.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
index 83c2091..9a7fcb5 100644
--- a/drivers/net/ethernet/moxa/moxart_ether.c
+++ b/drivers/net/ethernet/moxa/moxart_ether.c
@@ -531,7 +531,6 @@ static int moxart_remove(struct platform_device *pdev)
 	struct net_device *ndev = platform_get_drvdata(pdev);
 
 	unregister_netdev(ndev);
-	free_irq(ndev->irq, ndev);
 	moxart_mac_free_memory(ndev);
 	free_netdev(ndev);
 

^ permalink raw reply related

* Kernel Panic Sending Frames Using dev_queue_xmit()
From: Merlin Davis @ 2013-09-25  7:32 UTC (permalink / raw)
  To: netdev

I am getting kernel panics in the 3.5.0 kernel (and 2.6.39) when
sending Ethernet frames using allocated SKBs passed to
dev_queue_xmit() from a kernel module.  I have trimmed the problem to
a very small module that continually sends the same (captured and
verified) frame from one net_device.  The frames are sent without a
socket in I believe the same way that net/ipv4/arp.c creates and sends
them.  The panic can occur after a few seconds or a few minutes, but
always occurs.  It is USUALLY a "kernel NULL pointer dereference" and
USUALLY occurs within __kfree_skb(), but both the error and the
location can vary.  I have not been successful in capturing a complete
oops message.

There is a GitHub repository with the module, a makefile, and more
notes about how I have tested and what I have tried at
https://github.com/me-the-wizard/ksend

I do not know if there is something I am doing wrong with the SKBs or
if I have stumbled upon a bug in the kernel.  After weeks of attempts
I believe I have reached the limit of my ability to figure this out.
Can anyone offer some guidance?

Thank you,
Merlin

^ permalink raw reply

* [PATCH v2] powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file
From: Aida Mynzhasova @ 2013-09-25  7:24 UTC (permalink / raw)
  To: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ
  Cc: devicetree-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	richardcochran-Re5JQEeQqe8AvxtiuMwx3w

Currently IEEE 1588 timer reference clock source is determined through
hard-coded value in gianfar_ptp driver. This patch allows to select ptp
clock source by means of device tree file node.

For instance:

	fsl,cksel = <0>;

for using external (TSEC_TMR_CLK input) high precision timer
reference clock.

Other acceptable values:

	<1> : eTSEC system clock
	<2> : eTSEC1 transmit clock
	<3> : RTC clock input

When this attribute isn't used, eTSEC system clock will serve as
IEEE 1588 timer reference clock.

Signed-off-by: Aida Mynzhasova <aida.mynzhasova-Fmhy8gsqeTEvJsYlp49lxw@public.gmane.org>
---
 Documentation/devicetree/bindings/net/fsl-tsec-phy.txt | 16 +++++++++++++++-
 drivers/net/ethernet/freescale/gianfar_ptp.c           |  4 +++-
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
index 2c6be03..eb06059 100644
--- a/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
+++ b/Documentation/devicetree/bindings/net/fsl-tsec-phy.txt
@@ -86,6 +86,7 @@ General Properties:
 
 Clock Properties:
 
+  - fsl,cksel        Timer reference clock source.
   - fsl,tclk-period  Timer reference clock period in nanoseconds.
   - fsl,tmr-prsc     Prescaler, divides the output clock.
   - fsl,tmr-add      Frequency compensation value.
@@ -97,7 +98,7 @@ Clock Properties:
   clock. You must choose these carefully for the clock to work right.
   Here is how to figure good values:
 
-  TimerOsc     = system clock               MHz
+  TimerOsc     = selected reference clock   MHz
   tclk_period  = desired clock period       nanoseconds
   NominalFreq  = 1000 / tclk_period         MHz
   FreqDivRatio = TimerOsc / NominalFreq     (must be greater that 1.0)
@@ -114,6 +115,18 @@ Clock Properties:
   Pulse Per Second (PPS) signal, since this will be offered to the PPS
   subsystem to synchronize the Linux clock.
 
+  "fsl,cksel" property allows to select different reference clock
+  sources:
+
+  <0> - external high precision timer reference clock (TSEC_TMR_CLK
+        input is used for this purpose);
+  <1> - eTSEC system clock;
+  <2> - eTSEC1 transmit clock;
+  <3> - RTC clock input.
+
+  When this attribute is not used, eTSEC system clock will serve as
+  IEEE 1588 timer reference clock.
+
 Example:
 
 	ptp_clock@24E00 {
@@ -121,6 +134,7 @@ Example:
 		reg = <0x24E00 0xB0>;
 		interrupts = <12 0x8 13 0x8>;
 		interrupt-parent = < &ipic >;
+		fsl,cksel       = <1>;
 		fsl,tclk-period = <10>;
 		fsl,tmr-prsc    = <100>;
 		fsl,tmr-add     = <0x999999A4>;
diff --git a/drivers/net/ethernet/freescale/gianfar_ptp.c b/drivers/net/ethernet/freescale/gianfar_ptp.c
index 098f133..e006a09 100644
--- a/drivers/net/ethernet/freescale/gianfar_ptp.c
+++ b/drivers/net/ethernet/freescale/gianfar_ptp.c
@@ -452,7 +452,9 @@ static int gianfar_ptp_probe(struct platform_device *dev)
 	err = -ENODEV;
 
 	etsects->caps = ptp_gianfar_caps;
-	etsects->cksel = DEFAULT_CKSEL;
+
+	if (get_of_u32(node, "fsl,cksel", &etsects->cksel))
+		etsects->cksel = DEFAULT_CKSEL;
 
 	if (get_of_u32(node, "fsl,tclk-period", &etsects->tclk_period) ||
 	    get_of_u32(node, "fsl,tmr-prsc", &etsects->tmr_prsc) ||
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH v5 net-next 12/27] bonding: rework rlb_next_rx_slave() to use bond_for_each_slave()
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

Currently, we're using bond_for_each_slave_from(), which is really hard to
implement under RCU and/or neighbour list.

Remove it and use bond_for_each_slave() instead, taking care of the last
used slave.

Also, rename next_rx_slave to rx_slave and store the current (last)
rx_slave.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    New patch.

 drivers/net/bonding/bond_alb.c | 41 +++++++++++++++++++++--------------------
 drivers/net/bonding/bond_alb.h |  4 +---
 2 files changed, 22 insertions(+), 23 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index f4929ce..c5b85ad 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -383,30 +383,31 @@ out:
 static struct slave *rlb_next_rx_slave(struct bonding *bond)
 {
 	struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
-	struct slave *rx_slave, *slave, *start_at;
-	int i = 0;
-
-	if (bond_info->next_rx_slave)
-		start_at = bond_info->next_rx_slave;
-	else
-		start_at = bond_first_slave(bond);
-
-	rx_slave = NULL;
+	struct slave *before = NULL, *rx_slave = NULL, *slave;
+	struct list_head *iter;
+	bool found = false;
 
-	bond_for_each_slave_from(bond, slave, i, start_at) {
-		if (SLAVE_IS_OK(slave)) {
-			if (!rx_slave) {
-				rx_slave = slave;
-			} else if (slave->speed > rx_slave->speed) {
+	bond_for_each_slave(bond, slave, iter) {
+		if (!SLAVE_IS_OK(slave))
+			continue;
+		if (!found) {
+			if (!before || before->speed < slave->speed)
+				before = slave;
+		} else {
+			if (!rx_slave || rx_slave->speed < slave->speed)
 				rx_slave = slave;
-			}
 		}
+		if (slave == bond_info->rx_slave)
+			found = true;
 	}
+	/* we didn't find anything after the current or we have something
+	 * better before and up to the current slave
+	 */
+	if (!rx_slave || (before && rx_slave->speed < before->speed))
+		rx_slave = before;
 
-	if (rx_slave) {
-		slave = bond_next_slave(bond, rx_slave);
-		bond_info->next_rx_slave = slave;
-	}
+	if (rx_slave)
+		bond_info->rx_slave = rx_slave;
 
 	return rx_slave;
 }
@@ -1611,7 +1612,7 @@ void bond_alb_deinit_slave(struct bonding *bond, struct slave *slave)
 	tlb_clear_slave(bond, slave, 0);
 
 	if (bond->alb_info.rlb_enabled) {
-		bond->alb_info.next_rx_slave = NULL;
+		bond->alb_info.rx_slave = NULL;
 		rlb_clear_slave(bond, slave);
 	}
 }
diff --git a/drivers/net/bonding/bond_alb.h b/drivers/net/bonding/bond_alb.h
index c5eff5d..4226044 100644
--- a/drivers/net/bonding/bond_alb.h
+++ b/drivers/net/bonding/bond_alb.h
@@ -154,9 +154,7 @@ struct alb_bond_info {
 	u8			rx_ntt;	/* flag - need to transmit
 					 * to all rx clients
 					 */
-	struct slave		*next_rx_slave;/* next slave to be assigned
-						* to a new rx client for
-						*/
+	struct slave		*rx_slave;/* last slave to xmit from */
 	u8			primary_is_promisc;	   /* boolean */
 	u32			rlb_promisc_timeout_counter;/* counts primary
 							     * promiscuity time
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 27/27] net: create sysfs symlinks for neighbour devices
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev
  Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek,
	David S. Miller, Eric Dumazet, Alexander Duyck
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

Also, remove the same functionality from bonding - it will be already done
for any device that links to its lower/upper neighbour.

The links will be created for dev's kobject, and will look like
lower_eth0 for lower device eth0 and upper_bridge0 for upper device
bridge0.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    Convert the logic to the new functions, which don't have the bool upper
    argument.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    Rename lower devices from slave_eth0 to lower_eth0, so that it meets
    the lower/upper naming. I've searched for any active users of bonding's
    slave_eth0, and seems that nobody's actively using it or relying on it.

 drivers/net/bonding/bond_main.c  | 11 +----------
 drivers/net/bonding/bond_sysfs.c | 21 ---------------------
 drivers/net/bonding/bonding.h    |  2 --
 net/core/dev.c                   | 35 ++++++++++++++++++++++++++++++++++-
 4 files changed, 35 insertions(+), 34 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index d494045..d5c3153 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1612,15 +1612,11 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 
 	read_unlock(&bond->lock);
 
-	res = bond_create_slave_symlinks(bond_dev, slave_dev);
-	if (res)
-		goto err_detach;
-
 	res = netdev_rx_handler_register(slave_dev, bond_handle_frame,
 					 new_slave);
 	if (res) {
 		pr_debug("Error %d calling netdev_rx_handler_register\n", res);
-		goto err_dest_symlinks;
+		goto err_detach;
 	}
 
 	res = bond_master_upper_dev_link(bond_dev, slave_dev, new_slave);
@@ -1642,9 +1638,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 err_unregister:
 	netdev_rx_handler_unregister(slave_dev);
 
-err_dest_symlinks:
-	bond_destroy_slave_symlinks(bond_dev, slave_dev);
-
 err_detach:
 	if (!USES_PRIMARY(bond->params.mode))
 		bond_hw_addr_flush(bond_dev, slave_dev);
@@ -1842,8 +1835,6 @@ static int __bond_release_one(struct net_device *bond_dev,
 			bond_dev->name, slave_dev->name, bond_dev->name);
 
 	/* must do this from outside any spinlocks */
-	bond_destroy_slave_symlinks(bond_dev, slave_dev);
-
 	vlan_vids_del_by_dev(slave_dev, bond_dev);
 
 	/* If the mode USES_PRIMARY, then this cases was handled above by
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 1c57246..e06c644 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -168,27 +168,6 @@ static const struct class_attribute class_attr_bonding_masters = {
 	.namespace = bonding_namespace,
 };
 
-int bond_create_slave_symlinks(struct net_device *master,
-			       struct net_device *slave)
-{
-	char linkname[IFNAMSIZ+7];
-
-	/* create a link from the master to the slave */
-	sprintf(linkname, "slave_%s", slave->name);
-	return sysfs_create_link(&(master->dev.kobj), &(slave->dev.kobj),
-				 linkname);
-}
-
-void bond_destroy_slave_symlinks(struct net_device *master,
-				 struct net_device *slave)
-{
-	char linkname[IFNAMSIZ+7];
-
-	sprintf(linkname, "slave_%s", slave->name);
-	sysfs_remove_link(&(master->dev.kobj), linkname);
-}
-
-
 /*
  * Show the slaves in the current bond.
  */
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index e5c32a6..5b71601 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -436,8 +436,6 @@ int bond_create(struct net *net, const char *name);
 int bond_create_sysfs(struct bond_net *net);
 void bond_destroy_sysfs(struct bond_net *net);
 void bond_prepare_sysfs_group(struct bonding *bond);
-int bond_create_slave_symlinks(struct net_device *master, struct net_device *slave);
-void bond_destroy_slave_symlinks(struct net_device *master, struct net_device *slave);
 int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev);
 int bond_release(struct net_device *bond_dev, struct net_device *slave_dev);
 void bond_mii_monitor(struct work_struct *);
diff --git a/net/core/dev.c b/net/core/dev.c
index de443ee..25ab6fe 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4584,6 +4584,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 					void *private, bool master)
 {
 	struct netdev_adjacent *adj;
+	char linkname[IFNAMSIZ+7];
 	int ret;
 
 	adj = __netdev_find_adj(dev, adj_dev, dev_list);
@@ -4606,12 +4607,26 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 	pr_debug("dev_hold for %s, because of link added from %s to %s\n",
 		 adj_dev->name, dev->name, adj_dev->name);
 
+	if (dev_list == &dev->adj_list.lower) {
+		sprintf(linkname, "lower_%s", adj_dev->name);
+		ret = sysfs_create_link(&(dev->dev.kobj),
+					&(adj_dev->dev.kobj), linkname);
+		if (ret)
+			goto free_adj;
+	} else if (dev_list == &dev->adj_list.upper) {
+		sprintf(linkname, "upper_%s", adj_dev->name);
+		ret = sysfs_create_link(&(dev->dev.kobj),
+					&(adj_dev->dev.kobj), linkname);
+		if (ret)
+			goto free_adj;
+	}
+
 	/* Ensure that master link is always the first item in list. */
 	if (master) {
 		ret = sysfs_create_link(&(dev->dev.kobj),
 					&(adj_dev->dev.kobj), "master");
 		if (ret)
-			goto free_adj;
+			goto remove_symlinks;
 
 		list_add_rcu(&adj->list, dev_list);
 	} else {
@@ -4620,6 +4635,15 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 
 	return 0;
 
+remove_symlinks:
+	if (dev_list == &dev->adj_list.lower) {
+		sprintf(linkname, "lower_%s", adj_dev->name);
+		sysfs_remove_link(&(dev->dev.kobj), linkname);
+	} else if (dev_list == &dev->adj_list.upper) {
+		sprintf(linkname, "upper_%s", adj_dev->name);
+		sysfs_remove_link(&(dev->dev.kobj), linkname);
+	}
+
 free_adj:
 	kfree(adj);
 
@@ -4631,6 +4655,7 @@ void __netdev_adjacent_dev_remove(struct net_device *dev,
 				  struct list_head *dev_list)
 {
 	struct netdev_adjacent *adj;
+	char linkname[IFNAMSIZ+7];
 
 	adj = __netdev_find_adj(dev, adj_dev, dev_list);
 
@@ -4650,6 +4675,14 @@ void __netdev_adjacent_dev_remove(struct net_device *dev,
 	if (adj->master)
 		sysfs_remove_link(&(dev->dev.kobj), "master");
 
+	if (dev_list == &dev->adj_list.lower) {
+		sprintf(linkname, "lower_%s", adj_dev->name);
+		sysfs_remove_link(&(dev->dev.kobj), linkname);
+	} else if (dev_list == &dev->adj_list.upper) {
+		sprintf(linkname, "upper_%s", adj_dev->name);
+		sysfs_remove_link(&(dev->dev.kobj), linkname);
+	}
+
 	list_del_rcu(&adj->list);
 	pr_debug("dev_put for %s, because link removed from %s to %s\n",
 		 adj_dev->name, dev->name, adj_dev->name);
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 26/27] net: expose the master link to sysfs, and remove it from bond
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev
  Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek,
	David S. Miller, Eric Dumazet, Alexander Duyck
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

Currently, we can have only one master upper neighbour, so it would be
useful to create a symlink to it in the sysfs device directory, the way
that bonding now does it, for every device. Lower devices from
bridge/team/etc will automagically get it, so we could rely on it.

Also, remove the same functionality from bonding.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    Convert the logic to the new functions, which don't have the bool upper
    argument.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    No changes.

 drivers/net/bonding/bond_sysfs.c | 20 +++-----------------
 net/core/dev.c                   | 19 +++++++++++++++++--
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 04d95d6..1c57246 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -172,24 +172,11 @@ int bond_create_slave_symlinks(struct net_device *master,
 			       struct net_device *slave)
 {
 	char linkname[IFNAMSIZ+7];
-	int ret = 0;
 
-	/* first, create a link from the slave back to the master */
-	ret = sysfs_create_link(&(slave->dev.kobj), &(master->dev.kobj),
-				"master");
-	if (ret)
-		return ret;
-	/* next, create a link from the master to the slave */
+	/* create a link from the master to the slave */
 	sprintf(linkname, "slave_%s", slave->name);
-	ret = sysfs_create_link(&(master->dev.kobj), &(slave->dev.kobj),
-				linkname);
-
-	/* free the master link created earlier in case of error */
-	if (ret)
-		sysfs_remove_link(&(slave->dev.kobj), "master");
-
-	return ret;
-
+	return sysfs_create_link(&(master->dev.kobj), &(slave->dev.kobj),
+				 linkname);
 }
 
 void bond_destroy_slave_symlinks(struct net_device *master,
@@ -197,7 +184,6 @@ void bond_destroy_slave_symlinks(struct net_device *master,
 {
 	char linkname[IFNAMSIZ+7];
 
-	sysfs_remove_link(&(slave->dev.kobj), "master");
 	sprintf(linkname, "slave_%s", slave->name);
 	sysfs_remove_link(&(master->dev.kobj), linkname);
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index acc1181..de443ee 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4584,6 +4584,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 					void *private, bool master)
 {
 	struct netdev_adjacent *adj;
+	int ret;
 
 	adj = __netdev_find_adj(dev, adj_dev, dev_list);
 
@@ -4606,12 +4607,23 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev,
 		 adj_dev->name, dev->name, adj_dev->name);
 
 	/* Ensure that master link is always the first item in list. */
-	if (master)
+	if (master) {
+		ret = sysfs_create_link(&(dev->dev.kobj),
+					&(adj_dev->dev.kobj), "master");
+		if (ret)
+			goto free_adj;
+
 		list_add_rcu(&adj->list, dev_list);
-	else
+	} else {
 		list_add_tail_rcu(&adj->list, dev_list);
+	}
 
 	return 0;
+
+free_adj:
+	kfree(adj);
+
+	return ret;
 }
 
 void __netdev_adjacent_dev_remove(struct net_device *dev,
@@ -4635,6 +4647,9 @@ void __netdev_adjacent_dev_remove(struct net_device *dev,
 		return;
 	}
 
+	if (adj->master)
+		sysfs_remove_link(&(dev->dev.kobj), "master");
+
 	list_del_rcu(&adj->list);
 	pr_debug("dev_put for %s, because link removed from %s to %s\n",
 		 adj_dev->name, dev->name, adj_dev->name);
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 25/27] vlan: unlink the upper neighbour before unregistering
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Patrick McHardy, David S. Miller
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

On netdev unregister we're removing also all of its sysfs-associated stuff,
including the sysfs symlinks that are controlled by netdev neighbour code.
Also, it's a subtle race condition - cause we can still access it after
unregistering.

Move the unlinking right before the unregistering to fix both.

CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.

    v3  -> v4:
    No change.

    v2  -> v3:
    No change.

    v1  -> v2:
    No changes.

    RFC -> v2:
    New patch.

 net/8021q/vlan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 69b4a35..b3d17d1 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -98,14 +98,14 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 		vlan_gvrp_request_leave(dev);

 	vlan_group_set_device(grp, vlan->vlan_proto, vlan_id, NULL);
+
+	netdev_upper_dev_unlink(real_dev, dev);
 	/* Because unregister_netdevice_queue() makes sure at least one rcu
 	 * grace period is respected before device freeing,
 	 * we dont need to call synchronize_net() here.
 	 */
 	unregister_netdevice_queue(dev, head);

-	netdev_upper_dev_unlink(real_dev, dev);
-
 	if (grp->nr_vlan_devs == 0) {
 		vlan_mvrp_uninit_applicant(real_dev);
 		vlan_gvrp_uninit_applicant(real_dev);
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 24/27] vlan: link the upper neighbour only after registering
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Patrick McHardy, David S. Miller
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

Otherwise users might access it without being fully registered, as per
sysfs - it only inits in register_netdevice(), so is unusable till it is
called.

CC: Patrick McHardy <kaber@trash.net>
CC: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    No changes.

 net/8021q/vlan.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 61fc573..69b4a35 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -169,13 +169,13 @@ int register_vlan_dev(struct net_device *dev)
 	if (err < 0)
 		goto out_uninit_mvrp;
 
-	err = netdev_upper_dev_link(real_dev, dev);
-	if (err)
-		goto out_uninit_mvrp;
-
 	err = register_netdevice(dev);
 	if (err < 0)
-		goto out_upper_dev_unlink;
+		goto out_uninit_mvrp;
+
+	err = netdev_upper_dev_link(real_dev, dev);
+	if (err)
+		goto out_unregister_netdev;
 
 	/* Account for reference in struct vlan_dev_priv */
 	dev_hold(real_dev);
@@ -191,8 +191,8 @@ int register_vlan_dev(struct net_device *dev)
 
 	return 0;
 
-out_upper_dev_unlink:
-	netdev_upper_dev_unlink(real_dev, dev);
+out_unregister_netdev:
+	unregister_netdevice(dev);
 out_uninit_mvrp:
 	if (grp->nr_vlan_devs == 0)
 		vlan_mvrp_uninit_applicant(real_dev);
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 23/27] bonding: remove slave lists
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

And all the initialization.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    RFC -> v2:
    New patch.

 drivers/net/bonding/bond_main.c | 4 ----
 drivers/net/bonding/bonding.h   | 2 --
 2 files changed, 6 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6aa345a..d494045 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -972,7 +972,6 @@ void bond_select_active_slave(struct bonding *bond)
  */
 static void bond_attach_slave(struct bonding *bond, struct slave *new_slave)
 {
-	list_add_tail_rcu(&new_slave->list, &bond->slave_list);
 	bond->slave_cnt++;
 }
 
@@ -988,7 +987,6 @@ static void bond_attach_slave(struct bonding *bond, struct slave *new_slave)
  */
 static void bond_detach_slave(struct bonding *bond, struct slave *slave)
 {
-	list_del_rcu(&slave->list);
 	bond->slave_cnt--;
 }
 
@@ -1374,7 +1372,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 		res = -ENOMEM;
 		goto err_undo_flags;
 	}
-	INIT_LIST_HEAD(&new_slave->list);
 	/*
 	 * Set the new_slave's queue_id to be zero.  Queue ID mapping
 	 * is set via sysfs or module option if desired.
@@ -4022,7 +4019,6 @@ static void bond_setup(struct net_device *bond_dev)
 	/* initialize rwlocks */
 	rwlock_init(&bond->lock);
 	rwlock_init(&bond->curr_slave_lock);
-	INIT_LIST_HEAD(&bond->slave_list);
 	bond->params = bonding_defaults;
 
 	/* Initialize pointers */
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index f070557..e5c32a6 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -165,7 +165,6 @@ struct bond_parm_tbl {
 
 struct slave {
 	struct net_device *dev; /* first - useful for panic debug */
-	struct list_head list;
 	struct bonding *bond; /* our master */
 	int    delay;
 	unsigned long jiffies;
@@ -205,7 +204,6 @@ struct slave {
  */
 struct bonding {
 	struct   net_device *dev; /* first - useful for panic debug */
-	struct   list_head slave_list;
 	struct   slave *curr_active_slave;
 	struct   slave *current_arp_slave;
 	struct   slave *primary_slave;
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 21/27] bonding: add __bond_next_slave() which uses neighbours
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek,
	Ben Hutchings
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

Add a new function, __bond_next_slave(), which uses neighbours to find the
next slave after the slave provided. It will be further used to gradually
go start using neighbour netdev_adjacent infrastructure instead of
bonding's own lists.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    RFC -> v4:
    New patch.

 drivers/net/bonding/bonding.h | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 454d6af..4a3fbe3 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -250,6 +250,34 @@ struct bonding {
 	((struct slave *) rtnl_dereference(dev->rx_handler_data))
 
 /**
+ * __bond_next_slave - get the next slave after the one provided
+ * @bond - bonding struct
+ * @slave - the slave provided
+ *
+ * Returns the next slave after the slave provided, first slave if the
+ * slave provided is the last slave and NULL if slave is not found
+ */
+static inline struct slave *__bond_next_slave(struct bonding *bond,
+					      struct slave *slave)
+{
+	struct slave *slave_iter;
+	struct list_head *iter;
+	bool found = false;
+
+	netdev_for_each_lower_private(bond->dev, slave_iter, iter) {
+		if (found)
+			return slave_iter;
+		if (slave_iter == slave)
+			found = true;
+	}
+
+	if (found)
+		return bond_first_slave(bond);
+
+	return NULL;
+}
+
+/**
  * Returns NULL if the net_device does not belong to any of the bond's slaves
  *
  * Caller must hold bond lock for read
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 22/27] bonding: use neighbours for bond_next_slave()
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

Use the new function __bond_next_slave().

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    Use the bonding-specific function, instead of the general netdev, which
    was removed.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    Change to standard logic - if it's the last slave - return the first one,
    otherwise - return the next via netdev_lower_dev_get_next_private().
    Also, bond_prev_slave() was dropped - so we don't need it.

 drivers/net/bonding/bonding.h | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 4a3fbe3..f070557 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -76,8 +76,6 @@
 
 #define bond_has_slaves(bond) !list_empty(bond_slave_list(bond))
 
-#define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
-
 /* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
 #define bond_first_slave(bond) \
 	(bond_has_slaves(bond) ? \
@@ -92,9 +90,7 @@
 #define bond_is_last_slave(bond, pos) (pos == bond_last_slave(bond))
 
 /* Since bond_first/last_slave can return NULL, these can return NULL too */
-#define bond_next_slave(bond, pos) \
-	(bond_is_last_slave(bond, pos) ? bond_first_slave(bond) : \
-					 bond_to_slave((pos)->list.next))
+#define bond_next_slave(bond, pos) __bond_next_slave(bond, pos)
 
 /**
  * bond_for_each_slave - iterate over all slaves
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 20/27] bonding: remove bond_prev_slave()
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

We don't really need it, and it's really hard to RCUify the list->prev.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    New patch.

 drivers/net/bonding/bond_main.c | 9 +++------
 drivers/net/bonding/bonding.h   | 4 ----
 2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 06ffc8a..6aa345a 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1255,7 +1255,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 {
 	struct bonding *bond = netdev_priv(bond_dev);
 	const struct net_device_ops *slave_ops = slave_dev->netdev_ops;
-	struct slave *new_slave = NULL;
+	struct slave *new_slave = NULL, *prev_slave;
 	struct sockaddr addr;
 	int link_reporting;
 	int res = 0, i;
@@ -1472,6 +1472,7 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 
 	write_lock_bh(&bond->lock);
 
+	prev_slave = bond_last_slave(bond);
 	bond_attach_slave(bond, new_slave);
 
 	new_slave->delay = 0;
@@ -1566,9 +1567,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 			 */
 			bond_3ad_initialize(bond, 1000/AD_TIMER_INTERVAL);
 		} else {
-			struct slave *prev_slave;
-
-			prev_slave = bond_prev_slave(bond, new_slave);
 			SLAVE_AD_INFO(new_slave).id =
 				SLAVE_AD_INFO(prev_slave).id + 1;
 		}
@@ -3506,9 +3504,8 @@ static int bond_change_mtu(struct net_device *bond_dev, int new_mtu)
 	 */
 
 	bond_for_each_slave(bond, slave, iter) {
-		pr_debug("s %p s->p %p c_m %p\n",
+		pr_debug("s %p c_m %p\n",
 			 slave,
-			 bond_prev_slave(bond, slave),
 			 slave->dev->netdev_ops->ndo_change_mtu);
 
 		res = dev_set_mtu(slave->dev, new_mtu);
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 3eb464c..454d6af 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -96,10 +96,6 @@
 	(bond_is_last_slave(bond, pos) ? bond_first_slave(bond) : \
 					 bond_to_slave((pos)->list.next))
 
-#define bond_prev_slave(bond, pos) \
-	(bond_is_first_slave(bond, pos) ? bond_last_slave(bond) : \
-					  bond_to_slave((pos)->list.prev))
-
 /**
  * bond_for_each_slave - iterate over all slaves
  * @bond:	the bond holding this list
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 19/27] bonding: convert first/last slave logic to use neighbours
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

For that, use netdev_adjacent_get_private(list_head) on bond's lower
neighbour list members. Also, add a small macro - bond_slave_list(bond),
which returns the bond list via neighbour list.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    Rename neieghour_dev_list to adj_list.

 drivers/net/bonding/bonding.h | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 170c7fc..3eb464c 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -72,19 +72,24 @@
 	res; })
 
 /* slave list primitives */
-#define bond_has_slaves(bond) !list_empty(&(bond)->dev->adj_list.lower)
+#define bond_slave_list(bond) (&(bond)->dev->adj_list.lower)
+
+#define bond_has_slaves(bond) !list_empty(bond_slave_list(bond))
 
 #define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
 
 /* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
 #define bond_first_slave(bond) \
-	list_first_entry_or_null(&(bond)->slave_list, struct slave, list)
+	(bond_has_slaves(bond) ? \
+		netdev_adjacent_get_private(bond_slave_list(bond)->next) : \
+		NULL)
 #define bond_last_slave(bond) \
-	(list_empty(&(bond)->slave_list) ? NULL : \
-					   bond_to_slave((bond)->slave_list.prev))
+	(bond_has_slaves(bond) ? \
+		netdev_adjacent_get_private(bond_slave_list(bond)->prev) : \
+		NULL)
 
-#define bond_is_first_slave(bond, pos) ((pos)->list.prev == &(bond)->slave_list)
-#define bond_is_last_slave(bond, pos) ((pos)->list.next == &(bond)->slave_list)
+#define bond_is_first_slave(bond, pos) (pos == bond_first_slave(bond))
+#define bond_is_last_slave(bond, pos) (pos == bond_last_slave(bond))
 
 /* Since bond_first/last_slave can return NULL, these can return NULL too */
 #define bond_next_slave(bond, pos) \
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 18/27] net: add a possibility to get private from netdev_adjacent->list
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev
  Cc: jiri, Veaceslav Falico, David S. Miller, Eric Dumazet,
	Alexander Duyck
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

It will be useful to get first/last element.

CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <edumazet@google.com>
CC: Jiri Pirko <jiri@resnulli.us>
CC: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    No changes.

 include/linux/netdevice.h |  1 +
 net/core/dev.c            | 10 ++++++++++
 2 files changed, 11 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 168974e..b4cfb63 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2850,6 +2850,7 @@ extern void *netdev_lower_get_next_private_rcu(struct net_device *dev,
 	     priv; \
 	     priv = netdev_lower_get_next_private_rcu(dev, &(iter)))
 
+extern void *netdev_adjacent_get_private(struct list_head *adj_list);
 extern struct net_device *netdev_master_upper_dev_get(struct net_device *dev);
 extern struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev);
 extern int netdev_upper_dev_link(struct net_device *dev,
diff --git a/net/core/dev.c b/net/core/dev.c
index 0aa844a..acc1181 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4466,6 +4466,16 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
 }
 EXPORT_SYMBOL(netdev_master_upper_dev_get);
 
+void *netdev_adjacent_get_private(struct list_head *adj_list)
+{
+	struct netdev_adjacent *adj;
+
+	adj = list_entry(adj_list, struct netdev_adjacent, list);
+
+	return adj->private;
+}
+EXPORT_SYMBOL(netdev_adjacent_get_private);
+
 /**
  * netdev_all_upper_get_next_dev_rcu - Get the next dev from upper list
  * @dev: device
-- 
1.8.4

^ permalink raw reply related

* [PATCH v5 net-next 17/27] bonding: convert bond_has_slaves() to use the neighbour list
From: Veaceslav Falico @ 2013-09-25  7:20 UTC (permalink / raw)
  To: netdev; +Cc: jiri, Veaceslav Falico, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-1-git-send-email-vfalico@redhat.com>

The same way as it was used for its own slave_list.

CC: Jay Vosburgh <fubar@us.ibm.com>
CC: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
---

Notes:
    v4  -> v5
    No change.
    
    v3  -> v4:
    No change.
    
    v2  -> v3:
    No change.
    
    v1  -> v2:
    No changes.
    
    RFC -> v1:
    Use the renamed adj_list instead of neighbour_dev_list.

 drivers/net/bonding/bonding.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index bcef15e..170c7fc 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -72,7 +72,7 @@
 	res; })
 
 /* slave list primitives */
-#define bond_has_slaves(bond) !list_empty(&(bond)->slave_list)
+#define bond_has_slaves(bond) !list_empty(&(bond)->dev->adj_list.lower)
 
 #define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
 
-- 
1.8.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox