Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next] cxgb3: Restore dependency on INET
From: Ben Hutchings @ 2012-11-28 20:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Fengguang Wu, Divy Le Ray

Commit ff33c0e1885cda44dd14c79f70df4706f83582a0 ('net: Remove bogus
dependencies on INET') wrongly removed this dependency.  cxgb3 uses
the arp_send() function defined in net/ipv4/arp.c.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
---
 drivers/net/ethernet/chelsio/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/Kconfig b/drivers/net/ethernet/chelsio/Kconfig
index a71c0f3..d40c994 100644
--- a/drivers/net/ethernet/chelsio/Kconfig
+++ b/drivers/net/ethernet/chelsio/Kconfig
@@ -48,7 +48,7 @@ config CHELSIO_T1_1G
 
 config CHELSIO_T3
 	tristate "Chelsio Communications T3 10Gb Ethernet support"
-	depends on PCI
+	depends on PCI && INET
 	select FW_LOADER
 	select MDIO
 	---help---
-- 
1.7.7.6


-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply related

* Re: VPN traffic leaks in IPv6/IPv4 dual-stack networks/hosts
From: Jan Engelhardt @ 2012-11-28 20:06 UTC (permalink / raw)
  To: Fernando Gont; +Cc: netdev
In-Reply-To: <50B66CA1.5050907@gont.com.ar>

On Wednesday 2012-11-28 20:57, Fernando Gont wrote:

>On 11/27/2012 01:10 PM, Jan Engelhardt wrote:
>>> For a project such as OpenVPN, a (portable) fix might be non-trivial.
>> 
>> If the VPN server does not even advertise to-be-secured IPv6 prefixes, 
>> any client-side fix is questionable. 
>
>If the VPN is supposed to secure all traffic, and the VPN just fails to
>support v6, then for me, it's questionable to have your traffic leak out
>the VPN just because of that lack of IPv6 support.

Well, what I am saying is that a server may not
be conveying "all", but only "0.0.0.0/0".

^ permalink raw reply

* Re: VPN traffic leaks in IPv6/IPv4 dual-stack networks/hosts
From: Fernando Gont @ 2012-11-28 20:14 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netdev
In-Reply-To: <alpine.LNX.2.01.1211282102240.11155@nerf07.vanv.qr>

On 11/28/2012 05:06 PM, Jan Engelhardt wrote:
>> If the VPN is supposed to secure all traffic, and the VPN just fails to
>> support v6, then for me, it's questionable to have your traffic leak out
>> the VPN just because of that lack of IPv6 support.
> 
> Well, what I am saying is that a server may not
> be conveying "all", but only "0.0.0.0/0".

In such scenarios, doing nothing about IPv6 would be an oversight/error,
since IPv4 and IPv6 do not operate isolated from each other.

Cheers,
-- 
Fernando Gont
e-mail: fernando@gont.com.ar || fgont@si6networks.com
PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1

^ permalink raw reply

* Re: [PATCH v2 3/3] pppoatm: protect against freeing of vcc
From: Krzysztof Mazur @ 2012-11-28 20:18 UTC (permalink / raw)
  To: David Woodhouse, chas williams - CONTRACTOR; +Cc: davem, netdev, linux-kernel
In-Reply-To: <20121127182843.GA11597@shrek.podlesie.net>

On Tue, Nov 27, 2012 at 07:28:43PM +0100, Krzysztof Mazur wrote:
> 
> While reviewing your br2684 patch I also found that some ATM drivers does
> not call ->pop() when ->send() fails, they should do:
> 
> 	if (vcc->pop)
> 		vcc->pop(vcc, skb);
> 	else
> 		dev_kfree_skb(skb);
> 
> but some drivers just call dev_kfree_skb(skb).
> 
> I think that we should add atm_pop() function that does that and fix all
> drivers.
> 

I'm sending a patch that implements that idea.

Currently we need two arguments vcc and skb. However, we have reserved
ATM_SKB(skb)->vcc in skb control block for keeping vcc
and we can create single argument version vcc_pop(skb). In that case
we need to move:

	ATM_SKB(skb)->vcc = vcc;

from ATM drivers to functions that call atmdev_ops->send().

Krzysiek
-- >8 --
Subject: [PATCH] atm: introduce vcc_pop_*()

The atm drivers to free skb, that they got from ->send(), cannot just use
dev_kfree_skb*(), but they must use something like:

	if (vcc->pop)
		vcc->pop(vcc, skb);
	else
		dev_kfree_skb_any(skb);

When vcc->pop is non-NULL, but they must in such case call vcc->pop().
This causes duplicated code in many drivers, and some drivers even forgot
to call vcc->pop() in some error handling code.

The new vcc_pop_*() functions are equivalent to dev_kfree_skb*().
Currently we always use dev_kfree_skb_any() to free, because using
other versions it's probably worthless optimization - in ->pop() we
already use only dev_kfree_skb_any(). The other functions we added
only to not loose information from converting existing code that
uses some non-any dev_kfree_skb*() variants.

Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
---
 include/linux/atmdev.h | 11 +++++++++++
 net/atm/common.c       |  9 +++++++++
 2 files changed, 20 insertions(+)

diff --git a/include/linux/atmdev.h b/include/linux/atmdev.h
index c1da539..dedccad 100644
--- a/include/linux/atmdev.h
+++ b/include/linux/atmdev.h
@@ -283,6 +283,17 @@ int atm_pcr_goal(const struct atm_trafprm *tp);
 
 void vcc_release_async(struct atm_vcc *vcc, int reply);
 
+/*
+ * vcc_pop_*() functions should be used by ATM driver to free transmitted
+ * skbs - skbs that were sent to driver by atmdev_opt->send() function.
+ *
+ * We provide three functions that can be used in different contexts.
+ * See dev_kfree_skb*() documentation for details.
+ */
+void vcc_pop_any(struct atm_vcc *vcc, struct sk_buff *skb);
+#define vcc_pop(vcc, skb) vcc_pop_any(vcc, skb)
+#define vcc_pop_irq(vcc, skb) vcc_pop_any(vcc, skb)
+
 struct atm_ioctl {
 	struct module *owner;
 	/* A module reference is kept if appropriate over this call.
diff --git a/net/atm/common.c b/net/atm/common.c
index 806fc0a..ad9c77d 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -654,6 +654,15 @@ out:
 	return error;
 }
 
+void vcc_pop_any(struct atm_vcc *vcc, struct sk_buff *skb)
+{
+	if (vcc->pop)
+		vcc->pop(vcc, skb);
+	else
+		dev_kfree_skb_any(skb);
+}
+EXPORT_SYMBOL(vcc_pop_any);
+
 unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait)
 {
 	struct sock *sk = sock->sk;
-- 
1.8.0.411.g71a7da8

^ permalink raw reply related

* Re: [PATCH net-next] be2net: fix INTx ISR for interrupt behaviour on BE2
From: Ben Hutchings @ 2012-11-28 20:20 UTC (permalink / raw)
  To: Sathya Perla; +Cc: netdev
In-Reply-To: <d856540b-bacc-45e3-a7cc-49e7febafb95@CMEXHTCAS1.ad.emulex.com>

On Wed, 2012-11-28 at 11:20 +0530, Sathya Perla wrote:
> On BE2 chip, an interrupt may be raised even when EQ is in un-armed state.
> As a result be_intx()::events_get() and be_poll:events_get() can race and
> notify an EQ wrongly.
> 
> Fix this by counting events only in be_poll(). Commit 0b545a629 fixes
> the same issue in the MSI-x path.
> 
> But, on Lancer, INTx can be de-asserted only by notifying num evts. This
> is not an issue as the above BE2 behavior doesn't exist/has never been
> seen on Lancer.
[...]
> @@ -2014,15 +1996,23 @@ static int be_rx_cqs_create(struct be_adapter *adapter)
>  
>  static irqreturn_t be_intx(int irq, void *dev)
>  {
> -	struct be_adapter *adapter = dev;
> -	int num_evts;
> +	struct be_eq_obj *eqo = dev;
> +	struct be_adapter *adapter = eqo->adapter;
> +	int num_evts = 0;
>  
> -	/* With INTx only one EQ is used */
> -	num_evts = event_handle(&adapter->eq_obj[0]);
> -	if (num_evts)
> -		return IRQ_HANDLED;
> -	else
> -		return IRQ_NONE;
> +	/* On Lancer, clear-intr bit of the EQ DB does not work.
> +	 * INTx is de-asserted only on notifying num evts.
> +	 */
> +	if (lancer_chip(adapter))
> +		num_evts = events_get(eqo);
> +
> +	/* The EQ-notify may not de-assert INTx rightaway, causing
> +	 * the ISR to be invoked again. So, return HANDLED even when
> +	 * num_evts is zero.
> +	 */
> +	be_eq_notify(adapter, eqo->q.id, false, true, num_evts);
> +	napi_schedule(&eqo->napi);
> +	return IRQ_HANDLED;
>  }
[...]

You shouldn't unconditionally return IRQ_HANDLED.  This prevents
interrupt storm detection from working, not just for your device but for
anything else sharing its IRQ.

I understand there is a real problem to be fixed (PCIe write completions
overtaking INTx deassertion, and maybe a specific hardware bug).  The
way we dealt with such problems in sfc is to count the number of times
in a row that we don't see any events, and only return IRQ_HANDLED the
first time.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH net-next] be2net: fix INTx ISR for interrupt behaviour on BE2
From: Ben Hutchings @ 2012-11-28 20:25 UTC (permalink / raw)
  To: Sathya Perla; +Cc: netdev
In-Reply-To: <1354134001.2768.11.camel@bwh-desktop.uk.solarflarecom.com>

On Wed, 2012-11-28 at 20:20 +0000, Ben Hutchings wrote:
> On Wed, 2012-11-28 at 11:20 +0530, Sathya Perla wrote:
> > On BE2 chip, an interrupt may be raised even when EQ is in un-armed state.
> > As a result be_intx()::events_get() and be_poll:events_get() can race and
> > notify an EQ wrongly.
> > 
> > Fix this by counting events only in be_poll(). Commit 0b545a629 fixes
> > the same issue in the MSI-x path.
> > 
> > But, on Lancer, INTx can be de-asserted only by notifying num evts. This
> > is not an issue as the above BE2 behavior doesn't exist/has never been
> > seen on Lancer.
> [...]
> > @@ -2014,15 +1996,23 @@ static int be_rx_cqs_create(struct be_adapter *adapter)
> >  
> >  static irqreturn_t be_intx(int irq, void *dev)
> >  {
> > -	struct be_adapter *adapter = dev;
> > -	int num_evts;
> > +	struct be_eq_obj *eqo = dev;
> > +	struct be_adapter *adapter = eqo->adapter;
> > +	int num_evts = 0;
> >  
> > -	/* With INTx only one EQ is used */
> > -	num_evts = event_handle(&adapter->eq_obj[0]);
> > -	if (num_evts)
> > -		return IRQ_HANDLED;
> > -	else
> > -		return IRQ_NONE;
> > +	/* On Lancer, clear-intr bit of the EQ DB does not work.
> > +	 * INTx is de-asserted only on notifying num evts.
> > +	 */
> > +	if (lancer_chip(adapter))
> > +		num_evts = events_get(eqo);
> > +
> > +	/* The EQ-notify may not de-assert INTx rightaway, causing
> > +	 * the ISR to be invoked again. So, return HANDLED even when
> > +	 * num_evts is zero.
> > +	 */
> > +	be_eq_notify(adapter, eqo->q.id, false, true, num_evts);
> > +	napi_schedule(&eqo->napi);
> > +	return IRQ_HANDLED;
> >  }
> [...]
> 
> You shouldn't unconditionally return IRQ_HANDLED.  This prevents
> interrupt storm detection from working, not just for your device but for
> anything else sharing its IRQ.
> 
> I understand there is a real problem to be fixed (PCIe write completions
> overtaking INTx deassertion, and maybe a specific hardware bug).
[...]

I was thinking of read completions; there are no write completions to
wait for so you're pretty much guaranteed to get called a second time.
Maybe you should add an MMIO read after calling be_eq_notify().

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH v2 3/3] pppoatm: protect against freeing of vcc
From: David Woodhouse @ 2012-11-28 20:44 UTC (permalink / raw)
  To: Krzysztof Mazur; +Cc: chas williams - CONTRACTOR, davem, netdev, linux-kernel
In-Reply-To: <20121128201837.GA912@shrek.podlesie.net>

[-- Attachment #1: Type: text/plain, Size: 246 bytes --]

On Wed, 2012-11-28 at 21:18 +0100, Krzysztof Mazur wrote:
> 
> +void vcc_pop_any(struct atm_vcc *vcc, struct sk_buff *skb)
> +{
> +       if (vcc->pop)
> +               vcc->pop(vcc, skb);

if (vcc && vcc->pop) perhaps? 

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 6171 bytes --]

^ permalink raw reply

* Re: [PATCH v2 3/3] pppoatm: protect against freeing of vcc
From: chas williams - CONTRACTOR @ 2012-11-28 21:20 UTC (permalink / raw)
  To: Krzysztof Mazur; +Cc: David Woodhouse, davem, netdev, linux-kernel
In-Reply-To: <20121128201837.GA912@shrek.podlesie.net>

On Wed, 28 Nov 2012 21:18:37 +0100
Krzysztof Mazur <krzysiek@podlesie.net> wrote:

> On Tue, Nov 27, 2012 at 07:28:43PM +0100, Krzysztof Mazur wrote:
> > I think that we should add atm_pop() function that does that and fix all
> > drivers.
> > 
> 
> I'm sending a patch that implements that idea.
> 
> Currently we need two arguments vcc and skb. However, we have reserved
> ATM_SKB(skb)->vcc in skb control block for keeping vcc
> and we can create single argument version vcc_pop(skb). In that case
> we need to move:
> 
> 	ATM_SKB(skb)->vcc = vcc;
> 
> from ATM drivers to functions that call atmdev_ops->send().

i dont like the vcc->pop() implementation and at one point i had the
crazy idea of using skb->destructors to handle it.  however, i think it
would be necessary to clone the skb's so any existing destructor is
preserved.

> +#define vcc_pop(vcc, skb) vcc_pop_any(vcc, skb)
> +#define vcc_pop_irq(vcc, skb) vcc_pop_any(vcc, skb)

don't define these if you dont plan on using them anway.

^ permalink raw reply

* [PATCH resend net-next 0/3] myri10ge: LRO to GRO conversion
From: Andrew Gallatin @ 2012-11-28 21:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

Hi,

The following patchset is a re-send of one I sent a few weeks
back, and converts myri10ge from using the old inet_lro
interface to GRO.

Note that a naive LRO->GRO conversion of myri10ge will result in a
performance regression for vlan tagged frames.  This is because
myri10ge does not offer hardware vlan tag offload, and because GRO
requires hardware vlan tag offload to aggregate vlan tagged frames.

To address this performance regression, I have implemented vlan tag
popping in the myri10ge driver, as it seems to be the lesser of two
evils.  As eric.dumazet@gmail.com commented when I asked about this on
netdev: "Given GRO assumes NIC does hardware vlan
offloading, I guess I would chose to do that.  It seems unfortunate to
add vlan decap in GRO path, already very complex."

Andrew Gallatin (3):
myri10ge: Convert from LRO to GRO
myri10ge: Add vlan rx for better GRO perf.
myri10ge: Use skb_fill_page_desc().

  drivers/net/ethernet/myricom/Kconfig             |    1 -
  drivers/net/ethernet/myricom/myri10ge/myri10ge.c |  274 
++++++----------------
  2 files changed, 73 insertions(+), 202 deletions(-)

^ permalink raw reply

* [PATCH resend net-next 1/3] myri10ge: Convert from LRO to GRO
From: Andrew Gallatin @ 2012-11-28 21:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


Convert myri10ge from LRO to GRO, and simplify the driver by removing
various LRO-related code which is no longer needed including
ndo_fix_features op, custom skb building from frags, and LRO
header parsing.

Signed-off-by: Andrew Gallatin <gallatin@myri.com>
---
  drivers/net/ethernet/myricom/Kconfig             |    1 -
  drivers/net/ethernet/myricom/myri10ge/myri10ge.c |  227 
+++-------------------
  2 files changed, 31 insertions(+), 197 deletions(-)

diff --git a/drivers/net/ethernet/myricom/Kconfig 
b/drivers/net/ethernet/myricom/Kconfig
index 540f0c6..3932d08 100644
--- a/drivers/net/ethernet/myricom/Kconfig
+++ b/drivers/net/ethernet/myricom/Kconfig
@@ -23,7 +23,6 @@ config MYRI10GE
  	depends on PCI && INET
  	select FW_LOADER
  	select CRC32
-	select INET_LRO
  	---help---
  	  This driver supports Myricom Myri-10G Dual Protocol interface in
  	  Ethernet mode. If the eeprom on your board is not recent enough,
diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c 
b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
index 83516e3..a5ab2f2 100644
--- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
+++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
@@ -50,7 +50,6 @@
  #include <linux/etherdevice.h>
  #include <linux/if_ether.h>
  #include <linux/if_vlan.h>
-#include <linux/inet_lro.h>
  #include <linux/dca.h>
  #include <linux/ip.h>
  #include <linux/inet.h>
@@ -96,8 +95,6 @@ MODULE_LICENSE("Dual BSD/GPL");

  #define MYRI10GE_EEPROM_STRINGS_SIZE 256
  #define MYRI10GE_MAX_SEND_DESC_TSO ((65536 / 2048) * 2)
-#define MYRI10GE_MAX_LRO_DESCRIPTORS 8
-#define MYRI10GE_LRO_MAX_PKTS 64

  #define MYRI10GE_NO_CONFIRM_DATA htonl(0xffffffff)
  #define MYRI10GE_NO_RESPONSE_RESULT 0xffffffff
@@ -165,8 +162,6 @@ struct myri10ge_rx_done {
  	dma_addr_t bus;
  	int cnt;
  	int idx;
-	struct net_lro_mgr lro_mgr;
-	struct net_lro_desc lro_desc[MYRI10GE_MAX_LRO_DESCRIPTORS];
  };

  struct myri10ge_slice_netstats {
@@ -338,11 +333,6 @@ static int myri10ge_debug = -1;	/* defaults above */
  module_param(myri10ge_debug, int, 0);
  MODULE_PARM_DESC(myri10ge_debug, "Debug level (0=none,...,16=all)");

-static int myri10ge_lro_max_pkts = MYRI10GE_LRO_MAX_PKTS;
-module_param(myri10ge_lro_max_pkts, int, S_IRUGO);
-MODULE_PARM_DESC(myri10ge_lro_max_pkts,
-		 "Number of LRO packets to be aggregated");
-
  static int myri10ge_fill_thresh = 256;
  module_param(myri10ge_fill_thresh, int, S_IRUGO | S_IWUSR);
  MODULE_PARM_DESC(myri10ge_fill_thresh, "Number of empty rx slots 
allowed");
@@ -1197,36 +1187,6 @@ static inline void myri10ge_vlan_ip_csum(struct 
sk_buff *skb, __wsum hw_csum)
  	}
  }

-static inline void
-myri10ge_rx_skb_build(struct sk_buff *skb, u8 * va,
-		      struct skb_frag_struct *rx_frags, int len, int hlen)
-{
-	struct skb_frag_struct *skb_frags;
-
-	skb->len = skb->data_len = len;
-	/* attach the page(s) */
-
-	skb_frags = skb_shinfo(skb)->frags;
-	while (len > 0) {
-		memcpy(skb_frags, rx_frags, sizeof(*skb_frags));
-		len -= skb_frag_size(rx_frags);
-		skb_frags++;
-		rx_frags++;
-		skb_shinfo(skb)->nr_frags++;
-	}
-
-	/* pskb_may_pull is not available in irq context, but
-	 * skb_pull() (for ether_pad and eth_type_trans()) requires
-	 * the beginning of the packet in skb_headlen(), move it
-	 * manually */
-	skb_copy_to_linear_data(skb, va, hlen);
-	skb_shinfo(skb)->frags[0].page_offset += hlen;
-	skb_frag_size_sub(&skb_shinfo(skb)->frags[0], hlen);
-	skb->data_len -= hlen;
-	skb->tail += hlen;
-	skb_pull(skb, MXGEFW_PAD);
-}
-
  static void
  myri10ge_alloc_rx_pages(struct myri10ge_priv *mgp, struct 
myri10ge_rx_buf *rx,
  			int bytes, int watchdog)
@@ -1304,18 +1264,14 @@ myri10ge_unmap_rx_page(struct pci_dev *pdev,
  	}
  }

-#define MYRI10GE_HLEN 64	/* The number of bytes to copy from a
-				 * page into an skb */
-
  static inline int
-myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
-		 bool lro_enabled)
+myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum)
  {
  	struct myri10ge_priv *mgp = ss->mgp;
  	struct sk_buff *skb;
-	struct skb_frag_struct rx_frags[MYRI10GE_MAX_FRAGS_PER_FRAME];
+	struct skb_frag_struct *rx_frags;
  	struct myri10ge_rx_buf *rx;
-	int i, idx, hlen, remainder, bytes;
+	int i, idx, remainder, bytes;
  	struct pci_dev *pdev = mgp->pdev;
  	struct net_device *dev = mgp->dev;
  	u8 *va;
@@ -1332,6 +1288,20 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, 
int len, __wsum csum,
  	idx = rx->cnt & rx->mask;
  	va = page_address(rx->info[idx].page) + rx->info[idx].page_offset;
  	prefetch(va);
+
+	skb = napi_get_frags(&ss->napi);
+	if (unlikely(skb == NULL)) {
+		ss->stats.rx_dropped++;
+		for (i = 0, remainder = len; remainder > 0; i++) {
+			myri10ge_unmap_rx_page(pdev, &rx->info[idx], bytes);
+			put_page(rx->info[idx].page);
+			rx->cnt++;
+			idx = rx->cnt & rx->mask;
+			remainder -= MYRI10GE_ALLOC_SIZE;
+		}
+		return 0;
+	}
+	rx_frags = skb_shinfo(skb)->frags;
  	/* Fill skb_frag_struct(s) with data from our receive */
  	for (i = 0, remainder = len; remainder > 0; i++) {
  		myri10ge_unmap_rx_page(pdev, &rx->info[idx], bytes);
@@ -1345,54 +1315,23 @@ myri10ge_rx_done(struct myri10ge_slice_state 
*ss, int len, __wsum csum,
  		idx = rx->cnt & rx->mask;
  		remainder -= MYRI10GE_ALLOC_SIZE;
  	}
+	skb_shinfo(skb)->nr_frags = i;

-	if (lro_enabled) {
-		rx_frags[0].page_offset += MXGEFW_PAD;
-		skb_frag_size_sub(&rx_frags[0], MXGEFW_PAD);
-		len -= MXGEFW_PAD;
-		lro_receive_frags(&ss->rx_done.lro_mgr, rx_frags,
-				  /* opaque, will come back in get_frag_header */
-				  len, len,
-				  (void *)(__force unsigned long)csum, csum);
+	/* remove padding */
+	rx_frags[0].page_offset += MXGEFW_PAD;
+	rx_frags[0].size -= MXGEFW_PAD;
+	len -= MXGEFW_PAD;

-		return 1;
-	}
-
-	hlen = MYRI10GE_HLEN > len ? len : MYRI10GE_HLEN;
-
-	/* allocate an skb to attach the page(s) to. This is done
-	 * after trying LRO, so as to avoid skb allocation overheads */
-
-	skb = netdev_alloc_skb(dev, MYRI10GE_HLEN + 16);
-	if (unlikely(skb == NULL)) {
-		ss->stats.rx_dropped++;
-		do {
-			i--;
-			__skb_frag_unref(&rx_frags[i]);
-		} while (i != 0);
-		return 0;
-	}
-
-	/* Attach the pages to the skb, and trim off any padding */
-	myri10ge_rx_skb_build(skb, va, rx_frags, len, hlen);
-	if (skb_frag_size(&skb_shinfo(skb)->frags[0]) <= 0) {
-		skb_frag_unref(skb, 0);
-		skb_shinfo(skb)->nr_frags = 0;
-	} else {
-		skb->truesize += bytes * skb_shinfo(skb)->nr_frags;
+	skb->len = len;
+	skb->data_len = len;
+	skb->truesize += len;
+	if (dev->features & NETIF_F_RXCSUM) {
+		skb->ip_summed = CHECKSUM_COMPLETE;
+		skb->csum = csum;
  	}
-	skb->protocol = eth_type_trans(skb, dev);
  	skb_record_rx_queue(skb, ss - &mgp->ss[0]);

-	if (dev->features & NETIF_F_RXCSUM) {
-		if ((skb->protocol == htons(ETH_P_IP)) ||
-		    (skb->protocol == htons(ETH_P_IPV6))) {
-			skb->csum = csum;
-			skb->ip_summed = CHECKSUM_COMPLETE;
-		} else
-			myri10ge_vlan_ip_csum(skb, csum);
-	}
-	netif_receive_skb(skb);
+	napi_gro_frags(&ss->napi);
  	return 1;
  }

@@ -1480,18 +1419,11 @@ myri10ge_clean_rx_done(struct 
myri10ge_slice_state *ss, int budget)
  	u16 length;
  	__wsum checksum;

-	/*
-	 * Prevent compiler from generating more than one ->features memory
-	 * access to avoid theoretical race condition with functions that
-	 * change NETIF_F_LRO flag at runtime.
-	 */
-	bool lro_enabled = !!(ACCESS_ONCE(mgp->dev->features) & NETIF_F_LRO);
-
  	while (rx_done->entry[idx].length != 0 && work_done < budget) {
  		length = ntohs(rx_done->entry[idx].length);
  		rx_done->entry[idx].length = 0;
  		checksum = csum_unfold(rx_done->entry[idx].checksum);
-		rx_ok = myri10ge_rx_done(ss, length, checksum, lro_enabled);
+		rx_ok = myri10ge_rx_done(ss, length, checksum);
  		rx_packets += rx_ok;
  		rx_bytes += rx_ok * (unsigned long)length;
  		cnt++;
@@ -1503,9 +1435,6 @@ myri10ge_clean_rx_done(struct myri10ge_slice_state 
*ss, int budget)
  	ss->stats.rx_packets += rx_packets;
  	ss->stats.rx_bytes += rx_bytes;

-	if (lro_enabled)
-		lro_flush_all(&rx_done->lro_mgr);
-
  	/* restock receive rings if needed */
  	if (ss->rx_small.fill_cnt - ss->rx_small.cnt < myri10ge_fill_thresh)
  		myri10ge_alloc_rx_pages(mgp, &ss->rx_small,
@@ -1779,7 +1708,6 @@ static const char 
myri10ge_gstrings_slice_stats[][ETH_GSTRING_LEN] = {
  	"tx_pkt_start", "tx_pkt_done", "tx_req", "tx_done",
  	"rx_small_cnt", "rx_big_cnt",
  	"wake_queue", "stop_queue", "tx_linearized",
-	"LRO aggregated", "LRO flushed", "LRO avg aggr", "LRO no_desc",
  };

  #define MYRI10GE_NET_STATS_LEN      21
@@ -1880,14 +1808,6 @@ myri10ge_get_ethtool_stats(struct net_device *netdev,
  		data[i++] = (unsigned int)ss->tx.wake_queue;
  		data[i++] = (unsigned int)ss->tx.stop_queue;
  		data[i++] = (unsigned int)ss->tx.linearized;
-		data[i++] = ss->rx_done.lro_mgr.stats.aggregated;
-		data[i++] = ss->rx_done.lro_mgr.stats.flushed;
-		if (ss->rx_done.lro_mgr.stats.flushed)
-			data[i++] = ss->rx_done.lro_mgr.stats.aggregated /
-			    ss->rx_done.lro_mgr.stats.flushed;
-		else
-			data[i++] = 0;
-		data[i++] = ss->rx_done.lro_mgr.stats.no_desc;
  	}
  }

@@ -2271,67 +2191,6 @@ static void myri10ge_free_irq(struct 
myri10ge_priv *mgp)
  		pci_disable_msix(pdev);
  }

-static int
-myri10ge_get_frag_header(struct skb_frag_struct *frag, void **mac_hdr,
-			 void **ip_hdr, void **tcpudp_hdr,
-			 u64 * hdr_flags, void *priv)
-{
-	struct ethhdr *eh;
-	struct vlan_ethhdr *veh;
-	struct iphdr *iph;
-	u8 *va = skb_frag_address(frag);
-	unsigned long ll_hlen;
-	/* passed opaque through lro_receive_frags() */
-	__wsum csum = (__force __wsum) (unsigned long)priv;
-
-	/* find the mac header, aborting if not IPv4 */
-
-	eh = (struct ethhdr *)va;
-	*mac_hdr = eh;
-	ll_hlen = ETH_HLEN;
-	if (eh->h_proto != htons(ETH_P_IP)) {
-		if (eh->h_proto == htons(ETH_P_8021Q)) {
-			veh = (struct vlan_ethhdr *)va;
-			if (veh->h_vlan_encapsulated_proto != htons(ETH_P_IP))
-				return -1;
-
-			ll_hlen += VLAN_HLEN;
-
-			/*
-			 *  HW checksum starts ETH_HLEN bytes into
-			 *  frame, so we must subtract off the VLAN
-			 *  header's checksum before csum can be used
-			 */
-			csum = csum_sub(csum, csum_partial(va + ETH_HLEN,
-							   VLAN_HLEN, 0));
-		} else {
-			return -1;
-		}
-	}
-	*hdr_flags = LRO_IPV4;
-
-	iph = (struct iphdr *)(va + ll_hlen);
-	*ip_hdr = iph;
-	if (iph->protocol != IPPROTO_TCP)
-		return -1;
-	if (ip_is_fragment(iph))
-		return -1;
-	*hdr_flags |= LRO_TCP;
-	*tcpudp_hdr = (u8 *) (*ip_hdr) + (iph->ihl << 2);
-
-	/* verify the IP checksum */
-	if (unlikely(ip_fast_csum((u8 *) iph, iph->ihl)))
-		return -1;
-
-	/* verify the  checksum */
-	if (unlikely(csum_tcpudp_magic(iph->saddr, iph->daddr,
-				       ntohs(iph->tot_len) - (iph->ihl << 2),
-				       IPPROTO_TCP, csum)))
-		return -1;
-
-	return 0;
-}
-
  static int myri10ge_get_txrx(struct myri10ge_priv *mgp, int slice)
  {
  	struct myri10ge_cmd cmd;
@@ -2402,7 +2261,6 @@ static int myri10ge_open(struct net_device *dev)
  	struct myri10ge_cmd cmd;
  	int i, status, big_pow2, slice;
  	u8 *itable;
-	struct net_lro_mgr *lro_mgr;

  	if (mgp->running != MYRI10GE_ETH_STOPPED)
  		return -EBUSY;
@@ -2513,19 +2371,6 @@ static int myri10ge_open(struct net_device *dev)
  			goto abort_with_rings;
  		}

-		lro_mgr = &ss->rx_done.lro_mgr;
-		lro_mgr->dev = dev;
-		lro_mgr->features = LRO_F_NAPI;
-		lro_mgr->ip_summed = CHECKSUM_COMPLETE;
-		lro_mgr->ip_summed_aggr = CHECKSUM_UNNECESSARY;
-		lro_mgr->max_desc = MYRI10GE_MAX_LRO_DESCRIPTORS;
-		lro_mgr->lro_arr = ss->rx_done.lro_desc;
-		lro_mgr->get_frag_header = myri10ge_get_frag_header;
-		lro_mgr->max_aggr = myri10ge_lro_max_pkts;
-		lro_mgr->frag_align_pad = 2;
-		if (lro_mgr->max_aggr > MAX_SKB_FRAGS)
-			lro_mgr->max_aggr = MAX_SKB_FRAGS;
-
  		/* must happen prior to any irq */
  		napi_enable(&(ss)->napi);
  	}
@@ -3143,15 +2988,6 @@ static int myri10ge_set_mac_address(struct 
net_device *dev, void *addr)
  	return 0;
  }

-static netdev_features_t myri10ge_fix_features(struct net_device *dev,
-	netdev_features_t features)
-{
-	if (!(features & NETIF_F_RXCSUM))
-		features &= ~NETIF_F_LRO;
-
-	return features;
-}
-
  static int myri10ge_change_mtu(struct net_device *dev, int new_mtu)
  {
  	struct myri10ge_priv *mgp = netdev_priv(dev);
@@ -3878,7 +3714,6 @@ static const struct net_device_ops 
myri10ge_netdev_ops = {
  	.ndo_get_stats64	= myri10ge_get_stats,
  	.ndo_validate_addr	= eth_validate_addr,
  	.ndo_change_mtu		= myri10ge_change_mtu,
-	.ndo_fix_features	= myri10ge_fix_features,
  	.ndo_set_rx_mode	= myri10ge_set_multicast_list,
  	.ndo_set_mac_address	= myri10ge_set_mac_address,
  };
@@ -4018,7 +3853,7 @@ static int myri10ge_probe(struct pci_dev *pdev, 
const struct pci_device_id *ent)

  	netdev->netdev_ops = &myri10ge_netdev_ops;
  	netdev->mtu = myri10ge_initial_mtu;
-	netdev->hw_features = mgp->features | NETIF_F_LRO | NETIF_F_RXCSUM;
+	netdev->hw_features = mgp->features | NETIF_F_RXCSUM;
  	netdev->features = netdev->hw_features;

  	if (dac_enabled)
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH resend net-next 2/3] myri10ge: Add vlan rx for better GRO perf.
From: Andrew Gallatin @ 2012-11-28 21:20 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


Unlike LRO, GRO requires that vlan tags be removed before
aggregation can occur.  Since the myri10ge NIC does not support
hardware vlan tag offload, we must remove the tag in the driver
to achieve performance comparable to LRO for vlan tagged frames.

Thanks to Eric Duzamet for his help simplifying the original patch.

Signed-off-by: Andrew Gallatin <gallatin@myri.com>
---
  drivers/net/ethernet/myricom/myri10ge/myri10ge.c |   40 
++++++++++++++++++++++
  1 file changed, 40 insertions(+)

diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c 
b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
index a5ab2f2..93ed089 100644
--- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
+++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
@@ -1264,6 +1264,41 @@ myri10ge_unmap_rx_page(struct pci_dev *pdev,
  	}
  }

+/*
+ * GRO does not support acceleration of tagged vlan frames, and
+ * this NIC does not support vlan tag offload, so we must pop
+ * the tag ourselves to be able to achieve GRO performance that
+ * is comparable to LRO.
+ */
+
+static inline void
+myri10ge_vlan_rx(struct net_device *dev, void *addr, struct sk_buff *skb)
+{
+	u8 *va;
+	struct vlan_ethhdr *veh;
+	struct skb_frag_struct *frag;
+
+	va = addr;
+	va += MXGEFW_PAD;
+	veh = (struct vlan_ethhdr *) va;
+	if ((dev->features & (NETIF_F_HW_VLAN_RX)) == NETIF_F_HW_VLAN_RX &&
+	    (veh->h_vlan_proto == ntohs(ETH_P_8021Q))) {
+		/* fixup csum if needed */
+		if (skb->ip_summed == CHECKSUM_COMPLETE)
+			skb->csum = csum_sub(skb->csum,
+					     csum_partial(va + ETH_HLEN,
+							  VLAN_HLEN, 0));
+		/* pop tag */
+		__vlan_hwaccel_put_tag(skb, ntohs(veh->h_vlan_TCI));
+		memmove(va + VLAN_HLEN, va, 2 * ETH_ALEN);
+		skb->len -= VLAN_HLEN;
+		skb->data_len -= VLAN_HLEN;
+		frag = skb_shinfo(skb)->frags;
+		frag->page_offset += VLAN_HLEN;
+		skb_frag_size_set(frag, skb_frag_size(frag) - VLAN_HLEN);
+	}
+}
+
  static inline int
  myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum)
  {
@@ -1329,6 +1364,7 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, 
int len, __wsum csum)
  		skb->ip_summed = CHECKSUM_COMPLETE;
  		skb->csum = csum;
  	}
+	myri10ge_vlan_rx(mgp->dev, va, skb);
  	skb_record_rx_queue(skb, ss - &mgp->ss[0]);

  	napi_gro_frags(&ss->napi);
@@ -3854,6 +3890,10 @@ static int myri10ge_probe(struct pci_dev *pdev, 
const struct pci_device_id *ent)
  	netdev->netdev_ops = &myri10ge_netdev_ops;
  	netdev->mtu = myri10ge_initial_mtu;
  	netdev->hw_features = mgp->features | NETIF_F_RXCSUM;
+
+	/* fake NETIF_F_HW_VLAN_RX for good GRO performance */
+	netdev->hw_features |= NETIF_F_HW_VLAN_RX;
+
  	netdev->features = netdev->hw_features;

  	if (dac_enabled)
-- 
1.7.9.5

^ permalink raw reply related

* [PATCH resend net-next 3/3] myri10ge: Use skb_fill_page_desc().
From: Andrew Gallatin @ 2012-11-28 21:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


Now that LRO is gone, the receive routine is much simpler, and
we are able to use the standard skb_fill_page_desc() in myri10ge.

Signed-off-by: Andrew Gallatin <gallatin@myri.com>
---
  drivers/net/ethernet/myricom/myri10ge/myri10ge.c |   11 ++++-------
  1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c 
b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
index 93ed089..6bf1d26 100644
--- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
+++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c
@@ -1340,17 +1340,14 @@ myri10ge_rx_done(struct myri10ge_slice_state 
*ss, int len, __wsum csum)
  	/* Fill skb_frag_struct(s) with data from our receive */
  	for (i = 0, remainder = len; remainder > 0; i++) {
  		myri10ge_unmap_rx_page(pdev, &rx->info[idx], bytes);
-		__skb_frag_set_page(&rx_frags[i], rx->info[idx].page);
-		rx_frags[i].page_offset = rx->info[idx].page_offset;
-		if (remainder < MYRI10GE_ALLOC_SIZE)
-			skb_frag_size_set(&rx_frags[i], remainder);
-		else
-			skb_frag_size_set(&rx_frags[i], MYRI10GE_ALLOC_SIZE);
+		skb_fill_page_desc(skb, i, rx->info[idx].page,
+				   rx->info[idx].page_offset,
+				   remainder < MYRI10GE_ALLOC_SIZE ?
+				   remainder : MYRI10GE_ALLOC_SIZE);
  		rx->cnt++;
  		idx = rx->cnt & rx->mask;
  		remainder -= MYRI10GE_ALLOC_SIZE;
  	}
-	skb_shinfo(skb)->nr_frags = i;

  	/* remove padding */
  	rx_frags[0].page_offset += MXGEFW_PAD;
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH v2 3/3] pppoatm: protect against freeing of vcc
From: Krzysztof Mazur @ 2012-11-28 21:24 UTC (permalink / raw)
  To: David Woodhouse; +Cc: chas williams - CONTRACTOR, davem, netdev, linux-kernel
In-Reply-To: <1354135481.21562.98.camel@shinybook.infradead.org>

On Wed, Nov 28, 2012 at 08:44:41PM +0000, David Woodhouse wrote:
> On Wed, 2012-11-28 at 21:18 +0100, Krzysztof Mazur wrote:
> > 
> > +void vcc_pop_any(struct atm_vcc *vcc, struct sk_buff *skb)
> > +{
> > +       if (vcc->pop)
> > +               vcc->pop(vcc, skb);
> 
> if (vcc && vcc->pop) perhaps? 
> 

yes, we can do that, at least "he" and "nicstar" use that in some cases.

Krzysiek

^ permalink raw reply

* Re: VPN traffic leaks in IPv6/IPv4 dual-stack networks/hosts
From: Jan Engelhardt @ 2012-11-28 21:37 UTC (permalink / raw)
  To: Fernando Gont; +Cc: netdev
In-Reply-To: <50B6708A.8020701@gont.com.ar>

On Wednesday 2012-11-28 21:14, Fernando Gont wrote:

>On 11/28/2012 05:06 PM, Jan Engelhardt wrote:
>>> If the VPN is supposed to secure all traffic, and the VPN just fails to
>>> support v6, then for me, it's questionable to have your traffic leak out
>>> the VPN just because of that lack of IPv6 support.
>> 
>> Well, what I am saying is that a server may not
>> be conveying "all", but only "0.0.0.0/0"[0/0].
>
>In such scenarios, doing nothing about IPv6 would be an oversight/error,

Without additional input from the user, e.g. by means of a config 
setting, the software itself cannot distinguish between an 
oversight/error and a deliberate configuration.

^ permalink raw reply

* [PATCH] atm: introduce vcc_pop()
From: Krzysztof Mazur @ 2012-11-28 21:45 UTC (permalink / raw)
  To: chas williams - CONTRACTOR; +Cc: David Woodhouse, davem, netdev, linux-kernel
In-Reply-To: <20121128162001.3f326720@thirdoffive.cmf.nrl.navy.mil>

On Wed, Nov 28, 2012 at 04:20:01PM -0500, chas williams - CONTRACTOR wrote:
> On Wed, 28 Nov 2012 21:18:37 +0100
> Krzysztof Mazur <krzysiek@podlesie.net> wrote:
> 
> > On Tue, Nov 27, 2012 at 07:28:43PM +0100, Krzysztof Mazur wrote:
> > > I think that we should add atm_pop() function that does that and fix all
> > > drivers.
> > > 
> > 
> > I'm sending a patch that implements that idea.
> > 
> > Currently we need two arguments vcc and skb. However, we have reserved
> > ATM_SKB(skb)->vcc in skb control block for keeping vcc
> > and we can create single argument version vcc_pop(skb). In that case
> > we need to move:
> > 
> > 	ATM_SKB(skb)->vcc = vcc;
> > 
> > from ATM drivers to functions that call atmdev_ops->send().
> 
> i dont like the vcc->pop() implementation and at one point i had the
> crazy idea of using skb->destructors to handle it.  however, i think it
> would be necessary to clone the skb's so any existing destructor is
> preserved.

With this patch we will kill vcc->pop() in drivers and in future
we can do that without changes in drivers.

> 
> > +#define vcc_pop(vcc, skb) vcc_pop_any(vcc, skb)
> > +#define vcc_pop_irq(vcc, skb) vcc_pop_any(vcc, skb)
> 
> don't define these if you dont plan on using them anway.

I removed them. I also added check if vcc is NULL, as David Woodhouse
suggested, some drivers use that.

Krzysiek
-- >8 --
Subject: [PATCH v2] atm: introduce vcc_pop()

The atm drivers to free skb, that they got from ->send(), cannot just use
dev_kfree_skb*(), but they must use something like:

	if (vcc->pop)
		vcc->pop(vcc, skb);
	else
		dev_kfree_skb_any(skb);

When vcc->pop() is non-NULL, but they must in such case call vcc->pop().
This causes duplicated code in many drivers, and some drivers even forgot
to call vcc->pop() in some error handling code.

Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
---
 include/linux/atmdev.h | 8 ++++++++
 net/atm/common.c       | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/atmdev.h b/include/linux/atmdev.h
index c1da539..57bd93f 100644
--- a/include/linux/atmdev.h
+++ b/include/linux/atmdev.h
@@ -283,6 +283,14 @@ int atm_pcr_goal(const struct atm_trafprm *tp);
 
 void vcc_release_async(struct atm_vcc *vcc, int reply);
 
+/**
+ * vcc_pop - free transmitted ATM skb
+ *
+ * vcc_pop() should be used by ATM driver to free skbs, that were sent
+ * to driver by atmdev_opt->send() function.
+ */
+void vcc_pop(struct atm_vcc *vcc, struct sk_buff *skb);
+
 struct atm_ioctl {
 	struct module *owner;
 	/* A module reference is kept if appropriate over this call.
diff --git a/net/atm/common.c b/net/atm/common.c
index 806fc0a..76bf76c 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -654,6 +654,15 @@ out:
 	return error;
 }
 
+void vcc_pop(struct atm_vcc *vcc, struct sk_buff *skb)
+{
+	if (vcc && vcc->pop)
+		vcc->pop(vcc, skb);
+	else
+		dev_kfree_skb_any(skb);
+}
+EXPORT_SYMBOL(vcc_pop);
+
 unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait)
 {
 	struct sock *sk = sock->sk;
-- 
1.8.0.411.g71a7da8

^ permalink raw reply related

* Re: [PATCH] atm: introduce vcc_pop()
From: chas williams - CONTRACTOR @ 2012-11-28 21:59 UTC (permalink / raw)
  To: Krzysztof Mazur; +Cc: David Woodhouse, davem, netdev, linux-kernel
In-Reply-To: <20121128214533.GA25486@shrek.podlesie.net>

On Wed, 28 Nov 2012 22:45:34 +0100
Krzysztof Mazur <krzysiek@podlesie.net> wrote:

> On Wed, Nov 28, 2012 at 04:20:01PM -0500, chas williams - CONTRACTOR wrote:
> > i dont like the vcc->pop() implementation and at one point i had the
> > crazy idea of using skb->destructors to handle it.  however, i think it
> > would be necessary to clone the skb's so any existing destructor is
> > preserved.
> 
> With this patch we will kill vcc->pop() in drivers and in future
> we can do that without changes in drivers.

ok

> > 
> > > +#define vcc_pop(vcc, skb) vcc_pop_any(vcc, skb)
> > > +#define vcc_pop_irq(vcc, skb) vcc_pop_any(vcc, skb)
> > 
> > don't define these if you dont plan on using them anway.
> 
> I removed them. I also added check if vcc is NULL, as David Woodhouse
> suggested, some drivers use that.

it should probably be if (likely(vcc) && likely(vcc->pop)) since it
will almost always be the case.

^ permalink raw reply

* Re: [PATCH] atm: introduce vcc_pop()
From: Krzysztof Mazur @ 2012-11-28 22:10 UTC (permalink / raw)
  To: chas williams - CONTRACTOR; +Cc: David Woodhouse, davem, netdev, linux-kernel
In-Reply-To: <20121128165906.05ef5e2b@thirdoffive.cmf.nrl.navy.mil>

On Wed, Nov 28, 2012 at 04:59:06PM -0500, chas williams - CONTRACTOR wrote:
> On Wed, 28 Nov 2012 22:45:34 +0100
> Krzysztof Mazur <krzysiek@podlesie.net> wrote:
> 
> > On Wed, Nov 28, 2012 at 04:20:01PM -0500, chas williams - CONTRACTOR wrote:
> > > i dont like the vcc->pop() implementation and at one point i had the
> > > crazy idea of using skb->destructors to handle it.  however, i think it
> > > would be necessary to clone the skb's so any existing destructor is
> > > preserved.
> > 
> > With this patch we will kill vcc->pop() in drivers and in future
> > we can do that without changes in drivers.
> 
> ok
> 
> > > 
> > > > +#define vcc_pop(vcc, skb) vcc_pop_any(vcc, skb)
> > > > +#define vcc_pop_irq(vcc, skb) vcc_pop_any(vcc, skb)
> > > 
> > > don't define these if you dont plan on using them anway.
> > 
> > I removed them. I also added check if vcc is NULL, as David Woodhouse
> > suggested, some drivers use that.
> 
> it should probably be if (likely(vcc) && likely(vcc->pop)) since it
> will almost always be the case.

Thanks,

Krzysiek
-- >8 --
Subject: [PATCH v3] atm: introduce vcc_pop()

The atm drivers to free skb, that they got from ->send(), cannot just use
dev_kfree_skb*(), but they must use something like:

	if (vcc->pop)
		vcc->pop(vcc, skb);
	else
		dev_kfree_skb_any(skb);

When vcc->pop() is non-NULL, but they must in such case call vcc->pop().
This causes duplicated code in many drivers, and some drivers even forgot
to call vcc->pop() in some error handling code.

Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
---
 include/linux/atmdev.h | 8 ++++++++
 net/atm/common.c       | 9 +++++++++
 2 files changed, 17 insertions(+)

diff --git a/include/linux/atmdev.h b/include/linux/atmdev.h
index c1da539..57bd93f 100644
--- a/include/linux/atmdev.h
+++ b/include/linux/atmdev.h
@@ -283,6 +283,14 @@ int atm_pcr_goal(const struct atm_trafprm *tp);
 
 void vcc_release_async(struct atm_vcc *vcc, int reply);
 
+/**
+ * vcc_pop - free transmitted ATM skb
+ *
+ * vcc_pop() should be used by ATM driver to free skbs, that were sent
+ * to driver by atmdev_opt->send() function.
+ */
+void vcc_pop(struct atm_vcc *vcc, struct sk_buff *skb);
+
 struct atm_ioctl {
 	struct module *owner;
 	/* A module reference is kept if appropriate over this call.
diff --git a/net/atm/common.c b/net/atm/common.c
index 806fc0a..c42ff62 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -654,6 +654,15 @@ out:
 	return error;
 }
 
+void vcc_pop(struct atm_vcc *vcc, struct sk_buff *skb)
+{
+	if (likely(vcc) && likely(vcc->pop))
+		vcc->pop(vcc, skb);
+	else
+		dev_kfree_skb_any(skb);
+}
+EXPORT_SYMBOL(vcc_pop);
+
 unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait)
 {
 	struct sock *sk = sock->sk;
-- 
1.8.0.411.g71a7da8

^ permalink raw reply related

* Re: [PATCH v2 3/3] pppoatm: protect against freeing of vcc
From: David Woodhouse @ 2012-11-28 22:18 UTC (permalink / raw)
  To: David Laight
  Cc: chas williams - CONTRACTOR, Krzysztof Mazur, davem, netdev,
	linux-kernel, nathan
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70C9@saturn3.aculab.com>

[-- Attachment #1: Type: text/plain, Size: 3006 bytes --]

On Wed, 2012-11-28 at 09:21 +0000, David Laight wrote:
> Even when it might make sense to sleep in close until tx drains
> there needs to be a finite timeout before it become abortive.

You are, of course, right. We should never wait for hardware for ever.
And just to serve me right, I seem to have hit a bug in the latest Solos
firmware (1.11) which makes it sometimes lock up when I reboot. So it
never responds to the PKT_PCLOSE packet... and thus it deadlocks when I
try to kill pppd and unload the module to reset it :)

New version...

From 53dd01c08fec5b26006a009b25e4210127fdb27a Mon Sep 17 00:00:00 2001
From: David Woodhouse <David.Woodhouse@intel.com>
Date: Tue, 27 Nov 2012 23:49:24 +0000
Subject: [PATCH] solos-pci: Wait for pending TX to complete when releasing
 vcc

We should no longer be calling the old pop routine for the vcc, after
vcc_release() has completed. Make sure we wait for any pending TX skbs
to complete, by waiting for our own PKT_PCLOSE control skb to be sent.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/atm/solos-pci.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c
index 9851093..3720670 100644
--- a/drivers/atm/solos-pci.c
+++ b/drivers/atm/solos-pci.c
@@ -92,6 +92,7 @@ struct pkt_hdr {
 };
 
 struct solos_skb_cb {
+	struct completion c;
 	struct atm_vcc *vcc;
 	uint32_t dma_addr;
 };
@@ -881,11 +882,18 @@ static void pclose(struct atm_vcc *vcc)
 	header->vci = cpu_to_le16(vcc->vci);
 	header->type = cpu_to_le16(PKT_PCLOSE);
 
+	init_completion(&SKB_CB(skb)->c);
+
 	fpga_queue(card, SOLOS_CHAN(vcc->dev), skb, NULL);
 
 	clear_bit(ATM_VF_ADDR, &vcc->flags);
 	clear_bit(ATM_VF_READY, &vcc->flags);
 
+	if (!wait_for_completion_timeout(&SKB_CB(skb)->c,
+					 jiffies + msecs_to_jiffies(5000)))
+		dev_warn(&card->dev->dev, "Timeout waiting for VCC close on port %d\n",
+			 SOLOS_CHAN(vcc->dev));
+
 	/* Hold up vcc_destroy_socket() (our caller) until solos_bh() in the
 	   tasklet has finished processing any incoming packets (and, more to
 	   the point, using the vcc pointer). */
@@ -1011,9 +1019,12 @@ static uint32_t fpga_tx(struct solos_card *card)
 			if (vcc) {
 				atomic_inc(&vcc->stats->tx);
 				solos_pop(vcc, oldskb);
-			} else
+			} else {
+				struct pkt_hdr *header = (void *)oldskb->data;
+				if (le16_to_cpu(header->type) == PKT_PCLOSE)
+					complete(&SKB_CB(oldskb)->c);
 				dev_kfree_skb_irq(oldskb);
-
+			}
 		}
 	}
 	/* For non-DMA TX, write the 'TX start' bit for all four ports simultaneously */
@@ -1345,6 +1356,8 @@ static struct pci_driver fpga_driver = {
 
 static int __init solos_pci_init(void)
 {
+	BUILD_BUG_ON(sizeof(struct solos_skb_cb) > sizeof(((struct sk_buff *)0)->cb));
+
 	printk(KERN_INFO "Solos PCI Driver Version %s\n", VERSION);
 	return pci_register_driver(&fpga_driver);
 }
-- 
1.8.0


-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 6171 bytes --]

^ permalink raw reply related

* [PATCH] atm: introduce vcc_pop_skb()
From: Krzysztof Mazur @ 2012-11-28 22:33 UTC (permalink / raw)
  To: chas williams - CONTRACTOR; +Cc: David Woodhouse, davem, netdev, linux-kernel
In-Reply-To: <20121128221040.GA24035@shrek.podlesie.net>

On Wed, Nov 28, 2012 at 11:10:40PM +0100, Krzysztof Mazur wrote:
> On Wed, Nov 28, 2012 at 04:59:06PM -0500, chas williams - CONTRACTOR wrote:
> > On Wed, 28 Nov 2012 22:45:34 +0100
> > Krzysztof Mazur <krzysiek@podlesie.net> wrote:
> > 
> > > On Wed, Nov 28, 2012 at 04:20:01PM -0500, chas williams - CONTRACTOR wrote:
> > > > i dont like the vcc->pop() implementation and at one point i had the
> > > > crazy idea of using skb->destructors to handle it.  however, i think it
> > > > would be necessary to clone the skb's so any existing destructor is
> > > > preserved.
> > > 
> > > With this patch we will kill vcc->pop() in drivers and in future
> > > we can do that without changes in drivers.
> > 
> > ok
> > 
> > > > 
> > > > > +#define vcc_pop(vcc, skb) vcc_pop_any(vcc, skb)
> > > > > +#define vcc_pop_irq(vcc, skb) vcc_pop_any(vcc, skb)
> > > > 
> > > > don't define these if you dont plan on using them anway.
> > > 
> > > I removed them. I also added check if vcc is NULL, as David Woodhouse
> > > suggested, some drivers use that.
> > 
> > it should probably be if (likely(vcc) && likely(vcc->pop)) since it
> > will almost always be the case.
> 

I think that we should also add that single-argument skb-only version.
Currently it can be used only after the driver does ATM_SKB(skb)->vcc = vcc.
Most drivers do that.

Thanks,

Krzysiek
-- >8 --
Subject: [PATCH] atm: introduce atm_pop_skb()

Many ATM drivers store vcc in ATM_SKB(skb)->vcc and use it for
freeing skbs. Now they can just use atm_pop_skb() to free such
buffers.

Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
---
 include/linux/atmdev.h | 8 ++++++++
 net/atm/common.c       | 6 ++++++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/atmdev.h b/include/linux/atmdev.h
index 57bd93f..648fb79 100644
--- a/include/linux/atmdev.h
+++ b/include/linux/atmdev.h
@@ -291,6 +291,14 @@ void vcc_release_async(struct atm_vcc *vcc, int reply);
  */
 void vcc_pop(struct atm_vcc *vcc, struct sk_buff *skb);
 
+/**
+ * vcc_pop_skb - free transmitted ATM skb
+ *
+ * This variant of vcc_pop() assumes that ATM_SKB(skb)->vcc is set
+ * by driver.
+ */
+void vcc_pop_skb(struct sk_buff *skb);
+
 struct atm_ioctl {
 	struct module *owner;
 	/* A module reference is kept if appropriate over this call.
diff --git a/net/atm/common.c b/net/atm/common.c
index c42ff62..378c911 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -663,6 +663,12 @@ void vcc_pop(struct atm_vcc *vcc, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(vcc_pop);
 
+void vcc_pop_skb(struct sk_buff *skb)
+{
+	vcc_pop(ATM_SKB(skb)->vcc, skb);
+}
+EXPORT_SYMBOL(vcc_pop_skb);
+
 unsigned int vcc_poll(struct file *file, struct socket *sock, poll_table *wait)
 {
 	struct sock *sk = sock->sk;
-- 
1.8.0.411.g71a7da8

^ permalink raw reply related

* Re: [PATCH net-next] cxgb3: Restore dependency on INET
From: David Miller @ 2012-11-28 22:41 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, fengguang.wu, divy
In-Reply-To: <1354132983.2768.1.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 28 Nov 2012 20:03:03 +0000

> Commit ff33c0e1885cda44dd14c79f70df4706f83582a0 ('net: Remove bogus
> dependencies on INET') wrongly removed this dependency.  cxgb3 uses
> the arp_send() function defined in net/ipv4/arp.c.
> 
> Reported-by: kbuild test robot <fengguang.wu@intel.com>
> Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/1] net: ethernet: cpsw: fix build warnings for CPSW when CPTS not selected
From: David Miller @ 2012-11-28 22:51 UTC (permalink / raw)
  To: mugunthanvnm; +Cc: netdev, linux-arm-kernel, linux-omap, richardcochran
In-Reply-To: <1354038820-11095-1-git-send-email-mugunthanvnm@ti.com>

From: Mugunthan V N <mugunthanvnm@ti.com>
Date: Tue, 27 Nov 2012 23:23:40 +0530

>   CC      drivers/net/ethernet/ti/cpsw.o
> drivers/net/ethernet/ti/cpsw.c: In function 'cpsw_ndo_ioctl':
> drivers/net/ethernet/ti/cpsw.c:881:20: warning: unused variable 'priv'
> 
> The build warning is generated when CPTS is not selected in Kernel Build.
> Fixing by passing the net_device pointer to cpts IOCTL instead of passing priv
> 
> Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>

Applied.

^ permalink raw reply

* Re: [PATCH] bonding: fix miimon and arp_interval delayed work race conditions
From: Jay Vosburgh @ 2012-11-28 19:15 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, andy, davem
In-Reply-To: <1353759471-30323-1-git-send-email-nikolay@redhat.com>

Nikolay Aleksandrov <nikolay@redhat.com> wrote:

>First I would give three observations which will be used later.
>Observation 1: if (delayed_work_pending(wq)) cancel_delayed_work(wq)
> This usage is wrong because the pending bit is cleared just before the work's fn is
> executed and if the function re-arms itself we might end up with the work still
> running. It is safe to call cancel_delayed_work_sync() even if the work is not queued
> at all.
>Observation 2: Use of INIT_DELAYED_WORK()
> Work needs to be initialized only once prior to (de/en)queueing.
>Observation 3: IFF_UP is set only after ndo_open is called
>
>Related race conditions:
>1. Race between bonding_store_miimon() and bonding_store_arp_interval()
> Because of Obs.1 we can end up having both works enqueued.
>2. Multiple races with INIT_DELAYED_WORK()
> Since the works are not protected by anything between INIT_DELAYED_WORK() and
> calls to (en/de)queue it is possible for races between the following functions:
> (races are also possible between the calls to INIT_DELAYED_WORK() and workqueue code)
> bonding_store_miimon() - bonding_store_arp_interval(), bond_close(), bond_open(),
>			  enqueued functions
> bonding_store_arp_interval() - bonding_store_miimon(), bond_close(), bond_open(),
>				enqueued functions
>3. By Obs.1 we need to change bond_cancel_all()
>
>Bugs 1 and 2 are fixed by moving all work initializations in bond_open which by
>Obs. 2 and Obs. 3 and the fact that we make sure that all works are cancelled in
>bond_close(), is guaranteed not to have any work enqueued. Also RTNL lock is now
>acquired in bonding_store_miimon/arp_interval so they can't race with bond_close
>and bond_open. The opposing work is cancelled only if the IFF_UP flag is set
>and it is cancelled unconditionally. The opposing work is already cancelled if
>the interface is down so no need to cancel it again. This way we don't need new
>synchronizations for the bonding workqueue. These bug (and fixes) are tied 
>together and belong in the same patch.
>Note: I have left 1 line intentionally over 80 characters (84) because I didn't
>      like how it looks broken down. If you'd prefer it otherwise, then simply
>      break it.
>
>Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>

	The patch looks good, although, once applied, the commit message
as shown by "git log" is hard to read due to the formatting (long
lines).  Can you reflow the text to less than 75 columns to make it more
readable in the log?

	This is true for the other two patches as well (that they look
good, and their text runs long), although the log messages are much
shorter.

	-J

>---
> drivers/net/bonding/bond_main.c  | 88 ++++++++++++----------------------------
> drivers/net/bonding/bond_sysfs.c | 34 +++++-----------
> 2 files changed, 36 insertions(+), 86 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 5f5b69f..1445c7d 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -3459,6 +3459,28 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
>
> /*-------------------------- Device entry points ----------------------------*/
>
>+static void bond_work_init_all(struct bonding *bond)
>+{
>+	INIT_DELAYED_WORK(&bond->mcast_work,
>+			  bond_resend_igmp_join_requests_delayed);
>+	INIT_DELAYED_WORK(&bond->alb_work, bond_alb_monitor);
>+	INIT_DELAYED_WORK(&bond->mii_work, bond_mii_monitor);
>+	if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
>+		INIT_DELAYED_WORK(&bond->arp_work, bond_activebackup_arp_mon);
>+	else
>+		INIT_DELAYED_WORK(&bond->arp_work, bond_loadbalance_arp_mon);
>+	INIT_DELAYED_WORK(&bond->ad_work, bond_3ad_state_machine_handler);
>+}
>+
>+static void bond_work_cancel_all(struct bonding *bond)
>+{
>+	cancel_delayed_work_sync(&bond->mii_work);
>+	cancel_delayed_work_sync(&bond->arp_work);
>+	cancel_delayed_work_sync(&bond->alb_work);
>+	cancel_delayed_work_sync(&bond->ad_work);
>+	cancel_delayed_work_sync(&bond->mcast_work);
>+}
>+
> static int bond_open(struct net_device *bond_dev)
> {
> 	struct bonding *bond = netdev_priv(bond_dev);
>@@ -3481,41 +3503,27 @@ static int bond_open(struct net_device *bond_dev)
> 	}
> 	read_unlock(&bond->lock);
>
>-	INIT_DELAYED_WORK(&bond->mcast_work, bond_resend_igmp_join_requests_delayed);
>+	bond_work_init_all(bond);
>
> 	if (bond_is_lb(bond)) {
> 		/* bond_alb_initialize must be called before the timer
> 		 * is started.
> 		 */
>-		if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB))) {
>-			/* something went wrong - fail the open operation */
>+		if (bond_alb_initialize(bond, (bond->params.mode == BOND_MODE_ALB)))
> 			return -ENOMEM;
>-		}
>-
>-		INIT_DELAYED_WORK(&bond->alb_work, bond_alb_monitor);
> 		queue_delayed_work(bond->wq, &bond->alb_work, 0);
> 	}
>
>-	if (bond->params.miimon) {  /* link check interval, in milliseconds. */
>-		INIT_DELAYED_WORK(&bond->mii_work, bond_mii_monitor);
>+	if (bond->params.miimon)  /* link check interval, in milliseconds. */
> 		queue_delayed_work(bond->wq, &bond->mii_work, 0);
>-	}
>
> 	if (bond->params.arp_interval) {  /* arp interval, in milliseconds. */
>-		if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
>-			INIT_DELAYED_WORK(&bond->arp_work,
>-					  bond_activebackup_arp_mon);
>-		else
>-			INIT_DELAYED_WORK(&bond->arp_work,
>-					  bond_loadbalance_arp_mon);
>-
> 		queue_delayed_work(bond->wq, &bond->arp_work, 0);
> 		if (bond->params.arp_validate)
> 			bond->recv_probe = bond_arp_rcv;
> 	}
>
> 	if (bond->params.mode == BOND_MODE_8023AD) {
>-		INIT_DELAYED_WORK(&bond->ad_work, bond_3ad_state_machine_handler);
> 		queue_delayed_work(bond->wq, &bond->ad_work, 0);
> 		/* register to receive LACPDUs */
> 		bond->recv_probe = bond_3ad_lacpdu_recv;
>@@ -3530,34 +3538,10 @@ static int bond_close(struct net_device *bond_dev)
> 	struct bonding *bond = netdev_priv(bond_dev);
>
> 	write_lock_bh(&bond->lock);
>-
> 	bond->send_peer_notif = 0;
>-
> 	write_unlock_bh(&bond->lock);
>
>-	if (bond->params.miimon) {  /* link check interval, in milliseconds. */
>-		cancel_delayed_work_sync(&bond->mii_work);
>-	}
>-
>-	if (bond->params.arp_interval) {  /* arp interval, in milliseconds. */
>-		cancel_delayed_work_sync(&bond->arp_work);
>-	}
>-
>-	switch (bond->params.mode) {
>-	case BOND_MODE_8023AD:
>-		cancel_delayed_work_sync(&bond->ad_work);
>-		break;
>-	case BOND_MODE_TLB:
>-	case BOND_MODE_ALB:
>-		cancel_delayed_work_sync(&bond->alb_work);
>-		break;
>-	default:
>-		break;
>-	}
>-
>-	if (delayed_work_pending(&bond->mcast_work))
>-		cancel_delayed_work_sync(&bond->mcast_work);
>-
>+	bond_work_cancel_all(bond);
> 	if (bond_is_lb(bond)) {
> 		/* Must be called only after all
> 		 * slaves have been released
>@@ -4436,26 +4420,6 @@ static void bond_setup(struct net_device *bond_dev)
> 	bond_dev->features |= bond_dev->hw_features;
> }
>
>-static void bond_work_cancel_all(struct bonding *bond)
>-{
>-	if (bond->params.miimon && delayed_work_pending(&bond->mii_work))
>-		cancel_delayed_work_sync(&bond->mii_work);
>-
>-	if (bond->params.arp_interval && delayed_work_pending(&bond->arp_work))
>-		cancel_delayed_work_sync(&bond->arp_work);
>-
>-	if (bond->params.mode == BOND_MODE_ALB &&
>-	    delayed_work_pending(&bond->alb_work))
>-		cancel_delayed_work_sync(&bond->alb_work);
>-
>-	if (bond->params.mode == BOND_MODE_8023AD &&
>-	    delayed_work_pending(&bond->ad_work))
>-		cancel_delayed_work_sync(&bond->ad_work);
>-
>-	if (delayed_work_pending(&bond->mcast_work))
>-		cancel_delayed_work_sync(&bond->mcast_work);
>-}
>-
> /*
> * Destroy a bonding device.
> * Must be under rtnl_lock when this function is called.
>diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
>index ef8d2a0..3327a07 100644
>--- a/drivers/net/bonding/bond_sysfs.c
>+++ b/drivers/net/bonding/bond_sysfs.c
>@@ -513,6 +513,8 @@ static ssize_t bonding_store_arp_interval(struct device *d,
> 	int new_value, ret = count;
> 	struct bonding *bond = to_bond(d);
>
>+	if (!rtnl_trylock())
>+		return restart_syscall();
> 	if (sscanf(buf, "%d", &new_value) != 1) {
> 		pr_err("%s: no arp_interval value specified.\n",
> 		       bond->dev->name);
>@@ -539,10 +541,6 @@ static ssize_t bonding_store_arp_interval(struct device *d,
> 		pr_info("%s: ARP monitoring cannot be used with MII monitoring. %s Disabling MII monitoring.\n",
> 			bond->dev->name, bond->dev->name);
> 		bond->params.miimon = 0;
>-		if (delayed_work_pending(&bond->mii_work)) {
>-			cancel_delayed_work(&bond->mii_work);
>-			flush_workqueue(bond->wq);
>-		}
> 	}
> 	if (!bond->params.arp_targets[0]) {
> 		pr_info("%s: ARP monitoring has been set up, but no ARP targets have been specified.\n",
>@@ -554,19 +552,12 @@ static ssize_t bonding_store_arp_interval(struct device *d,
> 		 * timer will get fired off when the open function
> 		 * is called.
> 		 */
>-		if (!delayed_work_pending(&bond->arp_work)) {
>-			if (bond->params.mode == BOND_MODE_ACTIVEBACKUP)
>-				INIT_DELAYED_WORK(&bond->arp_work,
>-						  bond_activebackup_arp_mon);
>-			else
>-				INIT_DELAYED_WORK(&bond->arp_work,
>-						  bond_loadbalance_arp_mon);
>-
>-			queue_delayed_work(bond->wq, &bond->arp_work, 0);
>-		}
>+		cancel_delayed_work_sync(&bond->mii_work);
>+		queue_delayed_work(bond->wq, &bond->arp_work, 0);
> 	}
>
> out:
>+	rtnl_unlock();
> 	return ret;
> }
> static DEVICE_ATTR(arp_interval, S_IRUGO | S_IWUSR,
>@@ -962,6 +953,8 @@ static ssize_t bonding_store_miimon(struct device *d,
> 	int new_value, ret = count;
> 	struct bonding *bond = to_bond(d);
>
>+	if (!rtnl_trylock())
>+		return restart_syscall();
> 	if (sscanf(buf, "%d", &new_value) != 1) {
> 		pr_err("%s: no miimon value specified.\n",
> 		       bond->dev->name);
>@@ -993,10 +986,6 @@ static ssize_t bonding_store_miimon(struct device *d,
> 				bond->params.arp_validate =
> 					BOND_ARP_VALIDATE_NONE;
> 			}
>-			if (delayed_work_pending(&bond->arp_work)) {
>-				cancel_delayed_work(&bond->arp_work);
>-				flush_workqueue(bond->wq);
>-			}
> 		}
>
> 		if (bond->dev->flags & IFF_UP) {
>@@ -1005,15 +994,12 @@ static ssize_t bonding_store_miimon(struct device *d,
> 			 * timer will get fired off when the open function
> 			 * is called.
> 			 */
>-			if (!delayed_work_pending(&bond->mii_work)) {
>-				INIT_DELAYED_WORK(&bond->mii_work,
>-						  bond_mii_monitor);
>-				queue_delayed_work(bond->wq,
>-						   &bond->mii_work, 0);
>-			}
>+			cancel_delayed_work_sync(&bond->arp_work);
>+			queue_delayed_work(bond->wq, &bond->mii_work, 0);
> 		}
> 	}
> out:
>+	rtnl_unlock();
> 	return ret;
> }
> static DEVICE_ATTR(miimon, S_IRUGO | S_IWUSR,
>-- 
>1.7.11.7
>

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com

^ permalink raw reply

* Re: [PATCH 1/1] Introduce notification events for routing changes
From: David Miller @ 2012-11-28 22:59 UTC (permalink / raw)
  To: kadlec; +Cc: netdev, netfilter-devel
In-Reply-To: <1354048045-17846-2-git-send-email-kadlec@blackhole.kfki.hu>

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Date: Tue, 27 Nov 2012 21:27:25 +0100

> The netfilter MASQUERADE target does not handle the case when the routing
> changes and the source address of existing connections become invalid.
> The problem can be solved if routing modifications create events to which
> the MASQUERADE target can subscribe and then delete the affected
> connections.
> 
> The patch adds the required event support for IPv4/IPv6.
> 
> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

What part of the information are you actually interested in?

Because just saying that a route is added or removed using fib_info X
doesn't tell you a whole lot.

fib_info only encapsulates the information that can be shared heaving
with many ipv4 routes.  It doesn't include the TOS or other aspects
stored in the fib_alias part.  I can only guess that you did not
use fib_alias in order to avoid having to export that structure to
the callers, as it is currently private to net/ipv4/

The notifier doesn't seem to distinguish between adds or removes
either, making it less useful in another way.

I would suggest passing a super-structure that gives the event type:

	struct route_changed_info {
		enum {
			add,
			remove,
		} event_type;
		void *data;
	};

or something like that.

Can you also show us exactly how this will be used?  Otherwise we
have to guess.

^ permalink raw reply

* Re: [GIT net] Open vSwitch
From: David Miller @ 2012-11-28 23:04 UTC (permalink / raw)
  To: jesse; +Cc: netdev, dev
In-Reply-To: <1354041423-3050-1-git-send-email-jesse@nicira.com>

From: Jesse Gross <jesse@nicira.com>
Date: Tue, 27 Nov 2012 10:37:01 -0800

> These two small bug fixes are intended for 3.7/net if there is still time.

Pulled, thanks.

^ permalink raw reply

* Re: pull request: wireless-next 2012-11-28
From: David Miller @ 2012-11-28 23:05 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev
In-Reply-To: <20121128192352.GB9118@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Wed, 28 Nov 2012 14:23:52 -0500

> This pull request is intended for the 3.8 stream.  It is a bit large
> -- I guess Thanksgiving got me off track!  At least the code got to
> spend some time in linux-next... :-)

Wow, that's a lot.

Pulled, thanks.

I'll push it out after I do a bunch of build tests.

Thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox