Netdev List
 help / color / mirror / Atom feed
* Re: Question about xfrm by MARK feature
From: jamal @ 2010-06-24 12:04 UTC (permalink / raw)
  To: Gerd v. Egidy; +Cc: timo.teras, kaber, herbert, netdev
In-Reply-To: <201006231803.17261.lists@egidy.de>

Hi Gerd,

On Wed, 2010-06-23 at 18:03 +0200, Gerd v. Egidy wrote:
> Hi Jamal,
> 
> while looking through the 2.6.34 changelog I found the xfrm by MARK feature 
> you developed in february. I'm currently working on NAT for ipsec connections 
> and thought your feature might help me.
> 
> For example I have 2 different remote networks with the same ip network each 
> and both of them have a tunnel to the same local network. 

It seems "Same IP network" means that two remote locations will have
exactly same IP address? This is hard of course - but nat may do it..
There's also the nat zones feature that Patrick introduced a while back
that may help you

> I map their IPs to 
> something different so I can distinguish them in the local network. But after 
> the nat the xfrm code sees two tunnels with exactly the same values. So this 
> can't work.
> 

Can you look at the incoming encrypted packet headers and tell if they
are from different remotes? If not, are different remotes coming in via
a different network device? If yes, you can install a tc rule to mark
them as they come in before decryption and that mark should stay with
them even after they get decrypted.

> But if I understood your feature correctly, I can now mark the packets (e.g. 
> in iptables with ... -j MARK --set-mark 1) and have xfrm select the correct 
> ipsec tunnel via the mark. Correct?
> 
> But does your feature also set the mark on packets decrypted by xfrm? I need 
> some way to find out from which tunnel the packet came to correctly treat it. 
> 

Refer to above and also to policy routing.

> Do you know if any of the ipsec solutions for linux (e.g. strongswan, 
> openswan, racoon) already have support for this feature or are developing on 
> it?

AFAIK, only iproute2 can use marks. I believe the ike daemons can be
made to use reqid (as Herbert mentioned) but i am not sure that is 
sufficient for what you want.

cheers,
jamal


^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Michal Marek @ 2010-06-24 12:05 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, linuxppc-dev, linux-fsdevel, kexec, netdev
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On 23.6.2010 12:37, Andi Kleen wrote:
> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools.

FWIW, patch 16/40 of this series teaches 'make tags' to recognize these
macros: http://permalink.gmane.org/gmane.linux.kernel/1002103

Michal

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Michal Marek @ 2010-06-24 12:05 UTC (permalink / raw)
  To: linux-fsdevel
  Cc: linux-mm, linux-kernel, linuxppc-dev, kexec, netdev, linux-mm,
	linux-kernel, linux-fsdevel, kexec, netdev, linux-kernel,
	linuxppc-dev, linux-fsdevel, kexec, netdev
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On 23.6.2010 12:37, Andi Kleen wrote:
> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools.

FWIW, patch 16/40 of this series teaches 'make tags' to recognize these
macros: http://permalink.gmane.org/gmane.linux.kernel/1002103

Michal


^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Michal Marek @ 2010-06-24 12:05 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linuxppc-dev, netdev, kexec, linux-kernel, linux-mm,
	linux-fsdevel
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On 23.6.2010 12:37, Andi Kleen wrote:
> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools.

FWIW, patch 16/40 of this series teaches 'make tags' to recognize these
macros: http://permalink.gmane.org/gmane.linux.kernel/1002103

Michal

^ permalink raw reply

* Re: [PATCH 31/40] trace syscalls: Convert various generic compat syscalls
From: Michal Marek @ 2010-06-24 12:05 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, linux-fsdevel, kexec, netdev,
	linux-kernel, linuxppc-dev, linux-fsdevel, kexec, netdev
In-Reply-To: <4C21E3F8.9000405@linux.intel.com>

On 23.6.2010 12:37, Andi Kleen wrote:
> It also has maintenance costs, e.g. I doubt ctags and cscope
> will be able to deal with these kinds of macros, so it has a
> high cost for everyone using these tools.

FWIW, patch 16/40 of this series teaches 'make tags' to recognize these
macros: http://permalink.gmane.org/gmane.linux.kernel/1002103

Michal

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH net-next-2.6 2/2] 3c59x: Use fine-grained locks for MII and windowed register access
From: Ben Hutchings @ 2010-06-24 12:57 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: David Miller, netdev, Chase Douglas, Arne Nordmark
In-Reply-To: <20100624120517.GI5570@secunet.com>

[-- Attachment #1: Type: text/plain, Size: 5075 bytes --]

On Thu, 2010-06-24 at 14:05 +0200, Steffen Klassert wrote:
> Hi.
> 
> On Thu, Jun 24, 2010 at 12:55:41AM +0100, Ben Hutchings wrote:
> > This avoids scheduling in atomic context and also means that IRQs
> > will only be deferred for relatively short periods of time.
> > 
> > Previously discussed in:
> > http://article.gmane.org/gmane.linux.network/155024
> > 
> > Reported-by: Arne Nordmark <nordmark@mech.kth.se>
> > Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
> > Tested-by: Arne Nordmark <nordmark@mech.kth.se> [against 2.6.32]
> > ---
> >  drivers/net/3c59x.c |   66 ++++++++++++++++++++++++++++++---------------------
> >  1 files changed, 39 insertions(+), 27 deletions(-)
> > 
> > diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
> > index beddef9..f4a3fb1 100644
> > --- a/drivers/net/3c59x.c
> > +++ b/drivers/net/3c59x.c
> > @@ -644,9 +644,15 @@ struct vortex_private {
> >  	u16 deferred;						/* Resend these interrupts when we
> >  										 * bale from the ISR */
> >  	u16 io_size;						/* Size of PCI region (for release_region) */
> > -	spinlock_t lock;					/* Serialise access to device & its vortex_private */
> > -	struct mii_if_info mii;				/* MII lib hooks/info */
> > -	int window;					/* Register window */
> > +
> > +	/* Serialises access to hardware other than MII and variables below.
> > +	 * The lock hierarchy is rtnl_lock > lock > mii_lock > window_lock. */
> > +	spinlock_t lock;
> > +
> > +	spinlock_t mii_lock;		/* Serialises access to MII */
> > +	struct mii_if_info mii;		/* MII lib hooks/info */
> > +	spinlock_t window_lock;		/* Serialises access to windowed regs */
> 
> You should initialize the new locks properly with spin_lock_init().

Oops, yes, obviously.

> > +	int window;			/* Register window */
> >  };
> >  
> >  static void window_set(struct vortex_private *vp, int window)
> > @@ -661,15 +667,23 @@ static void window_set(struct vortex_private *vp, int window)
> >  static u ## size							\
> >  window_read ## size(struct vortex_private *vp, int window, int addr)	\
> >  {									\
> > +	unsigned long flags;						\
> > +	u ## size ret;							\
> > +	spin_lock_irqsave(&vp->window_lock, flags);			\
> >  	window_set(vp, window);						\
> > -	return ioread ## size(vp->ioaddr + addr);			\
> > +	ret = ioread ## size(vp->ioaddr + addr);			\
> > +	spin_unlock_irqrestore(&vp->window_lock, flags);		\
> > +	return ret;							\
> >  }									\
> >  static void								\
> >  window_write ## size(struct vortex_private *vp, u ## size value,	\
> >  		     int window, int addr)				\
> >  {									\
> > +	unsigned long flags;						\
> > +	spin_lock_irqsave(&vp->window_lock, flags);			\
> >  	window_set(vp, window);						\
> >  	iowrite ## size(value, vp->ioaddr + addr);			\
> > +	spin_unlock_irqrestore(&vp->window_lock, flags);		\
> >  }
> 
> This adds a lot of calls to spin_lock_irqsave/spin_unlock_irqrestore to many
> places where this is not necessary at all. For example during device probe and
> device open, window_read/window_write are called multiple times, each time
> disabling the interrupts. I'd suggest to have unlocked, locked and irqsave
> versions of window_read/window_write and use them in appropriate places.

So what?  These are not speed-critical.  The fast-path functions do
acquire the lock just once.

> >  DEFINE_WINDOW_IO(8)
> >  DEFINE_WINDOW_IO(16)
> > @@ -1784,7 +1798,6 @@ vortex_timer(unsigned long data)
> >  		pr_debug("dev->watchdog_timeo=%d\n", dev->watchdog_timeo);
> >  	}
> >  
> > -	disable_irq_lockdep(dev->irq);
> >  	media_status = window_read16(vp, 4, Wn4_Media);
> >  	switch (dev->if_port) {
> >  	case XCVR_10baseT:  case XCVR_100baseTx:  case XCVR_100baseFx:
> > @@ -1805,10 +1818,7 @@ vortex_timer(unsigned long data)
> >  	case XCVR_MII: case XCVR_NWAY:
> >  		{
> >  			ok = 1;
> > -			/* Interrupts are already disabled */
> > -			spin_lock(&vp->lock);
> >  			vortex_check_media(dev, 0);
> > -			spin_unlock(&vp->lock);
> >  		}
> >  		break;
> >  	  default:					/* Other media types handled by Tx timeouts. */
> > @@ -1827,6 +1837,8 @@ vortex_timer(unsigned long data)
> >  	if (!ok) {
> >  		unsigned int config;
> >  
> > +		spin_lock_irq(&vp->lock);
> 
> This can still happen every 5 seconds if the NIC has no link beat and
> medialock is not set. So what about defering this locked codepath to
> a workqueue, or moving the whole vortex_timer to a delayed workqueue?
> In this case we don't need to disable all the interrups on the cpu, we
> could still use disable_irq then.

This locked section is now very short - 5 or 6 register read/writes and
no delays.  We might even be able to get away without locking here as
the only software state this accesses is dev->if_port and I don't think
it can race with anything except SIOCGIFMAP (which seems harmless).

Ben.

> The rest looks quite good to me.
> 
> Thanks,
> 
> Steffen
> 

-- 
Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH net-next-2.6 2/2] 3c59x: Use fine-grained locks for MII and windowed register access
From: Steffen Klassert @ 2010-06-24 14:00 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: David Miller, netdev, Chase Douglas, Arne Nordmark
In-Reply-To: <1277384239.26161.162.camel@localhost>

On Thu, Jun 24, 2010 at 01:57:19PM +0100, Ben Hutchings wrote:
> > 
> > This adds a lot of calls to spin_lock_irqsave/spin_unlock_irqrestore to many
> > places where this is not necessary at all. For example during device probe and
> > device open, window_read/window_write are called multiple times, each time
> > disabling the interrupts. I'd suggest to have unlocked, locked and irqsave
> > versions of window_read/window_write and use them in appropriate places.
> 
> So what?  These are not speed-critical.  The fast-path functions do
> acquire the lock just once.
> 

The point is that we should not disable the interrupts if we don't need to
do so. It is not speed critical for the 3c59x driver but disabling the
interrupts should be avoided whenever possible. For example during device
probe and device open we can't race against an interrupt handler because
the device is not yet running.

An example from vortex_probe1() is:

for (i = 0; i < 6; i++)
	window_write8(vp, dev->dev_addr[i], 2, i);

which expands to someting like:

for (i = 0; i < 6; i++) {
	unsigned long flags;
	spin_lock_irqsave(&vp->window_lock, flags);
	window_set(vp, window);
	iowrite8(dev->dev_addr[i], vp->ioaddr  + i);
	spin_unlock_irqrestore(&vp->window_lock, flags);
	return ret;
}

which is quite odd in a codepath that could simply do:

for (i = 0; i < 6; i++) {
	window_set(vp, window);
	iowrite8(dev->dev_addr[i], vp->ioaddr + i);
}

> 
> This locked section is now very short - 5 or 6 register read/writes and
> no delays.  We might even be able to get away without locking here as
> the only software state this accesses is dev->if_port and I don't think
> it can race with anything except SIOCGIFMAP (which seems harmless).
> 

Best would be, if we don't need to disable the interrupts on this cpu
at all. But then we probaply need to disable the interupt line with
disable_irq. That's why I suggested to move the timer to thread context.

Steffen

^ permalink raw reply

* [PATCH] vxge: fix memory leak in vxge_alloc_msix() error path
From: Michal Schmidt @ 2010-06-24 14:13 UTC (permalink / raw)
  To: Ramkrishna Vepa; +Cc: netdev

When pci_enable_msix() returned ret<0, entries and vxge_entries were leaked.
While at it, use the centralized exit idiom in the function.

Not tested. It compiles OK.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/vxge/vxge-main.c |   29 ++++++++++++++++++++---------
 1 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/net/vxge/vxge-main.c b/drivers/net/vxge/vxge-main.c
index 45c5dc2..8b9e73b 100644
--- a/drivers/net/vxge/vxge-main.c
+++ b/drivers/net/vxge/vxge-main.c
@@ -2262,7 +2262,8 @@ start:
 		vxge_debug_init(VXGE_ERR,
 			"%s: memory allocation failed",
 			VXGE_DRIVER_NAME);
-		return  -ENOMEM;
+		ret = -ENOMEM;
+		goto alloc_entries_failed;
 	}
 
 	vdev->vxge_entries =
@@ -2271,8 +2272,8 @@ start:
 	if (!vdev->vxge_entries) {
 		vxge_debug_init(VXGE_ERR, "%s: memory allocation failed",
 			VXGE_DRIVER_NAME);
-		kfree(vdev->entries);
-		return -ENOMEM;
+		ret = -ENOMEM;
+		goto alloc_vxge_entries_failed;
 	}
 
 	for (i = 0, j = 0; i < vdev->no_of_vpath; i++) {
@@ -2303,22 +2304,32 @@ start:
 		vxge_debug_init(VXGE_ERR,
 			"%s: MSI-X enable failed for %d vectors, ret: %d",
 			VXGE_DRIVER_NAME, vdev->intr_cnt, ret);
+		if ((max_config_vpath != VXGE_USE_DEFAULT) || (ret < 3)) {
+			ret = -ENODEV;
+			goto enable_msix_failed;
+		}
+
 		kfree(vdev->entries);
 		kfree(vdev->vxge_entries);
 		vdev->entries = NULL;
 		vdev->vxge_entries = NULL;
-
-		if ((max_config_vpath != VXGE_USE_DEFAULT) || (ret < 3))
-			return -ENODEV;
 		/* Try with less no of vector by reducing no of vpaths count */
 		temp = (ret - 1)/2;
 		vxge_close_vpaths(vdev, temp);
 		vdev->no_of_vpath = temp;
 		goto start;
-	} else if (ret < 0)
-		return -ENODEV;
-
+	} else if (ret < 0) {
+		ret = -ENODEV;
+		goto enable_msix_failed;
+	}
 	return 0;
+
+enable_msix_failed:
+	kfree(vdev->vxge_entries);
+alloc_vxge_entries_failed:
+	kfree(vdev->entries);
+alloc_entries_failed:
+	return ret;
 }
 
 static int vxge_enable_msix(struct vxgedev *vdev)
-- 
1.7.1


^ permalink raw reply related

* Re: [PATCH net-next-2.6 3/4] macvlan: 64 bit rx counters
From: Patrick McHardy @ 2010-06-24 14:55 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1277376861.2816.284.camel@edumazet-laptop>

Eric Dumazet wrote:
> Use u64_stats_sync infrastructure to implement 64bit stats.
>
>   

Looks good to me, thanks, I actually wanted to do this myself yesterday :)

Acked-by: Patrick McHardy <kaber@trash.net>


^ permalink raw reply

* Re: [PATCH net-next-2.6 4/4] vlan: 64 bit rx counters
From: Patrick McHardy @ 2010-06-24 14:57 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1277376906.2816.287.camel@edumazet-laptop>

Eric Dumazet wrote:
> Use u64_stats_sync infrastructure to implement 64bit rx stats.
>
> (tx stats are addressed later)

Acked-by: Patrick McHardy <kaber@trash.net>

^ permalink raw reply

* Re: [PATCH net-next-2.6 5/5] sfc: Record hardware RX hash on each skb where possible
From: Ben Hutchings @ 2010-06-24 15:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers
In-Reply-To: <1277328688.2101.12.camel@achroite.uk.solarflarecom.com>

David,

Unfortunately, this version of the patch can hit a hardware bug that
results in bogus hashes.  Let me know whether you've applied it, and
I'll send an incremental or a replacement patch.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH net-next-2.6] netfilter: allow nf_tproxy_core module to be removed
From: Patrick McHardy @ 2010-06-24 15:29 UTC (permalink / raw)
  To: David Miller; +Cc: fw, jpirko, netdev, Balazs Scheidler, KOVACS Krisztian
In-Reply-To: <20100623.115558.189705237.davem@davemloft.net>

David Miller wrote:
> From: Florian Westphal <fw@strlen.de>
> Date: Wed, 23 Jun 2010 20:46:11 +0200
>
>   
>> tproxy assigns skb->destructor, what prevents module unload while such skbs may
>> still be around?
>>     
>
> The only reference to nf_tproxy_core.ko is for the symbol, "nf_tproxy_assign_sock".
> xt_TPROXY.c, which references this symbol, thus creates a symbol dependency on this
> module, so xt_TPROXY.o needs to unload before nf_tproxy_core.ko can unload, and
> xt_TPROXY.o has it's own manner for handling module references properly.
>   

I don't see anything waiting for skbs in flight using the tproxy
destructor in either xt_TPROXY or nf_tproxy_core though, so I think
Florian is correct.


^ permalink raw reply

* [RFC net-next-2.6] snmp: ipstats_mib becomes u64 for all arches
From: Eric Dumazet @ 2010-06-24 16:47 UTC (permalink / raw)
  To: David Miller; +Cc: netdev


David,

I will respin this patch after net-next-2.6 tree stabilizes a bit.

(It needs u64_stats_fetch_begin_bh() and u64_stats_fetch_retry_bh(), and
probably the SNMP fix I sent earlier, currently in net-2.6 only)

Thanks

[RFC net-next-2.6] snmp: ipstats_mib becomes u64 for all arches

/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

This uses u64_stats_sync infrastructure.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/net/ip.h    |   20 ++++++--
 include/net/ipv6.h  |   12 ++---
 include/net/snmp.h  |   98 ++++++++++++++++++++++++++++++++++++++----
 net/ipv4/af_inet.c  |   33 ++++++++++++++
 net/ipv4/proc.c     |   12 ++---
 net/ipv6/addrconf.c |   18 +++++++
 net/ipv6/proc.c     |   13 ++++-
 7 files changed, 176 insertions(+), 30 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index 3b524df..890f972 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -165,12 +165,12 @@ struct ipv4_config {
 };
 
 extern struct ipv4_config ipv4_config;
-#define IP_INC_STATS(net, field)	SNMP_INC_STATS((net)->mib.ip_statistics, field)
-#define IP_INC_STATS_BH(net, field)	SNMP_INC_STATS_BH((net)->mib.ip_statistics, field)
-#define IP_ADD_STATS(net, field, val)	SNMP_ADD_STATS((net)->mib.ip_statistics, field, val)
-#define IP_ADD_STATS_BH(net, field, val) SNMP_ADD_STATS_BH((net)->mib.ip_statistics, field, val)
-#define IP_UPD_PO_STATS(net, field, val) SNMP_UPD_PO_STATS((net)->mib.ip_statistics, field, val)
-#define IP_UPD_PO_STATS_BH(net, field, val) SNMP_UPD_PO_STATS_BH((net)->mib.ip_statistics, field, val)
+#define IP_INC_STATS(net, field)	SNMP_INC_STATS64((net)->mib.ip_statistics, field)
+#define IP_INC_STATS_BH(net, field)	SNMP_INC_STATS64_BH((net)->mib.ip_statistics, field)
+#define IP_ADD_STATS(net, field, val)	SNMP_ADD_STATS64((net)->mib.ip_statistics, field, val)
+#define IP_ADD_STATS_BH(net, field, val) SNMP_ADD_STATS64_BH((net)->mib.ip_statistics, field, val)
+#define IP_UPD_PO_STATS(net, field, val) SNMP_UPD_PO_STATS64((net)->mib.ip_statistics, field, val)
+#define IP_UPD_PO_STATS_BH(net, field, val) SNMP_UPD_PO_STATS64_BH((net)->mib.ip_statistics, field, val)
 #define NET_INC_STATS(net, field)	SNMP_INC_STATS((net)->mib.net_statistics, field)
 #define NET_INC_STATS_BH(net, field)	SNMP_INC_STATS_BH((net)->mib.net_statistics, field)
 #define NET_INC_STATS_USER(net, field) 	SNMP_INC_STATS_USER((net)->mib.net_statistics, field)
@@ -178,6 +178,14 @@ extern struct ipv4_config ipv4_config;
 #define NET_ADD_STATS_USER(net, field, adnd) SNMP_ADD_STATS_USER((net)->mib.net_statistics, field, adnd)
 
 extern unsigned long snmp_fold_field(void __percpu *mib[], int offt);
+#if BITS_PER_LONG==32
+extern u64 snmp_fold_field64(void __percpu *mib[], int offt, size_t sync_off);
+#else
+static inline u64 snmp_fold_field64(void __percpu *mib[], int offt, size_t syncp_off)
+{
+	return snmp_fold_field(mib, offt);
+}
+#endif
 extern int snmp_mib_init(void __percpu *ptr[2], size_t mibsize, size_t align);
 extern void snmp_mib_free(void __percpu *ptr[2]);
 
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index f5808d5..1f84124 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -136,17 +136,17 @@ extern struct ctl_path net_ipv6_ctl_path[];
 /* MIBs */
 
 #define IP6_INC_STATS(net, idev,field)		\
-		_DEVINC(net, ipv6, , idev, field)
+		_DEVINC(net, ipv6, 64, idev, field)
 #define IP6_INC_STATS_BH(net, idev,field)	\
-		_DEVINC(net, ipv6, _BH, idev, field)
+		_DEVINC(net, ipv6, 64_BH, idev, field)
 #define IP6_ADD_STATS(net, idev,field,val)	\
-		_DEVADD(net, ipv6, , idev, field, val)
+		_DEVADD(net, ipv6, 64, idev, field, val)
 #define IP6_ADD_STATS_BH(net, idev,field,val)	\
-		_DEVADD(net, ipv6, _BH, idev, field, val)
+		_DEVADD(net, ipv6, 64_BH, idev, field, val)
 #define IP6_UPD_PO_STATS(net, idev,field,val)   \
-		_DEVUPD(net, ipv6, , idev, field, val)
+		_DEVUPD(net, ipv6, 64, idev, field, val)
 #define IP6_UPD_PO_STATS_BH(net, idev,field,val)   \
-		_DEVUPD(net, ipv6, _BH, idev, field, val)
+		_DEVUPD(net, ipv6, 64_BH, idev, field, val)
 #define ICMP6_INC_STATS(net, idev, field)	\
 		_DEVINC(net, icmpv6, , idev, field)
 #define ICMP6_INC_STATS_BH(net, idev, field)	\
diff --git a/include/net/snmp.h b/include/net/snmp.h
index 92456f1..af7e9dc 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -47,15 +47,15 @@ struct snmp_mib {
 }
 
 /*
- * We use all unsigned longs. Linux will soon be so reliable that even 
- * these will rapidly get too small 8-). Seriously consider the IpInReceives 
- * count on the 20Gb/s + networks people expect in a few years time!
+ * We use unsigned longs for most mibs but u64 for ipstats.
  */
+#include <linux/u64_stats_sync.h>
 
 /* IPstats */
 #define IPSTATS_MIB_MAX	__IPSTATS_MIB_MAX
 struct ipstats_mib {
-	unsigned long	mibs[IPSTATS_MIB_MAX];
+	u64		mibs[IPSTATS_MIB_MAX];
+	struct u64_stats_sync syncp;
 };
 
 /* ICMP */
@@ -122,19 +122,31 @@ struct linux_xfrm_mib {
 #define SNMP_STAT_USRPTR(name)	(name[1])
 
 #define SNMP_INC_STATS_BH(mib, field)	\
-			__this_cpu_inc(mib[0]->mibs[field])
+	do {									\
+		BUILD_BUG_ON(sizeof(mib[0]->mibs[field]) > sizeof(long));	\
+		__this_cpu_inc(mib[0]->mibs[field]);				\
+	} while (0)
 #define SNMP_INC_STATS_USER(mib, field)	\
-			this_cpu_inc(mib[1]->mibs[field])
+	do {									\
+		BUILD_BUG_ON(sizeof(mib[1]->mibs[field]) > sizeof(long));	\
+		this_cpu_inc(mib[1]->mibs[field]);				\
+	} while (0)
 #define SNMP_INC_STATS(mib, field)	\
-			this_cpu_inc(mib[!in_softirq()]->mibs[field])
+	do {									\
+		BUILD_BUG_ON(sizeof(mib[0]->mibs[field]) > sizeof(long));	\
+		this_cpu_inc(mib[!in_softirq()]->mibs[field]);			\
+	} while (0)
 #define SNMP_DEC_STATS(mib, field)	\
-			this_cpu_dec(mib[!in_softirq()]->mibs[field])
+	do {									\
+		BUILD_BUG_ON(sizeof(mib[0]->mibs[field]) > sizeof(long));	\
+		this_cpu_dec(mib[!in_softirq()]->mibs[field]);			\
+	} while (0)
 #define SNMP_ADD_STATS_BH(mib, field, addend)	\
 			__this_cpu_add(mib[0]->mibs[field], addend)
 #define SNMP_ADD_STATS_USER(mib, field, addend)	\
 			this_cpu_add(mib[1]->mibs[field], addend)
 #define SNMP_ADD_STATS(mib, field, addend)	\
-			this_cpu_add(mib[0]->mibs[field], addend)
+			this_cpu_add(mib[!in_softirq()]->mibs[field], addend)
 /*
  * Use "__typeof__(*mib[0]) *ptr" instead of "__typeof__(mib[0]) ptr"
  * to make @ptr a non-percpu pointer.
@@ -144,6 +156,7 @@ struct linux_xfrm_mib {
 		__typeof__(*mib[0]) *ptr; \
 		preempt_disable(); \
 		ptr = this_cpu_ptr((mib)[!in_softirq()]); \
+		BUILD_BUG_ON(sizeof(mib[0]->mibs[0]) > sizeof(long)); \
 		ptr->mibs[basefield##PKTS]++; \
 		ptr->mibs[basefield##OCTETS] += addend;\
 		preempt_enable(); \
@@ -152,7 +165,74 @@ struct linux_xfrm_mib {
 	do { \
 		__typeof__(*mib[0]) *ptr = \
 			__this_cpu_ptr((mib)[!in_softirq()]); \
+		BUILD_BUG_ON(sizeof(mib[0]->mibs[0]) > sizeof(long)); \
 		ptr->mibs[basefield##PKTS]++; \
 		ptr->mibs[basefield##OCTETS] += addend;\
 	} while (0)
+
+
+#if BITS_PER_LONG==32
+
+#define SNMP_ADD_STATS64_BH(mib, field, addend) 			\
+	do {								\
+		__typeof__(*mib[0]) *ptr = __this_cpu_ptr((mib)[0]);	\
+		u64_stats_update_begin(&ptr->syncp);			\
+		ptr->mibs[field] += addend;				\
+		u64_stats_update_end(&ptr->syncp);			\
+	} while (0)
+#define SNMP_ADD_STATS64_USER(mib, field, addend) 			\
+	do {								\
+		__typeof__(*mib[0]) *ptr;				\
+		preempt_disable();					\
+		ptr = __this_cpu_ptr((mib)[1]);				\
+		u64_stats_update_begin(&ptr->syncp);			\
+		ptr->mibs[field] += addend;				\
+		u64_stats_update_end(&ptr->syncp);			\
+		preempt_enable();					\
+	} while (0)
+#define SNMP_ADD_STATS64(mib, field, addend)				\
+	do {								\
+		__typeof__(*mib[0]) *ptr;				\
+		preempt_disable();					\
+		ptr = __this_cpu_ptr((mib)[!in_softirq()]);		\
+		u64_stats_update_begin(&ptr->syncp);			\
+		ptr->mibs[field] += addend;				\
+		u64_stats_update_end(&ptr->syncp);			\
+		preempt_enable();					\
+	} while (0)
+#define SNMP_INC_STATS64_BH(mib, field) SNMP_ADD_STATS64_BH(mib, field, 1)
+#define SNMP_INC_STATS64_USER(mib, field) SNMP_ADD_STATS64_USER(mib, field, 1)
+#define SNMP_INC_STATS64(mib, field) SNMP_ADD_STATS64(mib, field, 1)
+#define SNMP_UPD_PO_STATS64(mib, basefield, addend)			\
+	do {								\
+		__typeof__(*mib[0]) *ptr;				\
+		preempt_disable();					\
+		ptr = __this_cpu_ptr((mib)[!in_softirq()]);		\
+		u64_stats_update_begin(&ptr->syncp);			\
+		ptr->mibs[basefield##PKTS]++;				\
+		ptr->mibs[basefield##OCTETS] += addend;			\
+		u64_stats_update_end(&ptr->syncp);			\
+		preempt_enable();					\
+	} while (0)
+#define SNMP_UPD_PO_STATS64_BH(mib, basefield, addend)			\
+	do {								\
+		__typeof__(*mib[0]) *ptr;				\
+		ptr = __this_cpu_ptr((mib)[!in_softirq()]);		\
+		u64_stats_update_begin(&ptr->syncp);			\
+		ptr->mibs[basefield##PKTS]++;				\
+		ptr->mibs[basefield##OCTETS] += addend;			\
+		u64_stats_update_end(&ptr->syncp);			\
+	} while (0)
+#else
+#define SNMP_INC_STATS64_BH(mib, field)		SNMP_INC_STATS_BH(mib, field)
+#define SNMP_INC_STATS64_USER(mib, field)	SNMP_INC_STATS_USER(mib, field)
+#define SNMP_INC_STATS64(mib, field)		SNMP_INC_STATS(mib, field)
+#define SNMP_DEC_STATS64(mib, field)		SNMP_DEC_STATS(mib, field)
+#define SNMP_ADD_STATS64_BH(mib, field, addend) SNMP_ADD_STATS_BH(mib, field, addend)
+#define SNMP_ADD_STATS64_USER(mib, field, addend) SNMP_ADD_STATS_USER(mib, field, addend)
+#define SNMP_ADD_STATS64(mib, field, addend)	SNMP_ADD_STATS(mib, field, addend)
+#define SNMP_UPD_PO_STATS64(mib, basefield, addend) SNMP_UPD_PO_STATS(mib, basefield, addend)
+#define SNMP_UPD_PO_STATS64_BH(mib, basefield, addend) SNMP_UPD_PO_STATS_BH(mib, basefield, addend)
+#endif
+
 #endif
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 640db9b..aeb178e 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1427,6 +1427,39 @@ unsigned long snmp_fold_field(void __percpu *mib[], int offt)
 }
 EXPORT_SYMBOL_GPL(snmp_fold_field);
 
+#if BITS_PER_LONG==32
+
+u64 snmp_fold_field64(void __percpu *mib[], int offt, size_t syncp_offset)
+{
+	u64 res = 0;
+	int i;
+
+	for_each_possible_cpu(i) {
+		struct u64_stats_sync *syncp;
+		u64 v0, v1;
+		unsigned int start;
+
+		/* first mib used by softirq context */
+		syncp = (struct u64_stats_sync *)(per_cpu_ptr(mib[0], i) + syncp_offset);
+		do {
+			start = u64_stats_fetch_begin_bh(syncp);
+			v0 = *(((u64 *) per_cpu_ptr(mib[0], i)) + offt);
+		} while (u64_stats_fetch_retry_bh(syncp, start));
+
+		/* second mib used in USER context */
+		syncp = (struct u64_stats_sync *)(per_cpu_ptr(mib[1], i) + syncp_offset);
+		do {
+			start = u64_stats_fetch_begin(syncp);
+			v1 = *(((u64 *) per_cpu_ptr(mib[1], i)) + offt);
+		} while (u64_stats_fetch_retry(syncp, start));
+
+		res += v0 + v1;
+	}
+	return res;
+}
+EXPORT_SYMBOL_GPL(snmp_fold_field64);
+#endif
+
 int snmp_mib_init(void __percpu *ptr[2], size_t mibsize, size_t align)
 {
 	BUG_ON(ptr == NULL);
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index e320ca6..d41cf26 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -344,9 +344,9 @@ static int snmp_seq_show(struct seq_file *seq, void *v)
 		   sysctl_ip_default_ttl);
 
 	for (i = 0; snmp4_ipstats_list[i].name != NULL; i++)
-		seq_printf(seq, " %lu",
-			   snmp_fold_field((void __percpu **)net->mib.ip_statistics,
-					   snmp4_ipstats_list[i].entry));
+		seq_printf(seq, " %llu",
+			   snmp_fold_field64((void __percpu **)net->mib.ip_statistics,
+					   snmp4_ipstats_list[i].entry, offsetof(struct ipstats_mib, syncp)));
 
 	icmp_put(seq);	/* RFC 2011 compatibility */
 	icmpmsg_put(seq);
@@ -432,9 +432,9 @@ static int netstat_seq_show(struct seq_file *seq, void *v)
 
 	seq_puts(seq, "\nIpExt:");
 	for (i = 0; snmp4_ipextstats_list[i].name != NULL; i++)
-		seq_printf(seq, " %lu",
-			   snmp_fold_field((void __percpu **)net->mib.ip_statistics,
-					   snmp4_ipextstats_list[i].entry));
+		seq_printf(seq, " %llu",
+			   snmp_fold_field64((void __percpu **)net->mib.ip_statistics,
+					   snmp4_ipextstats_list[i].entry, offsetof(struct ipstats_mib, syncp)));
 
 	seq_putc(seq, '\n');
 	return 0;
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index c20a7c2..56165ae 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -3858,12 +3858,28 @@ static inline void __snmp6_fill_stats(u64 *stats, void __percpu **mib,
 	memset(&stats[items], 0, pad);
 }
 
+static inline void __snmp6_fill_stats64(u64 *stats, void __percpu **mib,
+				      int items, int bytes, size_t syncpoff)
+{
+	int i;
+	int pad = bytes - sizeof(u64) * items;
+	BUG_ON(pad < 0);
+
+	/* Use put_unaligned() because stats may not be aligned for u64. */
+	put_unaligned(items, &stats[0]);
+	for (i = 1; i < items; i++)
+		put_unaligned(snmp_fold_field64(mib, i, syncpoff), &stats[i]);
+
+	memset(&stats[items], 0, pad);
+}
+
 static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype,
 			     int bytes)
 {
 	switch (attrtype) {
 	case IFLA_INET6_STATS:
-		__snmp6_fill_stats(stats, (void __percpu **)idev->stats.ipv6, IPSTATS_MIB_MAX, bytes);
+		__snmp6_fill_stats64(stats, (void __percpu **)idev->stats.ipv6,
+				     IPSTATS_MIB_MAX, bytes, offsetof(struct ipstats_mib, syncp));
 		break;
 	case IFLA_INET6_ICMP6STATS:
 		__snmp6_fill_stats(stats, (void __percpu **)idev->stats.icmpv6, ICMP6_MIB_MAX, bytes);
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 566798d..acc1960 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -179,12 +179,21 @@ static void snmp6_seq_show_item(struct seq_file *seq, void __percpu **mib,
 			   snmp_fold_field(mib, itemlist[i].entry));
 }
 
+static void snmp6_seq_show_item64(struct seq_file *seq, void __percpu **mib,
+				const struct snmp_mib *itemlist, size_t syncpoff)
+{
+	int i;
+	for (i=0; itemlist[i].name; i++)
+		seq_printf(seq, "%-32s\t%llu\n", itemlist[i].name,
+			   snmp_fold_field64(mib, itemlist[i].entry, syncpoff));
+}
+
 static int snmp6_seq_show(struct seq_file *seq, void *v)
 {
 	struct net *net = (struct net *)seq->private;
 
-	snmp6_seq_show_item(seq, (void __percpu **)net->mib.ipv6_statistics,
-			    snmp6_ipstats_list);
+	snmp6_seq_show_item64(seq, (void __percpu **)net->mib.ipv6_statistics,
+			    snmp6_ipstats_list, offsetof(struct ipstats_mib, syncp));
 	snmp6_seq_show_item(seq, (void __percpu **)net->mib.icmpv6_statistics,
 			    snmp6_icmp6_list);
 	snmp6_seq_show_icmpv6msg(seq,



^ permalink raw reply related

* Re: [PATCH 0/3] b43: logging cleanups
From: John W. Linville @ 2010-06-24 18:53 UTC (permalink / raw)
  To: Joe Perches
  Cc: Stefano Brivio, linux-wireless, netdev, linux-kernel,
	Larry.Finger, mb, zajec5
In-Reply-To: <cover.1276988387.git.joe@perches.com>

On Sat, Jun 19, 2010 at 04:30:08PM -0700, Joe Perches wrote:
> Just some small cleanups
> 
> Joe Perches (3):
>   drivers/net/wireless/b43: Use local ratelimit_state
>   drivers/net/wireless/b43: Logging cleanups
>   drivers/net/wireless/b43: Rename b43_debug to b43_debugging

Any of the b43 guys want to express an opinion on these?

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply

* Re: [PATCH] vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)
From: Pedro Garcia Pelaez @ 2010-06-24 18:28 UTC (permalink / raw)
  To: Pedro Garcia; +Cc: netdev, Patrick McHardy, Ben Hutchings, Eric Dumazet
In-Reply-To: <311b59aee7d648c6124a84b5ca06ac60@dondevamos.com>

On Wed, 16 Jun 2010 10:49:27 +0200, Pedro Garcia
<pedro.netdev@dondevamos.com> wrote:
> On Mon, 14 Jun 2010 21:12:52 +0200, Eric Dumazet
<eric.dumazet@gmail.com>
> wrote:
>> Le lundi 14 juin 2010 à 19:11 +0200, Patrick McHardy a écrit :
>>> Ben Hutchings wrote:
>>> > On Mon, 2010-06-14 at 18:49 +0200, Pedro Garcia wrote:
>>> >   
>>> >> On Sun, 13 Jun 2010 22:56:30 +0100, Ben Hutchings
>>> >> <bhutchings@solarflare.com> wrote:
>>> >>     
>>> >>> I have no particular opinion on this change, but you need to read
>>> >>> and
>>> >>> follow Documentation/SubmittingPatches.
>>> >>>
>>> >>> Ben.
>>> >>>       
>>> >> Sorry, first kernel patch, and I did not know about it. I resubmit
>>> >> with
>>> >> the correct style / format:
>>> >>     
>>> > [...]
>>> >
>>> > Sorry, no you haven't.
>>> >
>>> > - Networking changes go through David Miller's net-next-2.6 tree so
>>> > you
>>> > need to use that as the baseline, not 2.6.26
>>> > - Patches should be applicable with -p1, not -p0 (so if you use
diff,
>>> > you should run it from one directory level up)
>>> > - The patch was word-wrapped
>>> 
>>> Additionally:
>>> 
>>> - please use the proper comment style, meaning each line begins
>>>   with a '*'
>>> 
>>> - the pr_debug statements may be useful for debugging, but are
>>>   a bit excessive for the final version
>>> 
>>> - + /* 2010-06-13: Pedro Garcia
>>> 
>>>    We have changelogs for this, simply explaining what the code
>>>    does is enough.
>>> 
>>> - Please CC the maintainer (which is me)
>>> --
>> 
>> Pedro, we have two kind of vlan setups :
>> 
>> accelerated and non accelerated ones.
>> 
>> Your patch address non accelated ones only, since you only touch
>> vlan_skb_recv()
>> 
>> Accelerated vlan can follow these paths :
>> 
>> 1) NAPI devices
>> 
>> vlan_gro_receive() -> vlan_gro_common()
>> 
>> 2) non NAPI devices
>> 
>> __vlan_hwaccel_rx() 
>> 
>> So you might also patch __vlan_hwaccel_rx() and vlan_gro_common()
>> 
>> Please merge following bits to your patch submission :
>> 
>> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868
>> 
>> 
>> Good luck for your first patch !
> 
> Here it is again. I added the modifications in
> http://kerneltrap.org/mailarchive/linux-netdev/2010/5/23/6277868 for HW
> accelerated incoming packets (it did not apply cleanly on the last
version
> of
> the kernel, so I applied manually). Now, if the VLAN 0 is not explicitly
> created by the user, VLAN 0 packets will be treated as no VLAN (802.1p
> packets), instead of dropping them.
> 
> The patch is now for two files: vlan_core (accel) and vlan_dev (non
accel)
> 
> I can not test on HW accelerated devices, so if someone can check it I
> will appreciate (even though in the thread above it looked like yes).
For
> non accel I tessted in 2.6.26. Now the patch is for
> net-next-2.6, and it compiles OK, but I a have to setup a test
environment
> to check it is still OK (should, but better to test).
> 

I tested the pacth under net-next-2.6, and it OOPSed the kernel (worked
under 2.6.26 but not under 2.6.35). I have found why and solved it, but
now, to my surprise, it only works when I leave the interface in
promiscuous mode.

After a lot of debugging, looks like the skb does not even arrive to
__netif_receive_skb, unless in promiscuous mode. Under what circunstances
could this happen?

Pedro 

^ permalink raw reply

* [PATCH 2/2] netdev: mdio-octeon: Fix section mismatch errors.
From: David Daney @ 2010-06-24 19:14 UTC (permalink / raw)
  To: netdev; +Cc: linux-mips, David Daney
In-Reply-To: <1277406888-26309-1-git-send-email-ddaney@caviumnetworks.com>

We started getting:

WARNING: vmlinux.o(.data+0x20bd0): Section mismatch in reference from
the variable octeon_mdiobus_driver to the function
.init.text:octeon_mdiobus_probe()

This fixes it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
---
 drivers/net/phy/mdio-octeon.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/phy/mdio-octeon.c b/drivers/net/phy/mdio-octeon.c
index f443d43..bd12ba9 100644
--- a/drivers/net/phy/mdio-octeon.c
+++ b/drivers/net/phy/mdio-octeon.c
@@ -85,7 +85,7 @@ static int octeon_mdiobus_write(struct mii_bus *bus, int phy_id,
 	return 0;
 }
 
-static int __init octeon_mdiobus_probe(struct platform_device *pdev)
+static int __devinit octeon_mdiobus_probe(struct platform_device *pdev)
 {
 	struct octeon_mdiobus *bus;
 	union cvmx_smix_en smi_en;
@@ -143,7 +143,7 @@ err:
 	return err;
 }
 
-static int __exit octeon_mdiobus_remove(struct platform_device *pdev)
+static int __devexit octeon_mdiobus_remove(struct platform_device *pdev)
 {
 	struct octeon_mdiobus *bus;
 	union cvmx_smix_en smi_en;
@@ -163,7 +163,7 @@ static struct platform_driver octeon_mdiobus_driver = {
 		.owner		= THIS_MODULE,
 	},
 	.probe		= octeon_mdiobus_probe,
-	.remove		= __exit_p(octeon_mdiobus_remove),
+	.remove		= __devexit_p(octeon_mdiobus_remove),
 };
 
 void octeon_mdiobus_force_mod_depencency(void)
-- 
1.6.6.1


^ permalink raw reply related

* [PATCH 1/2] netdev: octeon_mgmt: Fix section mismatch errors.
From: David Daney @ 2010-06-24 19:14 UTC (permalink / raw)
  To: netdev; +Cc: linux-mips, David Daney
In-Reply-To: <1277406888-26309-1-git-send-email-ddaney@caviumnetworks.com>

We started getting:

WARNING: drivers/net/built-in.o(.data+0x10f0): Section mismatch in
reference from the variable octeon_mgmt_driver to the function
.init.text:octeon_mgmt_probe()

This fixes it.

Signed-off-by: David Daney <ddaney@caviumnetworks.com>
---
 drivers/net/octeon/octeon_mgmt.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/octeon/octeon_mgmt.c b/drivers/net/octeon/octeon_mgmt.c
index 000e792..f4a0f08 100644
--- a/drivers/net/octeon/octeon_mgmt.c
+++ b/drivers/net/octeon/octeon_mgmt.c
@@ -1067,7 +1067,7 @@ static const struct net_device_ops octeon_mgmt_ops = {
 #endif
 };
 
-static int __init octeon_mgmt_probe(struct platform_device *pdev)
+static int __devinit octeon_mgmt_probe(struct platform_device *pdev)
 {
 	struct resource *res_irq;
 	struct net_device *netdev;
@@ -1124,7 +1124,7 @@ err:
 	return -ENOENT;
 }
 
-static int __exit octeon_mgmt_remove(struct platform_device *pdev)
+static int __devexit octeon_mgmt_remove(struct platform_device *pdev)
 {
 	struct net_device *netdev = dev_get_drvdata(&pdev->dev);
 
@@ -1139,7 +1139,7 @@ static struct platform_driver octeon_mgmt_driver = {
 		.owner		= THIS_MODULE,
 	},
 	.probe		= octeon_mgmt_probe,
-	.remove		= __exit_p(octeon_mgmt_remove),
+	.remove		= __devexit_p(octeon_mgmt_remove),
 };
 
 extern void octeon_mdiobus_force_mod_depencency(void);
-- 
1.6.6.1


^ permalink raw reply related

* [PATCH 0/2] Fix some section mismatch errors for Octeon network drivers
From: David Daney @ 2010-06-24 19:14 UTC (permalink / raw)
  To: netdev; +Cc: linux-mips, David Daney

It seems the kernel build system is becoming more rigorous in its
checking of section mismatches.  Please consider these two patches to
correct the problems.

The warnings were not seen under 2.6.34, so one could argue that these
are fixing a regression (although admittedly it a minor one).

David Daney (2):
  netdev: octeon_mgmt: Fix section mismatch errors.
  netdev: mdio-octeon: Fix section mismatch errors.

 drivers/net/octeon/octeon_mgmt.c |    6 +++---
 drivers/net/phy/mdio-octeon.c    |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)


^ permalink raw reply

* Re: [RFC PATCH v2 1/5] irq: add tracepoint to softirq_raise
From: Steven Rostedt @ 2010-06-24 19:15 UTC (permalink / raw)
  To: Koki Sanagi
  Cc: netdev, davem, scott.a.mcmillan, kaneshige.kenji, izumi.taku,
	linux-kernel
In-Reply-To: <4C23145B.30805@jp.fujitsu.com>

Hi Koki,

Your subject says 1/5 but I do not see any other patches.


On Thu, 2010-06-24 at 17:16 +0900, Koki Sanagi wrote:
> This patch adds a tracepoint to raising of softirq.
> This is useful if you want to detect which hard interrupt raise softirq
> and lets you know a time between raising softirq and performing softirq.
> Combinating with other tracepoint, it lets us know a process of packets
> (See patch 0/5).
> 
>           <idle>-0     [001] 241229.957184: softirq_raise: vec=3 [action=NET_RX]
>           <idle>-0     [000] 241229.993399: softirq_raise: vec=1 [action=TIMER]
>           <idle>-0     [000] 241229.993400: softirq_raise: vec=9 [action=RCU]
> 
> This is a same patch Lai Jiangshan submitted.
> http://marc.info/?l=linux-kernel&m=126026122728732&w=2
> 
> Signed-off-by: Koki Sanagi <sanagi.koki@jp.fujitsu.com>
> ---
>  include/linux/interrupt.h  |    8 +++++++-
>  include/trace/events/irq.h |   34 +++++++++++++++++++++++++++++++---
>  2 files changed, 38 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index c233113..1cb5726 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -18,6 +18,7 @@
>  #include <asm/atomic.h>
>  #include <asm/ptrace.h>
>  #include <asm/system.h>
> +#include <trace/events/irq.h>
>  
>  /*
>   * These correspond to the IORESOURCE_IRQ_* defines in
> @@ -402,7 +403,12 @@ asmlinkage void do_softirq(void);
>  asmlinkage void __do_softirq(void);
>  extern void open_softirq(int nr, void (*action)(struct softirq_action *));
>  extern void softirq_init(void);
> -#define __raise_softirq_irqoff(nr) do { or_softirq_pending(1UL << (nr)); } while (0)
> +static inline void __raise_softirq_irqoff(unsigned int nr)
> +{
> +	trace_softirq_raise(nr);
> +	or_softirq_pending(1UL << nr);
> +}
> +
>  extern void raise_softirq_irqoff(unsigned int nr);
>  extern void raise_softirq(unsigned int nr);
>  extern void wakeup_softirqd(void);
> diff --git a/include/trace/events/irq.h b/include/trace/events/irq.h
> index 0e4cfb6..7cb7435 100644
> --- a/include/trace/events/irq.h
> +++ b/include/trace/events/irq.h
> @@ -5,7 +5,9 @@
>  #define _TRACE_IRQ_H
>  
>  #include <linux/tracepoint.h>
> -#include <linux/interrupt.h>
> +
> +struct irqaction;
> +struct softirq_action;
>  
>  #define softirq_name(sirq) { sirq##_SOFTIRQ, #sirq }
>  #define show_softirq_name(val)				\
> @@ -82,6 +84,32 @@ TRACE_EVENT(irq_handler_exit,
>  		  __entry->irq, __entry->ret ? "handled" : "unhandled")
>  );
>  
> +/**
> + * softirq_raise - called immediately when a softirq is raised
> + * @nr: softirq vector number
> + *
> + * Tracepoint for tracing when softirq action is raised.
> + * Also, when used in combination with the softirq_entry tracepoint
> + * we can determine the softirq raise latency.
> + */
> +TRACE_EVENT(softirq_raise,
> +
> +	TP_PROTO(unsigned int nr),
> +
> +	TP_ARGS(nr),
> +
> +	TP_STRUCT__entry(
> +		__field(	unsigned int,	vec	)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->vec	= nr;
> +	),
> +
> +	TP_printk("vec=%d [action=%s]", __entry->vec,
> +		show_softirq_name(__entry->vec))

Hmm, is there a way to reuse a DECLARE_EVENT_CLASS here?

> +);
> +
>  DECLARE_EVENT_CLASS(softirq,
>  
>  	TP_PROTO(struct softirq_action *h, struct softirq_action *vec),
> @@ -89,11 +117,11 @@ DECLARE_EVENT_CLASS(softirq,
>  	TP_ARGS(h, vec),
>  
>  	TP_STRUCT__entry(
> -		__field(	int,	vec			)
> +		__field(	unsigned int,	vec	)
>  	),
>  
>  	TP_fast_assign(
> -		__entry->vec = (int)(h - vec);
> +		__entry->vec = (unsigned int)(h - vec);

Just curious, did you see the original as a bug?

-- Steve

>  	),
>  
>  	TP_printk("vec=%d [action=%s]", __entry->vec,

^ permalink raw reply

* Re: [PATCH 0/3] b43: logging cleanups
From: Larry Finger @ 2010-06-24 19:20 UTC (permalink / raw)
  To: John W. Linville
  Cc: Joe Perches, Stefano Brivio,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, mb-fseUSCV1ubazQB+pC5nmwQ,
	zajec5-Re5JQEeQqe8AvxtiuMwx3w
In-Reply-To: <20100624185339.GC2368-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>

On 06/24/2010 01:53 PM, John W. Linville wrote:
> On Sat, Jun 19, 2010 at 04:30:08PM -0700, Joe Perches wrote:
>> Just some small cleanups
>>
>> Joe Perches (3):
>>   drivers/net/wireless/b43: Use local ratelimit_state
>>   drivers/net/wireless/b43: Logging cleanups
>>   drivers/net/wireless/b43: Rename b43_debug to b43_debugging
> 
> Any of the b43 guys want to express an opinion on these?

The local ratelimit patch is OK. My personal opinion is that the others
are just churning the source for no real reason, but I have no major
objections.

Larry
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 0/3] b43: logging cleanups
From: Joe Perches @ 2010-06-24 19:40 UTC (permalink / raw)
  To: John W. Linville
  Cc: Stefano Brivio, linux-wireless, netdev, linux-kernel,
	Larry.Finger, mb, zajec5
In-Reply-To: <20100624185339.GC2368@tuxdriver.com>

On Thu, 2010-06-24 at 14:53 -0400, John W. Linville wrote:
> On Sat, Jun 19, 2010 at 04:30:08PM -0700, Joe Perches wrote:
> > Just some small cleanups
> > Joe Perches (3):
> >   drivers/net/wireless/b43: Use local ratelimit_state
> >   drivers/net/wireless/b43: Logging cleanups
> >   drivers/net/wireless/b43: Rename b43_debug to b43_debugging
> Any of the b43 guys want to express an opinion on these?

Stefano, are you active here?
Your last ack for b43 was Feb 2008.
There have been 400+ commits to b43 without your ack.

Should your name be moved from MAINTAINERS to CREDITS?

$ ./scripts/get_maintainer.pl --rolestats -f drivers/net/wireless/b43/
Stefano Brivio <stefano.brivio@polimi.it> (maintainer:B43 WIRELESS DRIVER)
"John W. Linville" <linville@tuxdriver.com> (maintainer:NETWORKING [WIREL...,commit_signer:204/240=85%)
"Rafał Miłecki" <zajec5@gmail.com> (commit_signer:83/240=35%)
"Gábor Stefanik" <netrolller.3d@gmail.com> (commit_signer:44/240=18%)
Michael Buesch <mb@bu3sch.de> (commit_signer:39/240=16%)
Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:13/240=5%)
linux-wireless@vger.kernel.org (open list:B43 WIRELESS DRIVER)
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)



^ permalink raw reply

* Re: [PATCH 0/3] b43: logging cleanups
From: Larry Finger @ 2010-06-24 19:56 UTC (permalink / raw)
  To: Joe Perches
  Cc: John W. Linville, Stefano Brivio, linux-wireless, netdev,
	linux-kernel, mb, zajec5
In-Reply-To: <1277408432.1654.80.camel@Joe-Laptop.home>

On 06/24/2010 02:40 PM, Joe Perches wrote:
> On Thu, 2010-06-24 at 14:53 -0400, John W. Linville wrote:
>> On Sat, Jun 19, 2010 at 04:30:08PM -0700, Joe Perches wrote:
>>> Just some small cleanups
>>> Joe Perches (3):
>>>   drivers/net/wireless/b43: Use local ratelimit_state
>>>   drivers/net/wireless/b43: Logging cleanups
>>>   drivers/net/wireless/b43: Rename b43_debug to b43_debugging
>> Any of the b43 guys want to express an opinion on these?
> 
> Stefano, are you active here?
> Your last ack for b43 was Feb 2008.
> There have been 400+ commits to b43 without your ack.
> 
> Should your name be moved from MAINTAINERS to CREDITS?
> 
> $ ./scripts/get_maintainer.pl --rolestats -f drivers/net/wireless/b43/
> Stefano Brivio <stefano.brivio@polimi.it> (maintainer:B43 WIRELESS DRIVER)
> "John W. Linville" <linville@tuxdriver.com> (maintainer:NETWORKING [WIREL...,commit_signer:204/240=85%)
> "Rafał Miłecki" <zajec5@gmail.com> (commit_signer:83/240=35%)
> "Gábor Stefanik" <netrolller.3d@gmail.com> (commit_signer:44/240=18%)
> Michael Buesch <mb@bu3sch.de> (commit_signer:39/240=16%)
> Larry Finger <Larry.Finger@lwfinger.net> (commit_signer:13/240=5%)
> linux-wireless@vger.kernel.org (open list:B43 WIRELESS DRIVER)
> netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
> linux-kernel@vger.kernel.org (open list)

The primary arbitrator for patches to the subtle parts of b43 is Michael
Buesch; however, he is no longer an official MAINTAINER. I can ACK some
things; however, any changes that are associated with my reverse
engineering of the Broadcom drivers are off limits.

Larry

^ permalink raw reply

* [net-next-2.6 PATCH 0/10] enic: updates to version 1.4.1.1
From: Vasanthy Kolluri @ 2010-06-24 20:49 UTC (permalink / raw)
  To: davem; +Cc: netdev, scofeldm, vkolluri, roprabhu

The following patch series implements enic driver updates:

01/10 - Feature Add: Replace LRO with GRO
02/10 - Bug Fix: Change hardware ingress vlan rewrite mode
03/10 - Use a lighter reset operation for enic devices
04/10 - Clean up: Add wrapper routines for firmware devcmd calls
05/10 - Use (netdev|dev|pr)_<level> macro helpers for logging
06/10 - Add new firmware devcmds
07/10 - Use receive queue buffer blocks of 32/64 entries
08/10 - Feature Add: Add loopback capability to enic devices
09/10 - Bug Fix: Handle surprise hardware removals
10/10 - Clean ups

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>

^ permalink raw reply

* [net-next-2.6 PATCH 01/10] enic: Feature Add: Replace LRO with GRO
From: Vasanthy Kolluri @ 2010-06-24 20:49 UTC (permalink / raw)
  To: davem; +Cc: netdev, scofeldm, vkolluri, roprabhu
In-Reply-To: <20100624204723.22595.42990.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>

enic now uses the GRO mechanism instead of LRO to pass skbs to upper
layers.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
---
 drivers/net/Kconfig          |    1 -
 drivers/net/enic/enic.h      |    9 +----
 drivers/net/enic/enic_main.c |   79 ++++--------------------------------------
 3 files changed, 9 insertions(+), 80 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index fe113d0..a0182a2 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2615,7 +2615,6 @@ config EHEA
 config ENIC
 	tristate "Cisco VIC Ethernet NIC Support"
 	depends on PCI && INET
-	select INET_LRO
 	help
 	  This enables the support for the Cisco VIC Ethernet card.
 
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index 45e86d1..81c2ade 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -20,8 +20,6 @@
 #ifndef _ENIC_H_
 #define _ENIC_H_
 
-#include <linux/inet_lro.h>
-
 #include "vnic_enet.h"
 #include "vnic_dev.h"
 #include "vnic_wq.h"
@@ -34,13 +32,10 @@
 
 #define DRV_NAME		"enic"
 #define DRV_DESCRIPTION		"Cisco VIC Ethernet NIC Driver"
-#define DRV_VERSION		"1.3.1.1-pp"
+#define DRV_VERSION		"1.4.1.1"
 #define DRV_COPYRIGHT		"Copyright 2008-2009 Cisco Systems, Inc"
 #define PFX			DRV_NAME ": "
 
-#define ENIC_LRO_MAX_DESC	8
-#define ENIC_LRO_MAX_AGGR	64
-
 #define ENIC_BARS_MAX		6
 
 #define ENIC_WQ_MAX		8
@@ -124,8 +119,6 @@ struct enic {
 	u64 rq_truncated_pkts;
 	u64 rq_bad_fcs;
 	struct napi_struct napi;
-	struct net_lro_mgr lro_mgr;
-	struct net_lro_desc lro_desc[ENIC_LRO_MAX_DESC];
 
 	/* interrupt resource cache line section */
 	____cacheline_aligned struct vnic_intr intr[ENIC_INTR_MAX];
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index bc7d6b9..c2b848f 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -1287,51 +1287,6 @@ static int enic_set_rq_alloc_buf(struct enic *enic)
 	return 0;
 }
 
-static int enic_get_skb_header(struct sk_buff *skb, void **iphdr,
-	void **tcph, u64 *hdr_flags, void *priv)
-{
-	struct cq_enet_rq_desc *cq_desc = priv;
-	unsigned int ip_len;
-	struct iphdr *iph;
-
-	u8 type, color, eop, sop, ingress_port, vlan_stripped;
-	u8 fcoe, fcoe_sof, fcoe_fc_crc_ok, fcoe_enc_error, fcoe_eof;
-	u8 tcp_udp_csum_ok, udp, tcp, ipv4_csum_ok;
-	u8 ipv6, ipv4, ipv4_fragment, fcs_ok, rss_type, csum_not_calc;
-	u8 packet_error;
-	u16 q_number, completed_index, bytes_written, vlan, checksum;
-	u32 rss_hash;
-
-	cq_enet_rq_desc_dec(cq_desc,
-		&type, &color, &q_number, &completed_index,
-		&ingress_port, &fcoe, &eop, &sop, &rss_type,
-		&csum_not_calc, &rss_hash, &bytes_written,
-		&packet_error, &vlan_stripped, &vlan, &checksum,
-		&fcoe_sof, &fcoe_fc_crc_ok, &fcoe_enc_error,
-		&fcoe_eof, &tcp_udp_csum_ok, &udp, &tcp,
-		&ipv4_csum_ok, &ipv6, &ipv4, &ipv4_fragment,
-		&fcs_ok);
-
-	if (!(ipv4 && tcp && !ipv4_fragment))
-		return -1;
-
-	skb_reset_network_header(skb);
-	iph = ip_hdr(skb);
-
-	ip_len = ip_hdrlen(skb);
-	skb_set_transport_header(skb, ip_len);
-
-	/* check if ip header and tcp header are complete */
-	if (ntohs(iph->tot_len) < ip_len + tcp_hdrlen(skb))
-		return -1;
-
-	*hdr_flags = LRO_IPV4 | LRO_TCP;
-	*tcph = tcp_hdr(skb);
-	*iphdr = iph;
-
-	return 0;
-}
-
 static void enic_rq_indicate_buf(struct vnic_rq *rq,
 	struct cq_desc *cq_desc, struct vnic_rq_buf *buf,
 	int skipped, void *opaque)
@@ -1397,18 +1352,17 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
 
 		if (enic->vlan_group && vlan_stripped) {
 
-			if ((netdev->features & NETIF_F_LRO) && ipv4)
-				lro_vlan_hwaccel_receive_skb(&enic->lro_mgr,
-					skb, enic->vlan_group,
-					vlan, cq_desc);
+			if (netdev->features & NETIF_F_GRO)
+				vlan_gro_receive(&enic->napi, enic->vlan_group,
+					vlan, skb);
 			else
 				vlan_hwaccel_receive_skb(skb,
 					enic->vlan_group, vlan);
 
 		} else {
 
-			if ((netdev->features & NETIF_F_LRO) && ipv4)
-				lro_receive_skb(&enic->lro_mgr, skb, cq_desc);
+			if (netdev->features & NETIF_F_GRO)
+				napi_gro_receive(&enic->napi, skb);
 			else
 				netif_receive_skb(skb);
 
@@ -1438,7 +1392,6 @@ static int enic_rq_service(struct vnic_dev *vdev, struct cq_desc *cq_desc,
 static int enic_poll(struct napi_struct *napi, int budget)
 {
 	struct enic *enic = container_of(napi, struct enic, napi);
-	struct net_device *netdev = enic->netdev;
 	unsigned int rq_work_to_do = budget;
 	unsigned int wq_work_to_do = -1; /* no limit */
 	unsigned int  work_done, rq_work_done, wq_work_done;
@@ -1478,12 +1431,9 @@ static int enic_poll(struct napi_struct *napi, int budget)
 	if (rq_work_done < rq_work_to_do) {
 
 		/* Some work done, but not enough to stay in polling,
-		 * flush all LROs and exit polling
+		 * exit polling
 		 */
 
-		if (netdev->features & NETIF_F_LRO)
-			lro_flush_all(&enic->lro_mgr);
-
 		napi_complete(napi);
 		vnic_intr_unmask(&enic->intr[ENIC_INTX_WQ_RQ]);
 	}
@@ -1494,7 +1444,6 @@ static int enic_poll(struct napi_struct *napi, int budget)
 static int enic_poll_msix(struct napi_struct *napi, int budget)
 {
 	struct enic *enic = container_of(napi, struct enic, napi);
-	struct net_device *netdev = enic->netdev;
 	unsigned int work_to_do = budget;
 	unsigned int work_done;
 	int err;
@@ -1528,12 +1477,9 @@ static int enic_poll_msix(struct napi_struct *napi, int budget)
 	if (work_done < work_to_do) {
 
 		/* Some work done, but not enough to stay in polling,
-		 * flush all LROs and exit polling
+		 * exit polling
 		 */
 
-		if (netdev->features & NETIF_F_LRO)
-			lro_flush_all(&enic->lro_mgr);
-
 		napi_complete(napi);
 		vnic_intr_unmask(&enic->intr[ENIC_MSIX_RQ]);
 	}
@@ -2378,21 +2324,12 @@ static int __devinit enic_probe(struct pci_dev *pdev,
 		netdev->features |= NETIF_F_TSO |
 			NETIF_F_TSO6 | NETIF_F_TSO_ECN;
 	if (ENIC_SETTING(enic, LRO))
-		netdev->features |= NETIF_F_LRO;
+		netdev->features |= NETIF_F_GRO;
 	if (using_dac)
 		netdev->features |= NETIF_F_HIGHDMA;
 
 	enic->csum_rx_enabled = ENIC_SETTING(enic, RXCSUM);
 
-	enic->lro_mgr.max_aggr = ENIC_LRO_MAX_AGGR;
-	enic->lro_mgr.max_desc = ENIC_LRO_MAX_DESC;
-	enic->lro_mgr.lro_arr = enic->lro_desc;
-	enic->lro_mgr.get_skb_header = enic_get_skb_header;
-	enic->lro_mgr.features	= LRO_F_NAPI | LRO_F_EXTRACT_VLAN_ID;
-	enic->lro_mgr.dev = netdev;
-	enic->lro_mgr.ip_summed = CHECKSUM_COMPLETE;
-	enic->lro_mgr.ip_summed_aggr = CHECKSUM_UNNECESSARY;
-
 	err = register_netdev(netdev);
 	if (err) {
 		printk(KERN_ERR PFX


^ permalink raw reply related

* [net-next-2.6 PATCH 02/10] enic: Bug Fix: Change hardware ingress vlan rewrite mode
From: Vasanthy Kolluri @ 2010-06-24 20:49 UTC (permalink / raw)
  To: davem; +Cc: netdev, scofeldm, vkolluri, roprabhu
In-Reply-To: <20100624204723.22595.42990.stgit@savbu-pc100.cisco.com>

From: Vasanthy Kolluri <vkolluri@cisco.com>

The current ingress vlan rewrite mode setting lets the hardware strip off
the tag control information of a packet received on native vlan. As a
result, the priority bits are also lost. The fix is to change the ingress
vlan rewrite mode setting such that the complete tag control information is
retained for packets that belong to native vlan.

Signed-off-by: Scott Feldman <scofeldm@cisco.com>
Signed-off-by: Vasanthy Kolluri <vkolluri@cisco.com>
Signed-off-by: Roopa Prabhu <roprabhu@cisco.com>
---
 drivers/net/enic/cq_enet_desc.h |   16 ++++++++++++++--
 drivers/net/enic/enic_main.c    |   31 ++++++++++++++++++++++++++-----
 drivers/net/enic/vnic_dev.c     |   14 ++++++++++++++
 drivers/net/enic/vnic_dev.h     |    2 ++
 drivers/net/enic/vnic_devcmd.h  |   12 ++++++++++++
 5 files changed, 68 insertions(+), 7 deletions(-)

diff --git a/drivers/net/enic/cq_enet_desc.h b/drivers/net/enic/cq_enet_desc.h
index 337d194..f2d98bb 100644
--- a/drivers/net/enic/cq_enet_desc.h
+++ b/drivers/net/enic/cq_enet_desc.h
@@ -73,6 +73,15 @@ struct cq_enet_rq_desc {
 #define CQ_ENET_RQ_DESC_FLAGS_TRUNCATED             (0x1 << 14)
 #define CQ_ENET_RQ_DESC_FLAGS_VLAN_STRIPPED         (0x1 << 15)
 
+#define CQ_ENET_RQ_DESC_VLAN_TCI_VLAN_BITS          12
+#define CQ_ENET_RQ_DESC_VLAN_TCI_VLAN_MASK \
+	((1 << CQ_ENET_RQ_DESC_VLAN_TCI_VLAN_BITS) - 1)
+#define CQ_ENET_RQ_DESC_VLAN_TCI_CFI_MASK           (0x1 << 12)
+#define CQ_ENET_RQ_DESC_VLAN_TCI_USER_PRIO_BITS     3
+#define CQ_ENET_RQ_DESC_VLAN_TCI_USER_PRIO_MASK \
+	((1 << CQ_ENET_RQ_DESC_VLAN_TCI_USER_PRIO_BITS) - 1)
+#define CQ_ENET_RQ_DESC_VLAN_TCI_USER_PRIO_SHIFT    13
+
 #define CQ_ENET_RQ_DESC_FCOE_SOF_BITS               4
 #define CQ_ENET_RQ_DESC_FCOE_SOF_MASK \
 	((1 << CQ_ENET_RQ_DESC_FCOE_SOF_BITS) - 1)
@@ -96,7 +105,7 @@ static inline void cq_enet_rq_desc_dec(struct cq_enet_rq_desc *desc,
 	u8 *type, u8 *color, u16 *q_number, u16 *completed_index,
 	u8 *ingress_port, u8 *fcoe, u8 *eop, u8 *sop, u8 *rss_type,
 	u8 *csum_not_calc, u32 *rss_hash, u16 *bytes_written, u8 *packet_error,
-	u8 *vlan_stripped, u16 *vlan, u16 *checksum, u8 *fcoe_sof,
+	u8 *vlan_stripped, u16 *vlan_tci, u16 *checksum, u8 *fcoe_sof,
 	u8 *fcoe_fc_crc_ok, u8 *fcoe_enc_error, u8 *fcoe_eof,
 	u8 *tcp_udp_csum_ok, u8 *udp, u8 *tcp, u8 *ipv4_csum_ok,
 	u8 *ipv6, u8 *ipv4, u8 *ipv4_fragment, u8 *fcs_ok)
@@ -136,7 +145,10 @@ static inline void cq_enet_rq_desc_dec(struct cq_enet_rq_desc *desc,
 	*vlan_stripped = (bytes_written_flags &
 		CQ_ENET_RQ_DESC_FLAGS_VLAN_STRIPPED) ? 1 : 0;
 
-	*vlan = le16_to_cpu(desc->vlan);
+	/*
+	 * Tag Control Information(16) = user_priority(3) + cfi(1) + vlan(12)
+	 */
+	*vlan_tci = le16_to_cpu(desc->vlan);
 
 	if (*fcoe) {
 		*fcoe_sof = (u8)(le16_to_cpu(desc->checksum_fcoe) &
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index c2b848f..7f98af1 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -1300,7 +1300,7 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
 	u8 tcp_udp_csum_ok, udp, tcp, ipv4_csum_ok;
 	u8 ipv6, ipv4, ipv4_fragment, fcs_ok, rss_type, csum_not_calc;
 	u8 packet_error;
-	u16 q_number, completed_index, bytes_written, vlan, checksum;
+	u16 q_number, completed_index, bytes_written, vlan_tci, checksum;
 	u32 rss_hash;
 
 	if (skipped)
@@ -1315,7 +1315,7 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
 		&type, &color, &q_number, &completed_index,
 		&ingress_port, &fcoe, &eop, &sop, &rss_type,
 		&csum_not_calc, &rss_hash, &bytes_written,
-		&packet_error, &vlan_stripped, &vlan, &checksum,
+		&packet_error, &vlan_stripped, &vlan_tci, &checksum,
 		&fcoe_sof, &fcoe_fc_crc_ok, &fcoe_enc_error,
 		&fcoe_eof, &tcp_udp_csum_ok, &udp, &tcp,
 		&ipv4_csum_ok, &ipv6, &ipv4, &ipv4_fragment,
@@ -1350,14 +1350,15 @@ static void enic_rq_indicate_buf(struct vnic_rq *rq,
 
 		skb->dev = netdev;
 
-		if (enic->vlan_group && vlan_stripped) {
+		if (enic->vlan_group && vlan_stripped &&
+			(vlan_tci & CQ_ENET_RQ_DESC_VLAN_TCI_VLAN_MASK)) {
 
 			if (netdev->features & NETIF_F_GRO)
 				vlan_gro_receive(&enic->napi, enic->vlan_group,
-					vlan, skb);
+					vlan_tci, skb);
 			else
 				vlan_hwaccel_receive_skb(skb,
-					enic->vlan_group, vlan);
+					enic->vlan_group, vlan_tci);
 
 		} else {
 
@@ -1879,6 +1880,18 @@ static int enic_set_niccfg(struct enic *enic)
 		ig_vlan_strip_en);
 }
 
+int enic_dev_set_ig_vlan_rewrite_mode(struct enic *enic)
+{
+	int err;
+
+	spin_lock(&enic->devcmd_lock);
+	err = vnic_dev_set_ig_vlan_rewrite_mode(enic->vdev,
+		IG_VLAN_REWRITE_MODE_PRIORITY_TAG_DEFAULT_VLAN);
+	spin_unlock(&enic->devcmd_lock);
+
+	return err;
+}
+
 static void enic_reset(struct work_struct *work)
 {
 	struct enic *enic = container_of(work, struct enic, reset);
@@ -1898,6 +1911,7 @@ static void enic_reset(struct work_struct *work)
 	enic_reset_mcaddrs(enic);
 	enic_init_vnic_resources(enic);
 	enic_set_niccfg(enic);
+	enic_dev_set_ig_vlan_rewrite_mode(enic);
 	enic_open(enic->netdev);
 
 	rtnl_unlock();
@@ -2110,6 +2124,13 @@ int enic_dev_init(struct enic *enic)
 		goto err_out_free_vnic_resources;
 	}
 
+	err = enic_dev_set_ig_vlan_rewrite_mode(enic);
+	if (err) {
+		printk(KERN_ERR PFX
+			"Failed to set ingress vlan rewrite mode, aborting.\n");
+		goto err_out_free_vnic_resources;
+	}
+
 	switch (vnic_dev_get_intr_mode(enic->vdev)) {
 	default:
 		netif_napi_add(netdev, &enic->napi, enic_poll, 64);
diff --git a/drivers/net/enic/vnic_dev.c b/drivers/net/enic/vnic_dev.c
index e0d3328..e3742fa 100644
--- a/drivers/net/enic/vnic_dev.c
+++ b/drivers/net/enic/vnic_dev.c
@@ -564,6 +564,20 @@ int vnic_dev_del_addr(struct vnic_dev *vdev, u8 *addr)
 	return err;
 }
 
+int vnic_dev_set_ig_vlan_rewrite_mode(struct vnic_dev *vdev,
+	u8 ig_vlan_rewrite_mode)
+{
+	u64 a0 = ig_vlan_rewrite_mode, a1 = 0;
+	int wait = 1000;
+	int err;
+
+	err = vnic_dev_cmd(vdev, CMD_IG_VLAN_REWRITE_MODE, &a0, &a1, wait);
+	if (err == ERR_ECMDUNKNOWN)
+		return 0;
+
+	return err;
+}
+
 int vnic_dev_raise_intr(struct vnic_dev *vdev, u16 intr)
 {
 	u64 a0 = intr, a1 = 0;
diff --git a/drivers/net/enic/vnic_dev.h b/drivers/net/enic/vnic_dev.h
index caccce3..780c3cd 100644
--- a/drivers/net/enic/vnic_dev.h
+++ b/drivers/net/enic/vnic_dev.h
@@ -133,6 +133,8 @@ void vnic_dev_set_intr_mode(struct vnic_dev *vdev,
 	enum vnic_dev_intr_mode intr_mode);
 enum vnic_dev_intr_mode vnic_dev_get_intr_mode(struct vnic_dev *vdev);
 void vnic_dev_unregister(struct vnic_dev *vdev);
+int vnic_dev_set_ig_vlan_rewrite_mode(struct vnic_dev *vdev,
+	u8 ig_vlan_rewrite_mode);
 struct vnic_dev *vnic_dev_register(struct vnic_dev *vdev,
 	void *priv, struct pci_dev *pdev, struct vnic_dev_bar *bar,
 	unsigned int num_bars);
diff --git a/drivers/net/enic/vnic_devcmd.h b/drivers/net/enic/vnic_devcmd.h
index d78bbcc..c5ff4ff 100644
--- a/drivers/net/enic/vnic_devcmd.h
+++ b/drivers/net/enic/vnic_devcmd.h
@@ -211,6 +211,12 @@ enum vnic_devcmd_cmd {
 	 * in: (u16)a0=interrupt number to assert
 	 */
 	CMD_IAR			= _CMDCNW(_CMD_DIR_WRITE, _CMD_VTYPE_ALL, 38),
+
+	/*
+	 * Set hw ingress packet vlan rewrite mode:
+	 * in:  (u32)a0=new vlan rewrite mode
+	 * out: (u32)a0=old vlan rewrite mode */
+	CMD_IG_VLAN_REWRITE_MODE = _CMDC(_CMD_DIR_RW, _CMD_VTYPE_ENET, 41),
 };
 
 /* flags for CMD_OPEN */
@@ -226,6 +232,12 @@ enum vnic_devcmd_cmd {
 #define CMD_PFILTER_PROMISCUOUS		0x08
 #define CMD_PFILTER_ALL_MULTICAST	0x10
 
+/* rewrite modes for CMD_IG_VLAN_REWRITE_MODE */
+#define IG_VLAN_REWRITE_MODE_DEFAULT_TRUNK              0
+#define IG_VLAN_REWRITE_MODE_UNTAG_DEFAULT_VLAN         1
+#define IG_VLAN_REWRITE_MODE_PRIORITY_TAG_DEFAULT_VLAN  2
+#define IG_VLAN_REWRITE_MODE_PASS_THRU                  3
+
 enum vnic_devcmd_status {
 	STAT_NONE = 0,
 	STAT_BUSY = 1 << 0,	/* cmd in progress */


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox