Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
From: Muraliraja Muniraju @ 2011-10-12  1:28 UTC (permalink / raw)
  To: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy <kabe
  Cc: linux-kernel, netdev, Murali Raja
In-Reply-To: <20111010145206.6f7e9ee2@nehalam.linuxnetplumber.net>

From: Murali Raja <muralira@google.com>

This patch exposes the tos value for the TCP sockets when the TOS flag
is requested in the ext_flags for the inet_diag request. This would mainly be
used to expose TOS values for both for TCP and UDP sockets. Currently it is
supported for TCP. When netlink support for UDP would be added the support
to expose the TOS values would alse be done.

Signed-off-by: Murali Raja <muralira@google.com>
---
Changelog since v2:
- Adding support for IPV6 class and using right API's
Changelog since v1:
- Removing reserved field 

 include/linux/inet_diag.h |    9 ++++++++-
 net/ipv4/inet_diag.c      |    5 +++++
 2 files changed, 13 insertions(+), 1 deletions(-)

diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
index bc8c490..e36093d 100644
--- a/include/linux/inet_diag.h
+++ b/include/linux/inet_diag.h
@@ -97,9 +97,10 @@ enum {
 	INET_DIAG_INFO,
 	INET_DIAG_VEGASINFO,
 	INET_DIAG_CONG,
+	INET_DIAG_TOS,
 };
 
-#define INET_DIAG_MAX INET_DIAG_CONG
+#define INET_DIAG_MAX INET_DIAG_TOS
 
 
 /* INET_DIAG_MEM */
@@ -120,6 +121,12 @@ struct tcpvegas_info {
 	__u32	tcpv_minrtt;
 };
 
+/* INET_DIAG_TOS */
+
+struct inet_diag_tos {
+	__u8	idiag_tos;
+};
+
 #ifdef __KERNEL__
 struct sock;
 struct inet_hashinfo;
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 389a2e6..f5e2bda 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -108,6 +108,9 @@ static int inet_csk_diag_fill(struct sock *sk,
 		       icsk->icsk_ca_ops->name);
 	}
 
+	if ((ext & (1 << (INET_DIAG_TOS - 1))) && (sk->sk_family != AF_INET6))
+		RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
+
 	r->idiag_family = sk->sk_family;
 	r->idiag_state = sk->sk_state;
 	r->idiag_timer = 0;
@@ -130,6 +133,8 @@ static int inet_csk_diag_fill(struct sock *sk,
 			       &np->rcv_saddr);
 		ipv6_addr_copy((struct in6_addr *)r->id.idiag_dst,
 			       &np->daddr);
+		if (ext & (1 << (INET_DIAG_TOS - 1)))
+			RTA_PUT_U8(skb, INET_DIAG_TOS, np->tclass);
 	}
 #endif
 
-- 
1.7.3.1

^ permalink raw reply related

* Re: Host got crash when guest running netperf client with UDP_STREAM protocol with IPV6
From: Qunfang Zhang @ 2011-10-12  1:33 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1318313091.2606.4.camel@edumazet-laptop>

Hi, Eric
Sorry, just got that Jason has commited a patch for it. So, please 
ignore my mail.  Thanks.

On 10/11/2011 02:04 PM, Eric Dumazet wrote:
> Le mardi 11 octobre 2011 à 10:26 +0800, Qunfang Zhang a écrit :
>    
>> Hi, guys
>>
>> I found the following bug on RHEL6.2 kernel and after re-test, latest
>> stable kernel also has this problem.
>> Jiri suggests it would be best to resolve this via upstream as well,
>> so I send you the mail.  Thank you.
>>
>> Bug 740465 - Host got crash when guest running netperf client with
>> UDP_STREAM protocol with IPV6
>> https://bugzilla.redhat.com/show_bug.cgi?id=740465
>>
>> For the vmcore log please also find it from the bug attachment.
>>
>>
>>
>>      
> Isnt it fixed by Jason Wang patch ?
>
> ( ipv6: fix NULL dereference in udp6_ufo_fragment() )
>
>
>
>
>    

^ permalink raw reply

* [PATCH v2 2/2] phylib: Convert MDIO bitbang to new MDIO 45 format
From: Andy Fleming @ 2011-10-12  1:20 UTC (permalink / raw)
  To: davem; +Cc: netdev
In-Reply-To: <1318382422-2133-1-git-send-email-afleming@freescale.com>

Now that we've added somewhat more complete MDIO 45 support to the PHY
Lib, convert the MDIO bitbang driver to use this new infrastructure.

Signed-off-by: Andy Fleming <afleming@freescale.com>
---
 drivers/net/phy/mdio-bitbang.c |   29 +++++++++++++++--------------
 1 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/phy/mdio-bitbang.c b/drivers/net/phy/mdio-bitbang.c
index 2f6f02e..df7f496 100644
--- a/drivers/net/phy/mdio-bitbang.c
+++ b/drivers/net/phy/mdio-bitbang.c
@@ -134,11 +134,10 @@ static void mdiobb_cmd(struct mdiobb_ctrl *ctrl, int op, u8 phy, u8 reg)
    MII_ADDR_C45 into the address. Theoretically clause 45 and normal devices
    can exist on the same bus. Normal devices should ignore the MDIO_ADDR
    phase. */
-static int mdiobb_cmd_addr(struct mdiobb_ctrl *ctrl, int phy, u32 addr)
+static void mdiobb_cmd_addr(struct mdiobb_ctrl *ctrl, int phy, int devad,
+				int reg)
 {
-	unsigned int dev_addr = (addr >> 16) & 0x1F;
-	unsigned int reg = addr & 0xFFFF;
-	mdiobb_cmd(ctrl, MDIO_C45_ADDR, phy, dev_addr);
+	mdiobb_cmd(ctrl, MDIO_C45_ADDR, phy, devad);
 
 	/* send the turnaround (10) */
 	mdiobb_send_bit(ctrl, 1);
@@ -148,8 +147,6 @@ static int mdiobb_cmd_addr(struct mdiobb_ctrl *ctrl, int phy, u32 addr)
 
 	ctrl->ops->set_mdio_dir(ctrl, 0);
 	mdiobb_get_bit(ctrl);
-
-	return dev_addr;
 }
 
 static int mdiobb_read(struct mii_bus *bus, int phy, int devad, int reg)
@@ -157,11 +154,13 @@ static int mdiobb_read(struct mii_bus *bus, int phy, int devad, int reg)
 	struct mdiobb_ctrl *ctrl = bus->priv;
 	int ret, i;
 
-	if (reg & MII_ADDR_C45) {
-		reg = mdiobb_cmd_addr(ctrl, phy, reg);
-		mdiobb_cmd(ctrl, MDIO_C45_READ, phy, reg);
-	} else
+	/* Clause 22 PHYs don't have a devad */
+	if (devad == MDIO_DEVAD_NONE)
 		mdiobb_cmd(ctrl, MDIO_READ, phy, reg);
+	else {
+		mdiobb_cmd_addr(ctrl, phy, devad, reg);
+		mdiobb_cmd(ctrl, MDIO_C45_READ, phy, devad);
+	}
 
 	ctrl->ops->set_mdio_dir(ctrl, 0);
 
@@ -186,11 +185,13 @@ static int mdiobb_write(struct mii_bus *bus, int phy, int devad, int reg,
 {
 	struct mdiobb_ctrl *ctrl = bus->priv;
 
-	if (reg & MII_ADDR_C45) {
-		reg = mdiobb_cmd_addr(ctrl, phy, reg);
-		mdiobb_cmd(ctrl, MDIO_C45_WRITE, phy, reg);
-	} else
+	/* Clause 22 PHYs don't have a devad */
+	if (devad == MDIO_DEVAD_NONE)
 		mdiobb_cmd(ctrl, MDIO_WRITE, phy, reg);
+	else {
+		mdiobb_cmd_addr(ctrl, phy, devad, reg);
+		mdiobb_cmd(ctrl, MDIO_C45_WRITE, phy, devad);
+	}
 
 	/* send the turnaround (10) */
 	mdiobb_send_bit(ctrl, 1);
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH] netconsole: enable netconsole can make net_device refcnt
From: Gao feng @ 2011-10-12  1:45 UTC (permalink / raw)
  To: Flavio Leitner; +Cc: Wanlong Gao, netdev, davem
In-Reply-To: <20111011221148.2ef4487d@asterix.rh>

12.10.2011 09:11, Flavio Leitner wrote:
> On Tue, 11 Oct 2011 16:05:48 +0800
> Gao feng <gaofeng@cn.fujitsu.com> wrote:
> 
>> Im so sorry.
>> the first patch has some format err.
>> Please use this one.
>> thanks wanlong! ^V^
>>
>> 11.10.2011 15:50, Wanlong Gao wrote:
>>> There is no check if netconsole is enabled current.
>>> so when exec echo 1 > enabled;
>>> the reference of net_device will increment always.
>>>
>>> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
>>> ---
>>>  drivers/net/netconsole.c |    2 ++
>>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
>>> index ed2a397..4e6323df 100644
>>> --- a/drivers/net/netconsole.c
>>> +++ b/drivers/net/netconsole.c
>>> @@ -307,6 +307,8 @@ static ssize_t store_enabled(struct
>>> netconsole_target *nt, return err;
>>>  	if (enabled < 0 || enabled > 1)
>>>  		return -EINVAL;
>>> +	if (enabled == nt->enabled)
>>> +		return err;
> 
> It looks like 'err' will be 0.  Maybe it is better to
> return -EINVAL?
> 
> fbl
> 

Yes,you are right,and I will add some printk.thanks.

^ permalink raw reply

* [PATCH v2] netconsole: enable netconsole can make net_device refcnt incorrent
From: Gao feng @ 2011-10-12  2:08 UTC (permalink / raw)
  To: netdev; +Cc: davem, eric.dumazet, fbl, Gao feng

There is no check if netconsole is enabled current.
so when exec echo 1 > enabled;
the reference of net_device will increment always.

Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
---
 drivers/net/netconsole.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ed2a397..e888202 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -307,6 +307,11 @@ static ssize_t store_enabled(struct netconsole_target *nt,
 		return err;
 	if (enabled < 0 || enabled > 1)
 		return -EINVAL;
+	if (enabled == nt->enabled) {
+		printk(KERN_INFO "netconsole: network logging has already %s\n",
+				nt->enabled ? "started" : "stopped");
+		return -EINVAL;
+	}
 
 	if (enabled) {	/* 1 */
 
-- 
1.7.1

^ permalink raw reply related

* linux-next: build failure after merge of the wireless tree
From: Stephen Rothwell @ 2011-10-12  2:36 UTC (permalink / raw)
  To: John W. Linville
  Cc: linux-next, linux-kernel, Jiri Pirko, David Miller, netdev,
	Arend van Spriel

[-- Attachment #1: Type: text/plain, Size: 1948 bytes --]

Hi John,

After merging the wireless tree, today's linux-next build (x86_64
allmodconfig) failed like this:

drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c:1131:2: error: unknown field 'ndo_set_multicast_list' specified in initializer
drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c:1132:1: warning: initialization from incompatible pointer type [enabled by default]
drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c:1132:1: warning: (near initialization for 'brcmf_netdev_ops_pri.ndo_validate_addr') [enabled by default]

Caused by commit 5b435de0d786 ("net: wireless: add brcm80211 drivers")
interacting with commit b81693d9149c ("net: remove ndo_set_multicast_list
callback") from the net-next tree.

I have applied the following merge fixup patch for today.

From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Wed, 12 Oct 2011 13:33:10 +1100
Subject: [PATCH] net: wireless: brcm80211: replace ndo_set_multicast_list  with ndo_set_rx_mode

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
---
 .../net/wireless/brcm80211/brcmfmac/dhd_linux.c    |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c b/drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c
index 99ba5e3..03607ca 100644
--- a/drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c
+++ b/drivers/net/wireless/brcm80211/brcmfmac/dhd_linux.c
@@ -1128,7 +1128,7 @@ static struct net_device_ops brcmf_netdev_ops_pri = {
 	.ndo_do_ioctl = brcmf_netdev_ioctl_entry,
 	.ndo_start_xmit = brcmf_netdev_start_xmit,
 	.ndo_set_mac_address = brcmf_netdev_set_mac_address,
-	.ndo_set_multicast_list = brcmf_netdev_set_multicast_list
+	.ndo_set_rx_mode = brcmf_netdev_set_multicast_list
 };
 
 int brcmf_net_attach(struct brcmf_pub *drvr, int ifidx)
-- 
1.7.6.3

-- 
Cheers,
Stephen Rothwell                    sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply related

* Re: [PATCH v3] net-netlink: Add a new attribute to expose TOS values via netlink
From: Eric Dumazet @ 2011-10-12  2:48 UTC (permalink / raw)
  To: Muraliraja Muniraju
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, linux-kernel, netdev
In-Reply-To: <1318382887-31824-1-git-send-email-muralira@google.com>

Le mardi 11 octobre 2011 à 18:28 -0700, Muraliraja Muniraju a écrit :
> From: Murali Raja <muralira@google.com>
> 
> This patch exposes the tos value for the TCP sockets when the TOS flag
> is requested in the ext_flags for the inet_diag request. This would mainly be
> used to expose TOS values for both for TCP and UDP sockets. Currently it is
> supported for TCP. When netlink support for UDP would be added the support
> to expose the TOS values would alse be done.
> 

You could mention TCLASS support for IPv6

> Signed-off-by: Murali Raja <muralira@google.com>
> ---
> Changelog since v2:
> - Adding support for IPV6 class and using right API's
> Changelog since v1:
> - Removing reserved field 
> 
>  include/linux/inet_diag.h |    9 ++++++++-
>  net/ipv4/inet_diag.c      |    5 +++++
>  2 files changed, 13 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
> index bc8c490..e36093d 100644
> --- a/include/linux/inet_diag.h
> +++ b/include/linux/inet_diag.h
> @@ -97,9 +97,10 @@ enum {
>  	INET_DIAG_INFO,
>  	INET_DIAG_VEGASINFO,
>  	INET_DIAG_CONG,
> +	INET_DIAG_TOS,
>  };
>  
> -#define INET_DIAG_MAX INET_DIAG_CONG
> +#define INET_DIAG_MAX INET_DIAG_TOS
>  
> 
>  /* INET_DIAG_MEM */
> @@ -120,6 +121,12 @@ struct tcpvegas_info {
>  	__u32	tcpv_minrtt;
>  };
>  
> +/* INET_DIAG_TOS */
> +
> +struct inet_diag_tos {
> +	__u8	idiag_tos;
> +};

Are you sure its still needed ?

I am now wondering what is done in TIME_WAIT state.

^ permalink raw reply

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
From: Andy Gospodarek @ 2011-10-12  2:51 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Andy Gospodarek, Yinglin Sun, netdev, John Eaglesham
In-Reply-To: <23119.1318348739@death>

On Tue, Oct 11, 2011 at 08:58:59AM -0700, Jay Vosburgh wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
[...]
> >
> >There have been some attempts to add support for ipv6 hashing this in
> >the past, but none have been committed.  The best one I had seen was one
> >that did some extensive testing one a wide variety of ipv6 traffic and
> >it showed nice traffic distribution.  I'm not sure if it was ever posted
> >upstream, so I will see if I can dig it up.
> >
> >Can you quantify how traffic was distributed with this algorithm?
> 
> 	As I recall, the IPv6 issues had to do with the "layer3+4" hash,
> because the IPv6 TCP or UDP port numbers can be harder to get at than in
> IPv4 (which typically has a fixed size header).  The above is just for
> layer 2, so it only hits the IPv6 addresses, which don't move around.
> 
> 	That said, I believe that many IPv6 addresses are derived from
> the MAC address, the autoconf addresses in particular, so s6_addr32[3]
> may not show a lot more variation than just the MAC address.  I don't
> know for sure though, since I haven't tested it.
> 
> 	I don't recall seeing the patch you mention, Andy, that checks
> ipv6 traffic; can you post it?
> 

I found the patch, cleaned it up, and compile tested it against
net-next.  I traded some emails with John Eaglesham (cc'd) earlier this
year and though he planned to post it, I never followed up.

His comments about this patch were as follows:

"I've attached my patch for IPv6 transmit hashing for the nic bonding
driver.

"The algorithm I chose is based on 273,913 IPv6 client addresses I
gathered from webservers and ran through a test program that implemented
several algorithms. This algorithm provided the most even distribution
while using the fewest instructions.

"I've tested this on 2.6.39-rc4 and a similar patch to 2.6.18 (from
RHEL5 5.4.3) and it has performed as expected in both cases.

"Please let me know if you have any comments, otherwise I suppose the
next step is to propose the patch to LKML."

I would suggest we use this.  John or I could write an official
changelog and post this in it's own thread if it looks good to others.

---
 drivers/net/bonding/bond_main.c |   30 +++++++++++++++++++++++++-----
 1 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 6191e63..335cb67 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3368,11 +3368,20 @@ static struct notifier_block bond_inetaddr_notifier = {
 static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
 {
 	struct ethhdr *data = (struct ethhdr *)skb->data;
-	struct iphdr *iph = ip_hdr(skb);
 
 	if (skb->protocol == htons(ETH_P_IP)) {
+		struct iphdr *iph = ip_hdr(skb);
 		return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
 			(data->h_dest[5] ^ data->h_source[5])) % count;
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+		u32 v6hash = (
+			(ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
+			(ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
+			(ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3])
+		);
+		v6hash = (v6hash >> 16) ^ (v6hash >> 8) ^ v6hash;
+		return (v6hash ^ data->h_dest[5] ^ data->h_source[5]) % count;
 	}
 
 	return (data->h_dest[5] ^ data->h_source[5]) % count;
@@ -3386,11 +3395,11 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
 static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
 {
 	struct ethhdr *data = (struct ethhdr *)skb->data;
-	struct iphdr *iph = ip_hdr(skb);
-	__be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
-	int layer4_xor = 0;
+	u32 layer4_xor = 0;
 
 	if (skb->protocol == htons(ETH_P_IP)) {
+		struct iphdr *iph = ip_hdr(skb);
+		__be16 *layer4hdr = (__be16 *)((u32 *)iph + iph->ihl);
 		if (!ip_is_fragment(iph) &&
 		    (iph->protocol == IPPROTO_TCP ||
 		     iph->protocol == IPPROTO_UDP)) {
@@ -3398,7 +3407,18 @@ static int bond_xmit_hash_policy_l34(struct sk_buff *skb, int count)
 		}
 		return (layer4_xor ^
 			((ntohl(iph->saddr ^ iph->daddr)) & 0xffff)) % count;
-
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		struct ipv6hdr *ipv6h = ipv6_hdr(skb);
+		__be16 *layer4hdrv6 = (__be16 *)((u8 *)ipv6h + sizeof(*ipv6h));
+		if (ipv6h->nexthdr == IPPROTO_TCP || ipv6h->nexthdr == IPPROTO_UDP) {
+			layer4_xor = (*layer4hdrv6 ^ *(layer4hdrv6 + 1));
+		}
+		layer4_xor ^= (
+			(ipv6h->saddr.s6_addr32[1] ^ ipv6h->daddr.s6_addr32[1]) ^
+			(ipv6h->saddr.s6_addr32[2] ^ ipv6h->daddr.s6_addr32[2]) ^
+			(ipv6h->saddr.s6_addr32[3] ^ ipv6h->daddr.s6_addr32[3])
+		);
+		return ((layer4_xor >> 16) ^ (layer4_xor >> 8) ^ layer4_xor) % count;
 	}
 
 	return (data->h_dest[5] ^ data->h_source[5]) % count;

^ permalink raw reply related

* Re: [PATCH v2 1/3] phylib: Convert MDIO and PHY Lib drivers to support 10G
From: David Miller @ 2011-10-12  2:58 UTC (permalink / raw)
  To: afleming; +Cc: netdev
In-Reply-To: <1318382422-2133-1-git-send-email-afleming@freescale.com>

What is up with your patch numbering?  You submitted a "1/3" and
a "2/2", and then a patch without any numbering at all.

What's the deal?

I'm tossing this entire series, please submit something coherent.

^ permalink raw reply

* [PATCH iproute2] ss: report ecnseen
From: Eric Dumazet @ 2011-10-12  3:18 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20111011165908.0d6ed7de@nehalam.linuxnetplumber.net>

Support ECNSEEN reporting in ss command.

ESTAB      0      0           10.170.73.123:4900
10.170.73.125:51001    uid:501 ino:385994 sk:f31e5f00
	 mem:(r0,w0,f0,t0) ts sack ecn ecnseen bic wscale:8,8 rto:210
rtt:18.75/15 ato:40 cwnd:10 send 69.9Mbps rcv_space:32768

"ecn" means TCP session negociated ECN capability at setup time

"ecnseen" at least one frame with ECT(0) or ECT(1) or ECN (IP layer) was
received from peer.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 include/netinet/tcp.h |    1 +
 misc/ss.c             |    2 ++
 2 files changed, 3 insertions(+)

diff --git a/include/netinet/tcp.h b/include/netinet/tcp.h
index 282b29c..95a4fe6 100644
--- a/include/netinet/tcp.h
+++ b/include/netinet/tcp.h
@@ -172,6 +172,7 @@ enum
 # define TCPI_OPT_SACK		2
 # define TCPI_OPT_WSCALE	4
 # define TCPI_OPT_ECN		8
+# define TCPI_OPT_ECNSEEN	16
 
 /* Values for tcpi_state.  */
 enum tcp_ca_state
diff --git a/misc/ss.c b/misc/ss.c
index b00841b..778bf0a 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -1357,6 +1357,8 @@ static void tcp_show_info(const struct nlmsghdr *nlh, struct inet_diag_msg *r)
 				printf(" sack");
 			if (info->tcpi_options & TCPI_OPT_ECN)
 				printf(" ecn");
+			if (info->tcpi_options & TCPI_OPT_ECNSEEN)
+				printf(" ecnseen");
 		}
 
 		if (tb[INET_DIAG_CONG])

^ permalink raw reply related

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
From: Yinglin Sun @ 2011-10-12  3:27 UTC (permalink / raw)
  To: Andy Gospodarek; +Cc: Jay Vosburgh, netdev
In-Reply-To: <20111011143348.GA20605@gospo.rdu.redhat.com>

On Tue, Oct 11, 2011 at 7:33 AM, Andy Gospodarek <andy@greyhouse.net> wrote:
> On Fri, Oct 07, 2011 at 10:36:45PM -0700, Yinglin Sun wrote:
>> Add IPv6 support in L2L3 xmit policy.
>> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
>>
>> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
>> ---
>>  drivers/net/bonding/bond_main.c |    7 +++++++
>>  1 files changed, 7 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>> index 6d79b78..d6fd282 100644
>> --- a/drivers/net/bonding/bond_main.c
>> +++ b/drivers/net/bonding/bond_main.c
>> @@ -41,8 +41,10 @@
>>  #include <linux/ptrace.h>
>>  #include <linux/ioport.h>
>>  #include <linux/in.h>
>> +#include <linux/in6.h>
>>  #include <net/ip.h>
>>  #include <linux/ip.h>
>> +#include <linux/ipv6.h>
>>  #include <linux/tcp.h>
>>  #include <linux/udp.h>
>>  #include <linux/slab.h>
>> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>>  {
>>       struct ethhdr *data = (struct ethhdr *)skb->data;
>>       struct iphdr *iph = ip_hdr(skb);
>> +     struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>>
>>       if (skb->protocol == htons(ETH_P_IP)) {
>>               return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>>                       (data->h_dest[5] ^ data->h_source[5])) % count;
>> +     } else if (skb->protocol == htons(ETH_P_IPV6)) {
>> +             return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
>> +                            ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
>> +                     (data->h_dest[5] ^ data->h_source[5])) % count;
>>       }
>>
>
> There have been some attempts to add support for ipv6 hashing this in
> the past, but none have been committed.  The best one I had seen was one
> that did some extensive testing one a wide variety of ipv6 traffic and
> it showed nice traffic distribution.  I'm not sure if it was ever posted
> upstream, so I will see if I can dig it up.
>
> Can you quantify how traffic was distributed with this algorithm?
>

My test was not extensive. I manually set some addresses which vary by
the last 32bit portion, and the traffic was distributed evenly across
slaves.

I agree that the real world IPv6 traffic should be used for more extensive test.

Yinglin

^ permalink raw reply

* Re: [PATCH] bonding: L2L3 xmit doesn't support IPv6
From: Yinglin Sun @ 2011-10-12  3:30 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: Andy Gospodarek, netdev
In-Reply-To: <23119.1318348739@death>

On Tue, Oct 11, 2011 at 8:58 AM, Jay Vosburgh <fubar@us.ibm.com> wrote:
> Andy Gospodarek <andy@greyhouse.net> wrote:
>
>>On Fri, Oct 07, 2011 at 10:36:45PM -0700, Yinglin Sun wrote:
>>> Add IPv6 support in L2L3 xmit policy.
>>> L3L4 doesn't support IPv6 either, and I'll try to fix that later.
>>>
>>> Signed-off-by: Yinglin Sun <Yinglin.Sun@emc.com>
>>> ---
>>>  drivers/net/bonding/bond_main.c |    7 +++++++
>>>  1 files changed, 7 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>>> index 6d79b78..d6fd282 100644
>>> --- a/drivers/net/bonding/bond_main.c
>>> +++ b/drivers/net/bonding/bond_main.c
>>> @@ -41,8 +41,10 @@
>>>  #include <linux/ptrace.h>
>>>  #include <linux/ioport.h>
>>>  #include <linux/in.h>
>>> +#include <linux/in6.h>
>>>  #include <net/ip.h>
>>>  #include <linux/ip.h>
>>> +#include <linux/ipv6.h>
>>>  #include <linux/tcp.h>
>>>  #include <linux/udp.h>
>>>  #include <linux/slab.h>
>>> @@ -3372,10 +3374,15 @@ static int bond_xmit_hash_policy_l23(struct sk_buff *skb, int count)
>>>  {
>>>      struct ethhdr *data = (struct ethhdr *)skb->data;
>>>      struct iphdr *iph = ip_hdr(skb);
>>> +    struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>>>
>>>      if (skb->protocol == htons(ETH_P_IP)) {
>>>              return ((ntohl(iph->saddr ^ iph->daddr) & 0xffff) ^
>>>                      (data->h_dest[5] ^ data->h_source[5])) % count;
>>> +    } else if (skb->protocol == htons(ETH_P_IPV6)) {
>>> +            return ((ntohl(ipv6h->saddr.s6_addr32[3] ^
>>> +                           ipv6h->daddr.s6_addr32[3]) & 0xffff) ^
>>> +                    (data->h_dest[5] ^ data->h_source[5])) % count;
>>>      }
>>>
>>
>>There have been some attempts to add support for ipv6 hashing this in
>>the past, but none have been committed.  The best one I had seen was one
>>that did some extensive testing one a wide variety of ipv6 traffic and
>>it showed nice traffic distribution.  I'm not sure if it was ever posted
>>upstream, so I will see if I can dig it up.
>>
>>Can you quantify how traffic was distributed with this algorithm?
>
>        As I recall, the IPv6 issues had to do with the "layer3+4" hash,
> because the IPv6 TCP or UDP port numbers can be harder to get at than in
> IPv4 (which typically has a fixed size header).  The above is just for
> layer 2, so it only hits the IPv6 addresses, which don't move around.
>
>        That said, I believe that many IPv6 addresses are derived from
> the MAC address, the autoconf addresses in particular, so s6_addr32[3]
> may not show a lot more variation than just the MAC address.  I don't
> know for sure though, since I haven't tested it.
>

This is a good point. The last 32bit portion is not enough for some cases.

Yinglin

^ permalink raw reply

* [net-next 00/12][pull request] Intel Wired LAN Driver Update
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann

The following series contains updates to ixgbe and igb.  This
version of the series contains the following changes:

- ixgbe add FCoE stats, add protection from invalid DMA and fix
  check for change in FCoE priority
- igb version bump, fix sparse warnings and finish up the cleanup
  work of igb by Alex Duyck.

The following are changes since commit 3e5057114643d14679c2ba05594eb5f844488287:
  igb: add support for NETIF_F_RXHASH
and are available in the git repository at
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next.git
or
  git://github.com/Jkirsher/net-next.git

Akeem G. Abodunrin (1):
  igb: Loopback funtionality supports for i350 devices

Alexander Duyck (5):
  igb: avoid unnecessarily creating a local copy of the q_vector
  igb: Make certain one vector is always assigned in igb_request_irq
  igb: Fix features that are currently 82580 only and should also be
    i350
  igb: Drop unnecessary write of E1000_IMS from igb_msix_other
  igb: Add workaround for byte swapped VLAN on i350 local traffic

Amir Hanania (1):
  ixgbe: Add FCoE DDP allocation failure counters to ethtool stats.

Carolyn Wyborny (1):
  igb: Version bump.

Emil Tantilov (1):
  igb: fix static function warnings reported by sparse

Greg Rose (1):
  ixgbe: Add protection from VF invalid target DMA

Jacob Sowles (1):
  igb: move DMA Coalescing feature code into separate function.

Mark Rustad (1):
  ixgbe: Correct check for change in FCoE priority

 drivers/net/ethernet/intel/igb/e1000_82575.c     |   38 +++--
 drivers/net/ethernet/intel/igb/e1000_defines.h   |    1 +
 drivers/net/ethernet/intel/igb/e1000_regs.h      |    1 +
 drivers/net/ethernet/intel/igb/igb.h             |    1 +
 drivers/net/ethernet/intel/igb/igb_main.c        |  178 ++++++++++++++-------
 drivers/net/ethernet/intel/ixgbe/ixgbe.h         |    4 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c  |   12 ++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |    2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c    |   44 ++++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.h    |    2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |  189 +++++++++++++++++++++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h    |    2 +
 12 files changed, 390 insertions(+), 84 deletions(-)

-- 
1.7.6.4

^ permalink raw reply

* [net-next 01/12] ixgbe: Add protection from VF invalid target DMA
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Greg Rose, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Greg Rose <gregory.v.rose@intel.com>

It is possible for a VF to set an invalid target DMA address in its
Tx/Rx descriptor buffer pointers.  The workarounds in this patch
will guard against such an event and issue a VFLR to the VF in response.
The VFLR will shut down the VF until an administrator can take action
to investigate the event and correct the problem.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe.h      |    4 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  172 ++++++++++++++++++++++++-
 2 files changed, 175 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
index 38940d7..c1f76aa 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
@@ -116,6 +116,8 @@
 #define MAX_EMULATION_MAC_ADDRS         16
 #define IXGBE_MAX_PF_MACVLANS           15
 #define VMDQ_P(p)   ((p) + adapter->num_vfs)
+#define IXGBE_82599_VF_DEVICE_ID        0x10ED
+#define IXGBE_X540_VF_DEVICE_ID         0x1515
 
 struct vf_data_storage {
 	unsigned char vf_mac_addresses[ETH_ALEN];
@@ -512,6 +514,8 @@ struct ixgbe_adapter {
 	struct hlist_head fdir_filter_list;
 	union ixgbe_atr_input fdir_mask;
 	int fdir_filter_count;
+	u32 timer_event_accumulator;
+	u32 vferr_refcount;
 };
 
 struct ixgbe_fdir_filter {
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 1519a23..b95c6e9 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -6112,6 +6112,51 @@ static void ixgbe_sfp_link_config_subtask(struct ixgbe_adapter *adapter)
 	clear_bit(__IXGBE_IN_SFP_INIT, &adapter->state);
 }
 
+#ifdef CONFIG_PCI_IOV
+static void ixgbe_check_for_bad_vf(struct ixgbe_adapter *adapter)
+{
+	int vf;
+	struct ixgbe_hw *hw = &adapter->hw;
+	struct net_device *netdev = adapter->netdev;
+	u32 gpc;
+	u32 ciaa, ciad;
+
+	gpc = IXGBE_READ_REG(hw, IXGBE_TXDGPC);
+	if (gpc) /* If incrementing then no need for the check below */
+		return;
+	/*
+	 * Check to see if a bad DMA write target from an errant or
+	 * malicious VF has caused a PCIe error.  If so then we can
+	 * issue a VFLR to the offending VF(s) and then resume without
+	 * requesting a full slot reset.
+	 */
+
+	for (vf = 0; vf < adapter->num_vfs; vf++) {
+		ciaa = (vf << 16) | 0x80000000;
+		/* 32 bit read so align, we really want status at offset 6 */
+		ciaa |= PCI_COMMAND;
+		IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+		ciad = IXGBE_READ_REG(hw, IXGBE_CIAD_82599);
+		ciaa &= 0x7FFFFFFF;
+		/* disable debug mode asap after reading data */
+		IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+		/* Get the upper 16 bits which will be the PCI status reg */
+		ciad >>= 16;
+		if (ciad & PCI_STATUS_REC_MASTER_ABORT) {
+			netdev_err(netdev, "VF %d Hung DMA\n", vf);
+			/* Issue VFLR */
+			ciaa = (vf << 16) | 0x80000000;
+			ciaa |= 0xA8;
+			IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+			ciad = 0x00008000;  /* VFLR */
+			IXGBE_WRITE_REG(hw, IXGBE_CIAD_82599, ciad);
+			ciaa &= 0x7FFFFFFF;
+			IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+		}
+	}
+}
+
+#endif
 /**
  * ixgbe_service_timer - Timer Call-back
  * @data: pointer to adapter cast into an unsigned long
@@ -6120,17 +6165,49 @@ static void ixgbe_service_timer(unsigned long data)
 {
 	struct ixgbe_adapter *adapter = (struct ixgbe_adapter *)data;
 	unsigned long next_event_offset;
+	bool ready = true;
 
+#ifdef CONFIG_PCI_IOV
+	ready = false;
+
+	/*
+	 * don't bother with SR-IOV VF DMA hang check if there are
+	 * no VFs or the link is down
+	 */
+	if (!adapter->num_vfs ||
+	    (adapter->flags & IXGBE_FLAG_NEED_LINK_UPDATE)) {
+		ready = true;
+		goto normal_timer_service;
+	}
+
+	/* If we have VFs allocated then we must check for DMA hangs */
+	ixgbe_check_for_bad_vf(adapter);
+	next_event_offset = HZ / 50;
+	adapter->timer_event_accumulator++;
+
+	if (adapter->timer_event_accumulator >= 100) {
+		ready = true;
+		adapter->timer_event_accumulator = 0;
+	}
+
+	goto schedule_event;
+
+normal_timer_service:
+#endif
 	/* poll faster when waiting for link */
 	if (adapter->flags & IXGBE_FLAG_NEED_LINK_UPDATE)
 		next_event_offset = HZ / 10;
 	else
 		next_event_offset = HZ * 2;
 
+#ifdef CONFIG_PCI_IOV
+schedule_event:
+#endif
 	/* Reset the timer */
 	mod_timer(&adapter->service_timer, next_event_offset + jiffies);
 
-	ixgbe_service_event_schedule(adapter);
+	if (ready)
+		ixgbe_service_event_schedule(adapter);
 }
 
 static void ixgbe_reset_subtask(struct ixgbe_adapter *adapter)
@@ -7717,6 +7794,91 @@ static pci_ers_result_t ixgbe_io_error_detected(struct pci_dev *pdev,
 	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
 	struct net_device *netdev = adapter->netdev;
 
+#ifdef CONFIG_PCI_IOV
+	struct pci_dev *bdev, *vfdev;
+	u32 dw0, dw1, dw2, dw3;
+	int vf, pos;
+	u16 req_id, pf_func;
+
+	if (adapter->hw.mac.type == ixgbe_mac_82598EB ||
+	    adapter->num_vfs == 0)
+		goto skip_bad_vf_detection;
+
+	bdev = pdev->bus->self;
+	while (bdev && (bdev->pcie_type != PCI_EXP_TYPE_ROOT_PORT))
+		bdev = bdev->bus->self;
+
+	if (!bdev)
+		goto skip_bad_vf_detection;
+
+	pos = pci_find_ext_capability(bdev, PCI_EXT_CAP_ID_ERR);
+	if (!pos)
+		goto skip_bad_vf_detection;
+
+	pci_read_config_dword(bdev, pos + PCI_ERR_HEADER_LOG, &dw0);
+	pci_read_config_dword(bdev, pos + PCI_ERR_HEADER_LOG + 4, &dw1);
+	pci_read_config_dword(bdev, pos + PCI_ERR_HEADER_LOG + 8, &dw2);
+	pci_read_config_dword(bdev, pos + PCI_ERR_HEADER_LOG + 12, &dw3);
+
+	req_id = dw1 >> 16;
+	/* On the 82599 if bit 7 of the requestor ID is set then it's a VF */
+	if (!(req_id & 0x0080))
+		goto skip_bad_vf_detection;
+
+	pf_func = req_id & 0x01;
+	if ((pf_func & 1) == (pdev->devfn & 1)) {
+		unsigned int device_id;
+
+		vf = (req_id & 0x7F) >> 1;
+		e_dev_err("VF %d has caused a PCIe error\n", vf);
+		e_dev_err("TLP: dw0: %8.8x\tdw1: %8.8x\tdw2: "
+				"%8.8x\tdw3: %8.8x\n",
+		dw0, dw1, dw2, dw3);
+		switch (adapter->hw.mac.type) {
+		case ixgbe_mac_82599EB:
+			device_id = IXGBE_82599_VF_DEVICE_ID;
+			break;
+		case ixgbe_mac_X540:
+			device_id = IXGBE_X540_VF_DEVICE_ID;
+			break;
+		default:
+			device_id = 0;
+			break;
+		}
+
+		/* Find the pci device of the offending VF */
+		vfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID, device_id, NULL);
+		while (vfdev) {
+			if (vfdev->devfn == (req_id & 0xFF))
+				break;
+			vfdev = pci_get_device(IXGBE_INTEL_VENDOR_ID,
+					       device_id, vfdev);
+		}
+		/*
+		 * There's a slim chance the VF could have been hot plugged,
+		 * so if it is no longer present we don't need to issue the
+		 * VFLR.  Just clean up the AER in that case.
+		 */
+		if (vfdev) {
+			e_dev_err("Issuing VFLR to VF %d\n", vf);
+			pci_write_config_dword(vfdev, 0xA8, 0x00008000);
+		}
+
+		pci_cleanup_aer_uncorrect_error_status(pdev);
+	}
+
+	/*
+	 * Even though the error may have occurred on the other port
+	 * we still need to increment the vf error reference count for
+	 * both ports because the I/O resume function will be called
+	 * for both of them.
+	 */
+	adapter->vferr_refcount++;
+
+	return PCI_ERS_RESULT_RECOVERED;
+
+skip_bad_vf_detection:
+#endif /* CONFIG_PCI_IOV */
 	netif_device_detach(netdev);
 
 	if (state == pci_channel_io_perm_failure)
@@ -7779,6 +7941,14 @@ static void ixgbe_io_resume(struct pci_dev *pdev)
 	struct ixgbe_adapter *adapter = pci_get_drvdata(pdev);
 	struct net_device *netdev = adapter->netdev;
 
+#ifdef CONFIG_PCI_IOV
+	if (adapter->vferr_refcount) {
+		e_info(drv, "Resuming after VF err\n");
+		adapter->vferr_refcount--;
+		return;
+	}
+
+#endif
 	if (netif_running(netdev))
 		ixgbe_up(adapter);
 
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 02/12] ixgbe: Add FCoE DDP allocation failure counters to ethtool stats.
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Amir Hanania, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Amir Hanania <amir.hanania@intel.com>

Add 2 new counters to ethtool:
	1. Count DDP allocation failure since we max the number of buffers
		allowed in one DDP context.
	2. Count DDP allocation failure since we max the number of buffers
		allowed in one DDP context when we alloc an extra buffer.

Signed-off-by: Amir Hanania <amir.hanania@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c |    2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c    |   44 ++++++++++++++++-----
 drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.h    |    2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    |   17 ++++++++
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h    |    2 +
 5 files changed, 56 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 18520ce..e102ff6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -113,6 +113,8 @@ static struct ixgbe_stats ixgbe_gstrings_stats[] = {
 	{"rx_fcoe_dropped", IXGBE_STAT(stats.fcoerpdc)},
 	{"rx_fcoe_packets", IXGBE_STAT(stats.fcoeprc)},
 	{"rx_fcoe_dwords", IXGBE_STAT(stats.fcoedwrc)},
+	{"fcoe_noddp", IXGBE_STAT(stats.fcoe_noddp)},
+	{"fcoe_noddp_ext_buff", IXGBE_STAT(stats.fcoe_noddp_ext_buff)},
 	{"tx_fcoe_packets", IXGBE_STAT(stats.fcoeptc)},
 	{"tx_fcoe_dwords", IXGBE_STAT(stats.fcoedwtc)},
 #endif /* IXGBE_FCOE */
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
index 323f452..df3b1be 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
@@ -145,6 +145,7 @@ static int ixgbe_fcoe_ddp_setup(struct net_device *netdev, u16 xid,
 	u32 fcbuff, fcdmarw, fcfltrw, fcrxctl;
 	dma_addr_t addr = 0;
 	struct pci_pool *pool;
+	unsigned int cpu;
 
 	if (!netdev || !sgl)
 		return 0;
@@ -182,7 +183,8 @@ static int ixgbe_fcoe_ddp_setup(struct net_device *netdev, u16 xid,
 	}
 
 	/* alloc the udl from per cpu ddp pool */
-	pool = *per_cpu_ptr(fcoe->pool, get_cpu());
+	cpu = get_cpu();
+	pool = *per_cpu_ptr(fcoe->pool, cpu);
 	ddp->udl = pci_pool_alloc(pool, GFP_ATOMIC, &ddp->udp);
 	if (!ddp->udl) {
 		e_err(drv, "failed allocated ddp context\n");
@@ -199,9 +201,7 @@ static int ixgbe_fcoe_ddp_setup(struct net_device *netdev, u16 xid,
 		while (len) {
 			/* max number of buffers allowed in one DDP context */
 			if (j >= IXGBE_BUFFCNT_MAX) {
-				e_err(drv, "xid=%x:%d,%d,%d:addr=%llx "
-				      "not enough descriptors\n",
-				      xid, i, j, dmacount, (u64)addr);
+				*per_cpu_ptr(fcoe->pcpu_noddp, cpu) += 1;
 				goto out_noddp_free;
 			}
 
@@ -241,12 +241,7 @@ static int ixgbe_fcoe_ddp_setup(struct net_device *netdev, u16 xid,
 	 */
 	if (lastsize == bufflen) {
 		if (j >= IXGBE_BUFFCNT_MAX) {
-			printk_once("Will NOT use DDP since there are not "
-				    "enough user buffers. We need an  extra "
-				    "buffer because lastsize is bufflen. "
-				    "xid=%x:%d,%d,%d:addr=%llx\n",
-				    xid, i, j, dmacount, (u64)addr);
-
+			*per_cpu_ptr(fcoe->pcpu_noddp_ext_buff, cpu) += 1;
 			goto out_noddp_free;
 		}
 
@@ -600,6 +595,7 @@ void ixgbe_configure_fcoe(struct ixgbe_adapter *adapter)
 	struct ixgbe_hw *hw = &adapter->hw;
 	struct ixgbe_fcoe *fcoe = &adapter->fcoe;
 	struct ixgbe_ring_feature *f = &adapter->ring_feature[RING_F_FCOE];
+	unsigned int cpu;
 
 	if (!fcoe->pool) {
 		spin_lock_init(&fcoe->lock);
@@ -627,6 +623,24 @@ void ixgbe_configure_fcoe(struct ixgbe_adapter *adapter)
 			e_err(drv, "failed to map extra DDP buffer\n");
 			goto out_extra_ddp_buffer;
 		}
+
+		/* Alloc per cpu mem to count the ddp alloc failure number */
+		fcoe->pcpu_noddp = alloc_percpu(u64);
+		if (!fcoe->pcpu_noddp) {
+			e_err(drv, "failed to alloc noddp counter\n");
+			goto out_pcpu_noddp_alloc_fail;
+		}
+
+		fcoe->pcpu_noddp_ext_buff = alloc_percpu(u64);
+		if (!fcoe->pcpu_noddp_ext_buff) {
+			e_err(drv, "failed to alloc noddp extra buff cnt\n");
+			goto out_pcpu_noddp_extra_buff_alloc_fail;
+		}
+
+		for_each_possible_cpu(cpu) {
+			*per_cpu_ptr(fcoe->pcpu_noddp, cpu) = 0;
+			*per_cpu_ptr(fcoe->pcpu_noddp_ext_buff, cpu) = 0;
+		}
 	}
 
 	/* Enable L2 eth type filter for FCoE */
@@ -664,7 +678,13 @@ void ixgbe_configure_fcoe(struct ixgbe_adapter *adapter)
 	IXGBE_WRITE_REG(hw, IXGBE_FCRXCTRL, IXGBE_FCRXCTRL_FCCRCBO |
 			(FC_FCOE_VER << IXGBE_FCRXCTRL_FCOEVER_SHIFT));
 	return;
-
+out_pcpu_noddp_extra_buff_alloc_fail:
+	free_percpu(fcoe->pcpu_noddp);
+out_pcpu_noddp_alloc_fail:
+	dma_unmap_single(&adapter->pdev->dev,
+			 fcoe->extra_ddp_buffer_dma,
+			 IXGBE_FCBUFF_MIN,
+			 DMA_FROM_DEVICE);
 out_extra_ddp_buffer:
 	kfree(fcoe->extra_ddp_buffer);
 out_ddp_pools:
@@ -693,6 +713,8 @@ void ixgbe_cleanup_fcoe(struct ixgbe_adapter *adapter)
 			 fcoe->extra_ddp_buffer_dma,
 			 IXGBE_FCBUFF_MIN,
 			 DMA_FROM_DEVICE);
+	free_percpu(fcoe->pcpu_noddp);
+	free_percpu(fcoe->pcpu_noddp_ext_buff);
 	kfree(fcoe->extra_ddp_buffer);
 	ixgbe_fcoe_ddp_pools_free(fcoe);
 }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.h
index 99de145..261fd62 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.h
@@ -73,6 +73,8 @@ struct ixgbe_fcoe {
 	unsigned char *extra_ddp_buffer;
 	dma_addr_t extra_ddp_buffer_dma;
 	unsigned long mode;
+	u64 __percpu *pcpu_noddp;
+	u64 __percpu *pcpu_noddp_ext_buff;
 #ifdef CONFIG_IXGBE_DCB
 	u8 up;
 #endif
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index b95c6e9..f6fea67 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -5552,6 +5552,11 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 	u64 non_eop_descs = 0, restart_queue = 0, tx_busy = 0;
 	u64 alloc_rx_page_failed = 0, alloc_rx_buff_failed = 0;
 	u64 bytes = 0, packets = 0;
+#ifdef IXGBE_FCOE
+	struct ixgbe_fcoe *fcoe = &adapter->fcoe;
+	unsigned int cpu;
+	u64 fcoe_noddp_counts_sum = 0, fcoe_noddp_ext_buff_counts_sum = 0;
+#endif /* IXGBE_FCOE */
 
 	if (test_bit(__IXGBE_DOWN, &adapter->state) ||
 	    test_bit(__IXGBE_RESETTING, &adapter->state))
@@ -5679,6 +5684,18 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 		hwstats->fcoeptc += IXGBE_READ_REG(hw, IXGBE_FCOEPTC);
 		hwstats->fcoedwrc += IXGBE_READ_REG(hw, IXGBE_FCOEDWRC);
 		hwstats->fcoedwtc += IXGBE_READ_REG(hw, IXGBE_FCOEDWTC);
+		/* Add up per cpu counters for total ddp aloc fail */
+		if (fcoe->pcpu_noddp && fcoe->pcpu_noddp_ext_buff) {
+			for_each_possible_cpu(cpu) {
+				fcoe_noddp_counts_sum +=
+					*per_cpu_ptr(fcoe->pcpu_noddp, cpu);
+				fcoe_noddp_ext_buff_counts_sum +=
+					*per_cpu_ptr(fcoe->
+						pcpu_noddp_ext_buff, cpu);
+			}
+		}
+		hwstats->fcoe_noddp = fcoe_noddp_counts_sum;
+		hwstats->fcoe_noddp_ext_buff = fcoe_noddp_ext_buff_counts_sum;
 #endif /* IXGBE_FCOE */
 		break;
 	default:
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index d1d6894..6c5cca8 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -2682,6 +2682,8 @@ struct ixgbe_hw_stats {
 	u64 fcoeptc;
 	u64 fcoedwrc;
 	u64 fcoedwtc;
+	u64 fcoe_noddp;
+	u64 fcoe_noddp_ext_buff;
 	u64 b2ospc;
 	u64 b2ogprc;
 	u64 o2bgptc;
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 05/12] igb: Make certain one vector is always assigned in igb_request_irq
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change makes certain that one interrupt is always initialized in
igb_request_irq.  In addition we drop the use of adapter->pdev and
instead just call pdev since we made a local copy of the pointer earlier in
the function.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by:  Aaron Brown  <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 2768c35..29ec64f 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -1215,7 +1215,7 @@ static int igb_request_irq(struct igb_adapter *adapter)
 			goto request_done;
 		/* fall back to MSI */
 		igb_clear_interrupt_scheme(adapter);
-		if (!pci_enable_msi(adapter->pdev))
+		if (!pci_enable_msi(pdev))
 			adapter->flags |= IGB_FLAG_HAS_MSI;
 		igb_free_all_tx_resources(adapter);
 		igb_free_all_rx_resources(adapter);
@@ -1237,12 +1237,12 @@ static int igb_request_irq(struct igb_adapter *adapter)
 		}
 		igb_setup_all_tx_resources(adapter);
 		igb_setup_all_rx_resources(adapter);
-	} else {
-		igb_assign_vector(adapter->q_vector[0], 0);
 	}
 
+	igb_assign_vector(adapter->q_vector[0], 0);
+
 	if (adapter->flags & IGB_FLAG_HAS_MSI) {
-		err = request_irq(adapter->pdev->irq, igb_intr_msi, 0,
+		err = request_irq(pdev->irq, igb_intr_msi, 0,
 				  netdev->name, adapter);
 		if (!err)
 			goto request_done;
@@ -1252,11 +1252,11 @@ static int igb_request_irq(struct igb_adapter *adapter)
 		adapter->flags &= ~IGB_FLAG_HAS_MSI;
 	}
 
-	err = request_irq(adapter->pdev->irq, igb_intr, IRQF_SHARED,
+	err = request_irq(pdev->irq, igb_intr, IRQF_SHARED,
 			  netdev->name, adapter);
 
 	if (err)
-		dev_err(&adapter->pdev->dev, "Error %d getting interrupt\n",
+		dev_err(&pdev->dev, "Error %d getting interrupt\n",
 			err);
 
 request_done:
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 04/12] igb: avoid unnecessarily creating a local copy of the q_vector
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This is mostly a drop of unnecessary pointer defines for q_vector when we
don't have issues with line width and don't have multiple references to
the pointer.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by:  Aaron Brown  <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |   46 ++++++++++------------------
 1 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index b3a2e3d..2768c35 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -1270,11 +1270,9 @@ static void igb_free_irq(struct igb_adapter *adapter)
 
 		free_irq(adapter->msix_entries[vector++].vector, adapter);
 
-		for (i = 0; i < adapter->num_q_vectors; i++) {
-			struct igb_q_vector *q_vector = adapter->q_vector[i];
+		for (i = 0; i < adapter->num_q_vectors; i++)
 			free_irq(adapter->msix_entries[vector++].vector,
-			         q_vector);
-		}
+				 adapter->q_vector[i]);
 	} else {
 		free_irq(adapter->pdev->irq, adapter);
 	}
@@ -1476,10 +1474,9 @@ int igb_up(struct igb_adapter *adapter)
 
 	clear_bit(__IGB_DOWN, &adapter->state);
 
-	for (i = 0; i < adapter->num_q_vectors; i++) {
-		struct igb_q_vector *q_vector = adapter->q_vector[i];
-		napi_enable(&q_vector->napi);
-	}
+	for (i = 0; i < adapter->num_q_vectors; i++)
+		napi_enable(&(adapter->q_vector[i]->napi));
+
 	if (adapter->msix_entries)
 		igb_configure_msix(adapter);
 	else
@@ -1531,10 +1528,8 @@ void igb_down(struct igb_adapter *adapter)
 	wrfl();
 	msleep(10);
 
-	for (i = 0; i < adapter->num_q_vectors; i++) {
-		struct igb_q_vector *q_vector = adapter->q_vector[i];
-		napi_disable(&q_vector->napi);
-	}
+	for (i = 0; i < adapter->num_q_vectors; i++)
+		napi_disable(&(adapter->q_vector[i]->napi));
 
 	igb_irq_disable(adapter);
 
@@ -2497,10 +2492,8 @@ static int igb_open(struct net_device *netdev)
 	/* From here on the code is the same as igb_up() */
 	clear_bit(__IGB_DOWN, &adapter->state);
 
-	for (i = 0; i < adapter->num_q_vectors; i++) {
-		struct igb_q_vector *q_vector = adapter->q_vector[i];
-		napi_enable(&q_vector->napi);
-	}
+	for (i = 0; i < adapter->num_q_vectors; i++)
+		napi_enable(&(adapter->q_vector[i]->napi));
 
 	/* Clear any pending interrupts. */
 	rd32(E1000_ICR);
@@ -3699,10 +3692,8 @@ static void igb_watchdog_task(struct work_struct *work)
 	/* Cause software interrupt to ensure rx ring is cleaned */
 	if (adapter->msix_entries) {
 		u32 eics = 0;
-		for (i = 0; i < adapter->num_q_vectors; i++) {
-			struct igb_q_vector *q_vector = adapter->q_vector[i];
-			eics |= q_vector->eims_value;
-		}
+		for (i = 0; i < adapter->num_q_vectors; i++)
+			eics |= adapter->q_vector[i]->eims_value;
 		wr32(E1000_EICS, eics);
 	} else {
 		wr32(E1000_ICS, E1000_ICS_RXDMT0);
@@ -6601,18 +6592,15 @@ static void igb_netpoll(struct net_device *netdev)
 {
 	struct igb_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
+	struct igb_q_vector *q_vector;
 	int i;
 
-	if (!adapter->msix_entries) {
-		struct igb_q_vector *q_vector = adapter->q_vector[0];
-		igb_irq_disable(adapter);
-		napi_schedule(&q_vector->napi);
-		return;
-	}
-
 	for (i = 0; i < adapter->num_q_vectors; i++) {
-		struct igb_q_vector *q_vector = adapter->q_vector[i];
-		wr32(E1000_EIMC, q_vector->eims_value);
+		q_vector = adapter->q_vector[i];
+		if (adapter->msix_entries)
+			wr32(E1000_EIMC, q_vector->eims_value);
+		else
+			igb_irq_disable(adapter);
 		napi_schedule(&q_vector->napi);
 	}
 }
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 03/12] ixgbe: Correct check for change in FCoE priority
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Mark Rustad, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mark Rustad <mark.d.rustad@intel.com>

Correct a check for change in FCoE priority when IEEE mode DCB is in use.
In IEEE mode a different function has to be used to get the FCoE priority
mask. Also, the check for the mask assumed that only one priority was set.
In case there should be more than one, check just the bit.

These changes help avoid link flapping issues that can come up when IEEE
DCB is in use.

Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
Tested-by: Ross Brattain <ross.b.brattain@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
index be66bb6..3631d63 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
@@ -318,7 +318,15 @@ static u8 ixgbe_dcbnl_set_all(struct net_device *netdev)
 			      .selector = DCB_APP_IDTYPE_ETHTYPE,
 			      .protocol = ETH_P_FCOE,
 			     };
-	u8 up = dcb_getapp(netdev, &app);
+	u8 up;
+
+	/* In IEEE mode, use the IEEE Ethertype selector value */
+	if (adapter->dcbx_cap & DCB_CAP_DCBX_VER_IEEE) {
+		app.selector = IEEE_8021QAZ_APP_SEL_ETHERTYPE;
+		up = dcb_ieee_getapp_mask(netdev, &app);
+	} else {
+		up = dcb_getapp(netdev, &app);
+	}
 #endif
 
 	/* Fail command if not in CEE mode */
@@ -331,7 +339,7 @@ static u8 ixgbe_dcbnl_set_all(struct net_device *netdev)
 		return DCB_NO_HW_CHG;
 
 #ifdef IXGBE_FCOE
-	if (up && (up != (1 << adapter->fcoe.up)))
+	if (up && !(up & (1 << adapter->fcoe.up)))
 		adapter->dcb_set_bitmap |= BIT_APP_UPCHG;
 
 	/*
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 08/12] igb: Add workaround for byte swapped VLAN on i350 local traffic
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

On i350 when traffic is looped back from a VF to the PF the value is byte
swapped from the normal format.  In order to address this we need to add a
flag indicating that the ring will need to byte swap the loopback packets
prior to processing them.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by:  Aaron Brown  <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_defines.h |    1 +
 drivers/net/ethernet/intel/igb/igb.h           |    1 +
 drivers/net/ethernet/intel/igb/igb_main.c      |   29 +++++++++++++++++++-----
 3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
index 68558be..f5fc572 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -85,6 +85,7 @@
 #define E1000_RXD_STAT_TCPCS    0x20    /* TCP xsum calculated */
 #define E1000_RXD_STAT_TS       0x10000 /* Pkt was time stamped */
 
+#define E1000_RXDEXT_STATERR_LB    0x00040000
 #define E1000_RXDEXT_STATERR_CE    0x01000000
 #define E1000_RXDEXT_STATERR_SE    0x02000000
 #define E1000_RXDEXT_STATERR_SEQ   0x04000000
diff --git a/drivers/net/ethernet/intel/igb/igb.h b/drivers/net/ethernet/intel/igb/igb.h
index 5def94c..e6e770d 100644
--- a/drivers/net/ethernet/intel/igb/igb.h
+++ b/drivers/net/ethernet/intel/igb/igb.h
@@ -243,6 +243,7 @@ struct igb_ring {
 
 enum e1000_ring_flags_t {
 	IGB_RING_FLAG_RX_SCTP_CSUM,
+	IGB_RING_FLAG_RX_LB_VLAN_BSWAP,
 	IGB_RING_FLAG_TX_CTX_IDX,
 	IGB_RING_FLAG_TX_DETECT_HANG
 };
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 43caf84..7c6e526 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -712,6 +712,11 @@ static int igb_alloc_queues(struct igb_adapter *adapter)
 		/* set flag indicating ring supports SCTP checksum offload */
 		if (adapter->hw.mac.type >= e1000_82576)
 			set_bit(IGB_RING_FLAG_RX_SCTP_CSUM, &ring->flags);
+
+		/* On i350, loopback VLAN packets have the tag byte-swapped. */
+		if (adapter->hw.mac.type == e1000_i350)
+			set_bit(IGB_RING_FLAG_RX_LB_VLAN_BSWAP, &ring->flags);
+
 		adapter->rx_ring[i] = ring;
 	}
 
@@ -5794,6 +5799,23 @@ static void igb_rx_hwtstamp(struct igb_q_vector *q_vector,
 
 	igb_systim_to_hwtstamp(adapter, skb_hwtstamps(skb), regval);
 }
+
+static void igb_rx_vlan(struct igb_ring *ring,
+			union e1000_adv_rx_desc *rx_desc,
+			struct sk_buff *skb)
+{
+	if (igb_test_staterr(rx_desc, E1000_RXD_STAT_VP)) {
+		u16 vid;
+		if (igb_test_staterr(rx_desc, E1000_RXDEXT_STATERR_LB) &&
+		    test_bit(IGB_RING_FLAG_RX_LB_VLAN_BSWAP, &ring->flags))
+			vid = be16_to_cpu(rx_desc->wb.upper.vlan);
+		else
+			vid = le16_to_cpu(rx_desc->wb.upper.vlan);
+
+		__vlan_hwaccel_put_tag(skb, vid);
+	}
+}
+
 static inline u16 igb_get_hlen(union e1000_adv_rx_desc *rx_desc)
 {
 	/* HW will not DMA in data larger than the given buffer, even if it
@@ -5890,12 +5912,7 @@ static bool igb_clean_rx_irq(struct igb_q_vector *q_vector, int budget)
 		igb_rx_hwtstamp(q_vector, rx_desc, skb);
 		igb_rx_hash(rx_ring, rx_desc, skb);
 		igb_rx_checksum(rx_ring, rx_desc, skb);
-
-		if (igb_test_staterr(rx_desc, E1000_RXD_STAT_VP)) {
-			u16 vid = le16_to_cpu(rx_desc->wb.upper.vlan);
-
-			__vlan_hwaccel_put_tag(skb, vid);
-		}
+		igb_rx_vlan(rx_ring, rx_desc, skb);
 
 		total_bytes += skb->len;
 		total_packets++;
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 06/12] igb: Fix features that are currently 82580 only and should also be i350
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

This change allows support for per packet timesync and global device reset
on the i350 adapter.  These features were supported on both 82580 and i350
however it looks like several checks where not updated and as such the i350
support was not enabled.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by:  Aaron Brown  <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |   15 ++++++---------
 1 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 29ec64f..ff4a335 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -563,7 +563,7 @@ static cycle_t igb_read_clock(const struct cyclecounter *tc)
 	 * the lowest register is SYSTIMR instead of SYSTIML.  However we never
 	 * adjusted TIMINCA so SYSTIMR will just read as all 0s so ignore it.
 	 */
-	if (hw->mac.type == e1000_82580) {
+	if (hw->mac.type >= e1000_82580) {
 		stamp = rd32(E1000_SYSTIMR) >> 8;
 		shift = IGB_82580_TSYNC_SHIFT;
 	}
@@ -1320,7 +1320,7 @@ static void igb_irq_enable(struct igb_adapter *adapter)
 	struct e1000_hw *hw = &adapter->hw;
 
 	if (adapter->msix_entries) {
-		u32 ims = E1000_IMS_LSC | E1000_IMS_DOUTSYNC;
+		u32 ims = E1000_IMS_LSC | E1000_IMS_DOUTSYNC | E1000_IMS_DRSTA;
 		u32 regval = rd32(E1000_EIAC);
 		wr32(E1000_EIAC, regval | adapter->eims_enable_mask);
 		regval = rd32(E1000_EIAM);
@@ -1330,9 +1330,6 @@ static void igb_irq_enable(struct igb_adapter *adapter)
 			wr32(E1000_MBVFIMR, 0xFF);
 			ims |= E1000_IMS_VMMB;
 		}
-		if (adapter->hw.mac.type == e1000_82580)
-			ims |= E1000_IMS_DRSTA;
-
 		wr32(E1000_IMS, ims);
 	} else {
 		wr32(E1000_IMS, IMS_ENABLE_MASK |
@@ -3042,7 +3039,7 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
 	srrctl |= (PAGE_SIZE / 2) >> E1000_SRRCTL_BSIZEPKT_SHIFT;
 #endif
 	srrctl |= E1000_SRRCTL_DESCTYPE_HDR_SPLIT_ALWAYS;
-	if (hw->mac.type == e1000_82580)
+	if (hw->mac.type >= e1000_82580)
 		srrctl |= E1000_SRRCTL_TIMESTAMP;
 	/* Only set Drop Enable if we are supporting multiple queues */
 	if (adapter->vfs_allocated_count || adapter->num_rx_queues > 1)
@@ -4393,7 +4390,7 @@ static void igb_tx_timeout(struct net_device *netdev)
 	/* Do the reset outside of interrupt context */
 	adapter->tx_timeout_count++;
 
-	if (hw->mac.type == e1000_82580)
+	if (hw->mac.type >= e1000_82580)
 		hw->dev_spec._82575.global_device_reset = true;
 
 	schedule_work(&adapter->reset_task);
@@ -5511,7 +5508,7 @@ static void igb_systim_to_hwtstamp(struct igb_adapter *adapter,
 	 * The 82580 starts with 1ns at bit 0 in RX/TXSTMPL, shift this up to
 	 * 24 to match clock shift we setup earlier.
 	 */
-	if (adapter->hw.mac.type == e1000_82580)
+	if (adapter->hw.mac.type >= e1000_82580)
 		regval <<= IGB_82580_TSYNC_SHIFT;
 
 	ns = timecounter_cyc2time(&adapter->clock, regval);
@@ -6206,7 +6203,7 @@ static int igb_hwtstamp_ioctl(struct net_device *netdev,
 	 * timestamped, so enable timestamping in all packets as
 	 * long as one rx filter was configured.
 	 */
-	if ((hw->mac.type == e1000_82580) && tsync_rx_ctl) {
+	if ((hw->mac.type >= e1000_82580) && tsync_rx_ctl) {
 		tsync_rx_ctl = E1000_TSYNCRXCTL_ENABLED;
 		tsync_rx_ctl |= E1000_TSYNCRXCTL_TYPE_ALL;
 	}
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 10/12] igb: move DMA Coalescing feature code into separate function.
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Jacob Sowles, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jacob Sowles <jacob.r.sowles@intel.com>

DMA Coalescing code needs to be executed during reset, if enabled.  Moved
code to separate function and added a call to it in igb_reset.

Signed-off-by: Jacob Sowles <jacob.r.sowles@intel.com>
Tested-by:  Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |   66 +++++++++++++++++++++++++++++
 1 files changed, 66 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 7c6e526..6fdf2e0 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6907,4 +6907,70 @@ static void igb_vmm_control(struct igb_adapter *adapter)
 	}
 }
 
+static void igb_init_dmac(struct igb_adapter *adapter, u32 pba)
+{
+	struct e1000_hw *hw = &adapter->hw;
+	u32 dmac_thr;
+	u16 hwm;
+
+	if (hw->mac.type > e1000_82580) {
+		if (adapter->flags & IGB_FLAG_DMAC) {
+			u32 reg;
+
+			/* force threshold to 0. */
+			wr32(E1000_DMCTXTH, 0);
+
+			/*
+			 * DMA Coalescing high water mark needs to be higher
+			 * than the RX threshold. set hwm to PBA -  2 * max
+			 * frame size
+			 */
+			hwm = pba - (2 * adapter->max_frame_size);
+			reg = rd32(E1000_DMACR);
+			reg &= ~E1000_DMACR_DMACTHR_MASK;
+			dmac_thr = pba - 4;
+
+			reg |= ((dmac_thr << E1000_DMACR_DMACTHR_SHIFT)
+				& E1000_DMACR_DMACTHR_MASK);
+
+			/* transition to L0x or L1 if available..*/
+			reg |= (E1000_DMACR_DMAC_EN | E1000_DMACR_DMAC_LX_MASK);
+
+			/* watchdog timer= +-1000 usec in 32usec intervals */
+			reg |= (1000 >> 5);
+			wr32(E1000_DMACR, reg);
+
+			/*
+			 * no lower threshold to disable
+			 * coalescing(smart fifb)-UTRESH=0
+			 */
+			wr32(E1000_DMCRTRH, 0);
+			wr32(E1000_FCRTC, hwm);
+
+			reg = (IGB_DMCTLX_DCFLUSH_DIS | 0x4);
+
+			wr32(E1000_DMCTLX, reg);
+
+			/*
+			 * free space in tx packet buffer to wake from
+			 * DMA coal
+			 */
+			wr32(E1000_DMCTXTH, (IGB_MIN_TXPBSIZE -
+			     (IGB_TX_BUF_4096 + adapter->max_frame_size)) >> 6);
+
+			/*
+			 * make low power state decision controlled
+			 * by DMA coal
+			 */
+			reg = rd32(E1000_PCIEMISC);
+			reg &= ~E1000_PCIEMISC_LX_DECISION;
+			wr32(E1000_PCIEMISC, reg);
+		} /* endif adapter->dmac is not disabled */
+	} else if (hw->mac.type == e1000_82580) {
+		u32 reg = rd32(E1000_PCIEMISC);
+		wr32(E1000_PCIEMISC, reg & ~E1000_PCIEMISC_LX_DECISION);
+		wr32(E1000_DMACR, 0);
+	}
+}
+
 /* igb_main.c */
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 07/12] igb: Drop unnecessary write of E1000_IMS from igb_msix_other
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Alexander Duyck, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Alexander Duyck <alexander.h.duyck@intel.com>

Since we mask interrupts in EIMS not in IMS there is no need to re-enable
mask bits in that register.  As such we can remove the write to IMS from
the end of igb_msix_other.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by:  Aaron Brown  <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index ff4a335..43caf84 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -4696,12 +4696,6 @@ static irqreturn_t igb_msix_other(int irq, void *data)
 			mod_timer(&adapter->watchdog_timer, jiffies + 1);
 	}
 
-	if (adapter->vfs_allocated_count)
-		wr32(E1000_IMS, E1000_IMS_LSC |
-				E1000_IMS_VMMB |
-				E1000_IMS_DOUTSYNC);
-	else
-		wr32(E1000_IMS, E1000_IMS_LSC | E1000_IMS_DOUTSYNC);
 	wr32(E1000_EIMS, adapter->eims_other);
 
 	return IRQ_HANDLED;
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 11/12] igb: Loopback funtionality supports for i350 devices
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Akeem G. Abodunrin, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: "Akeem G. Abodunrin" <akeem.g.abodunrin@intel.com>

This patch adds VMDq loopback pf support for i350 devices. The patch is
necessary since the register that enabled loopback was moved and renamed
from DTXSWC to TXSWC.

Signed-off-by: "Akeem G. Abodunrin" <akeem.g.abodunrin@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_82575.c |   29 ++++++++++++++++++++-----
 drivers/net/ethernet/intel/igb/e1000_regs.h  |    1 +
 2 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.c b/drivers/net/ethernet/intel/igb/e1000_82575.c
index 3771bd2..6580cea 100644
--- a/drivers/net/ethernet/intel/igb/e1000_82575.c
+++ b/drivers/net/ethernet/intel/igb/e1000_82575.c
@@ -1580,14 +1580,31 @@ void igb_vmdq_set_anti_spoofing_pf(struct e1000_hw *hw, bool enable, int pf)
  **/
 void igb_vmdq_set_loopback_pf(struct e1000_hw *hw, bool enable)
 {
-	u32 dtxswc = rd32(E1000_DTXSWC);
+	u32 dtxswc;
+
+	switch (hw->mac.type) {
+	case e1000_82576:
+		dtxswc = rd32(E1000_DTXSWC);
+		if (enable)
+			dtxswc |= E1000_DTXSWC_VMDQ_LOOPBACK_EN;
+		else
+			dtxswc &= ~E1000_DTXSWC_VMDQ_LOOPBACK_EN;
+		wr32(E1000_DTXSWC, dtxswc);
+		break;
+	case e1000_i350:
+		dtxswc = rd32(E1000_TXSWC);
+		if (enable)
+			dtxswc |= E1000_DTXSWC_VMDQ_LOOPBACK_EN;
+		else
+			dtxswc &= ~E1000_DTXSWC_VMDQ_LOOPBACK_EN;
+		wr32(E1000_TXSWC, dtxswc);
+		break;
+	default:
+		/* Currently no other hardware supports loopback */
+		break;
+	}
 
-	if (enable)
-		dtxswc |= E1000_DTXSWC_VMDQ_LOOPBACK_EN;
-	else
-		dtxswc &= ~E1000_DTXSWC_VMDQ_LOOPBACK_EN;
 
-	wr32(E1000_DTXSWC, dtxswc);
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/igb/e1000_regs.h b/drivers/net/ethernet/intel/igb/e1000_regs.h
index 0990f6d..0a860bc 100644
--- a/drivers/net/ethernet/intel/igb/e1000_regs.h
+++ b/drivers/net/ethernet/intel/igb/e1000_regs.h
@@ -318,6 +318,7 @@
 #define E1000_RPLOLR    0x05AF0 /* Replication Offload - RW */
 #define E1000_UTA       0x0A000 /* Unicast Table Array - RW */
 #define E1000_IOVTCL    0x05BBC /* IOV Control Register */
+#define E1000_TXSWC     0x05ACC /* Tx Switch Control */
 /* These act per VF so an array friendly macro is used */
 #define E1000_P2VMAILBOX(_n)   (0x00C00 + (4 * (_n)))
 #define E1000_VMBMEM(_n)       (0x00800 + (64 * (_n)))
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 09/12] igb: fix static function warnings reported by sparse
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Emil Tantilov, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Emil Tantilov <emil.s.tantilov@intel.com>

igb_update/validate_nvm_checksum_with_offset() should be static.
Also removes unneeded protypes for the above functions.

Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by:  Aaron Brown  <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/e1000_82575.c |    9 +++------
 1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/e1000_82575.c b/drivers/net/ethernet/intel/igb/e1000_82575.c
index c0857bd..3771bd2 100644
--- a/drivers/net/ethernet/intel/igb/e1000_82575.c
+++ b/drivers/net/ethernet/intel/igb/e1000_82575.c
@@ -66,10 +66,6 @@ static s32  igb_set_pcie_completion_timeout(struct e1000_hw *hw);
 static s32  igb_reset_mdicnfg_82580(struct e1000_hw *hw);
 static s32  igb_validate_nvm_checksum_82580(struct e1000_hw *hw);
 static s32  igb_update_nvm_checksum_82580(struct e1000_hw *hw);
-static s32  igb_update_nvm_checksum_with_offset(struct e1000_hw *hw,
-						u16 offset);
-static s32 igb_validate_nvm_checksum_with_offset(struct e1000_hw *hw,
-						u16 offset);
 static s32 igb_validate_nvm_checksum_i350(struct e1000_hw *hw);
 static s32 igb_update_nvm_checksum_i350(struct e1000_hw *hw);
 static const u16 e1000_82580_rxpbs_table[] =
@@ -1820,7 +1816,8 @@ u16 igb_rxpbs_adjust_82580(u32 data)
  *  Calculates the EEPROM checksum by reading/adding each word of the EEPROM
  *  and then verifies that the sum of the EEPROM is equal to 0xBABA.
  **/
-s32 igb_validate_nvm_checksum_with_offset(struct e1000_hw *hw, u16 offset)
+static s32 igb_validate_nvm_checksum_with_offset(struct e1000_hw *hw,
+						 u16 offset)
 {
 	s32 ret_val = 0;
 	u16 checksum = 0;
@@ -1855,7 +1852,7 @@ out:
  *  up to the checksum.  Then calculates the EEPROM checksum and writes the
  *  value to the EEPROM.
  **/
-s32 igb_update_nvm_checksum_with_offset(struct e1000_hw *hw, u16 offset)
+static s32 igb_update_nvm_checksum_with_offset(struct e1000_hw *hw, u16 offset)
 {
 	s32 ret_val;
 	u16 checksum = 0;
-- 
1.7.6.4

^ permalink raw reply related

* [net-next 12/12] igb: Version bump.
From: Jeff Kirsher @ 2011-10-12  3:38 UTC (permalink / raw)
  To: davem; +Cc: Carolyn Wyborny, netdev, gospo, sassmann, Jeff Kirsher
In-Reply-To: <1318390708-12232-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Carolyn Wyborny <carolyn.wyborny@intel.com>

This change updates the driver version to 3.2.10.

Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/igb/igb_main.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index 6fdf2e0..8ba0889 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -57,8 +57,8 @@
 #include "igb.h"
 
 #define MAJ 3
-#define MIN 0
-#define BUILD 6
+#define MIN 2
+#define BUILD 10
 #define DRV_VERSION __stringify(MAJ) "." __stringify(MIN) "." \
 __stringify(BUILD) "-k"
 char igb_driver_name[] = "igb";
-- 
1.7.6.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox