Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [patch iproute2] tc: add support for vlan tc action
From: Jamal Hadi Salim @ 2014-11-11 13:36 UTC (permalink / raw)
  To: Jiri Pirko, netdev
  Cc: davem, pshelar, therbert, edumazet, willemb, dborkman, mst, fw,
	Paul.Durrant, tgraf, Stephen Hemminger
In-Reply-To: <1415700966-9225-1-git-send-email-jiri@resnulli.us>

[-- Attachment #1: Type: text/plain, Size: 115 bytes --]


Ive run out of time - but will test; so far looks good.
Attached a small patchlet on top of yours.

cheers,
jamal

[-- Attachment #2: patch-for-jiri --]
[-- Type: text/plain, Size: 1493 bytes --]

diff --git a/tc/m_vlan.c b/tc/m_vlan.c
index 54c0ce7..2755e20 100644
--- a/tc/m_vlan.c
+++ b/tc/m_vlan.c
@@ -13,6 +13,7 @@
 #include <stdlib.h>
 #include <unistd.h>
 #include <string.h>
+#include <linux/if_ether.h>
 #include "utils.h"
 #include "rt_names.h"
 #include "tc_util.h"
@@ -22,6 +23,8 @@ static void explain(void)
 {
 	fprintf(stderr, "Usage: vlan pop\n");
 	fprintf(stderr, "       vlan push [ protocol VLANPROTO ] id VLANID\n");
+	fprintf(stderr, "       VLANPROTO is one of 802.1Q or 802.1ad\n");
+	fprintf(stderr, "            with default: 802.1Q\n");
 }
 
 static void usage(void)
@@ -121,7 +124,7 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
 		if (matches(*argv, "index") == 0) {
 			NEXT_ARG();
 			if (get_u32(&parm.index, *argv, 10)) {
-				fprintf(stderr, "Pedit: Illegal \"index\"\n");
+				fprintf(stderr, "vlan: Illegal \"index\"\n");
 				return -1;
 			}
 			argc--;
@@ -141,8 +144,15 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
 	addattr_l(n, MAX_MSG, TCA_VLAN_PARMS, &parm, sizeof(parm));
 	if (id_set)
 		addattr_l(n, MAX_MSG, TCA_VLAN_PUSH_VLAN_ID, &id, 2);
-	if (proto_set)
+	if (proto_set) {
+		if (proto != ETH_P_8021Q && proto != ETH_P_8021AD) {
+			fprintf(stderr, "protocol id 0x%x not supported\n", proto);
+			explain();
+			return -1;
+		}
+
 		addattr_l(n, MAX_MSG, TCA_VLAN_PUSH_VLAN_PROTOCOL, &proto, 2);
+	}
 	tail->rta_len = (char *)NLMSG_TAIL(n) - (char *)tail;
 
 	*argc_p = argc;

^ permalink raw reply related

* Re: [patch iproute2] tc: add support for vlan tc action
From: Jiri Pirko @ 2014-11-11 13:46 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: netdev, davem, pshelar, therbert, edumazet, willemb, dborkman,
	mst, fw, Paul.Durrant, tgraf, Stephen Hemminger
In-Reply-To: <546210EA.7090201@mojatatu.com>

Tue, Nov 11, 2014 at 02:36:42PM CET, jhs@mojatatu.com wrote:
>
>Ive run out of time - but will test; so far looks good.
>Attached a small patchlet on top of yours.

Will squash it in and send v2.

>
>cheers,
>jamal

>diff --git a/tc/m_vlan.c b/tc/m_vlan.c
>index 54c0ce7..2755e20 100644
>--- a/tc/m_vlan.c
>+++ b/tc/m_vlan.c
>@@ -13,6 +13,7 @@
> #include <stdlib.h>
> #include <unistd.h>
> #include <string.h>
>+#include <linux/if_ether.h>
> #include "utils.h"
> #include "rt_names.h"
> #include "tc_util.h"
>@@ -22,6 +23,8 @@ static void explain(void)
> {
> 	fprintf(stderr, "Usage: vlan pop\n");
> 	fprintf(stderr, "       vlan push [ protocol VLANPROTO ] id VLANID\n");
>+	fprintf(stderr, "       VLANPROTO is one of 802.1Q or 802.1ad\n");
>+	fprintf(stderr, "            with default: 802.1Q\n");
> }
> 
> static void usage(void)
>@@ -121,7 +124,7 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
> 		if (matches(*argv, "index") == 0) {
> 			NEXT_ARG();
> 			if (get_u32(&parm.index, *argv, 10)) {
>-				fprintf(stderr, "Pedit: Illegal \"index\"\n");
>+				fprintf(stderr, "vlan: Illegal \"index\"\n");
> 				return -1;
> 			}
> 			argc--;
>@@ -141,8 +144,15 @@ static int parse_vlan(struct action_util *a, int *argc_p, char ***argv_p,
> 	addattr_l(n, MAX_MSG, TCA_VLAN_PARMS, &parm, sizeof(parm));
> 	if (id_set)
> 		addattr_l(n, MAX_MSG, TCA_VLAN_PUSH_VLAN_ID, &id, 2);
>-	if (proto_set)
>+	if (proto_set) {
>+		if (proto != ETH_P_8021Q && proto != ETH_P_8021AD) {
>+			fprintf(stderr, "protocol id 0x%x not supported\n", proto);
>+			explain();
>+			return -1;
>+		}
>+
> 		addattr_l(n, MAX_MSG, TCA_VLAN_PUSH_VLAN_PROTOCOL, &proto, 2);
>+	}
> 	tail->rta_len = (char *)NLMSG_TAIL(n) - (char *)tail;
> 
> 	*argc_p = argc;

^ permalink raw reply

* Re: [PATCH -next 2/2] seq_putc: Convert to return void and convert uses too.
From: Petr Mladek @ 2014-11-11 13:47 UTC (permalink / raw)
  To: Joe Perches
  Cc: Steven Rostedt, Corey Minyard, Alexander Viro, Pablo Neira Ayuso,
	Patrick McHardy, Jozsef Kadlecsik, David S. Miller,
	Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI,
	openipmi-developer, linux-kernel, linux-fsdevel, netfilter-devel,
	coreteam, netdev
In-Reply-To: <56c4c0d5ec721134cff4913e5e3f8923169c35ef.1415645477.git.joe@perches.com>

On Mon 2014-11-10 10:58:57, Joe Perches wrote:
> Using the return value of seq_putc is error-prone, so
> make it return void instead.
> 
> Reverse the logic in seq_putc to make it like seq_puts.
> 
> Signed-off-by: Joe Perches <joe@perches.com>

Reviewed-by: Petr Mladek <pmladek@suse.cz>

The changes are correct. The show() functions should return 0
even when there is an overflow. They are called by traverse()
from seq_read() that might increase the buffer size and try again.


Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH 3/4] i40e: remove pci_assigned_vfs() check while disabling VFs
From: Jeff Kirsher @ 2014-11-11 13:52 UTC (permalink / raw)
  To: Sathya Perla
  Cc: linux-pci@vger.kernel.org, netdev, ariel.elior, linux.nics,
	shahed.shaikh, ddutile
In-Reply-To: <1415620410-4937-4-git-send-email-sathya.perla@emulex.com>

On Mon, Nov 10, 2014 at 3:53 AM, Sathya Perla <sathya.perla@emulex.com> wrote:
> From: Vasundhara Volam <vasundhara.volam@emulex.com>
>
> The pci_assigned_vfs() check (while disabling VFs) is being moved to the
> pci-sysfs.c file and will be done before invoking sriov_configure().
>
> Signed-off-by: Vasundhara Volam <vasundhara.volam@emulex.com>
> Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
> ---
>  drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c |    7 +------
>  1 files changed, 1 insertions(+), 6 deletions(-)

Thanks I will add your patch to my queue.

^ permalink raw reply

* [PATCH v2 net-next 0/2] net: SO_INCOMING_CPU support
From: Eric Dumazet @ 2014-11-11 13:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Neal Cardwell, Willem de Bruijn, Ying Cai, Eric Dumazet

SO_INCOMING_CPU socket option (read by getsockopt()) provides
an alternative to RPS/RFS for high performance servers using
multi queues NIC.

TCP should use sk_mark_napi_id() for established sockets only.

Eric Dumazet (2):
  tcp: move sk_mark_napi_id() at the right place
  net: introduce SO_INCOMING_CPU

 arch/alpha/include/uapi/asm/socket.h   |  2 ++
 arch/avr32/include/uapi/asm/socket.h   |  2 ++
 arch/cris/include/uapi/asm/socket.h    |  2 ++
 arch/frv/include/uapi/asm/socket.h     |  2 ++
 arch/ia64/include/uapi/asm/socket.h    |  2 ++
 arch/m32r/include/uapi/asm/socket.h    |  2 ++
 arch/mips/include/uapi/asm/socket.h    |  2 ++
 arch/mn10300/include/uapi/asm/socket.h |  2 ++
 arch/parisc/include/uapi/asm/socket.h  |  2 ++
 arch/powerpc/include/uapi/asm/socket.h |  2 ++
 arch/s390/include/uapi/asm/socket.h    |  2 ++
 arch/sparc/include/uapi/asm/socket.h   |  2 ++
 arch/xtensa/include/uapi/asm/socket.h  |  2 ++
 include/net/sock.h                     | 12 ++++++++++++
 include/uapi/asm-generic/socket.h      |  2 ++
 net/core/sock.c                        |  5 +++++
 net/ipv4/tcp_ipv4.c                    |  4 +++-
 net/ipv4/udp.c                         |  1 +
 net/ipv6/tcp_ipv6.c                    |  4 +++-
 net/ipv6/udp.c                         |  1 +
 net/sctp/ulpqueue.c                    |  5 +++--
 21 files changed, 56 insertions(+), 4 deletions(-)

-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply

* [PATCH v2 net-next 1/2] tcp: move sk_mark_napi_id() at the right place
From: Eric Dumazet @ 2014-11-11 13:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Neal Cardwell, Willem de Bruijn, Ying Cai, Eric Dumazet
In-Reply-To: <1415714068-21028-1-git-send-email-edumazet@google.com>

sk_mark_napi_id() is used to record for a flow napi id of incoming
packets for busypoll sake.
We should do this only on established flows, not on listeners.

This was 'working' by virtue of the socket cloning, but doing
this on SYN packets in unecessary cache line dirtying.

Even if we move sk_napi_id in the same cache line than sk_lock,
we are working to make SYN processing lockless, so it is desirable
to set sk_napi_id only for established flows.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv4/tcp_ipv4.c | 3 ++-
 net/ipv6/tcp_ipv6.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 9c7d7621466b..8893598a4124 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1429,6 +1429,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 		struct dst_entry *dst = sk->sk_rx_dst;
 
 		sock_rps_save_rxhash(sk, skb);
+		sk_mark_napi_id(sk, skb);
 		if (dst) {
 			if (inet_sk(sk)->rx_dst_ifindex != skb->skb_iif ||
 			    dst->ops->check(dst, 0) == NULL) {
@@ -1450,6 +1451,7 @@ int tcp_v4_do_rcv(struct sock *sk, struct sk_buff *skb)
 
 		if (nsk != sk) {
 			sock_rps_save_rxhash(nsk, skb);
+			sk_mark_napi_id(sk, skb);
 			if (tcp_child_process(sk, nsk, skb)) {
 				rsk = nsk;
 				goto reset;
@@ -1661,7 +1663,6 @@ process:
 	if (sk_filter(sk, skb))
 		goto discard_and_relse;
 
-	sk_mark_napi_id(sk, skb);
 	skb->dev = NULL;
 
 	bh_lock_sock_nested(sk);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index ace29b60813c..fd8e50b380e7 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1293,6 +1293,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 		struct dst_entry *dst = sk->sk_rx_dst;
 
 		sock_rps_save_rxhash(sk, skb);
+		sk_mark_napi_id(sk, skb);
 		if (dst) {
 			if (inet_sk(sk)->rx_dst_ifindex != skb->skb_iif ||
 			    dst->ops->check(dst, np->rx_dst_cookie) == NULL) {
@@ -1322,6 +1323,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb)
 		 */
 		if (nsk != sk) {
 			sock_rps_save_rxhash(nsk, skb);
+			sk_mark_napi_id(sk, skb);
 			if (tcp_child_process(sk, nsk, skb))
 				goto reset;
 			if (opt_skb)
@@ -1454,7 +1456,6 @@ process:
 	if (sk_filter(sk, skb))
 		goto discard_and_relse;
 
-	sk_mark_napi_id(sk, skb);
 	skb->dev = NULL;
 
 	bh_lock_sock_nested(sk);
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* [PATCH v2 net-next 2/2] net: introduce SO_INCOMING_CPU
From: Eric Dumazet @ 2014-11-11 13:54 UTC (permalink / raw)
  To: David S. Miller
  Cc: netdev, Neal Cardwell, Willem de Bruijn, Ying Cai, Eric Dumazet
In-Reply-To: <1415714068-21028-1-git-send-email-edumazet@google.com>

Alternative to RPS/RFS is to use hardware support for multiple
queues.

Then split a set of million of sockets into worker threads, each
one using epoll() to manage events on its own socket pool.

Ideally, we want one thread per RX/TX queue/cpu, but we have no way to
know after accept() or connect() on which queue/cpu a socket is managed.

We normally use one cpu per RX queue (IRQ smp_affinity being properly
set), so remembering on socket structure which cpu delivered last packet
is enough to solve the problem.

After accept(), connect(), or even file descriptor passing around
processes, applications can use :

 int cpu;
 socklen_t len = sizeof(cpu);

 getsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);

And use this information to put the socket into the right silo
for optimal performance, as all networking stack should run
on the appropriate cpu, without need to send IPI (RPS/RFS).

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 arch/alpha/include/uapi/asm/socket.h   |  2 ++
 arch/avr32/include/uapi/asm/socket.h   |  2 ++
 arch/cris/include/uapi/asm/socket.h    |  2 ++
 arch/frv/include/uapi/asm/socket.h     |  2 ++
 arch/ia64/include/uapi/asm/socket.h    |  2 ++
 arch/m32r/include/uapi/asm/socket.h    |  2 ++
 arch/mips/include/uapi/asm/socket.h    |  2 ++
 arch/mn10300/include/uapi/asm/socket.h |  2 ++
 arch/parisc/include/uapi/asm/socket.h  |  2 ++
 arch/powerpc/include/uapi/asm/socket.h |  2 ++
 arch/s390/include/uapi/asm/socket.h    |  2 ++
 arch/sparc/include/uapi/asm/socket.h   |  2 ++
 arch/xtensa/include/uapi/asm/socket.h  |  2 ++
 include/net/sock.h                     | 12 ++++++++++++
 include/uapi/asm-generic/socket.h      |  2 ++
 net/core/sock.c                        |  5 +++++
 net/ipv4/tcp_ipv4.c                    |  1 +
 net/ipv4/udp.c                         |  1 +
 net/ipv6/tcp_ipv6.c                    |  1 +
 net/ipv6/udp.c                         |  1 +
 net/sctp/ulpqueue.c                    |  5 +++--
 21 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index 3de1394bcab8..e2fe0700b3b4 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -87,4 +87,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/avr32/include/uapi/asm/socket.h b/arch/avr32/include/uapi/asm/socket.h
index 6e6cd159924b..92121b0f5b98 100644
--- a/arch/avr32/include/uapi/asm/socket.h
+++ b/arch/avr32/include/uapi/asm/socket.h
@@ -80,4 +80,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _UAPI__ASM_AVR32_SOCKET_H */
diff --git a/arch/cris/include/uapi/asm/socket.h b/arch/cris/include/uapi/asm/socket.h
index ed94e5ed0a23..60f60f5b9b35 100644
--- a/arch/cris/include/uapi/asm/socket.h
+++ b/arch/cris/include/uapi/asm/socket.h
@@ -82,6 +82,8 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _ASM_SOCKET_H */
 
 
diff --git a/arch/frv/include/uapi/asm/socket.h b/arch/frv/include/uapi/asm/socket.h
index ca2c6e6f31c6..2c6890209ea6 100644
--- a/arch/frv/include/uapi/asm/socket.h
+++ b/arch/frv/include/uapi/asm/socket.h
@@ -80,5 +80,7 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _ASM_SOCKET_H */
 
diff --git a/arch/ia64/include/uapi/asm/socket.h b/arch/ia64/include/uapi/asm/socket.h
index a1b49bac7951..09a93fb566f6 100644
--- a/arch/ia64/include/uapi/asm/socket.h
+++ b/arch/ia64/include/uapi/asm/socket.h
@@ -89,4 +89,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _ASM_IA64_SOCKET_H */
diff --git a/arch/m32r/include/uapi/asm/socket.h b/arch/m32r/include/uapi/asm/socket.h
index 6c9a24b3aefa..e8589819c274 100644
--- a/arch/m32r/include/uapi/asm/socket.h
+++ b/arch/m32r/include/uapi/asm/socket.h
@@ -80,4 +80,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _ASM_M32R_SOCKET_H */
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index a14baa218c76..2e9ee8c55a10 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -98,4 +98,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/mn10300/include/uapi/asm/socket.h b/arch/mn10300/include/uapi/asm/socket.h
index 6aa3ce1854aa..f3492e8c9f70 100644
--- a/arch/mn10300/include/uapi/asm/socket.h
+++ b/arch/mn10300/include/uapi/asm/socket.h
@@ -80,4 +80,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index fe35ceacf0e7..7984a1cab3da 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -79,4 +79,6 @@
 
 #define SO_BPF_EXTENSIONS	0x4029
 
+#define SO_INCOMING_CPU		0x402A
+
 #endif /* _UAPI_ASM_SOCKET_H */
diff --git a/arch/powerpc/include/uapi/asm/socket.h b/arch/powerpc/include/uapi/asm/socket.h
index a9c3e2e18c05..3474e4ef166d 100644
--- a/arch/powerpc/include/uapi/asm/socket.h
+++ b/arch/powerpc/include/uapi/asm/socket.h
@@ -87,4 +87,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif	/* _ASM_POWERPC_SOCKET_H */
diff --git a/arch/s390/include/uapi/asm/socket.h b/arch/s390/include/uapi/asm/socket.h
index e031332096d7..8457636c33e1 100644
--- a/arch/s390/include/uapi/asm/socket.h
+++ b/arch/s390/include/uapi/asm/socket.h
@@ -86,4 +86,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* _ASM_SOCKET_H */
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 54d9608681b6..4a8003a94163 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -76,6 +76,8 @@
 
 #define SO_BPF_EXTENSIONS	0x0032
 
+#define SO_INCOMING_CPU		0x0033
+
 /* Security levels - as per NRL IPv6 - don't actually do anything */
 #define SO_SECURITY_AUTHENTICATION		0x5001
 #define SO_SECURITY_ENCRYPTION_TRANSPORT	0x5002
diff --git a/arch/xtensa/include/uapi/asm/socket.h b/arch/xtensa/include/uapi/asm/socket.h
index 39acec0cf0b1..c46f6a696849 100644
--- a/arch/xtensa/include/uapi/asm/socket.h
+++ b/arch/xtensa/include/uapi/asm/socket.h
@@ -91,4 +91,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif	/* _XTENSA_SOCKET_H */
diff --git a/include/net/sock.h b/include/net/sock.h
index 7db3db112baa..ff2c3f11fb8f 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -273,6 +273,7 @@ struct cg_proto;
   *	@sk_rcvtimeo: %SO_RCVTIMEO setting
   *	@sk_sndtimeo: %SO_SNDTIMEO setting
   *	@sk_rxhash: flow hash received from netif layer
+  *	@sk_incoming_cpu: record cpu processing incoming packets
   *	@sk_txhash: computed flow hash for use on transmit
   *	@sk_filter: socket filtering instructions
   *	@sk_protinfo: private area, net family specific, when not using slab
@@ -350,6 +351,12 @@ struct sock {
 #ifdef CONFIG_RPS
 	__u32			sk_rxhash;
 #endif
+	u16			sk_incoming_cpu;
+	/* 16bit hole
+	 * Warned : sk_incoming_cpu can be set from softirq,
+	 * Do not use this hole without fully understanding possible issues.
+	 */
+
 	__u32			sk_txhash;
 #ifdef CONFIG_NET_RX_BUSY_POLL
 	unsigned int		sk_napi_id;
@@ -833,6 +840,11 @@ static inline int sk_backlog_rcv(struct sock *sk, struct sk_buff *skb)
 	return sk->sk_backlog_rcv(sk, skb);
 }
 
+static inline void sk_incoming_cpu_update(struct sock *sk)
+{
+	sk->sk_incoming_cpu = raw_smp_processor_id();
+}
+
 static inline void sock_rps_record_flow_hash(__u32 hash)
 {
 #ifdef CONFIG_RPS
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index ea0796bdcf88..f541ccefd4ac 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -82,4 +82,6 @@
 
 #define SO_BPF_EXTENSIONS	48
 
+#define SO_INCOMING_CPU		49
+
 #endif /* __ASM_GENERIC_SOCKET_H */
diff --git a/net/core/sock.c b/net/core/sock.c
index 15e0c67b1069..14998b161035 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1213,6 +1213,10 @@ int sock_getsockopt(struct socket *sock, int level, int optname,
 		v.val = sk->sk_max_pacing_rate;
 		break;
 
+	case SO_INCOMING_CPU:
+		v.val = sk->sk_incoming_cpu;
+		break;
+
 	default:
 		return -ENOPROTOOPT;
 	}
@@ -1517,6 +1521,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
 
 		newsk->sk_err	   = 0;
 		newsk->sk_priority = 0;
+		newsk->sk_incoming_cpu = raw_smp_processor_id();
 		/*
 		 * Before updating sk_refcnt, we must commit prior changes to memory
 		 * (Documentation/RCU/rculist_nulls.txt for details)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 8893598a4124..2c6a955fd5c3 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1663,6 +1663,7 @@ process:
 	if (sk_filter(sk, skb))
 		goto discard_and_relse;
 
+	sk_incoming_cpu_update(sk);
 	skb->dev = NULL;
 
 	bh_lock_sock_nested(sk);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd0db5471bb5..52235ca1f352 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1445,6 +1445,7 @@ static int __udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	if (inet_sk(sk)->inet_daddr) {
 		sock_rps_save_rxhash(sk, skb);
 		sk_mark_napi_id(sk, skb);
+		sk_incoming_cpu_update(sk);
 	}
 
 	rc = sock_queue_rcv_skb(sk, skb);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index fd8e50b380e7..1985b4933a6b 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1456,6 +1456,7 @@ process:
 	if (sk_filter(sk, skb))
 		goto discard_and_relse;
 
+	sk_incoming_cpu_update(sk);
 	skb->dev = NULL;
 
 	bh_lock_sock_nested(sk);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f6ba535b6feb..2c7790c9ac65 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -577,6 +577,7 @@ static int __udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	if (!ipv6_addr_any(&sk->sk_v6_daddr)) {
 		sock_rps_save_rxhash(sk, skb);
 		sk_mark_napi_id(sk, skb);
+		sk_incoming_cpu_update(sk);
 	}
 
 	rc = sock_queue_rcv_skb(sk, skb);
diff --git a/net/sctp/ulpqueue.c b/net/sctp/ulpqueue.c
index d49dc2ed30ad..ce469d648ffb 100644
--- a/net/sctp/ulpqueue.c
+++ b/net/sctp/ulpqueue.c
@@ -205,9 +205,10 @@ int sctp_ulpq_tail_event(struct sctp_ulpq *ulpq, struct sctp_ulpevent *event)
 	if (sock_flag(sk, SOCK_DEAD) || (sk->sk_shutdown & RCV_SHUTDOWN))
 		goto out_free;
 
-	if (!sctp_ulpevent_is_notification(event))
+	if (!sctp_ulpevent_is_notification(event)) {
 		sk_mark_napi_id(sk, skb);
-
+		sk_incoming_cpu_update(sk);
+	}
 	/* Check if the user wishes to receive this event.  */
 	if (!sctp_ulpevent_is_enabled(event, &sctp_sk(sk)->subscribe))
 		goto out_free;
-- 
2.1.0.rc2.206.gedb03e5

^ permalink raw reply related

* Re: [patch net-next v2 03/10] rtnl: expose physical switch id for particular device
From: Roopa Prabhu @ 2014-11-11 13:55 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Jiri Pirko, Netdev, David S. Miller, nhorman, Andy Gospodarek,
	Thomas Graf, dborkman, ogerlitz, jesse, pshelar, azhou, ben,
	stephen, Kirsher, Jeffrey T, vyasevic, Cong Wang,
	Fastabend, John R, Eric Dumazet, Jamal Hadi Salim,
	Florian Fainelli, John Linville, jasowang, ebiederm,
	Nicolas Dichtel, ryazanov.s.a, buytenh, aviadr, nbd,
	Alexei Starovoitov
In-Reply-To: <CAE4R7bA_u=E-Vm5Zx74NJugV1JgSj1dOmBGkBZG18kcnkXH1OQ@mail.gmail.com>

On 11/10/14, 12:02 PM, Scott Feldman wrote:
> On Mon, Nov 10, 2014 at 7:58 AM, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>> On 11/9/14, 2:51 AM, Jiri Pirko wrote:
>>> The netdevice represents a port in a switch, it will expose
>>> IFLA_PHYS_SWITCH_ID value via rtnl. Two netdevices with the same value
>>> belong to one physical switch.
>>>
>>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>>> ---
>>>    include/uapi/linux/if_link.h |  1 +
>>>    net/core/rtnetlink.c         | 26 +++++++++++++++++++++++++-
>>>    2 files changed, 26 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>>> index 7072d83..4163753 100644
>>> --- a/include/uapi/linux/if_link.h
>>> +++ b/include/uapi/linux/if_link.h
>>> @@ -145,6 +145,7 @@ enum {
>>>          IFLA_CARRIER,
>>>          IFLA_PHYS_PORT_ID,
>>>          IFLA_CARRIER_CHANGES,
>>> +       IFLA_PHYS_SWITCH_ID,
>>
>> Jiri, since we have not really converged on the switchdev class or having a
>> separate switchdev instance,
>> am thinking it is better if we dont expose any such switch_id to userspace
>> yet until absolutely needed. Do you need it today ?
>> There is no real in kernel hw switch driver that will use it today. And
>> quite likely this will need to change when we introduce real hw switch
>> drivers.
> How will it change when real hw switch drivers are introduced?  Will
> the real sw driver not be able to give a up unique ID for the switch?
  With my question i was trying to see if there are other ways to manage 
the relationship between the
switch device and the ports, instead of an random id provided by each 
switch driver. Today the switch id namespace seems
to be with each switch driver. On my systems on the first switch chip 
quite likely i will choose an id 0.,,and possibly some other
driver will choose the same id. The switch id namespace handling was not 
clear to me.

But, to your question, am sure we will have some id to go in there since 
the field is now available.

Thanks,
Roopa

^ permalink raw reply

* Re: [patch net-next v2 08/10] bridge: add API to notify bridge driver of learned FBD on offloaded device
From: Roopa Prabhu @ 2014-11-11 14:21 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: netdev, davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse,
	pshelar, azhou, ben, stephen, jeffrey.t.kirsher, vyasevic,
	xiyou.wangcong, john.r.fastabend, edumazet, jhs, sfeldma,
	f.fainelli, linville, jasowang, ebiederm, nicolas.dichtel,
	ryazanov.s.a, buytenh, aviadr, nbd, alexei.starovoitov,
	Neil.Jerram, ronye, simon.horman, alexander.h.duyck, john.ronciak,
	mleitner, shrijeet, gospo, bcrl
In-Reply-To: <1415530280-9190-9-git-send-email-jiri@resnulli.us>

On 11/9/14, 2:51 AM, Jiri Pirko wrote:
> From: Scott Feldman <sfeldma@gmail.com>
>
> When the swdev device learns a new mac/vlan on a port, it sends some async
> notification to the driver and the driver installs an FDB in the device.
> To give a holistic system view, the learned mac/vlan should be reflected
> in the bridge's FBD table, so the user, using normal iproute2 cmds, can view
> what is currently learned by the device.  This API on the bridge driver gives
> a way for the swdev driver to install an FBD entry in the bridge FBD table.
> (And remove one).
>
> This is equivalent to the device running these cmds:
>
>    bridge fdb [add|del] <mac> dev <dev> vid <vlan id> master
>
> This patch needs some extra eyeballs for review, in paricular around the
> locking and contexts.

scott/jiri, love that you have handled this case!, This will be useful.
But, quick question, Cant this also be done using the same ndo_op that 
is done to add the static fdb..?

Thanks!.
>
> Signed-off-by: Scott Feldman <sfeldma@gmail.com>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
>   include/linux/if_bridge.h | 18 ++++++++++
>   net/bridge/br_fdb.c       | 84 +++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 102 insertions(+)
>
> diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
> index 808dcb8..27ab217 100644
> --- a/include/linux/if_bridge.h
> +++ b/include/linux/if_bridge.h
> @@ -37,6 +37,24 @@ extern void brioctl_set(int (*ioctl_hook)(struct net *, unsigned int, void __use
>   typedef int br_should_route_hook_t(struct sk_buff *skb);
>   extern br_should_route_hook_t __rcu *br_should_route_hook;
>   
> +#if IS_ENABLED(CONFIG_BRIDGE)
> +int br_fdb_learn_add(struct net_device *dev,
> +		     const unsigned char *addr, u16 vid);
> +int br_fdb_learn_del(struct net_device *dev,
> +		     const unsigned char *addr, u16 vid);
> +#else
> +static inline int br_fdb_learn_add(struct net_device *dev,
> +				   const unsigned char *addr, u16 vid)
> +{
> +	return 0;
> +}
> +static inline int br_fdb_learn_del(struct net_device *dev,
> +				   const unsigned char *addr, u16 vid)
> +{
> +	return 0;
> +}
> +#endif
> +
>   #if IS_ENABLED(CONFIG_BRIDGE) && IS_ENABLED(CONFIG_BRIDGE_IGMP_SNOOPING)
>   int br_multicast_list_adjacent(struct net_device *dev,
>   			       struct list_head *br_ip_list);
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index f6f8bb5..e02d21b 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -1022,3 +1022,87 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
>   		}
>   	}
>   }
> +
> +int br_fdb_learn_add(struct net_device *dev, const unsigned char *addr,
> +		     u16 vid)
> +{
> +	struct net_bridge_port *p;
> +	struct net_bridge *br;
> +	struct hlist_head *head;
> +	struct net_bridge_fdb_entry *fdb;
> +	int err = 0;
> +
> +	rtnl_lock();
> +
> +	p = br_port_get_rtnl(dev);
> +	if (p == NULL) {
> +		pr_info("bridge: %s not a bridge port\n", dev->name);
> +		err = -EINVAL;
> +		goto err_rtnl_unlock;
> +	}
> +
> +	br = p->br;
> +
> +	spin_lock(&br->hash_lock);
> +
> +	head = &br->hash[br_mac_hash(addr, vid)];
> +	fdb = fdb_find(head, addr, vid);
> +	if (fdb == NULL) {
> +		fdb = fdb_create(head, p, addr, vid);
> +		if (!fdb) {
> +			err = -ENOMEM;
> +			goto err_unlock;
> +		}
> +		fdb->is_local = 1;
> +		fdb->used = jiffies;
> +		fdb->updated = jiffies;
> +		fdb_notify(br, fdb, RTM_NEWNEIGH);
> +	} else {
> +		err = -EEXIST;
> +	}
> +
> +err_unlock:
> +	spin_unlock(&br->hash_lock);
> +err_rtnl_unlock:
> +	rtnl_unlock();
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(br_fdb_learn_add);
> +
> +int br_fdb_learn_del(struct net_device *dev, const unsigned char *addr,
> +		     u16 vid)
> +{
> +	struct net_bridge_port *p;
> +	struct net_bridge *br;
> +	struct hlist_head *head;
> +	struct net_bridge_fdb_entry *fdb;
> +	int err = 0;
> +
> +	rtnl_lock();
> +
> +	p = br_port_get_rtnl(dev);
> +	if (p == NULL) {
> +		pr_info("bridge: %s not a bridge port\n", dev->name);
> +		err = -EINVAL;
> +		goto err_rtnl_unlock;
> +	}
> +
> +	br = p->br;
> +
> +	spin_lock(&br->hash_lock);
> +
> +	head = &br->hash[br_mac_hash(addr, vid)];
> +	fdb = fdb_find(head, addr, vid);
> +	if (fdb)
> +		fdb_delete(br, fdb);
> +	else
> +		err = -ENOENT;
> +
> +	spin_unlock(&br->hash_lock);
> +err_rtnl_unlock:
> +	rtnl_unlock();
> +
> +	return err;
> +}
> +EXPORT_SYMBOL(br_fdb_learn_del);

^ permalink raw reply

* Re: [PATCH] drivers: atm: eni: Add pci_dma_mapping_error() call
From: Tina Johnson @ 2014-11-11 14:22 UTC (permalink / raw)
  To: David Miller
  Cc: tinajohnson.1234, chas, linux-atm-general, netdev, linux-kernel,
	julia.lawall
In-Reply-To: <20141110.153228.2187951801277440496.davem@davemloft.net>



On Mon, 10 Nov 2014, David Miller wrote:

> The 'trouble' label assumes that it is recovering and unwinding state
> when an error occurs after the DMA buffer is successfully mapped.
>
> It unconditionally does pci_unmap_single() if 'paddr' is non-zero
> which it might be in the error case depending upon how DMA errors
> are represented on a given platform.

Sorry for the trouble and thankyou for your very helpful comment. I
will send the revised patch.

^ permalink raw reply

* Re: [patch net-next v2 05/10] rocker: introduce rocker switch driver
From: Thomas Graf @ 2014-11-11 14:29 UTC (permalink / raw)
  To: John Fastabend
  Cc: Jiri Pirko, netdev, davem, nhorman, andy, dborkman, ogerlitz,
	jesse, pshelar, azhou, ben, stephen, jeffrey.t.kirsher, vyasevic,
	xiyou.wangcong, john.r.fastabend, edumazet, jhs, sfeldma,
	f.fainelli, roopa, linville, jasowang, ebiederm, nicolas.dichtel,
	ryazanov.s.a, buytenh, aviadr, nbd, alexei.starovoitov,
	Neil.Jerram, ronye, simon.horman, alexander.h.duyck, john.ronciak,
	mleitner, shrijeet, gospo
In-Reply-To: <5461366A.9050900@gmail.com>

On 11/10/14 at 02:04pm, John Fastabend wrote:
> On 11/09/2014 02:51 AM, Jiri Pirko wrote:
> >+static int rocker_port_sw_parent_id_get(struct net_device *dev,
> >+					struct netdev_phys_item_id *psid)
> >+{
> >+	struct rocker_port *rocker_port = netdev_priv(dev);
> >+	struct rocker *rocker = rocker_port->rocker;
> >+
> 
> hmm looks like you read this out of a magic switch register :) but
> my switch doesn't have this magic reg. I suposse the switch MAC address
> should work.

This needs more work afterwards. Either we define that the switch ID
is only unique in combination with the parent ifindex or we need to
introduce a notation of uniquness into the switch ID itself.

Is the goal to expose a hardware ID here to allow identification of
the hardware chip?

MAC is tempting but I'm pretty sure that we'll have pure L3 devices
being handled by this API at some point.

^ permalink raw reply

* [net-next v2 07/10] ixgbe: cleanup ixgbe_ndo_set_vf_vlan
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

Clean up functionality in ixgbe_ndo_set_vf_vlan that will simplify later
patches.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
v2 initialized err in the first function because there was a possibility
   of it being used before it being assigned a value

 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 94 +++++++++++++++++---------
 1 file changed, 61 insertions(+), 33 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 97c85b8..07d1c04 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -1079,52 +1079,80 @@ int ixgbe_ndo_set_vf_mac(struct net_device *netdev, int vf, u8 *mac)
 	return ixgbe_set_vf_mac(adapter, vf, mac);
 }
 
+static int ixgbe_enable_port_vlan(struct ixgbe_adapter *adapter, int vf,
+				  u16 vlan, u8 qos)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	int err = 0;
+
+	if (adapter->vfinfo[vf].pf_vlan)
+		err = ixgbe_set_vf_vlan(adapter, false,
+					adapter->vfinfo[vf].pf_vlan,
+					vf);
+	if (err)
+		goto out;
+	ixgbe_set_vmvir(adapter, vlan, qos, vf);
+	ixgbe_set_vmolr(hw, vf, false);
+	if (adapter->vfinfo[vf].spoofchk_enabled)
+		hw->mac.ops.set_vlan_anti_spoofing(hw, true, vf);
+	adapter->vfinfo[vf].vlan_count++;
+	adapter->vfinfo[vf].pf_vlan = vlan;
+	adapter->vfinfo[vf].pf_qos = qos;
+	dev_info(&adapter->pdev->dev,
+		 "Setting VLAN %d, QOS 0x%x on VF %d\n", vlan, qos, vf);
+	if (test_bit(__IXGBE_DOWN, &adapter->state)) {
+		dev_warn(&adapter->pdev->dev,
+			 "The VF VLAN has been set, but the PF device is not up.\n");
+		dev_warn(&adapter->pdev->dev,
+			 "Bring the PF device up before attempting to use the VF device.\n");
+	}
+
+out:
+	return err;
+}
+
+static int ixgbe_disable_port_vlan(struct ixgbe_adapter *adapter, int vf)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	int err;
+
+	err = ixgbe_set_vf_vlan(adapter, false,
+				adapter->vfinfo[vf].pf_vlan, vf);
+	ixgbe_clear_vmvir(adapter, vf);
+	ixgbe_set_vmolr(hw, vf, true);
+	hw->mac.ops.set_vlan_anti_spoofing(hw, false, vf);
+	if (adapter->vfinfo[vf].vlan_count)
+		adapter->vfinfo[vf].vlan_count--;
+	adapter->vfinfo[vf].pf_vlan = 0;
+	adapter->vfinfo[vf].pf_qos = 0;
+
+	return err;
+}
+
 int ixgbe_ndo_set_vf_vlan(struct net_device *netdev, int vf, u16 vlan, u8 qos)
 {
 	int err = 0;
 	struct ixgbe_adapter *adapter = netdev_priv(netdev);
-	struct ixgbe_hw *hw = &adapter->hw;
 
 	if ((vf >= adapter->num_vfs) || (vlan > 4095) || (qos > 7))
 		return -EINVAL;
 	if (vlan || qos) {
+		/* Check if there is already a port VLAN set, if so
+		 * we have to delete the old one first before we
+		 * can set the new one.  The usage model had
+		 * previously assumed the user would delete the
+		 * old port VLAN before setting a new one but this
+		 * is not necessarily the case.
+		 */
 		if (adapter->vfinfo[vf].pf_vlan)
-			err = ixgbe_set_vf_vlan(adapter, false,
-						adapter->vfinfo[vf].pf_vlan,
-						vf);
-		if (err)
-			goto out;
-		err = ixgbe_set_vf_vlan(adapter, true, vlan, vf);
+			err = ixgbe_disable_port_vlan(adapter, vf);
 		if (err)
 			goto out;
-		ixgbe_set_vmvir(adapter, vlan, qos, vf);
-		ixgbe_set_vmolr(hw, vf, false);
-		if (adapter->vfinfo[vf].spoofchk_enabled)
-			hw->mac.ops.set_vlan_anti_spoofing(hw, true, vf);
-		adapter->vfinfo[vf].vlan_count++;
-		adapter->vfinfo[vf].pf_vlan = vlan;
-		adapter->vfinfo[vf].pf_qos = qos;
-		dev_info(&adapter->pdev->dev,
-			 "Setting VLAN %d, QOS 0x%x on VF %d\n", vlan, qos, vf);
-		if (test_bit(__IXGBE_DOWN, &adapter->state)) {
-			dev_warn(&adapter->pdev->dev,
-				 "The VF VLAN has been set,"
-				 " but the PF device is not up.\n");
-			dev_warn(&adapter->pdev->dev,
-				 "Bring the PF device up before"
-				 " attempting to use the VF device.\n");
-		}
+		err = ixgbe_enable_port_vlan(adapter, vf, vlan, qos);
 	} else {
-		err = ixgbe_set_vf_vlan(adapter, false,
-					adapter->vfinfo[vf].pf_vlan, vf);
-		ixgbe_clear_vmvir(adapter, vf);
-		ixgbe_set_vmolr(hw, vf, true);
-		hw->mac.ops.set_vlan_anti_spoofing(hw, false, vf);
-		if (adapter->vfinfo[vf].vlan_count)
-			adapter->vfinfo[vf].vlan_count--;
-		adapter->vfinfo[vf].pf_vlan = 0;
-		adapter->vfinfo[vf].pf_qos = 0;
+		err = ixgbe_disable_port_vlan(adapter, vf);
 	}
+
 out:
 	return err;
 }
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 08/10] ixgbe: cleanup move setting PFQDE.HIDE_VLAN to support function.
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem; +Cc: Don Skidmore, netdev, nhorman, sassmann, jogreene, Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

Move setting of drop enable to support function.  This not only makes the
code more readable but is also prep for following patches that add
additional MAC support.

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c | 31 ++++++++++++++++++--------
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 07d1c04..0c25df5 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -618,6 +618,27 @@ int ixgbe_vf_configuration(struct pci_dev *pdev, unsigned int event_mask)
 	return 0;
 }
 
+static inline void ixgbe_write_qde(struct ixgbe_adapter *adapter, u32 vf,
+				   u32 qde)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
+	u32 q_per_pool = __ALIGN_MASK(1, ~vmdq->mask);
+	int i;
+
+	for (i = vf * q_per_pool; i < ((vf + 1) * q_per_pool); i++) {
+		u32 reg;
+
+		/* flush previous write */
+		IXGBE_WRITE_FLUSH(hw);
+
+		/* indicate to hardware that we want to set drop enable */
+		reg = IXGBE_QDE_WRITE | IXGBE_QDE_ENABLE;
+		reg |= i <<  IXGBE_QDE_IDX_SHIFT;
+		IXGBE_WRITE_REG(hw, IXGBE_QDE, reg);
+	}
+}
+
 static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 {
 	struct ixgbe_ring_feature *vmdq = &adapter->ring_feature[RING_F_VMDQ];
@@ -647,15 +668,7 @@ static int ixgbe_vf_reset_msg(struct ixgbe_adapter *adapter, u32 vf)
 	IXGBE_WRITE_REG(hw, IXGBE_VFTE(reg_offset), reg);
 
 	/* force drop enable for all VF Rx queues */
-	for (i = vf * q_per_pool; i < ((vf + 1) * q_per_pool); i++) {
-		/* flush previous write */
-		IXGBE_WRITE_FLUSH(hw);
-
-		/* indicate to hardware that we want to set drop enable */
-		reg = IXGBE_QDE_WRITE | IXGBE_QDE_ENABLE;
-		reg |= i <<  IXGBE_QDE_IDX_SHIFT;
-		IXGBE_WRITE_REG(hw, IXGBE_QDE, reg);
-	}
+	ixgbe_write_qde(adapter, vf, IXGBE_QDE_ENABLE);
 
 	/* enable receive for vf */
 	reg = IXGBE_READ_REG(hw, IXGBE_VFRE(reg_offset));
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 00/10][pull request] Intel Wired LAN Driver Updates 2014-11-11
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, nhorman, sassmann, jogreene, john.ronciak

This series contains updates to i40e, i40evf and ixgbe.

Kamil updated the i40e and i40evf driver to poll the firmware slower
since we were polling faster than the firmware could respond.

Shannon updates i40e to add a check to keep the service_task from
running the periodic tasks more than once per second, while still
allowing quick action to service the events.

Jesse cleans up the throttle rate code by fixing the minimum interrupt
throttle rate and removing some unused defines.

Mitch makes the early init admin queue message receive code more robust
by handling messages in a loop and ignoring those that we are not
interested in.  This also gets rid of some scary log messages that
really do not indicate a problem.

Don provides several ixgbe patches, first fixes an issue with x540
completion timeout where on topologies including few levels of PCIe
switching for x540 can run into an unexpected completion error.  Cleans
up the functionality in ixgbe_ndo_set_vf_vlan() in preparation for
future work.  Adds support for x550 MAC's to the driver.

v2:
 - Remove code comment in patch 01 of the series, based on feedback from
   David Liaght
 - Updated the "goto" to "break" statements in patch 06 of the series,
   based on feedback from Sergei Shtylyov
 - Initialized the variable err due to the possibility of use before
   being assigned a value in patch 07 of the series
 - Added patch "ixgbe: add helper function for setting RSS key in
   preparation of X550" since it is needed for the addition of X550 MAC
   support

The following are changes since commit 2e1af7d74f4f7b4d4c1b0fbf5da3b5f92d9c332f:
  mlx4: restore conditional call to napi_complete_done()
and are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master

Don Skidmore (5):
  ixgbe: fix X540 Completion timeout
  ixgbe: cleanup ixgbe_ndo_set_vf_vlan
  ixgbe: cleanup move setting PFQDE.HIDE_VLAN to support function.
  ixgbe: Add new support for X550 MAC's
  ixgbe: add helper function for setting RSS key in perperation of X550

Jesse Brandeburg (1):
  i40e: clean up throttle rate code

Kamil Krawczyk (1):
  i40e: poll firmware slower

Mitch Williams (2):
  i40evf: make early init processing more robust
  i40evf: don't use more queues than CPUs

Shannon Nelson (1):
  i40e: don't do link_status or stats collection on every ARQ

 drivers/net/ethernet/intel/i40e/i40e.h             |   3 +-
 drivers/net/ethernet/intel/i40e/i40e_adminq.c      |   6 +-
 drivers/net/ethernet/intel/i40e/i40e_adminq.h      |   2 +-
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c     |  11 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c        |  14 ++-
 drivers/net/ethernet/intel/i40e/i40e_txrx.h        |   5 +-
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c    |   6 +-
 drivers/net/ethernet/intel/i40evf/i40e_adminq.h    |   2 +-
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h      |   5 +-
 drivers/net/ethernet/intel/i40evf/i40evf.h         |   1 +
 drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c |  16 +--
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    |  50 ++++----
 .../net/ethernet/intel/i40evf/i40evf_virtchnl.c    |  65 +++++-----
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c    |  50 ++++++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.c       |   8 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c    |   1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c   |  37 +++++-
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c       |   2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c      | 122 +++++++++++++++----
 drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.c       |   4 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c       |  64 +++++-----
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c     | 131 ++++++++++++++-------
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h      |  45 +++++--
 23 files changed, 441 insertions(+), 209 deletions(-)

-- 
1.9.3

^ permalink raw reply

* [net-next v2 03/10] i40e: clean up throttle rate code
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Jesse Brandeburg, netdev, nhorman, sassmann, jogreene, Patrick Lu,
	Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Jesse Brandeburg <jesse.brandeburg@intel.com>

The interrupt throttle rate minimum is actually 2us, so
fix that define and while we are there, remove some unused defines.

Change some strings in the function to be a bit less wrappy, and
express the correct limits.

Change-ID: I96829bbc77935e0b57c6f0fc1439fb4152b2960a
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_ethtool.c | 11 ++++-------
 drivers/net/ethernet/intel/i40e/i40e_txrx.h    |  5 +----
 drivers/net/ethernet/intel/i40evf/i40e_txrx.h  |  5 +----
 3 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
index b6e745f..afad5aa 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_ethtool.c
@@ -1575,11 +1575,9 @@ static int i40e_set_coalesce(struct net_device *netdev,
 	} else if (ec->rx_coalesce_usecs == 0) {
 		vsi->rx_itr_setting = ec->rx_coalesce_usecs;
 		if (ec->use_adaptive_rx_coalesce)
-			netif_info(pf, drv, netdev,
-				   "Rx-secs=0, need to disable adaptive-Rx for a complete disable\n");
+			netif_info(pf, drv, netdev, "rx-usecs=0, need to disable adaptive-rx for a complete disable\n");
 	} else {
-		netif_info(pf, drv, netdev,
-			   "Invalid value, Rx-usecs range is 0, 8-8160\n");
+		netif_info(pf, drv, netdev, "Invalid value, rx-usecs range is 0-8160\n");
 		return -EINVAL;
 	}
 
@@ -1589,11 +1587,10 @@ static int i40e_set_coalesce(struct net_device *netdev,
 	} else if (ec->tx_coalesce_usecs == 0) {
 		vsi->tx_itr_setting = ec->tx_coalesce_usecs;
 		if (ec->use_adaptive_tx_coalesce)
-			netif_info(pf, drv, netdev,
-				   "Tx-secs=0, need to disable adaptive-Tx for a complete disable\n");
+			netif_info(pf, drv, netdev, "tx-usecs=0, need to disable adaptive-tx for a complete disable\n");
 	} else {
 		netif_info(pf, drv, netdev,
-			   "Invalid value, Tx-usecs range is 0, 8-8160\n");
+			   "Invalid value, tx-usecs range is 0-8160\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index d7a625a..e60d3ac 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -30,10 +30,7 @@
 /* Interrupt Throttling and Rate Limiting Goodies */
 
 #define I40E_MAX_ITR               0x0FF0  /* reg uses 2 usec resolution */
-#define I40E_MIN_ITR               0x0004  /* reg uses 2 usec resolution */
-#define I40E_MAX_IRATE             0x03F
-#define I40E_MIN_IRATE             0x001
-#define I40E_IRATE_USEC_RESOLUTION 4
+#define I40E_MIN_ITR               0x0001  /* reg uses 2 usec resolution */
 #define I40E_ITR_100K              0x0005
 #define I40E_ITR_20K               0x0019
 #define I40E_ITR_8K                0x003E
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
index f6dcf9d..c7f2962 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.h
@@ -30,10 +30,7 @@
 /* Interrupt Throttling and Rate Limiting Goodies */
 
 #define I40E_MAX_ITR               0x0FF0  /* reg uses 2 usec resolution */
-#define I40E_MIN_ITR               0x0004  /* reg uses 2 usec resolution */
-#define I40E_MAX_IRATE             0x03F
-#define I40E_MIN_IRATE             0x001
-#define I40E_IRATE_USEC_RESOLUTION 4
+#define I40E_MIN_ITR               0x0001  /* reg uses 2 usec resolution */
 #define I40E_ITR_100K              0x0005
 #define I40E_ITR_20K               0x0019
 #define I40E_ITR_8K                0x003E
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 02/10] i40e: don't do link_status or stats collection on every ARQ
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Shannon Nelson, netdev, nhorman, sassmann, jogreene, Patrick Lu,
	Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Shannon Nelson <shannon.nelson@intel.com>

The ARQ events cause a service_task execution, and we do a link_status
check and full stats gathering for each service_task.  However, when
there are a lot of ARQ events, such as when doing an NVM update, we end up
doing 10's if not 100's of these per second, thereby heavily abusing the
PCI bus and especially the Firmware.  This patch adds a check to keep the
service_task from running these periodic tasks more than once per second,
while still allowing quick action to service the events.

Change-ID: Iec7670c37bfae9791c43fec26df48aea7f70b33e
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e.h      |  3 ++-
 drivers/net/ethernet/intel/i40e/i40e_main.c | 14 ++++++++++----
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h
index f1e33f8..b7a807b 100644
--- a/drivers/net/ethernet/intel/i40e/i40e.h
+++ b/drivers/net/ethernet/intel/i40e/i40e.h
@@ -269,7 +269,8 @@ struct i40e_pf {
 	u16 msg_enable;
 	char misc_int_name[IFNAMSIZ + 9];
 	u16 adminq_work_limit; /* num of admin receive queue desc to process */
-	int service_timer_period;
+	unsigned long service_timer_period;
+	unsigned long service_timer_previous;
 	struct timer_list service_timer;
 	struct work_struct service_task;
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 1a98e23..de66463 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -5449,7 +5449,7 @@ static void i40e_check_hang_subtask(struct i40e_pf *pf)
 }
 
 /**
- * i40e_watchdog_subtask - Check and bring link up
+ * i40e_watchdog_subtask - periodic checks not using event driven response
  * @pf: board private structure
  **/
 static void i40e_watchdog_subtask(struct i40e_pf *pf)
@@ -5461,6 +5461,15 @@ static void i40e_watchdog_subtask(struct i40e_pf *pf)
 	    test_bit(__I40E_CONFIG_BUSY, &pf->state))
 		return;
 
+	/* make sure we don't do these things too often */
+	if (time_before(jiffies, (pf->service_timer_previous +
+				  pf->service_timer_period)))
+		return;
+	pf->service_timer_previous = jiffies;
+
+	i40e_check_hang_subtask(pf);
+	i40e_link_event(pf);
+
 	/* Update the stats for active netdevs so the network stack
 	 * can look at updated numbers whenever it cares to
 	 */
@@ -6325,15 +6334,12 @@ static void i40e_service_task(struct work_struct *work)
 	i40e_vc_process_vflr_event(pf);
 	i40e_watchdog_subtask(pf);
 	i40e_fdir_reinit_subtask(pf);
-	i40e_check_hang_subtask(pf);
 	i40e_sync_filters_subtask(pf);
 #ifdef CONFIG_I40E_VXLAN
 	i40e_sync_vxlan_filters_subtask(pf);
 #endif
 	i40e_clean_adminq_subtask(pf);
 
-	i40e_link_event(pf);
-
 	i40e_service_event_complete(pf);
 
 	/* If the tasks have taken longer than one timer cycle or there
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 01/10] i40e: poll firmware slower
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Kamil Krawczyk, netdev, nhorman, sassmann, jogreene, David Laight,
	Or Gerlitz, Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Kamil Krawczyk <kamil.krawczyk@intel.com>

The code was polling the firmware tail register for completion every
10 microseconds, which is way faster than the firmware can respond.
This changes the poll interval to 1ms, which reduces polling CPU
utilization, and the number of times we loop.

The maximum delay is still 100ms.

CC: David Laight <David.Laight@aculab.com>
CC: Or Gerlitz <gerlitz.or@gmail.com>
Change-ID: I4bbfa6b66d802890baf8b4154061e55942b90958
Signed-off-by: Kamil Krawczyk <kamil.krawczyk@intel.com>
Acked-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
v2: Remove code comment that did not apply any longer, based on feedback
    from David Laight

 drivers/net/ethernet/intel/i40e/i40e_adminq.c   | 6 ++----
 drivers/net/ethernet/intel/i40e/i40e_adminq.h   | 2 +-
 drivers/net/ethernet/intel/i40evf/i40e_adminq.c | 6 ++----
 drivers/net/ethernet/intel/i40evf/i40e_adminq.h | 2 +-
 4 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
index 72f5d25..f7f6206 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c
@@ -853,7 +853,6 @@ i40e_status i40e_asq_send_command(struct i40e_hw *hw,
 	 */
 	if (!details->async && !details->postpone) {
 		u32 total_delay = 0;
-		u32 delay_len = 10;
 
 		do {
 			/* AQ designers suggest use of head for better
@@ -861,9 +860,8 @@ i40e_status i40e_asq_send_command(struct i40e_hw *hw,
 			 */
 			if (i40e_asq_done(hw))
 				break;
-			/* ugh! delay while spin_lock */
-			udelay(delay_len);
-			total_delay += delay_len;
+			usleep_range(1000, 2000);
+			total_delay++;
 		} while (total_delay < hw->aq.asq_cmd_timeout);
 	}
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.h b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
index ba38a89..df0bd09 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_adminq.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.h
@@ -141,7 +141,7 @@ static inline int i40e_aq_rc_to_posix(u16 aq_rc)
 
 /* general information */
 #define I40E_AQ_LARGE_BUF	512
-#define I40E_ASQ_CMD_TIMEOUT	100000  /* usecs */
+#define I40E_ASQ_CMD_TIMEOUT	100  /* msecs */
 
 void i40e_fill_default_direct_cmd_desc(struct i40e_aq_desc *desc,
 				       u16 opcode);
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq.c b/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
index f206be9..500ca21 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq.c
@@ -801,7 +801,6 @@ i40e_status i40evf_asq_send_command(struct i40e_hw *hw,
 	 */
 	if (!details->async && !details->postpone) {
 		u32 total_delay = 0;
-		u32 delay_len = 10;
 
 		do {
 			/* AQ designers suggest use of head for better
@@ -809,9 +808,8 @@ i40e_status i40evf_asq_send_command(struct i40e_hw *hw,
 			 */
 			if (i40evf_asq_done(hw))
 				break;
-			/* ugh! delay while spin_lock */
-			udelay(delay_len);
-			total_delay += delay_len;
+			usleep_range(1000, 2000);
+			total_delay++;
 		} while (total_delay < hw->aq.asq_cmd_timeout);
 	}
 
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_adminq.h b/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
index 91a5c5b..f40cfac 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
+++ b/drivers/net/ethernet/intel/i40evf/i40e_adminq.h
@@ -141,7 +141,7 @@ static inline int i40e_aq_rc_to_posix(u16 aq_rc)
 
 /* general information */
 #define I40E_AQ_LARGE_BUF	512
-#define I40E_ASQ_CMD_TIMEOUT	100000  /* usecs */
+#define I40E_ASQ_CMD_TIMEOUT	100  /* msecs */
 
 void i40evf_fill_default_direct_cmd_desc(struct i40e_aq_desc *desc,
 				       u16 opcode);
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 04/10] i40evf: make early init processing more robust
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Patrick Lu,
	Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

In early init, if we get an unexpected message from the PF (such as link
status), we just kick an error back to the init task, causing it to
restart its state machine and delaying initialization.

Make the early init AQ message receive code more robust by handling
messages in a loop, and ignoring those that we aren't interested in.
This also gets rid of some scary log messages that really didn't
indicate a problem.

Change-ID: I620e8c72e49c49c665ef33eeab2425dd10e721cf
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 .../net/ethernet/intel/i40evf/i40evf_virtchnl.c    | 59 ++++++++++++----------
 1 file changed, 31 insertions(+), 28 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index 66d12f5..ff86761 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -89,6 +89,7 @@ int i40evf_verify_api_ver(struct i40evf_adapter *adapter)
 	struct i40e_virtchnl_version_info *pf_vvi;
 	struct i40e_hw *hw = &adapter->hw;
 	struct i40e_arq_event_info event;
+	enum i40e_virtchnl_ops op;
 	i40e_status err;
 
 	event.msg_size = I40EVF_MAX_AQ_BUF_SIZE;
@@ -98,18 +99,27 @@ int i40evf_verify_api_ver(struct i40evf_adapter *adapter)
 		goto out;
 	}
 
-	err = i40evf_clean_arq_element(hw, &event, NULL);
-	if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK)
-		goto out_alloc;
+	while (1) {
+		err = i40evf_clean_arq_element(hw, &event, NULL);
+		/* When the AQ is empty, i40evf_clean_arq_element will return
+		 * nonzero and this loop will terminate.
+		 */
+		if (err)
+			goto out_alloc;
+		op =
+		    (enum i40e_virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
+		if (op == I40E_VIRTCHNL_OP_VERSION)
+			break;
+	}
+
 
 	err = (i40e_status)le32_to_cpu(event.desc.cookie_low);
 	if (err)
 		goto out_alloc;
 
-	if ((enum i40e_virtchnl_ops)le32_to_cpu(event.desc.cookie_high) !=
-	    I40E_VIRTCHNL_OP_VERSION) {
+	if (op != I40E_VIRTCHNL_OP_VERSION) {
 		dev_info(&adapter->pdev->dev, "Invalid reply type %d from PF\n",
-			 le32_to_cpu(event.desc.cookie_high));
+			op);
 		err = -EIO;
 		goto out_alloc;
 	}
@@ -153,8 +163,9 @@ int i40evf_get_vf_config(struct i40evf_adapter *adapter)
 {
 	struct i40e_hw *hw = &adapter->hw;
 	struct i40e_arq_event_info event;
-	u16 len;
+	enum i40e_virtchnl_ops op;
 	i40e_status err;
+	u16 len;
 
 	len =  sizeof(struct i40e_virtchnl_vf_resource) +
 		I40E_MAX_VF_VSI * sizeof(struct i40e_virtchnl_vsi_resource);
@@ -165,29 +176,21 @@ int i40evf_get_vf_config(struct i40evf_adapter *adapter)
 		goto out;
 	}
 
-	err = i40evf_clean_arq_element(hw, &event, NULL);
-	if (err == I40E_ERR_ADMIN_QUEUE_NO_WORK)
-		goto out_alloc;
-
-	err = (i40e_status)le32_to_cpu(event.desc.cookie_low);
-	if (err) {
-		dev_err(&adapter->pdev->dev,
-			"%s: Error returned from PF, %d, %d\n", __func__,
-			le32_to_cpu(event.desc.cookie_high),
-			le32_to_cpu(event.desc.cookie_low));
-		err = -EIO;
-		goto out_alloc;
+	while (1) {
+		event.msg_size = len;
+		/* When the AQ is empty, i40evf_clean_arq_element will return
+		 * nonzero and this loop will terminate.
+		 */
+		err = i40evf_clean_arq_element(hw, &event, NULL);
+		if (err)
+			goto out_alloc;
+		op =
+		    (enum i40e_virtchnl_ops)le32_to_cpu(event.desc.cookie_high);
+		if (op == I40E_VIRTCHNL_OP_GET_VF_RESOURCES)
+			break;
 	}
 
-	if ((enum i40e_virtchnl_ops)le32_to_cpu(event.desc.cookie_high) !=
-	    I40E_VIRTCHNL_OP_GET_VF_RESOURCES) {
-		dev_err(&adapter->pdev->dev,
-			"%s: Invalid response from PF, %d, %d\n", __func__,
-			le32_to_cpu(event.desc.cookie_high),
-			le32_to_cpu(event.desc.cookie_low));
-		err = -EIO;
-		goto out_alloc;
-	}
+	err = (i40e_status)le32_to_cpu(event.desc.cookie_low);
 	memcpy(adapter->vf_res, event.msg_buf, min(event.msg_size, len));
 
 	i40e_vf_parse_hw_config(hw, adapter->vf_res);
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 05/10] i40evf: don't use more queues than CPUs
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Mitch Williams, netdev, nhorman, sassmann, jogreene, Patrick Lu,
	Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Mitch Williams <mitch.a.williams@intel.com>

It's kind of silly to configure and attempt to use a bunch of queue
pairs when you're running on a single (virtual) CPU. Instead of
unconditionally configuring all of the queues that the PF gives us,
clamp the number of queue pairs to the number of CPUs.

Change-ID: I321714c9e15072ee76de8f95ab9a81f86ed347d1
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Signed-off-by: Patrick Lu <patrick.lu@intel.com>
Tested-by: Jim Young <jamesx.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/i40evf/i40evf.h         |  1 +
 drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c | 16 +++----
 drivers/net/ethernet/intel/i40evf/i40evf_main.c    | 50 +++++++++++++---------
 .../net/ethernet/intel/i40evf/i40evf_virtchnl.c    |  6 +--
 4 files changed, 41 insertions(+), 32 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40evf/i40evf.h b/drivers/net/ethernet/intel/i40evf/i40evf.h
index 30ef519..1113f8a 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf.h
+++ b/drivers/net/ethernet/intel/i40evf/i40evf.h
@@ -191,6 +191,7 @@ struct i40evf_adapter {
 	struct i40e_q_vector *q_vector[MAX_MSIX_Q_VECTORS];
 	struct list_head vlan_filter_list;
 	char misc_vector_name[IFNAMSIZ + 9];
+	int num_active_queues;
 
 	/* TX */
 	struct i40e_ring *tx_rings[I40E_MAX_VSI_QP];
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
index efee6b2..876411c 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_ethtool.c
@@ -59,7 +59,7 @@ static const struct i40evf_stats i40evf_gstrings_stats[] = {
 #define I40EVF_GLOBAL_STATS_LEN ARRAY_SIZE(i40evf_gstrings_stats)
 #define I40EVF_QUEUE_STATS_LEN(_dev) \
 	(((struct i40evf_adapter *) \
-		netdev_priv(_dev))->vsi_res->num_queue_pairs \
+		netdev_priv(_dev))->num_active_queues \
 		  * 2 * (sizeof(struct i40e_queue_stats) / sizeof(u64)))
 #define I40EVF_STATS_LEN(_dev) \
 	(I40EVF_GLOBAL_STATS_LEN + I40EVF_QUEUE_STATS_LEN(_dev))
@@ -121,11 +121,11 @@ static void i40evf_get_ethtool_stats(struct net_device *netdev,
 		p = (char *)adapter + i40evf_gstrings_stats[i].stat_offset;
 		data[i] =  *(u64 *)p;
 	}
-	for (j = 0; j < adapter->vsi_res->num_queue_pairs; j++) {
+	for (j = 0; j < adapter->num_active_queues; j++) {
 		data[i++] = adapter->tx_rings[j]->stats.packets;
 		data[i++] = adapter->tx_rings[j]->stats.bytes;
 	}
-	for (j = 0; j < adapter->vsi_res->num_queue_pairs; j++) {
+	for (j = 0; j < adapter->num_active_queues; j++) {
 		data[i++] = adapter->rx_rings[j]->stats.packets;
 		data[i++] = adapter->rx_rings[j]->stats.bytes;
 	}
@@ -151,13 +151,13 @@ static void i40evf_get_strings(struct net_device *netdev, u32 sset, u8 *data)
 			       ETH_GSTRING_LEN);
 			p += ETH_GSTRING_LEN;
 		}
-		for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+		for (i = 0; i < adapter->num_active_queues; i++) {
 			snprintf(p, ETH_GSTRING_LEN, "tx-%u.packets", i);
 			p += ETH_GSTRING_LEN;
 			snprintf(p, ETH_GSTRING_LEN, "tx-%u.bytes", i);
 			p += ETH_GSTRING_LEN;
 		}
-		for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+		for (i = 0; i < adapter->num_active_queues; i++) {
 			snprintf(p, ETH_GSTRING_LEN, "rx-%u.packets", i);
 			p += ETH_GSTRING_LEN;
 			snprintf(p, ETH_GSTRING_LEN, "rx-%u.bytes", i);
@@ -430,7 +430,7 @@ static int i40evf_get_rxnfc(struct net_device *netdev,
 
 	switch (cmd->cmd) {
 	case ETHTOOL_GRXRINGS:
-		cmd->data = adapter->vsi_res->num_queue_pairs;
+		cmd->data = adapter->num_active_queues;
 		ret = 0;
 		break;
 	case ETHTOOL_GRXFH:
@@ -598,12 +598,12 @@ static void i40evf_get_channels(struct net_device *netdev,
 	struct i40evf_adapter *adapter = netdev_priv(netdev);
 
 	/* Report maximum channels */
-	ch->max_combined = adapter->vsi_res->num_queue_pairs;
+	ch->max_combined = adapter->num_active_queues;
 
 	ch->max_other = NONQ_VECS;
 	ch->other_count = NONQ_VECS;
 
-	ch->combined_count = adapter->vsi_res->num_queue_pairs;
+	ch->combined_count = adapter->num_active_queues;
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
index b2f01eb..f0d07ad 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c
@@ -397,8 +397,8 @@ static int i40evf_map_rings_to_vectors(struct i40evf_adapter *adapter)
 	int q_vectors;
 	int v_start = 0;
 	int rxr_idx = 0, txr_idx = 0;
-	int rxr_remaining = adapter->vsi_res->num_queue_pairs;
-	int txr_remaining = adapter->vsi_res->num_queue_pairs;
+	int rxr_remaining = adapter->num_active_queues;
+	int txr_remaining = adapter->num_active_queues;
 	int i, j;
 	int rqpv, tqpv;
 	int err = 0;
@@ -584,7 +584,7 @@ static void i40evf_configure_tx(struct i40evf_adapter *adapter)
 {
 	struct i40e_hw *hw = &adapter->hw;
 	int i;
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++)
+	for (i = 0; i < adapter->num_active_queues; i++)
 		adapter->tx_rings[i]->tail = hw->hw_addr + I40E_QTX_TAIL1(i);
 }
 
@@ -629,7 +629,7 @@ static void i40evf_configure_rx(struct i40evf_adapter *adapter)
 			rx_buf_len = ALIGN(max_frame, 1024);
 	}
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+	for (i = 0; i < adapter->num_active_queues; i++) {
 		adapter->rx_rings[i]->tail = hw->hw_addr + I40E_QRX_TAIL1(i);
 		adapter->rx_rings[i]->rx_buf_len = rx_buf_len;
 	}
@@ -918,7 +918,7 @@ static void i40evf_configure(struct i40evf_adapter *adapter)
 	i40evf_configure_rx(adapter);
 	adapter->aq_required |= I40EVF_FLAG_AQ_CONFIGURE_QUEUES;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+	for (i = 0; i < adapter->num_active_queues; i++) {
 		struct i40e_ring *ring = adapter->rx_rings[i];
 		i40evf_alloc_rx_buffers(ring, ring->count);
 		ring->next_to_use = ring->count - 1;
@@ -950,7 +950,7 @@ static void i40evf_clean_all_rx_rings(struct i40evf_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++)
+	for (i = 0; i < adapter->num_active_queues; i++)
 		i40evf_clean_rx_ring(adapter->rx_rings[i]);
 }
 
@@ -962,7 +962,7 @@ static void i40evf_clean_all_tx_rings(struct i40evf_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++)
+	for (i = 0; i < adapter->num_active_queues; i++)
 		i40evf_clean_tx_ring(adapter->tx_rings[i]);
 }
 
@@ -1064,7 +1064,7 @@ static void i40evf_free_queues(struct i40evf_adapter *adapter)
 
 	if (!adapter->vsi_res)
 		return;
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+	for (i = 0; i < adapter->num_active_queues; i++) {
 		if (adapter->tx_rings[i])
 			kfree_rcu(adapter->tx_rings[i], rcu);
 		adapter->tx_rings[i] = NULL;
@@ -1084,7 +1084,7 @@ static int i40evf_alloc_queues(struct i40evf_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+	for (i = 0; i < adapter->num_active_queues; i++) {
 		struct i40e_ring *tx_ring;
 		struct i40e_ring *rx_ring;
 
@@ -1130,7 +1130,7 @@ static int i40evf_set_interrupt_capability(struct i40evf_adapter *adapter)
 		err = -EIO;
 		goto out;
 	}
-	pairs = adapter->vsi_res->num_queue_pairs;
+	pairs = adapter->num_active_queues;
 
 	/* It's easy to be greedy for MSI-X vectors, but it really
 	 * doesn't do us much good if we have a lot more vectors
@@ -1210,7 +1210,7 @@ static void i40evf_free_q_vectors(struct i40evf_adapter *adapter)
 	int napi_vectors;
 
 	num_q_vectors = adapter->num_msix_vectors - NONQ_VECS;
-	napi_vectors = adapter->vsi_res->num_queue_pairs;
+	napi_vectors = adapter->num_active_queues;
 
 	for (q_idx = 0; q_idx < num_q_vectors; q_idx++) {
 		struct i40e_q_vector *q_vector = adapter->q_vector[q_idx];
@@ -1265,8 +1265,8 @@ int i40evf_init_interrupt_scheme(struct i40evf_adapter *adapter)
 	}
 
 	dev_info(&adapter->pdev->dev, "Multiqueue %s: Queue pair count = %u",
-		(adapter->vsi_res->num_queue_pairs > 1) ? "Enabled" :
-		"Disabled", adapter->vsi_res->num_queue_pairs);
+		(adapter->num_active_queues > 1) ? "Enabled" :
+		"Disabled", adapter->num_active_queues);
 
 	return 0;
 err_alloc_queues:
@@ -1425,7 +1425,7 @@ static int next_queue(struct i40evf_adapter *adapter, int j)
 {
 	j += 1;
 
-	return j >= adapter->vsi_res->num_queue_pairs ? 0 : j;
+	return j >= adapter->num_active_queues ? 0 : j;
 }
 
 /**
@@ -1446,9 +1446,14 @@ static void i40evf_configure_rss(struct i40evf_adapter *adapter)
 			0xc135cafa, 0x7a6f7e2d, 0xe7102d28, 0x163cd12e,
 			0x4954b126 };
 
-	/* Hash type is configured by the PF - we just supply the key */
+	/* No RSS for single queue. */
+	if (adapter->num_active_queues == 1) {
+		wr32(hw, I40E_VFQF_HENA(0), 0);
+		wr32(hw, I40E_VFQF_HENA(1), 0);
+		return;
+	}
 
-	/* Fill out hash function seed */
+	/* Hash type is configured by the PF - we just supply the key */
 	for (i = 0; i <= I40E_VFQF_HKEY_MAX_INDEX; i++)
 		wr32(hw, I40E_VFQF_HKEY(i), seed[i]);
 
@@ -1458,7 +1463,7 @@ static void i40evf_configure_rss(struct i40evf_adapter *adapter)
 	wr32(hw, I40E_VFQF_HENA(1), (u32)(hena >> 32));
 
 	/* Populate the LUT with max no. of queues in round robin fashion */
-	j = adapter->vsi_res->num_queue_pairs;
+	j = adapter->num_active_queues;
 	for (i = 0; i <= I40E_VFQF_HLUT_MAX_INDEX; i++) {
 		j = next_queue(adapter, j);
 		lut = j;
@@ -1703,7 +1708,7 @@ static void i40evf_free_all_tx_resources(struct i40evf_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++)
+	for (i = 0; i < adapter->num_active_queues; i++)
 		if (adapter->tx_rings[i]->desc)
 			i40evf_free_tx_resources(adapter->tx_rings[i]);
 
@@ -1723,7 +1728,7 @@ static int i40evf_setup_all_tx_resources(struct i40evf_adapter *adapter)
 {
 	int i, err = 0;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+	for (i = 0; i < adapter->num_active_queues; i++) {
 		adapter->tx_rings[i]->count = adapter->tx_desc_count;
 		err = i40evf_setup_tx_descriptors(adapter->tx_rings[i]);
 		if (!err)
@@ -1751,7 +1756,7 @@ static int i40evf_setup_all_rx_resources(struct i40evf_adapter *adapter)
 {
 	int i, err = 0;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++) {
+	for (i = 0; i < adapter->num_active_queues; i++) {
 		adapter->rx_rings[i]->count = adapter->rx_desc_count;
 		err = i40evf_setup_rx_descriptors(adapter->rx_rings[i]);
 		if (!err)
@@ -1774,7 +1779,7 @@ static void i40evf_free_all_rx_resources(struct i40evf_adapter *adapter)
 {
 	int i;
 
-	for (i = 0; i < adapter->vsi_res->num_queue_pairs; i++)
+	for (i = 0; i < adapter->num_active_queues; i++)
 		if (adapter->rx_rings[i]->desc)
 			i40evf_free_rx_resources(adapter->rx_rings[i]);
 }
@@ -2150,6 +2155,9 @@ static void i40evf_init_task(struct work_struct *work)
 	adapter->watchdog_timer.data = (unsigned long)adapter;
 	mod_timer(&adapter->watchdog_timer, jiffies + 1);
 
+	adapter->num_active_queues = min_t(int,
+					   adapter->vsi_res->num_queue_pairs,
+					   (int)(num_online_cpus()));
 	adapter->tx_desc_count = I40EVF_DEFAULT_TXD;
 	adapter->rx_desc_count = I40EVF_DEFAULT_RXD;
 	err = i40evf_init_interrupt_scheme(adapter);
diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
index ff86761..49bfdb5 100644
--- a/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
+++ b/drivers/net/ethernet/intel/i40evf/i40evf_virtchnl.c
@@ -210,7 +210,7 @@ void i40evf_configure_queues(struct i40evf_adapter *adapter)
 {
 	struct i40e_virtchnl_vsi_queue_config_info *vqci;
 	struct i40e_virtchnl_queue_pair_info *vqpi;
-	int pairs = adapter->vsi_res->num_queue_pairs;
+	int pairs = adapter->num_active_queues;
 	int i, len;
 
 	if (adapter->current_op != I40E_VIRTCHNL_OP_UNKNOWN) {
@@ -276,7 +276,7 @@ void i40evf_enable_queues(struct i40evf_adapter *adapter)
 	}
 	adapter->current_op = I40E_VIRTCHNL_OP_ENABLE_QUEUES;
 	vqs.vsi_id = adapter->vsi_res->vsi_id;
-	vqs.tx_queues = (1 << adapter->vsi_res->num_queue_pairs) - 1;
+	vqs.tx_queues = (1 << adapter->num_active_queues) - 1;
 	vqs.rx_queues = vqs.tx_queues;
 	adapter->aq_pending |= I40EVF_FLAG_AQ_ENABLE_QUEUES;
 	adapter->aq_required &= ~I40EVF_FLAG_AQ_ENABLE_QUEUES;
@@ -302,7 +302,7 @@ void i40evf_disable_queues(struct i40evf_adapter *adapter)
 	}
 	adapter->current_op = I40E_VIRTCHNL_OP_DISABLE_QUEUES;
 	vqs.vsi_id = adapter->vsi_res->vsi_id;
-	vqs.tx_queues = (1 << adapter->vsi_res->num_queue_pairs) - 1;
+	vqs.tx_queues = (1 << adapter->num_active_queues) - 1;
 	vqs.rx_queues = vqs.tx_queues;
 	adapter->aq_pending |= I40EVF_FLAG_AQ_DISABLE_QUEUES;
 	adapter->aq_required &= ~I40EVF_FLAG_AQ_DISABLE_QUEUES;
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 06/10] ixgbe: fix X540 Completion timeout
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Don Skidmore, netdev, nhorman, sassmann, jogreene,
	Sergei Shtylyov, Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

On topologies including few levels of PCIe switching X540 can run into an
unexpected completion error.  We get around this by waiting after enabling
loopback a sufficient amount of time until Tx Data Fetch is sent.  We then
poll the pending transaction bit to ensure we received the completion.  Only
then do we go on to clear the buffers.

CC: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-of-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
v2: update the "goto" to "break" based on feedback from Sergei Shtylyov

 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index b5f484b..0406708 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -3583,7 +3583,8 @@ s32 ixgbe_set_fw_drv_ver_generic(struct ixgbe_hw *hw, u8 maj, u8 min,
  **/
 void ixgbe_clear_tx_pending(struct ixgbe_hw *hw)
 {
-	u32 gcr_ext, hlreg0;
+	u32 gcr_ext, hlreg0, i, poll;
+	u16 value;
 
 	/*
 	 * If double reset is not requested then all transactions should
@@ -3600,6 +3601,23 @@ void ixgbe_clear_tx_pending(struct ixgbe_hw *hw)
 	hlreg0 = IXGBE_READ_REG(hw, IXGBE_HLREG0);
 	IXGBE_WRITE_REG(hw, IXGBE_HLREG0, hlreg0 | IXGBE_HLREG0_LPBK);
 
+	/* wait for a last completion before clearing buffers */
+	IXGBE_WRITE_FLUSH(hw);
+	usleep_range(3000, 6000);
+
+	/* Before proceeding, make sure that the PCIe block does not have
+	 * transactions pending.
+	 */
+	poll = ixgbe_pcie_timeout_poll(hw);
+	for (i = 0; i < poll; i++) {
+		usleep_range(100, 200);
+		value = ixgbe_read_pci_cfg_word(hw, IXGBE_PCI_DEVICE_STATUS);
+		if (ixgbe_removed(hw->hw_addr))
+			break;
+		if (!(value & IXGBE_PCI_DEVICE_STATUS_TRANSACTION_PENDING))
+			break;
+	}
+
 	/* initiate cleaning flow for buffers in the PCIe transaction layer */
 	gcr_ext = IXGBE_READ_REG(hw, IXGBE_GCR_EXT);
 	IXGBE_WRITE_REG(hw, IXGBE_GCR_EXT,
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 09/10] ixgbe: Add new support for X550 MAC's
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Don Skidmore, netdev, nhorman, sassmann, jogreene, John Ronciak,
	Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

This patch will add in the new MAC defines and fit it into the switch
cases throughout the driver.  New functionality and enablement support will
be added in following patches.

CC: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_common.c  | 30 +++++---
 drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.c     |  8 +++
 drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c  |  1 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c | 37 ++++++++--
 drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c     |  2 +
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c    | 90 ++++++++++++++++++++----
 drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.c     |  4 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c     | 64 +++++++++--------
 drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c   |  6 ++
 drivers/net/ethernet/intel/ixgbe/ixgbe_type.h    | 45 +++++++++---
 10 files changed, 223 insertions(+), 64 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
index 0406708..0e754b4 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_common.c
@@ -2799,6 +2799,8 @@ u16 ixgbe_get_pcie_msix_count_generic(struct ixgbe_hw *hw)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		pcie_offset = IXGBE_PCIE_MSIX_82599_CAPS;
 		max_msix_count = IXGBE_MAX_MSIX_VECTORS_82599;
 		break;
@@ -3192,17 +3194,27 @@ s32 ixgbe_check_mac_link_generic(struct ixgbe_hw *hw, ixgbe_link_speed *speed,
 			*link_up = false;
 	}
 
-	if ((links_reg & IXGBE_LINKS_SPEED_82599) ==
-	    IXGBE_LINKS_SPEED_10G_82599)
-		*speed = IXGBE_LINK_SPEED_10GB_FULL;
-	else if ((links_reg & IXGBE_LINKS_SPEED_82599) ==
-		 IXGBE_LINKS_SPEED_1G_82599)
+	switch (links_reg & IXGBE_LINKS_SPEED_82599) {
+	case IXGBE_LINKS_SPEED_10G_82599:
+		if ((hw->mac.type >= ixgbe_mac_X550) &&
+		    (links_reg & IXGBE_LINKS_SPEED_NON_STD))
+			*speed = IXGBE_LINK_SPEED_2_5GB_FULL;
+		else
+			*speed = IXGBE_LINK_SPEED_10GB_FULL;
+		break;
+	case IXGBE_LINKS_SPEED_1G_82599:
 		*speed = IXGBE_LINK_SPEED_1GB_FULL;
-	else if ((links_reg & IXGBE_LINKS_SPEED_82599) ==
-		 IXGBE_LINKS_SPEED_100_82599)
-		*speed = IXGBE_LINK_SPEED_100_FULL;
-	else
+		break;
+	case IXGBE_LINKS_SPEED_100_82599:
+		if ((hw->mac.type >= ixgbe_mac_X550) &&
+		    (links_reg & IXGBE_LINKS_SPEED_NON_STD))
+			*speed = IXGBE_LINK_SPEED_5GB_FULL;
+		else
+			*speed = IXGBE_LINK_SPEED_100_FULL;
+		break;
+	default:
 		*speed = IXGBE_LINK_SPEED_UNKNOWN;
+	}
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.c
index 48f35fc..a507a6f 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb.c
@@ -286,6 +286,8 @@ s32 ixgbe_dcb_hw_config(struct ixgbe_hw *hw,
 						 bwgid, ptype);
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		return ixgbe_dcb_hw_config_82599(hw, pfc_en, refill, max,
 						 bwgid, ptype, prio_tc);
 	default:
@@ -302,6 +304,8 @@ s32 ixgbe_dcb_hw_pfc_config(struct ixgbe_hw *hw, u8 pfc_en, u8 *prio_tc)
 		return ixgbe_dcb_config_pfc_82598(hw, pfc_en);
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		return ixgbe_dcb_config_pfc_82599(hw, pfc_en, prio_tc);
 	default:
 		break;
@@ -357,6 +361,8 @@ s32 ixgbe_dcb_hw_ets_config(struct ixgbe_hw *hw,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		ixgbe_dcb_config_rx_arbiter_82599(hw, refill, max,
 						  bwg_id, prio_type, prio_tc);
 		ixgbe_dcb_config_tx_desc_arbiter_82599(hw, refill, max,
@@ -385,6 +391,8 @@ void ixgbe_dcb_read_rtrup2tc(struct ixgbe_hw *hw, u8 *map)
 	switch (hw->mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		ixgbe_dcb_read_rtrup2tc_82599(hw, map);
 		break;
 	default:
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
index 58a7f53..2707bda 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_dcb_nl.c
@@ -180,6 +180,7 @@ static void ixgbe_dcbnl_get_perm_hw_addr(struct net_device *netdev,
 	switch (adapter->hw.mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
 		for (j = 0; j < netdev->addr_len; j++, i++)
 			perm_addr[i] = adapter->hw.mac.san_addr[j];
 		break;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
index 0ae038b..26fd85e 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c
@@ -511,6 +511,8 @@ static void ixgbe_get_regs(struct net_device *netdev,
 			break;
 		case ixgbe_mac_82599EB:
 		case ixgbe_mac_X540:
+		case ixgbe_mac_X550:
+		case ixgbe_mac_X550EM_x:
 			regs_buff[35 + i] = IXGBE_READ_REG(hw, IXGBE_FCRTL_82599(i));
 			regs_buff[43 + i] = IXGBE_READ_REG(hw, IXGBE_FCRTH_82599(i));
 			break;
@@ -622,6 +624,8 @@ static void ixgbe_get_regs(struct net_device *netdev,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		regs_buff[830] = IXGBE_READ_REG(hw, IXGBE_RTTDCS);
 		regs_buff[832] = IXGBE_READ_REG(hw, IXGBE_RTRPCS);
 		for (i = 0; i < 8; i++)
@@ -1406,6 +1410,8 @@ static int ixgbe_reg_test(struct ixgbe_adapter *adapter, u64 *data)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		toggle = 0x7FFFF30F;
 		test = reg_test_82599;
 		break;
@@ -1644,6 +1650,8 @@ static void ixgbe_free_desc_rings(struct ixgbe_adapter *adapter)
 	switch (hw->mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		reg_ctl = IXGBE_READ_REG(hw, IXGBE_DMATXCTL);
 		reg_ctl &= ~IXGBE_DMATXCTL_TE;
 		IXGBE_WRITE_REG(hw, IXGBE_DMATXCTL, reg_ctl);
@@ -1680,6 +1688,8 @@ static int ixgbe_setup_desc_rings(struct ixgbe_adapter *adapter)
 	switch (adapter->hw.mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		reg_data = IXGBE_READ_REG(&adapter->hw, IXGBE_DMATXCTL);
 		reg_data |= IXGBE_DMATXCTL_TE;
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_DMATXCTL, reg_data);
@@ -1733,12 +1743,16 @@ static int ixgbe_setup_loopback_test(struct ixgbe_adapter *adapter)
 	reg_data |= IXGBE_FCTRL_BAM | IXGBE_FCTRL_SBP | IXGBE_FCTRL_MPE;
 	IXGBE_WRITE_REG(hw, IXGBE_FCTRL, reg_data);
 
-	/* X540 needs to set the MACC.FLU bit to force link up */
-	if (adapter->hw.mac.type == ixgbe_mac_X540) {
+	/* X540 and X550 needs to set the MACC.FLU bit to force link up */
+	switch (adapter->hw.mac.type) {
+	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		reg_data = IXGBE_READ_REG(hw, IXGBE_MACC);
 		reg_data |= IXGBE_MACC_FLU;
 		IXGBE_WRITE_REG(hw, IXGBE_MACC, reg_data);
-	} else {
+		break;
+	default:
 		if (hw->mac.orig_autoc) {
 			reg_data = hw->mac.orig_autoc | IXGBE_AUTOC_FLU;
 			IXGBE_WRITE_REG(hw, IXGBE_AUTOC, reg_data);
@@ -2776,7 +2790,14 @@ static int ixgbe_set_rss_hash_opt(struct ixgbe_adapter *adapter,
 	/* if we changed something we need to update flags */
 	if (flags2 != adapter->flags2) {
 		struct ixgbe_hw *hw = &adapter->hw;
-		u32 mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
+		u32 mrqc;
+		unsigned int pf_pool = adapter->num_vfs;
+
+		if ((hw->mac.type >= ixgbe_mac_X550) &&
+		    (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED))
+			mrqc = IXGBE_READ_REG(hw, IXGBE_PFVFMRQC(pf_pool));
+		else
+			mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC);
 
 		if ((flags2 & UDP_RSS_FLAGS) &&
 		    !(adapter->flags2 & UDP_RSS_FLAGS))
@@ -2799,7 +2820,11 @@ static int ixgbe_set_rss_hash_opt(struct ixgbe_adapter *adapter,
 		if (flags2 & IXGBE_FLAG2_RSS_FIELD_IPV6_UDP)
 			mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_UDP;
 
-		IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
+		if ((hw->mac.type >= ixgbe_mac_X550) &&
+		    (adapter->flags & IXGBE_FLAG_SRIOV_ENABLED))
+			IXGBE_WRITE_REG(hw, IXGBE_PFVFMRQC(pf_pool), mrqc);
+		else
+			IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
 	}
 
 	return 0;
@@ -2833,6 +2858,8 @@ static int ixgbe_get_ts_info(struct net_device *dev,
 	struct ixgbe_adapter *adapter = netdev_priv(dev);
 
 	switch (adapter->hw.mac.type) {
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 	case ixgbe_mac_X540:
 	case ixgbe_mac_82599EB:
 		info->so_timestamping =
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
index ce40c77..68e1e75 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c
@@ -126,6 +126,8 @@ static void ixgbe_get_first_reg_idx(struct ixgbe_adapter *adapter, u8 tc,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		if (num_tcs > 4) {
 			/*
 			 * TCs    : TC0/1 TC2/3 TC4-7
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index d2df4e3..355d1f7 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -835,6 +835,8 @@ static void ixgbe_set_ivar(struct ixgbe_adapter *adapter, s8 direction,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		if (direction == -1) {
 			/* other causes */
 			msix_vector |= IXGBE_IVAR_ALLOC_VAL;
@@ -871,6 +873,8 @@ static inline void ixgbe_irq_rearm_queues(struct ixgbe_adapter *adapter,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		mask = (qmask & 0xFFFFFFFF);
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_EICS_EX(0), mask);
 		mask = (qmask >> 32);
@@ -2155,6 +2159,8 @@ static void ixgbe_configure_msix(struct ixgbe_adapter *adapter)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		ixgbe_set_ivar(adapter, -1, 1, v_idx);
 		break;
 	default:
@@ -2264,6 +2270,8 @@ void ixgbe_write_eitr(struct ixgbe_q_vector *q_vector)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		/*
 		 * set the WDIS bit to not clear the timer bits and cause an
 		 * immediate assertion of the interrupt
@@ -2467,6 +2475,8 @@ static inline void ixgbe_irq_enable_queues(struct ixgbe_adapter *adapter,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		mask = (qmask & 0xFFFFFFFF);
 		if (mask)
 			IXGBE_WRITE_REG(hw, IXGBE_EIMS_EX(0), mask);
@@ -2493,6 +2503,8 @@ static inline void ixgbe_irq_disable_queues(struct ixgbe_adapter *adapter,
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		mask = (qmask & 0xFFFFFFFF);
 		if (mask)
 			IXGBE_WRITE_REG(hw, IXGBE_EIMC_EX(0), mask);
@@ -2525,6 +2537,8 @@ static inline void ixgbe_irq_enable(struct ixgbe_adapter *adapter, bool queues,
 			mask |= IXGBE_EIMS_GPI_SDP0;
 			break;
 		case ixgbe_mac_X540:
+		case ixgbe_mac_X550:
+		case ixgbe_mac_X550EM_x:
 			mask |= IXGBE_EIMS_TS;
 			break;
 		default:
@@ -2536,7 +2550,10 @@ static inline void ixgbe_irq_enable(struct ixgbe_adapter *adapter, bool queues,
 	case ixgbe_mac_82599EB:
 		mask |= IXGBE_EIMS_GPI_SDP1;
 		mask |= IXGBE_EIMS_GPI_SDP2;
+		/* fall through */
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		mask |= IXGBE_EIMS_ECC;
 		mask |= IXGBE_EIMS_MAILBOX;
 		break;
@@ -2544,9 +2561,6 @@ static inline void ixgbe_irq_enable(struct ixgbe_adapter *adapter, bool queues,
 		break;
 	}
 
-	if (adapter->hw.mac.type == ixgbe_mac_X540)
-		mask |= IXGBE_EIMS_TIMESYNC;
-
 	if ((adapter->flags & IXGBE_FLAG_FDIR_HASH_CAPABLE) &&
 	    !(adapter->flags2 & IXGBE_FLAG2_FDIR_REQUIRES_REINIT))
 		mask |= IXGBE_EIMS_FLOW_DIR;
@@ -2592,6 +2606,8 @@ static irqreturn_t ixgbe_msix_other(int irq, void *data)
 	switch (hw->mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		if (eicr & IXGBE_EICR_ECC) {
 			e_info(link, "Received ECC Err, initiating reset\n");
 			adapter->flags2 |= IXGBE_FLAG2_RESET_REQUESTED;
@@ -2811,6 +2827,8 @@ static irqreturn_t ixgbe_intr(int irq, void *data)
 		ixgbe_check_sfp_event(adapter, eicr);
 		/* Fall through */
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		if (eicr & IXGBE_EICR_ECC) {
 			e_info(link, "Received ECC Err, initiating reset\n");
 			adapter->flags2 |= IXGBE_FLAG2_RESET_REQUESTED;
@@ -2905,6 +2923,8 @@ static inline void ixgbe_irq_disable(struct ixgbe_adapter *adapter)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_EIMC, 0xFFFF0000);
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_EIMC_EX(0), ~0);
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_EIMC_EX(1), ~0);
@@ -3534,6 +3554,8 @@ static void ixgbe_setup_rdrxctl(struct ixgbe_adapter *adapter)
 	u32 rdrxctl = IXGBE_READ_REG(hw, IXGBE_RDRXCTL);
 
 	switch (hw->mac.type) {
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 	case ixgbe_mac_82598EB:
 		/*
 		 * For VMDq support of different descriptor types or
@@ -3657,6 +3679,8 @@ static void ixgbe_vlan_strip_disable(struct ixgbe_adapter *adapter)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		for (i = 0; i < adapter->num_rx_queues; i++) {
 			struct ixgbe_ring *ring = adapter->rx_ring[i];
 
@@ -3691,6 +3715,8 @@ static void ixgbe_vlan_strip_enable(struct ixgbe_adapter *adapter)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		for (i = 0; i < adapter->num_rx_queues; i++) {
 			struct ixgbe_ring *ring = adapter->rx_ring[i];
 
@@ -4112,6 +4138,8 @@ static int ixgbe_hpbthresh(struct ixgbe_adapter *adapter, int pb)
 	/* Calculate delay value for device */
 	switch (hw->mac.type) {
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		dv_id = IXGBE_DV_X540(link, tc);
 		break;
 	default:
@@ -4170,6 +4198,8 @@ static int ixgbe_lpbthresh(struct ixgbe_adapter *adapter, int pb)
 	/* Calculate delay value for device */
 	switch (hw->mac.type) {
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		dv_id = IXGBE_LOW_DV_X540(tc);
 		break;
 	default:
@@ -4606,6 +4636,8 @@ static void ixgbe_setup_gpie(struct ixgbe_adapter *adapter)
 			break;
 		case ixgbe_mac_82599EB:
 		case ixgbe_mac_X540:
+		case ixgbe_mac_X550:
+		case ixgbe_mac_X550EM_x:
 		default:
 			IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(0), 0xFFFFFFFF);
 			IXGBE_WRITE_REG(hw, IXGBE_EIAM_EX(1), 0xFFFFFFFF);
@@ -4948,10 +4980,12 @@ void ixgbe_down(struct ixgbe_adapter *adapter)
 		IXGBE_WRITE_REG(hw, IXGBE_TXDCTL(reg_idx), IXGBE_TXDCTL_SWFLSH);
 	}
 
-	/* Disable the Tx DMA engine on 82599 and X540 */
+	/* Disable the Tx DMA engine on 82599 and later MAC */
 	switch (hw->mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		IXGBE_WRITE_REG(hw, IXGBE_DMATXCTL,
 				(IXGBE_READ_REG(hw, IXGBE_DMATXCTL) &
 				 ~IXGBE_DMATXCTL_TE));
@@ -5071,6 +5105,12 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter)
 		if (fwsm & IXGBE_FWSM_TS_ENABLED)
 			adapter->flags2 |= IXGBE_FLAG2_TEMP_SENSOR_CAPABLE;
 		break;
+	case ixgbe_mac_X550EM_x:
+	case ixgbe_mac_X550:
+#ifdef CONFIG_IXGBE_DCA
+		adapter->flags &= ~IXGBE_FLAG_DCA_CAPABLE;
+#endif
+		break;
 	default:
 		break;
 	}
@@ -5086,6 +5126,8 @@ static int ixgbe_sw_init(struct ixgbe_adapter *adapter)
 #ifdef CONFIG_IXGBE_DCB
 	switch (hw->mac.type) {
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		adapter->dcb_cfg.num_tcs.pg_tcs = X540_TRAFFIC_CLASS;
 		adapter->dcb_cfg.num_tcs.pfc_tcs = X540_TRAFFIC_CLASS;
 		break;
@@ -5675,6 +5717,8 @@ static int __ixgbe_shutdown(struct pci_dev *pdev, bool *enable_wake)
 		break;
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		pci_wake_from_d3(pdev, !!wufc);
 		break;
 	default:
@@ -5806,6 +5850,8 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 			break;
 		case ixgbe_mac_82599EB:
 		case ixgbe_mac_X540:
+		case ixgbe_mac_X550:
+		case ixgbe_mac_X550EM_x:
 			hwstats->pxonrxc[i] +=
 				IXGBE_READ_REG(hw, IXGBE_PXONRXCNT(i));
 			break;
@@ -5819,7 +5865,9 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 		hwstats->qptc[i] += IXGBE_READ_REG(hw, IXGBE_QPTC(i));
 		hwstats->qprc[i] += IXGBE_READ_REG(hw, IXGBE_QPRC(i));
 		if ((hw->mac.type == ixgbe_mac_82599EB) ||
-		    (hw->mac.type == ixgbe_mac_X540)) {
+		    (hw->mac.type == ixgbe_mac_X540) ||
+		    (hw->mac.type == ixgbe_mac_X550) ||
+		    (hw->mac.type == ixgbe_mac_X550EM_x)) {
 			hwstats->qbtc[i] += IXGBE_READ_REG(hw, IXGBE_QBTC_L(i));
 			IXGBE_READ_REG(hw, IXGBE_QBTC_H(i)); /* to clear */
 			hwstats->qbrc[i] += IXGBE_READ_REG(hw, IXGBE_QBRC_L(i));
@@ -5842,7 +5890,9 @@ void ixgbe_update_stats(struct ixgbe_adapter *adapter)
 		hwstats->tor += IXGBE_READ_REG(hw, IXGBE_TORH);
 		break;
 	case ixgbe_mac_X540:
-		/* OS2BMC stats are X540 only*/
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
+		/* OS2BMC stats are X540 and later */
 		hwstats->o2bgptc += IXGBE_READ_REG(hw, IXGBE_O2BGPTC);
 		hwstats->o2bspc += IXGBE_READ_REG(hw, IXGBE_O2BSPC);
 		hwstats->b2ospc += IXGBE_READ_REG(hw, IXGBE_B2OSPC);
@@ -6110,6 +6160,8 @@ static void ixgbe_watchdog_link_is_up(struct ixgbe_adapter *adapter)
 	}
 		break;
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 	case ixgbe_mac_82599EB: {
 		u32 mflcn = IXGBE_READ_REG(hw, IXGBE_MFLCN);
 		u32 fccfg = IXGBE_READ_REG(hw, IXGBE_FCCFG);
@@ -6221,6 +6273,10 @@ static bool ixgbe_vf_tx_pending(struct ixgbe_adapter *adapter)
 	if (!adapter->num_vfs)
 		return false;
 
+	/* resetting the PF is only needed for MAC before X550 */
+	if (hw->mac.type >= ixgbe_mac_X550)
+		return false;
+
 	for (i = 0; i < adapter->num_vfs; i++) {
 		for (j = 0; j < q_per_pool; j++) {
 			u32 h, t;
@@ -6430,11 +6486,11 @@ static void ixgbe_check_for_bad_vf(struct ixgbe_adapter *adapter)
 		ciaa = (vf << 16) | 0x80000000;
 		/* 32 bit read so align, we really want status at offset 6 */
 		ciaa |= PCI_COMMAND;
-		IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
-		ciad = IXGBE_READ_REG(hw, IXGBE_CIAD_82599);
+		IXGBE_WRITE_REG(hw, IXGBE_CIAA_BY_MAC(hw), ciaa);
+		ciad = IXGBE_READ_REG(hw, IXGBE_CIAD_BY_MAC(hw));
 		ciaa &= 0x7FFFFFFF;
 		/* disable debug mode asap after reading data */
-		IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+		IXGBE_WRITE_REG(hw, IXGBE_CIAA_BY_MAC(hw), ciaa);
 		/* Get the upper 16 bits which will be the PCI status reg */
 		ciad >>= 16;
 		if (ciad & PCI_STATUS_REC_MASTER_ABORT) {
@@ -6442,11 +6498,11 @@ static void ixgbe_check_for_bad_vf(struct ixgbe_adapter *adapter)
 			/* Issue VFLR */
 			ciaa = (vf << 16) | 0x80000000;
 			ciaa |= 0xA8;
-			IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+			IXGBE_WRITE_REG(hw, IXGBE_CIAA_BY_MAC(hw), ciaa);
 			ciad = 0x00008000;  /* VFLR */
-			IXGBE_WRITE_REG(hw, IXGBE_CIAD_82599, ciad);
+			IXGBE_WRITE_REG(hw, IXGBE_CIAD_BY_MAC(hw), ciad);
 			ciaa &= 0x7FFFFFFF;
-			IXGBE_WRITE_REG(hw, IXGBE_CIAA_82599, ciaa);
+			IXGBE_WRITE_REG(hw, IXGBE_CIAA_BY_MAC(hw), ciaa);
 		}
 	}
 }
@@ -8098,6 +8154,8 @@ static int ixgbe_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	switch (adapter->hw.mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		IXGBE_WRITE_REG(&adapter->hw, IXGBE_WUS, ~0);
 		break;
 	default:
@@ -8161,6 +8219,8 @@ skip_sriov:
 	switch (adapter->hw.mac.type) {
 	case ixgbe_mac_82599EB:
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		netdev->features |= NETIF_F_SCTP_CSUM;
 		netdev->hw_features |= NETIF_F_SCTP_CSUM |
 				       NETIF_F_NTUPLE;
@@ -8514,6 +8574,12 @@ static pci_ers_result_t ixgbe_io_error_detected(struct pci_dev *pdev,
 		case ixgbe_mac_X540:
 			device_id = IXGBE_X540_VF_DEVICE_ID;
 			break;
+		case ixgbe_mac_X550:
+			device_id = IXGBE_DEV_ID_X550_VF;
+			break;
+		case ixgbe_mac_X550EM_x:
+			device_id = IXGBE_DEV_ID_X550EM_X_VF;
+			break;
 		default:
 			device_id = 0;
 			break;
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.c
index cc8f012..9993a47 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_mbx.c
@@ -305,6 +305,8 @@ static s32 ixgbe_check_for_rst_pf(struct ixgbe_hw *hw, u16 vf_number)
 		vflre = IXGBE_READ_REG(hw, IXGBE_VFLRE(reg_offset));
 		break;
 	case ixgbe_mac_X540:
+	case ixgbe_mac_X550:
+	case ixgbe_mac_X550EM_x:
 		vflre = IXGBE_READ_REG(hw, IXGBE_VFLREC(reg_offset));
 		break;
 	default:
@@ -426,6 +428,8 @@ void ixgbe_init_mbx_params_pf(struct ixgbe_hw *hw)
 	struct ixgbe_mbx_info *mbx = &hw->mbx;
 
 	if (hw->mac.type != ixgbe_mac_82599EB &&
+	    hw->mac.type != ixgbe_mac_X550 &&
+	    hw->mac.type != ixgbe_mac_X550EM_x &&
 	    hw->mac.type != ixgbe_mac_X540)
 		return;
 
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
index d47b19f..dc97c03 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c
@@ -43,7 +43,7 @@ static s32 ixgbe_clock_out_i2c_bit(struct ixgbe_hw *hw, bool data);
 static void ixgbe_raise_i2c_clk(struct ixgbe_hw *hw, u32 *i2cctl);
 static void ixgbe_lower_i2c_clk(struct ixgbe_hw *hw, u32 *i2cctl);
 static s32 ixgbe_set_i2c_data(struct ixgbe_hw *hw, u32 *i2cctl, bool data);
-static bool ixgbe_get_i2c_data(u32 *i2cctl);
+static bool ixgbe_get_i2c_data(struct ixgbe_hw *hw, u32 *i2cctl);
 static void ixgbe_i2c_bus_clear(struct ixgbe_hw *hw);
 static enum ixgbe_phy_type ixgbe_get_phy_type_from_id(u32 phy_id);
 static s32 ixgbe_get_phy_id(struct ixgbe_hw *hw);
@@ -576,6 +576,10 @@ s32 ixgbe_get_copper_link_capabilities_generic(struct ixgbe_hw *hw,
 			*speed |= IXGBE_LINK_SPEED_100_FULL;
 	}
 
+	/* Internal PHY does not support 100 Mbps */
+	if (hw->mac.type == ixgbe_mac_X550EM_x)
+		*speed &= ~IXGBE_LINK_SPEED_100_FULL;
+
 	return status;
 }
 
@@ -632,10 +636,12 @@ s32 ixgbe_check_phy_link_tnx(struct ixgbe_hw *hw, ixgbe_link_speed *speed,
  *	@hw: pointer to hardware structure
  *
  *	Restart autonegotiation and PHY and waits for completion.
+ *      This function always returns success, this is nessary since
+ *	it is called via a function pointer that could call other
+ *	functions that could return an error.
  **/
 s32 ixgbe_setup_phy_link_tnx(struct ixgbe_hw *hw)
 {
-	s32 status;
 	u16 autoneg_reg = IXGBE_MII_AUTONEG_REG;
 	bool autoneg = false;
 	ixgbe_link_speed speed;
@@ -701,7 +707,7 @@ s32 ixgbe_setup_phy_link_tnx(struct ixgbe_hw *hw)
 	hw->phy.ops.write_reg(hw, MDIO_CTRL1,
 			      MDIO_MMD_AN, autoneg_reg);
 
-	return status;
+	return 0;
 }
 
 /**
@@ -1612,7 +1618,7 @@ fail:
  **/
 static void ixgbe_i2c_start(struct ixgbe_hw *hw)
 {
-	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
 
 	/* Start condition must begin with data and clock high */
 	ixgbe_set_i2c_data(hw, &i2cctl, 1);
@@ -1641,7 +1647,7 @@ static void ixgbe_i2c_start(struct ixgbe_hw *hw)
  **/
 static void ixgbe_i2c_stop(struct ixgbe_hw *hw)
 {
-	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
 
 	/* Stop condition must begin with data low and clock high */
 	ixgbe_set_i2c_data(hw, &i2cctl, 0);
@@ -1699,9 +1705,9 @@ static s32 ixgbe_clock_out_i2c_byte(struct ixgbe_hw *hw, u8 data)
 	}
 
 	/* Release SDA line (set high) */
-	i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-	i2cctl |= IXGBE_I2C_DATA_OUT;
-	IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, i2cctl);
+	i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+	i2cctl |= IXGBE_I2C_DATA_OUT_BY_MAC(hw);
+	IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), i2cctl);
 	IXGBE_WRITE_FLUSH(hw);
 
 	return status;
@@ -1717,7 +1723,7 @@ static s32 ixgbe_get_i2c_ack(struct ixgbe_hw *hw)
 {
 	s32 status = 0;
 	u32 i = 0;
-	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
 	u32 timeout = 10;
 	bool ack = true;
 
@@ -1730,8 +1736,8 @@ static s32 ixgbe_get_i2c_ack(struct ixgbe_hw *hw)
 	/* Poll for ACK.  Note that ACK in I2C spec is
 	 * transition from 1 to 0 */
 	for (i = 0; i < timeout; i++) {
-		i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-		ack = ixgbe_get_i2c_data(&i2cctl);
+		i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+		ack = ixgbe_get_i2c_data(hw, &i2cctl);
 
 		udelay(1);
 		if (ack == 0)
@@ -1760,15 +1766,15 @@ static s32 ixgbe_get_i2c_ack(struct ixgbe_hw *hw)
  **/
 static s32 ixgbe_clock_in_i2c_bit(struct ixgbe_hw *hw, bool *data)
 {
-	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
 
 	ixgbe_raise_i2c_clk(hw, &i2cctl);
 
 	/* Minimum high period of clock is 4us */
 	udelay(IXGBE_I2C_T_HIGH);
 
-	i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-	*data = ixgbe_get_i2c_data(&i2cctl);
+	i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+	*data = ixgbe_get_i2c_data(hw, &i2cctl);
 
 	ixgbe_lower_i2c_clk(hw, &i2cctl);
 
@@ -1788,7 +1794,7 @@ static s32 ixgbe_clock_in_i2c_bit(struct ixgbe_hw *hw, bool *data)
 static s32 ixgbe_clock_out_i2c_bit(struct ixgbe_hw *hw, bool data)
 {
 	s32 status;
-	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
 
 	status = ixgbe_set_i2c_data(hw, &i2cctl, data);
 	if (status == 0) {
@@ -1824,14 +1830,14 @@ static void ixgbe_raise_i2c_clk(struct ixgbe_hw *hw, u32 *i2cctl)
 	u32 i2cctl_r = 0;
 
 	for (i = 0; i < timeout; i++) {
-		*i2cctl |= IXGBE_I2C_CLK_OUT;
-		IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, *i2cctl);
+		*i2cctl |= IXGBE_I2C_CLK_OUT_BY_MAC(hw);
+		IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), *i2cctl);
 		IXGBE_WRITE_FLUSH(hw);
 		/* SCL rise time (1000ns) */
 		udelay(IXGBE_I2C_T_RISE);
 
-		i2cctl_r = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-		if (i2cctl_r & IXGBE_I2C_CLK_IN)
+		i2cctl_r = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+		if (i2cctl_r & IXGBE_I2C_CLK_IN_BY_MAC(hw))
 			break;
 	}
 }
@@ -1846,9 +1852,9 @@ static void ixgbe_raise_i2c_clk(struct ixgbe_hw *hw, u32 *i2cctl)
 static void ixgbe_lower_i2c_clk(struct ixgbe_hw *hw, u32 *i2cctl)
 {
 
-	*i2cctl &= ~IXGBE_I2C_CLK_OUT;
+	*i2cctl &= ~IXGBE_I2C_CLK_OUT_BY_MAC(hw);
 
-	IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, *i2cctl);
+	IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), *i2cctl);
 	IXGBE_WRITE_FLUSH(hw);
 
 	/* SCL fall time (300ns) */
@@ -1866,19 +1872,19 @@ static void ixgbe_lower_i2c_clk(struct ixgbe_hw *hw, u32 *i2cctl)
 static s32 ixgbe_set_i2c_data(struct ixgbe_hw *hw, u32 *i2cctl, bool data)
 {
 	if (data)
-		*i2cctl |= IXGBE_I2C_DATA_OUT;
+		*i2cctl |= IXGBE_I2C_DATA_OUT_BY_MAC(hw);
 	else
-		*i2cctl &= ~IXGBE_I2C_DATA_OUT;
+		*i2cctl &= ~IXGBE_I2C_DATA_OUT_BY_MAC(hw);
 
-	IXGBE_WRITE_REG(hw, IXGBE_I2CCTL, *i2cctl);
+	IXGBE_WRITE_REG(hw, IXGBE_I2CCTL_BY_MAC(hw), *i2cctl);
 	IXGBE_WRITE_FLUSH(hw);
 
 	/* Data rise/fall (1000ns/300ns) and set-up time (250ns) */
 	udelay(IXGBE_I2C_T_RISE + IXGBE_I2C_T_FALL + IXGBE_I2C_T_SU_DATA);
 
 	/* Verify data was set correctly */
-	*i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
-	if (data != ixgbe_get_i2c_data(i2cctl)) {
+	*i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
+	if (data != ixgbe_get_i2c_data(hw, i2cctl)) {
 		hw_dbg(hw, "Error - I2C data was not set to %X.\n", data);
 		return IXGBE_ERR_I2C;
 	}
@@ -1893,9 +1899,9 @@ static s32 ixgbe_set_i2c_data(struct ixgbe_hw *hw, u32 *i2cctl, bool data)
  *
  *  Returns the I2C data bit value
  **/
-static bool ixgbe_get_i2c_data(u32 *i2cctl)
+static bool ixgbe_get_i2c_data(struct ixgbe_hw *hw, u32 *i2cctl)
 {
-	if (*i2cctl & IXGBE_I2C_DATA_IN)
+	if (*i2cctl & IXGBE_I2C_DATA_IN_BY_MAC(hw))
 		return true;
 	return false;
 }
@@ -1909,7 +1915,7 @@ static bool ixgbe_get_i2c_data(u32 *i2cctl)
  **/
 static void ixgbe_i2c_bus_clear(struct ixgbe_hw *hw)
 {
-	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL);
+	u32 i2cctl = IXGBE_READ_REG(hw, IXGBE_I2CCTL_BY_MAC(hw));
 	u32 i;
 
 	ixgbe_i2c_start(hw);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
index 0c25df5..04eee7c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c
@@ -1109,6 +1109,12 @@ static int ixgbe_enable_port_vlan(struct ixgbe_adapter *adapter, int vf,
 	if (adapter->vfinfo[vf].spoofchk_enabled)
 		hw->mac.ops.set_vlan_anti_spoofing(hw, true, vf);
 	adapter->vfinfo[vf].vlan_count++;
+
+	/* enable hide vlan on X550 */
+	if (hw->mac.type >= ixgbe_mac_X550)
+		ixgbe_write_qde(adapter, vf, IXGBE_QDE_ENABLE |
+				IXGBE_QDE_HIDE_VLAN);
+
 	adapter->vfinfo[vf].pf_vlan = vlan;
 	adapter->vfinfo[vf].pf_qos = qos;
 	dev_info(&adapter->pdev->dev,
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
index dfd55d8..64de20d 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_type.h
@@ -77,6 +77,8 @@
 /* VF Device IDs */
 #define IXGBE_DEV_ID_82599_VF           0x10ED
 #define IXGBE_DEV_ID_X540_VF            0x1515
+#define IXGBE_DEV_ID_X550_VF		0x1565
+#define IXGBE_DEV_ID_X550EM_X_VF	0x15A8
 
 /* General Registers */
 #define IXGBE_CTRL      0x00000
@@ -84,7 +86,8 @@
 #define IXGBE_CTRL_EXT  0x00018
 #define IXGBE_ESDP      0x00020
 #define IXGBE_EODSDP    0x00028
-#define IXGBE_I2CCTL    0x00028
+#define IXGBE_I2CCTL_BY_MAC(_hw)((((_hw)->mac.type >= ixgbe_mac_X550) ? \
+					0x15F5C : 0x00028))
 #define IXGBE_LEDCTL    0x00200
 #define IXGBE_FRTIMER   0x00048
 #define IXGBE_TCPTIMER  0x0004C
@@ -112,10 +115,14 @@
 #define IXGBE_VPDDIAG1  0x10208
 
 /* I2CCTL Bit Masks */
-#define IXGBE_I2C_CLK_IN    0x00000001
-#define IXGBE_I2C_CLK_OUT   0x00000002
-#define IXGBE_I2C_DATA_IN   0x00000004
-#define IXGBE_I2C_DATA_OUT  0x00000008
+#define IXGBE_I2C_CLK_IN_BY_MAC(_hw)(((_hw)->mac.type) >= ixgbe_mac_X550 ? \
+					0x00004000 : 0x00000001)
+#define IXGBE_I2C_CLK_OUT_BY_MAC(_hw)(((_hw)->mac.type) >= ixgbe_mac_X550 ? \
+					0x00000200 : 0x00000002)
+#define IXGBE_I2C_DATA_IN_BY_MAC(_hw)(((_hw)->mac.type) >= ixgbe_mac_X550 ? \
+					0x00001000 : 0x00000004)
+#define IXGBE_I2C_DATA_OUT_BY_MAC(_hw)(((_hw)->mac.type) >= ixgbe_mac_X550 ? \
+					0x00000400 : 0x00000008)
 #define IXGBE_I2C_CLOCK_STRETCHING_TIMEOUT	500
 
 #define IXGBE_I2C_THERMAL_SENSOR_ADDR	0xF8
@@ -292,6 +299,14 @@ struct ixgbe_thermal_sensor_data {
 #define IXGBE_RETA(_i)  (0x05C00 + ((_i) * 4))  /* 32 of these (0-31) */
 #define IXGBE_RSSRK(_i) (0x05C80 + ((_i) * 4))  /* 10 of these (0-9) */
 
+/* Registers for setting up RSS on X550 with SRIOV
+ * _p - pool number (0..63)
+ * _i - index (0..10 for PFVFRSSRK, 0..15 for PFVFRETA)
+ */
+#define IXGBE_PFVFMRQC(_p)	(0x03400 + ((_p) * 4))
+#define IXGBE_PFVFRSSRK(_i, _p)	(0x018000 + ((_i) * 4) + ((_p) * 0x40))
+#define IXGBE_PFVFRETA(_i, _p)	(0x019000 + ((_i) * 4) + ((_p) * 0x40))
+
 /* Flow Director registers */
 #define IXGBE_FDIRCTRL  0x0EE00
 #define IXGBE_FDIRHKEY  0x0EE68
@@ -798,6 +813,12 @@ struct ixgbe_thermal_sensor_data {
 #define IXGBE_PBACLR_82599      0x11068
 #define IXGBE_CIAA_82599        0x11088
 #define IXGBE_CIAD_82599        0x1108C
+#define IXGBE_CIAA_X550         0x11508
+#define IXGBE_CIAD_X550         0x11510
+#define IXGBE_CIAA_BY_MAC(_hw)  ((((_hw)->mac.type >= ixgbe_mac_X550) ? \
+				IXGBE_CIAA_X550 : IXGBE_CIAA_82599))
+#define IXGBE_CIAD_BY_MAC(_hw)  ((((_hw)->mac.type >= ixgbe_mac_X550) ? \
+				IXGBE_CIAD_X550 : IXGBE_CIAD_82599))
 #define IXGBE_PICAUSE           0x110B0
 #define IXGBE_PIENA             0x110B8
 #define IXGBE_CDQ_MBR_82599     0x110B4
@@ -1632,6 +1653,7 @@ enum {
 #define IXGBE_LINKS_TL_FAULT    0x00001000
 #define IXGBE_LINKS_SIGNAL      0x00000F00
 
+#define IXGBE_LINKS_SPEED_NON_STD   0x08000000
 #define IXGBE_LINKS_SPEED_82599     0x30000000
 #define IXGBE_LINKS_SPEED_10G_82599 0x30000000
 #define IXGBE_LINKS_SPEED_1G_82599  0x20000000
@@ -2000,6 +2022,7 @@ enum {
 
 /* Queue Drop Enable */
 #define IXGBE_QDE_ENABLE	0x00000001
+#define IXGBE_QDE_HIDE_VLAN	0x00000002
 #define IXGBE_QDE_IDX_MASK	0x00007F00
 #define IXGBE_QDE_IDX_SHIFT	8
 #define IXGBE_QDE_WRITE		0x00010000
@@ -2437,10 +2460,12 @@ struct ixgbe_adv_tx_context_desc {
 typedef u32 ixgbe_autoneg_advertised;
 /* Link speed */
 typedef u32 ixgbe_link_speed;
-#define IXGBE_LINK_SPEED_UNKNOWN   0
-#define IXGBE_LINK_SPEED_100_FULL  0x0008
-#define IXGBE_LINK_SPEED_1GB_FULL  0x0020
-#define IXGBE_LINK_SPEED_10GB_FULL 0x0080
+#define IXGBE_LINK_SPEED_UNKNOWN	0
+#define IXGBE_LINK_SPEED_100_FULL	0x0008
+#define IXGBE_LINK_SPEED_1GB_FULL	0x0020
+#define IXGBE_LINK_SPEED_2_5GB_FULL	0x0400
+#define IXGBE_LINK_SPEED_5GB_FULL	0x0800
+#define IXGBE_LINK_SPEED_10GB_FULL	0x0080
 #define IXGBE_LINK_SPEED_82598_AUTONEG (IXGBE_LINK_SPEED_1GB_FULL | \
 					IXGBE_LINK_SPEED_10GB_FULL)
 #define IXGBE_LINK_SPEED_82599_AUTONEG (IXGBE_LINK_SPEED_100_FULL | \
@@ -2588,6 +2613,8 @@ enum ixgbe_mac_type {
 	ixgbe_mac_82598EB,
 	ixgbe_mac_82599EB,
 	ixgbe_mac_X540,
+	ixgbe_mac_X550,
+	ixgbe_mac_X550EM_x,
 	ixgbe_num_macs
 };
 
-- 
1.9.3

^ permalink raw reply related

* [net-next v2 10/10] ixgbe: add helper function for setting RSS key in preparation of X550
From: Jeff Kirsher @ 2014-11-11 14:44 UTC (permalink / raw)
  To: davem
  Cc: Don Skidmore, netdev, nhorman, sassmann, jogreene, John Ronciak,
	Jeff Kirsher
In-Reply-To: <1415717086-3441-1-git-send-email-jeffrey.t.kirsher@intel.com>

From: Don Skidmore <donald.c.skidmore@intel.com>

Split off the setting of the RSS key into its own function.  This
will help when we add support for X550 which can have different
RSS keys per pool.

CC: John Ronciak <john.ronciak@intel.com>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 32 +++++++++++++++++----------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 355d1f7..a5ca877 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -3210,14 +3210,10 @@ static void ixgbe_configure_srrctl(struct ixgbe_adapter *adapter,
 	IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(reg_idx), srrctl);
 }
 
-static void ixgbe_setup_mrqc(struct ixgbe_adapter *adapter)
+static void ixgbe_setup_reta(struct ixgbe_adapter *adapter, const u32 *seed)
 {
 	struct ixgbe_hw *hw = &adapter->hw;
-	static const u32 seed[10] = { 0xE291D73D, 0x1805EC6C, 0x2A94B30D,
-			  0xA54F2BEC, 0xEA49AF7C, 0xE214AD3D, 0xB855AABE,
-			  0x6A3E67EA, 0x14364D17, 0x3BED200D};
-	u32 mrqc = 0, reta = 0;
-	u32 rxcsum;
+	u32 reta = 0;
 	int i, j;
 	u16 rss_i = adapter->ring_feature[RING_F_RSS].indices;
 
@@ -3243,6 +3239,16 @@ static void ixgbe_setup_mrqc(struct ixgbe_adapter *adapter)
 		if ((i & 3) == 3)
 			IXGBE_WRITE_REG(hw, IXGBE_RETA(i >> 2), reta);
 	}
+}
+
+static void ixgbe_setup_mrqc(struct ixgbe_adapter *adapter)
+{
+	struct ixgbe_hw *hw = &adapter->hw;
+	static const u32 seed[10] = { 0xE291D73D, 0x1805EC6C, 0x2A94B30D,
+			  0xA54F2BEC, 0xEA49AF7C, 0xE214AD3D, 0xB855AABE,
+			  0x6A3E67EA, 0x14364D17, 0x3BED200D};
+	u32 mrqc = 0, rss_field = 0;
+	u32 rxcsum;
 
 	/* Disable indicating checksum in descriptor, enables RSS hash */
 	rxcsum = IXGBE_READ_REG(hw, IXGBE_RXCSUM);
@@ -3275,16 +3281,18 @@ static void ixgbe_setup_mrqc(struct ixgbe_adapter *adapter)
 	}
 
 	/* Perform hash on these packet types */
-	mrqc |= IXGBE_MRQC_RSS_FIELD_IPV4 |
-		IXGBE_MRQC_RSS_FIELD_IPV4_TCP |
-		IXGBE_MRQC_RSS_FIELD_IPV6 |
-		IXGBE_MRQC_RSS_FIELD_IPV6_TCP;
+	rss_field |= IXGBE_MRQC_RSS_FIELD_IPV4 |
+		     IXGBE_MRQC_RSS_FIELD_IPV4_TCP |
+		     IXGBE_MRQC_RSS_FIELD_IPV6 |
+		     IXGBE_MRQC_RSS_FIELD_IPV6_TCP;
 
 	if (adapter->flags2 & IXGBE_FLAG2_RSS_FIELD_IPV4_UDP)
-		mrqc |= IXGBE_MRQC_RSS_FIELD_IPV4_UDP;
+		rss_field |= IXGBE_MRQC_RSS_FIELD_IPV4_UDP;
 	if (adapter->flags2 & IXGBE_FLAG2_RSS_FIELD_IPV6_UDP)
-		mrqc |= IXGBE_MRQC_RSS_FIELD_IPV6_UDP;
+		rss_field |= IXGBE_MRQC_RSS_FIELD_IPV6_UDP;
 
+	ixgbe_setup_reta(adapter, seed);
+	mrqc |= rss_field;
 	IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc);
 }
 
-- 
1.9.3

^ permalink raw reply related

* [PATCH]  net: phy: Correctly handle MII ioctl which changes autonegotiation.
From: Brian Hill @ 2014-11-11 14:53 UTC (permalink / raw)
  To: netdev
In-Reply-To: <5462210F.7040306@houston-radar.com>


When advertised capabilities are changed with mii-tool, such as:
mii-tool -A 10baseT
the existing handler has two errors.

- An actual PHY register value is provided by mii-tool, and this
  must be mapped to internal state with mii_adv_to_ethtool_adv_t().
- The PHY state machine needs to be told that autonegotiation has
  again been performed.  If not, the MAC will not be notified of
  the new link speed and duplex, resulting in a possible config
  mismatch.

Signed-off-by: Brian Hill <Brian@houston-radar.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
---
 drivers/net/phy/phy.c |   36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index c94e2a2..ee9f0c9 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -352,6 +352,7 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd)
 {
 	struct mii_ioctl_data *mii_data = if_mii(ifr);
 	u16 val = mii_data->val_in;
+	int change_autoneg = 0;
 
 	switch (cmd) {
 	case SIOCGMIIPHY:
@@ -367,22 +368,29 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd)
 		if (mii_data->phy_id == phydev->addr) {
 			switch (mii_data->reg_num) {
 			case MII_BMCR:
-				if ((val & (BMCR_RESET | BMCR_ANENABLE)) == 0)
+				if ((val & (BMCR_RESET | BMCR_ANENABLE)) == 0) {
+					if (phydev->autoneg == AUTONEG_ENABLE)
+						change_autoneg = 1;
 					phydev->autoneg = AUTONEG_DISABLE;
-				else
+					if (val & BMCR_FULLDPLX)
+						phydev->duplex = DUPLEX_FULL;
+					else
+						phydev->duplex = DUPLEX_HALF;
+					if (val & BMCR_SPEED1000)
+						phydev->speed = SPEED_1000;
+					else if (val & BMCR_SPEED100)
+						phydev->speed = SPEED_100;
+					else phydev->speed = SPEED_10;
+				}
+				else {
+					if (phydev->autoneg == AUTONEG_DISABLE)
+						change_autoneg = 1;
 					phydev->autoneg = AUTONEG_ENABLE;
-				if (!phydev->autoneg && (val & BMCR_FULLDPLX))
-					phydev->duplex = DUPLEX_FULL;
-				else
-					phydev->duplex = DUPLEX_HALF;
-				if (!phydev->autoneg && (val & BMCR_SPEED1000))
-					phydev->speed = SPEED_1000;
-				else if (!phydev->autoneg &&
-					 (val & BMCR_SPEED100))
-					phydev->speed = SPEED_100;
+				}
 				break;
 			case MII_ADVERTISE:
-				phydev->advertising = val;
+				phydev->advertising = mii_adv_to_ethtool_adv_t(val);
+				change_autoneg = 1;
 				break;
 			default:
 				/* do nothing */
@@ -396,6 +404,10 @@ int phy_mii_ioctl(struct phy_device *phydev, struct ifreq *ifr, int cmd)
 		if (mii_data->reg_num == MII_BMCR &&
 		    val & BMCR_RESET)
 			return phy_init_hw(phydev);
+
+		if (change_autoneg)
+			return phy_start_aneg(phydev);
+
 		return 0;
 
 	case SIOCSHWTSTAMP:
-- 
1.7.9.5

^ permalink raw reply related

* Re: [patch net-next 1/2] net: move vlan pop/push functions into common code
From: Jiri Pirko @ 2014-11-11 15:00 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, davem, jhs, pshelar, therbert, edumazet, willemb,
	dborkman, mst, fw, Paul.Durrant, tgraf
In-Reply-To: <1415711181.9613.32.camel@edumazet-glaptop2.roam.corp.google.com>

Tue, Nov 11, 2014 at 02:06:21PM CET, eric.dumazet@gmail.com wrote:
>On Tue, 2014-11-11 at 11:13 +0100, Jiri Pirko wrote:
>> So it can be used from out of openvswitch code.
>> Did couple of cosmetic changes on the way, namely variable naming and
>> adding support for 8021AD proto.
>> 
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>> ---
>>  include/linux/skbuff.h    |  2 ++
>>  net/core/skbuff.c         | 86 +++++++++++++++++++++++++++++++++++++++++++++++
>>  net/openvswitch/actions.c | 76 ++---------------------------------------
>>  3 files changed, 90 insertions(+), 74 deletions(-)
>> 
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 103fbe8..3b0445c 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -2673,6 +2673,8 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet);
>>  unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
>>  struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
>>  struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
>> +int skb_vlan_pop(struct sk_buff *skb);
>> +int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci);
>>  
>>  struct skb_checksum_ops {
>>  	__wsum (*update)(const void *mem, int len, __wsum wsum);
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 7001896..75e53d4 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -4151,6 +4151,92 @@ err_free:
>>  }
>>  EXPORT_SYMBOL(skb_vlan_untag);
>>  
>> +/* remove VLAN header from packet and update csum accordingly. */
>> +static int __skb_vlan_pop(struct sk_buff *skb, u16 *vlan_tci)
>> +{
>> +	struct vlan_hdr *vhdr;
>> +	int err;
>> +
>
>
>
>> +	if (!pskb_may_pull(skb, VLAN_ETH_HLEN))
>> +		return -ENOMEM;
>> +
>> +	if (!skb_cloned(skb) || skb_clone_writable(skb, VLAN_ETH_HLEN))
>> +		return 0;
>> +
>> +	err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC);
>> +	if (unlikely(err))
>> +		return err;
>
>All this should be a function of its own (OVS calls this
>make_writable()).
>
>Its too bad netfilter has a different skb_make_writable()

How different is that? Looking at it it seems it is doing the same
thing. Not sure though...

>
>
>> +
>> +	if (skb->ip_summed == CHECKSUM_COMPLETE)
>> +		skb->csum = csum_sub(skb->csum, csum_partial(skb->data
>> +					+ (2 * ETH_ALEN), VLAN_HLEN, 0));
>
>This looks like skb_postpull_rcsum()
>
>BTW, calling csum_partial() for 4 bytes is quite expensive.
>
>
>

^ permalink raw reply

* Re: [patch net-next v2 02/10] net: introduce generic switch devices support
From: Jiri Pirko @ 2014-11-11 15:11 UTC (permalink / raw)
  To: John Fastabend
  Cc: netdev, davem, nhorman, andy, tgraf, dborkman, ogerlitz, jesse,
	pshelar, azhou, ben, stephen, jeffrey.t.kirsher, vyasevic,
	xiyou.wangcong, john.r.fastabend, edumazet, jhs, sfeldma,
	f.fainelli, roopa, linville, jasowang, ebiederm, nicolas.dichtel,
	ryazanov.s.a, buytenh, aviadr, nbd, alexei.starovoitov,
	Neil.Jerram, ronye, simon.horman, alexander.h.duyck, john.ronciak,
	mleitner, shrijeet, gospo, bcrl
In-Reply-To: <5461354A.3020906@gmail.com>

Mon, Nov 10, 2014 at 10:59:38PM CET, john.fastabend@gmail.com wrote:
>On 11/09/2014 02:51 AM, Jiri Pirko wrote:
>>The goal of this is to provide a possibility to support various switch
>>chips. Drivers should implement relevant ndos to do so. Now there is
>>only one ndo defined:
>>- for getting physical switch id is in place.
>>
>>Note that user can use random port netdevice to access the switch.
>>
>>Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>>---
>>  Documentation/networking/switchdev.txt | 59 ++++++++++++++++++++++++++++++++++
>>  MAINTAINERS                            |  7 ++++
>>  include/linux/netdevice.h              | 10 ++++++
>>  include/net/switchdev.h                | 30 +++++++++++++++++
>>  net/Kconfig                            |  1 +
>>  net/Makefile                           |  3 ++
>>  net/switchdev/Kconfig                  | 13 ++++++++
>>  net/switchdev/Makefile                 |  5 +++
>>  net/switchdev/switchdev.c              | 33 +++++++++++++++++++
>>  9 files changed, 161 insertions(+)
>>  create mode 100644 Documentation/networking/switchdev.txt
>>  create mode 100644 include/net/switchdev.h
>>  create mode 100644 net/switchdev/Kconfig
>>  create mode 100644 net/switchdev/Makefile
>>  create mode 100644 net/switchdev/switchdev.c
>>
>>diff --git a/Documentation/networking/switchdev.txt b/Documentation/networking/switchdev.txt
>>new file mode 100644
>>index 0000000..98be76c
>>--- /dev/null
>>+++ b/Documentation/networking/switchdev.txt
>>@@ -0,0 +1,59 @@
>>+Switch (and switch-ish) device drivers HOWTO
>>+===========================
>>+
>>+Please note that the word "switch" is here used in very generic meaning.
>>+This include devices supporting L2/L3 but also various flow offloading chips,
>>+including switches embedded into SR-IOV NICs.
>>+
>>+Lets describe a topology a bit. Imagine the following example:
>>+
>>+       +----------------------------+    +---------------+
>>+       |     SOME switch chip       |    |      CPU      |
>>+       +----------------------------+    +---------------+
>>+       port1 port2 port3 port4 MNGMNT    |     PCI-E     |
>>+         |     |     |     |     |       +---------------+
>>+        PHY   PHY    |     |     |         |  NIC0 NIC1
>>+                     |     |     |         |   |    |
>>+                     |     |     +- PCI-E -+   |    |
>>+                     |     +------- MII -------+    |
>>+                     +------------- MII ------------+
>>+
>>+In this example, there are two independent lines between the switch silicon
>>+and CPU. NIC0 and NIC1 drivers are not aware of a switch presence. They are
>>+separate from the switch driver. SOME switch chip is by managed by a driver
>>+via PCI-E device MNGMNT. Note that MNGMNT device, NIC0 and NIC1 may be
>>+connected to some other type of bus.
>>+
>>+Now, for the previous example show the representation in kernel:
>>+
>>+       +----------------------------+    +---------------+
>>+       |     SOME switch chip       |    |      CPU      |
>>+       +----------------------------+    +---------------+
>>+       sw0p0 sw0p1 sw0p2 sw0p3 MNGMNT    |     PCI-E     |
>>+         |     |     |     |     |       +---------------+
>>+        PHY   PHY    |     |     |         |  eth0 eth1
>>+                     |     |     |         |   |    |
>>+                     |     |     +- PCI-E -+   |    |
>>+                     |     +------- MII -------+    |
>>+                     +------------- MII ------------+
>>+
>>+Lets call the example switch driver for SOME switch chip "SOMEswitch". This
>>+driver takes care of PCI-E device MNGMNT. There is a netdevice instance sw0pX
>>+created for each port of a switch. These netdevices are instances
>>+of "SOMEswitch" driver. sw0pX netdevices serve as a "representation"
>>+of the switch chip. eth0 and eth1 are instances of some other existing driver.
>>+
>>+The only difference of the switch-port netdevice from the ordinary netdevice
>>+is that is implements couple more NDOs:
>>+
>>+	ndo_sw_parent_get_id - This returns the same ID for two port netdevices
>>+			       of the same physical switch chip. This is
>>+			       mandatory to be implemented by all switch drivers
>>+			       and serves the caller for recognition of a port
>>+			       netdevice.
>
>What is the connection between ndo_sw_parent_get_id and
>ndo_get_phys_port_id(). I'm having a bit of trouble teasing
>this out.
>
>For example here is my ascii art for a SR-IOV NIC,
>
>       eth0     eth1     eth2
>        |         |        |
>        |         |        |
>        PF        VF       VF
>   +----+---------+--------+----+
>   |       embedded bridge      |
>   +-------------+--------------+
>                 |
>                port
>
>that can do switching between the various uplinks and downlinks.
>In IEEE 802.1Q language the embedded bridge acts like an edge
>relay. At least that seems to be the current state of the art
>for SR-IOV. Edge relay just means it has a single uplink port
>to the network and multiple downlinks and also isn't required
>to do learning and run loop detection protocols STP, et. al.
>
>Also there are multi-function devices that look the same except
>replace the VFs with PFs. It seems to be a common mode for NICs
>that do the iSCSI offloads with storage functions.
>
>When something is an embedded bridge vs a SOME switch chip is
>not entirely clear.
>
>My understanding is use ndo_sw_parent_get_id() when you have
>multiple physical ports all connected to a single switch object.
>When you have a single port connected to multiple PCIE functions
>or queues representing a netdev (e.g. macvlan offload) use the
>ndo_get_phys_port_id(). Just want to be sure we are on the
>same page here.

Nod. You described that right.


>
>Otherwise patch looks good. I think we can clear the above up
>with an addition to the documentation. Could go in after the
>initial set and be OK with me.
>
>IMO this patch is needed otherwise user space is at a complete
>loss on trying to figure out how netdevs map to switch silicon.
>You could have reused ndo_get_phys_port_id() perhaps but then
>I think user space may get confused by SR-IOV/VMDQ/etc ports
>attached to a switch silicon. For .02$ having a new distinct
>identifier is cleaner.

It most definitelly is. Therefore I went that way.


>
>
>>+	ndo_sw_parent_* - Functions that serve for a manipulation of the switch
>>+			  chip itself (it can be though of as a "parent" of the
>>+			  port, therefore the name). They are not port-specific.
>>+			  Caller might use arbitrary port netdevice of the same
>>+			  switch and it will make no difference.
>>+	ndo_sw_port_* - Functions that serve for a port-specific manipulation.
>
>[...]
>
>Thanks,
>John
>
>
>-- 
>John Fastabend         Intel Corporation

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox