Netdev List
 help / color / mirror / Atom feed
* [PATCH iproute2 3/3] link_vti6: Always add local/remote endpoint attributes
From: Serhey Popovych @ 2017-12-18 17:48 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1513619285-23187-1-git-send-email-serhe.popovych@gmail.com>

All tunnels already support for parsing/adding zero
endpoints and vti6 isn't an exception.

This check was added as part of commit 2a80154fde40
(vti6: fix local/remote any addr handling) and looks
too restrictive as purpose of change is to avoid
endpoint configuration from uninitialized data.

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
---
 ip/link_vti6.c |    6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/ip/link_vti6.c b/ip/link_vti6.c
index f631839..4136b0e 100644
--- a/ip/link_vti6.c
+++ b/ip/link_vti6.c
@@ -154,10 +154,8 @@ get_failed:
 	addattr32(n, 1024, IFLA_VTI_IKEY, ikey);
 	addattr32(n, 1024, IFLA_VTI_OKEY, okey);
 
-	if (memcmp(&saddr, &in6addr_any, sizeof(in6addr_any)))
-	    addattr_l(n, 1024, IFLA_VTI_LOCAL, &saddr, sizeof(saddr));
-	if (memcmp(&daddr, &in6addr_any, sizeof(in6addr_any)))
-	    addattr_l(n, 1024, IFLA_VTI_REMOTE, &daddr, sizeof(daddr));
+	addattr_l(n, 1024, IFLA_VTI_LOCAL, &saddr, sizeof(saddr));
+	addattr_l(n, 1024, IFLA_VTI_REMOTE, &daddr, sizeof(daddr));
 	addattr32(n, 1024, IFLA_VTI_FWMARK, fwmark);
 	if (link)
 		addattr32(n, 1024, IFLA_VTI_LINK, link);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH iproute2 2/3] link_ip6tnl: Use IN6ADDR_ANY_INIT to initialize local/remote endpoints
From: Serhey Popovych @ 2017-12-18 17:48 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1513619285-23187-1-git-send-email-serhe.popovych@gmail.com>

Use specialized helper to initialize endpoint addresses with
zeros instead of open coding this. This unifies initialization
style with other ipv6 tunnel variants (i.e. gre6 and vti6).

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
---
 ip/link_ip6tnl.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ip/link_ip6tnl.c b/ip/link_ip6tnl.c
index 83a4320..f11ddd5 100644
--- a/ip/link_ip6tnl.c
+++ b/ip/link_ip6tnl.c
@@ -88,8 +88,8 @@ static int ip6tunnel_parse_opt(struct link_util *lu, int argc, char **argv,
 	struct rtattr *linkinfo[IFLA_INFO_MAX+1];
 	struct rtattr *iptuninfo[IFLA_IPTUN_MAX + 1];
 	int len;
-	struct in6_addr laddr = {};
-	struct in6_addr raddr = {};
+	struct in6_addr laddr = IN6ADDR_ANY_INIT;
+	struct in6_addr raddr = IN6ADDR_ANY_INIT;
 	__u8 hop_limit = DEFAULT_TNL_HOP_LIMIT;
 	__u8 encap_limit = IPV6_DEFAULT_TNL_ENCAP_LIMIT;
 	__u32 flowinfo = 0;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH iproute2 1/3] ip/tunnel: Use tnl_parse_key() to parse tunnel key
From: Serhey Popovych @ 2017-12-18 17:48 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1513619285-23187-1-git-send-email-serhe.popovych@gmail.com>

It is added with commit a7ed1520ee96 (ip/tunnel:
introduce tnl_parse_key()) to avoid code duplication
in ip6?tunnel.c.

Reuse it for gre/gre6 and vti/vti6 tunnel rtnl
configuration interface with the same purpose
it is used in tunnel ioctl interface in ip6?tunnel.c.

While there change type of key variables from
unsigned integer to __be32 to reflect nature of the
value they store and place error message in
tnl_parse_key() on a single line to make single
call to fprintf().

Signed-off-by: Serhey Popovych <serhe.popovych@gmail.com>
---
 ip/link_gre.c  |   45 +++++----------------------------------------
 ip/link_gre6.c |   45 +++++----------------------------------------
 ip/link_vti.c  |   45 +++++----------------------------------------
 ip/link_vti6.c |   45 +++++----------------------------------------
 ip/tunnel.c    |    5 +++--
 5 files changed, 23 insertions(+), 162 deletions(-)

diff --git a/ip/link_gre.c b/ip/link_gre.c
index 09f1e44..2397920 100644
--- a/ip/link_gre.c
+++ b/ip/link_gre.c
@@ -81,8 +81,8 @@ static int gre_parse_opt(struct link_util *lu, int argc, char **argv,
 	struct rtattr *greinfo[IFLA_GRE_MAX + 1];
 	__u16 iflags = 0;
 	__u16 oflags = 0;
-	unsigned int ikey = 0;
-	unsigned int okey = 0;
+	__be32 ikey = 0;
+	__be32 okey = 0;
 	unsigned int saddr = 0;
 	unsigned int daddr = 0;
 	unsigned int link = 0;
@@ -184,53 +184,18 @@ get_failed:
 
 	while (argc > 0) {
 		if (!matches(*argv, "key")) {
-			unsigned int uval;
-
 			NEXT_ARG();
 			iflags |= GRE_KEY;
 			oflags |= GRE_KEY;
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr,
-						"Invalid value for \"key\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-
-			ikey = okey = uval;
+			ikey = okey = tnl_parse_key("key", *argv);
 		} else if (!matches(*argv, "ikey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
 			iflags |= GRE_KEY;
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value for \"ikey\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			ikey = uval;
+			ikey = tnl_parse_key("ikey", *argv);
 		} else if (!matches(*argv, "okey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
 			oflags |= GRE_KEY;
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value for \"okey\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			okey = uval;
+			okey = tnl_parse_key("okey", *argv);
 		} else if (!matches(*argv, "seq")) {
 			iflags |= GRE_SEQ;
 			oflags |= GRE_SEQ;
diff --git a/ip/link_gre6.c b/ip/link_gre6.c
index c22fded..7190ada 100644
--- a/ip/link_gre6.c
+++ b/ip/link_gre6.c
@@ -92,8 +92,8 @@ static int gre_parse_opt(struct link_util *lu, int argc, char **argv,
 	struct rtattr *greinfo[IFLA_GRE_MAX + 1];
 	__u16 iflags = 0;
 	__u16 oflags = 0;
-	unsigned int ikey = 0;
-	unsigned int okey = 0;
+	__be32 ikey = 0;
+	__be32 okey = 0;
 	struct in6_addr raddr = IN6ADDR_ANY_INIT;
 	struct in6_addr laddr = IN6ADDR_ANY_INIT;
 	unsigned int link = 0;
@@ -192,53 +192,18 @@ get_failed:
 
 	while (argc > 0) {
 		if (!matches(*argv, "key")) {
-			unsigned int uval;
-
 			NEXT_ARG();
 			iflags |= GRE_KEY;
 			oflags |= GRE_KEY;
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr,
-						"Invalid value for \"key\"\n");
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-
-			ikey = okey = uval;
+			ikey = okey = tnl_parse_key("key", *argv);
 		} else if (!matches(*argv, "ikey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
 			iflags |= GRE_KEY;
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value of \"ikey\"\n");
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			ikey = uval;
+			ikey = tnl_parse_key("ikey", *argv);
 		} else if (!matches(*argv, "okey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
 			oflags |= GRE_KEY;
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value of \"okey\"\n");
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			okey = uval;
+			okey = tnl_parse_key("okey", *argv);
 		} else if (!matches(*argv, "seq")) {
 			iflags |= GRE_SEQ;
 			oflags |= GRE_SEQ;
diff --git a/ip/link_vti.c b/ip/link_vti.c
index 05aefa3..6c5469f 100644
--- a/ip/link_vti.c
+++ b/ip/link_vti.c
@@ -64,8 +64,8 @@ static int vti_parse_opt(struct link_util *lu, int argc, char **argv,
 	struct rtattr *tb[IFLA_MAX + 1];
 	struct rtattr *linkinfo[IFLA_INFO_MAX+1];
 	struct rtattr *vtiinfo[IFLA_VTI_MAX + 1];
-	unsigned int ikey = 0;
-	unsigned int okey = 0;
+	__be32 ikey = 0;
+	__be32 okey = 0;
 	unsigned int saddr = 0;
 	unsigned int daddr = 0;
 	unsigned int link = 0;
@@ -122,49 +122,14 @@ get_failed:
 
 	while (argc > 0) {
 		if (!matches(*argv, "key")) {
-			unsigned int uval;
-
 			NEXT_ARG();
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr,
-						"Invalid value for \"key\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-
-			ikey = okey = uval;
+			ikey = okey = tnl_parse_key("key", *argv);
 		} else if (!matches(*argv, "ikey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value for \"ikey\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			ikey = uval;
+			ikey = tnl_parse_key("ikey", *argv);
 		} else if (!matches(*argv, "okey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value for \"okey\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			okey = uval;
+			okey = tnl_parse_key("okey", *argv);
 		} else if (!matches(*argv, "remote")) {
 			NEXT_ARG();
 			daddr = get_addr32(*argv);
diff --git a/ip/link_vti6.c b/ip/link_vti6.c
index 84824a5..f631839 100644
--- a/ip/link_vti6.c
+++ b/ip/link_vti6.c
@@ -61,8 +61,8 @@ static int vti6_parse_opt(struct link_util *lu, int argc, char **argv,
 	struct rtattr *vtiinfo[IFLA_VTI_MAX + 1];
 	struct in6_addr saddr = IN6ADDR_ANY_INIT;
 	struct in6_addr daddr = IN6ADDR_ANY_INIT;
-	unsigned int ikey = 0;
-	unsigned int okey = 0;
+	__be32 ikey = 0;
+	__be32 okey = 0;
 	unsigned int link = 0;
 	__u32 fwmark = 0;
 	int len;
@@ -117,49 +117,14 @@ get_failed:
 
 	while (argc > 0) {
 		if (!matches(*argv, "key")) {
-			unsigned int uval;
-
 			NEXT_ARG();
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr,
-						"Invalid value for \"key\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-
-			ikey = okey = uval;
+			ikey = okey = tnl_parse_key("key", *argv);
 		} else if (!matches(*argv, "ikey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value for \"ikey\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			ikey = uval;
+			ikey = tnl_parse_key("ikey", *argv);
 		} else if (!matches(*argv, "okey")) {
-			unsigned int uval;
-
 			NEXT_ARG();
-			if (strchr(*argv, '.'))
-				uval = get_addr32(*argv);
-			else {
-				if (get_unsigned(&uval, *argv, 0) < 0) {
-					fprintf(stderr, "invalid value for \"okey\": \"%s\"; it should be an unsigned integer\n", *argv);
-					exit(-1);
-				}
-				uval = htonl(uval);
-			}
-			okey = uval;
+			okey = tnl_parse_key("okey", *argv);
 		} else if (!matches(*argv, "remote")) {
 			inet_prefix addr;
 
diff --git a/ip/tunnel.c b/ip/tunnel.c
index d359eb9..f860103 100644
--- a/ip/tunnel.c
+++ b/ip/tunnel.c
@@ -192,8 +192,9 @@ __be32 tnl_parse_key(const char *name, const char *key)
 		return get_addr32(key);
 
 	if (get_unsigned(&uval, key, 0) < 0) {
-		fprintf(stderr, "invalid value for \"%s\": \"%s\";", name, key);
-		fprintf(stderr, " it should be an unsigned integer\n");
+		fprintf(stderr,
+			"invalid value for \"%s\": \"%s\"; it should be an unsigned integer\n",
+			name, key);
 		exit(-1);
 	}
 	return htonl(uval);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH iproute2 0/3] ip/tunnels: Reuse code, vti6 zero endpoint support and minor cleanup
From: Serhey Popovych @ 2017-12-18 17:48 UTC (permalink / raw)
  To: netdev

In this series I present next set of improvements:

  1) Use tnl_parse_key() to avoid code duplication in tunnel
     configuration via netlink code.

  2) Trivial: use IN6ADDR_ANY_INIT instead of open coded
     initialization of local/remote endpoint in ip6tnl code.

  3) Trivial: drop additional checks for zero endpoint
     in vti6 code. This completes and unifies support for
     unconfiguring local/remote endpoint for tunnel.

See individual patch description message for details.

Thanks,
Serhii

Serhey Popovych (3):
  ip/tunnel: Use tnl_parse_key() to parse tunnel key
  link_ip6tnl: Use IN6ADDR_ANY_INIT to initialize local/remote
    endpoints
  link_vti6: Always add local/remote endpoint attributes

 ip/link_gre.c    |   45 +++++----------------------------------------
 ip/link_gre6.c   |   45 +++++----------------------------------------
 ip/link_ip6tnl.c |    4 ++--
 ip/link_vti.c    |   45 +++++----------------------------------------
 ip/link_vti6.c   |   51 +++++++--------------------------------------------
 ip/tunnel.c      |    5 +++--
 6 files changed, 27 insertions(+), 168 deletions(-)

-- 
1.7.10.4

^ permalink raw reply

* [PATCH][next] bpf: make function skip_callee static and return NULL rather than 0
From: Colin King @ 2017-12-18 17:47 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Function skip_callee is local to the source and does not need to
be in global scope, so make it static. Also return NULL rather than 0.
Cleans up two sparse warnings:

symbol 'skip_callee' was not declared. Should it be static?
Using plain integer as NULL pointer

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 kernel/bpf/verifier.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2f6f09cd1925..52689f2abbcb 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -823,6 +823,7 @@ static int check_subprogs(struct bpf_verifier_env *env)
 	return 0;
 }
 
+static
 struct bpf_verifier_state *skip_callee(struct bpf_verifier_env *env,
 				       const struct bpf_verifier_state *state,
 				       struct bpf_verifier_state *parent,
@@ -867,7 +868,7 @@ struct bpf_verifier_state *skip_callee(struct bpf_verifier_env *env,
 	verbose(env, "verifier bug regno %d tmp %p\n", regno, tmp);
 	verbose(env, "regno %d parent frame %d current frame %d\n",
 		regno, parent->curframe, state->curframe);
-	return 0;
+	return NULL;
 }
 
 static int mark_reg_read(struct bpf_verifier_env *env,
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH v2 0/5] Support for generalized use of make C={1,2} via a wrapper program
From: Jason Gunthorpe @ 2017-12-18 17:46 UTC (permalink / raw)
  To: Joe Perches
  Cc: Knut Omang, Stephen Hemminger,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Mauro Carvalho Chehab,
	Nicolas Palix, Jonathan Corbet, Santosh Shilimkar, Matthew Wilcox,
	cocci-/FJkirnvOdkvYVN+rsErww, rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-doc-u79uwXL29TY76Z2rM5mHXA, Doug Ledford,
	Mickaël Salaün, Shuah Khan,
	linux-kbuild-u79uwXL29TY76Z2rM5mHXA, Michal Marek, Julia Lawall,
	John Haxby, Åsmund Østvold
In-Reply-To: <1513576817.31581.58.camel-6d6DIl74uiNBDgjK7y7TUQ@public.gmane.org>

On Sun, Dec 17, 2017 at 10:00:17PM -0800, Joe Perches wrote:

> > Today when we run checkers we get so many warnings it is too hard to
> > make any sense of it.
> 
> Here is a list of the checkpatch messages for drivers/infiniband
> sorted by type.
> 
> Many of these might be corrected by using
> 
> $ ./scripts/checkpatch.pl -f --fix-inplace --types=<TYPE> \
>   $(git ls-files drivers/infiniband/)

How many of these do you think it is worth to fix?

We do get a steady trickle of changes in this topic every cycle.

Is it better to just do a big number of them all at once? Do you have
an idea how disruptive this kind of work is to the whole patch flow
eg new patches no longer applying to for-next, backports no longer
applying, merge conflicts?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
From: Sowmini Varadhan @ 2017-12-18 17:22 UTC (permalink / raw)
  To: David Miller
  Cc: santosh.shilimkar, rds-devel,
	bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-rdma, netdev,
	syzkaller-bugs, linux-kernel
In-Reply-To: <20171218.121213.289437104214632276.davem@davemloft.net>

> From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Date: Mon, 18 Dec 2017 08:28:05 -0800
  :
> > Looks like another one tripping on empty transport. Mostly below
> > should
> > address it but we will test it if it does.

that was my first thought, but it cannot be the case here: rds_sendmsg
etc itself would have bombed if that were the case, and the packet
would never have gotten queued.

This is unlike f3069c6d33, where an applications skips the transport
binding (either misses the explicit bind, or gets the wrong transport
due to an implicit bind) before it triggers the setsockopt.

I suspect that the problems is that the conn (and thus c_trans)
have gotten destroyed, but the cp_send_w work got incorrectly 
re-queued. For example, rds_cong_queue_updates() (because the
peer sent a congestion update) can happen in softirq context, 
and would end up requeing work in the middle of rds_conn_destroy, 
after we have assumed that everything is quisced.

On (12/18/17 12:12), David Miller wrote:
> 
> We're seeming to accumulate a lot of checks like this, maybe there
> is a more general way to deal with this problem?

Yeah, I was thinking about this..  let me try to reprodcue this in-house
and get back with a patchset.  

--Sowmini

^ permalink raw reply

* Re: BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
From: Santosh Shilimkar @ 2017-12-18 17:16 UTC (permalink / raw)
  To: David Miller
  Cc: bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63-Pl5Pbv+GP7P466ipTTIvnc23WoclnBCfAL8bYrjMMd8,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	syzkaller-bugs-/JYPxA39Uh5TLH3MbocFFw
In-Reply-To: <20171218.121213.289437104214632276.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On 12/18/2017 9:12 AM, David Miller wrote:
> From: Santosh Shilimkar <santosh.shilimkar-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
> Date: Mon, 18 Dec 2017 08:28:05 -0800
> 
>> On 12/18/2017 12:43 AM, syzbot wrote:
>>> Hello,
>>> syzkaller hit the following crash on
>>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>> Unfortunately, I don't have any reproducer for this bug yet.
>>> BUG: unable to handle kernel NULL pointer dereference at
>>> 0000000000000028
>>> program syz-executor6 is using a deprecated SCSI ioctl, please convert
>>> it to SG_IO
>>> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186
>>
>> Looks like another one tripping on empty transport. Mostly below
>> should
>> address it but we will test it if it does.
>>
>> diff --git a/net/rds/send.c b/net/rds/send.c
>> index 7244d2e..e2d0eaa 100644
>> --- a/net/rds/send.c
>> +++ b/net/rds/send.c
>> @@ -183,7 +183,7 @@ int rds_send_xmit(struct rds_conn_path *cp)
>>                  goto out;
>>          }
>>
>> -       if (conn->c_trans->xmit_path_prepare)
>> +       if (conn->c_trans && conn->c_trans->xmit_path_prepare)
>>                  conn->c_trans->xmit_path_prepare(cp);
> 
> We're seeming to accumulate a lot of checks like this, maybe there
> is a more general way to deal with this problem?
> 
Agree. Some of these additional transports hooks got added later
to specific transports which needs them. Will review this overall
and see if it can be addressed generically.

Regards,
Santosh
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: Linux 4.14 - regression: broken tun/tap / bridge network with virtio - bisected
From: Andreas Hartmann @ 2017-12-18 17:11 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Michal Kubecek, Jason Wang, David Miller, Network Development
In-Reply-To: <CAF=yD-LWyCD4Y0aJ9O0e_CHLR+3JOeKicRRTEVCPxgw4XOcqGQ@mail.gmail.com>

On 12/17/2017 at 11:33 PM Willem de Bruijn wrote:
> On Fri, Dec 15, 2017 at 1:05 AM, Andreas Hartmann
> <andihartmann@01019freenet.de> wrote:
>> On 12/14/2017 at 11:17 PM Willem de Bruijn wrote:
>>>>> Well, the patch does not fix hanging VMs, which have been shutdown and
>>>>> can't be killed any more.
>>>>> Because of the stack trace
>>>>>
>>>>> [<ffffffffc0d0e3c5>] vhost_net_ubuf_put_and_wait+0x35/0x60 [vhost_net]
>>>>> [<ffffffffc0d0f264>] vhost_net_ioctl+0x304/0x870 [vhost_net]
>>>>> [<ffffffff9b25460f>] do_vfs_ioctl+0x8f/0x5c0
>>>>> [<ffffffff9b254bb4>] SyS_ioctl+0x74/0x80
>>>>> [<ffffffff9b00365b>] do_syscall_64+0x5b/0x100
>>>>> [<ffffffff9b78e7ab>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>
>>>>> I was hoping, that the problems could be related - but that seems not to
>>>>> be true.
>>>>
>>>> However, it turned out, that reverting the complete patchset "Remove UDP
>>>> Fragmentation Offload support" prevent hanging qemu processes.
>>>
>>> That implies a combination of UFO and vhost zerocopy. Disabling
>>> experimental_zcopytx in vhost_net will probably work around the bug
>>> then.
> 
> I have been able to reproduce the hang by sending a UFO packet
> between two guests running v4.13 on a host running v4.15-rc1.
> 
> The vhost_net_ubuf_ref refcount indeed hits overflow (-1) from
> vhost_zerocopy_callback being called for each segment of a
> segmented UFO skb. This refcount is decremented then on each
> segment, but incremented only once for the entire UFO skb.
> 
> Before v4.14, these packets would be converted in skb_segment to
> regular copy packets with skb_orphan_frags and the callback function
> called once at this point. v4.14 added support for reference counted
> zerocopy skb that can pass through skb_orphan_frags unmodified and
> have their zerocopy state safely cloned with skb_zerocopy_clone.
> 
> The call to skb_zerocopy_clone must come after skb_orphan_frags
> to limit cloning of this state to those skbs that can do so safely.
> 
> Please try a host with the following patch. This fixes it for me. I intend to
> send it to net.
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index a592ca025fc4..d2d985418819 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3654,8 +3654,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> 
>                 skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags &
>                                               SKBTX_SHARED_FRAG;
> -               if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC))
> -                       goto err;
> 
>                 while (pos < offset + len) {
>                         if (i >= nfrags) {
> @@ -3681,6 +3679,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> 
>                         if (unlikely(skb_orphan_frags(frag_skb, GFP_ATOMIC)))
>                                 goto err;
> +                       if (skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC))
> +                               goto err;
> 
>                         *nskb_frag = *frag;
>                         __skb_frag_ref(nskb_frag);
> 
> 
> This is relatively inefficient, as it calls skb_zerocopy_clone for each frag
> in the frags[] array. I will follow-up with a patch to net-next that only
> checks once per skb:
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 466581cf4cdc..a293a33604ec 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3662,7 +3662,8 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> 
>                 skb_shinfo(nskb)->tx_flags |= skb_shinfo(head_skb)->tx_flags &
>                                               SKBTX_SHARED_FRAG;
> -               if (skb_zerocopy_clone(nskb, head_skb, GFP_ATOMIC))
> +               if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
> +                   skb_zerocopy_clone(nskb, frag_skb, GFP_ATOMIC))
>                         goto err;
> 
>                 while (pos < offset + len) {
> @@ -3676,6 +3677,11 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
> 
>                                 BUG_ON(!nfrags);
> 
> +                               if (skb_orphan_frags(frag_skb, GFP_ATOMIC) ||
> +                                   skb_zerocopy_clone(nskb, frag_skb,
> +                                                      GFP_ATOMIC))
> +                                       goto err;
> +
>                                 list_skb = list_skb->next;
>                         }
> 
> @@ -3687,9 +3693,6 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>                                 goto err;
>                         }
> 
> -                       if (unlikely(skb_orphan_frags(frag_skb, GFP_ATOMIC)))
> -                               goto err;
> -

I'm currently testing this one.

> 
> I'll also send to net-next
> 
> (1) a patch to convert its vhost_net_ ubuf_ref refcnt to refcount_t
> 
> (2) a path to skb_zerocopy_clone to warn on clone if not
>      sock_zerocopy_callback
> 
>> I already tested it w/ options vhost_net experimental_zcopytx=0 - but
>> this didn't "resolve" anything. See
>> https://www.mail-archive.com/netdev@vger.kernel.org/msg203197.html
>>
>> Therefore, I think your following thoughts are lapsed unfortunately,
>> aren't they?
> 
> That experiment was perhaps run before commit 0c19f846d582 ("net:
> accept UFO datagrams from tuntap and packet") and hit the other UFO
> bug.

That's probably true.


Thanks,
Andreas

^ permalink raw reply

* [net  1/1] tipc: remove leaving group member from all lists
From: Jon Maloy @ 2017-12-18 17:13 UTC (permalink / raw)
  To: davem, netdev
  Cc: mohan.krishna.ghanta.krishnamurthy, tung.q.nguyen, hoang.h.le,
	jon.maloy, canh.d.luu, ying.xue, tipc-discussion

A group member going into state LEAVING should never go back to any
other state before it is finally deleted. However, this might happen
if the socket needs to send out a RECLAIM message during this interval.
Since we forget to remove the leaving member from the group's 'active'
or 'pending' list, the member might be selected for reclaiming, change
state to RECLAIMING, and get stuck in this state instead of being
deleted. This might lead to suppression of the expected 'member down'
event to the receiver.

We fix this by removing the member from all lists, except the RB tree,
at the moment it goes into state LEAVING.

Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
---
 net/tipc/group.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/tipc/group.c b/net/tipc/group.c
index efb5714..b96ec42 100644
--- a/net/tipc/group.c
+++ b/net/tipc/group.c
@@ -699,6 +699,9 @@ void tipc_group_proto_rcv(struct tipc_group *grp, bool *usr_wakeup,
 		if (!m)
 			return;
 		m->bc_syncpt = msg_grp_bc_syncpt(hdr);
+		list_del_init(&m->list);
+		list_del_init(&m->congested);
+		*usr_wakeup = true;
 
 		/* Wait until WITHDRAW event is received */
 		if (m->state != MBR_LEAVING) {
@@ -710,8 +713,6 @@ void tipc_group_proto_rcv(struct tipc_group *grp, bool *usr_wakeup,
 		ehdr = buf_msg(m->event_msg);
 		msg_set_grp_bc_seqno(ehdr, m->bc_syncpt);
 		__skb_queue_tail(inputq, m->event_msg);
-		*usr_wakeup = true;
-		list_del_init(&m->congested);
 		return;
 	case GRP_ADV_MSG:
 		if (!m)
@@ -863,6 +864,7 @@ void tipc_group_member_evt(struct tipc_group *grp,
 				msg_set_grp_bc_seqno(hdr, m->bc_rcv_nxt);
 			__skb_queue_tail(inputq, skb);
 		}
+		list_del_init(&m->list);
 		list_del_init(&m->congested);
 	}
 	*sk_rcvbuf = tipc_group_rcvbuf_limit(grp);
-- 
2.1.4

^ permalink raw reply related

* Re: BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
From: David Miller @ 2017-12-18 17:12 UTC (permalink / raw)
  To: santosh.shilimkar
  Cc: bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-kernel,
	linux-rdma, netdev, rds-devel, syzkaller-bugs
In-Reply-To: <5ba83a68-0103-d514-1b22-900f755f5aa4@oracle.com>

From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Date: Mon, 18 Dec 2017 08:28:05 -0800

> On 12/18/2017 12:43 AM, syzbot wrote:
>> Hello,
>> syzkaller hit the following crash on
>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> Unfortunately, I don't have any reproducer for this bug yet.
>> BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000028
>> program syz-executor6 is using a deprecated SCSI ioctl, please convert
>> it to SG_IO
>> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186
> 
> Looks like another one tripping on empty transport. Mostly below
> should
> address it but we will test it if it does.
> 
> diff --git a/net/rds/send.c b/net/rds/send.c
> index 7244d2e..e2d0eaa 100644
> --- a/net/rds/send.c
> +++ b/net/rds/send.c
> @@ -183,7 +183,7 @@ int rds_send_xmit(struct rds_conn_path *cp)
>                 goto out;
>         }
> 
> -       if (conn->c_trans->xmit_path_prepare)
> +       if (conn->c_trans && conn->c_trans->xmit_path_prepare)
>                 conn->c_trans->xmit_path_prepare(cp);

We're seeming to accumulate a lot of checks like this, maybe there
is a more general way to deal with this problem?

^ permalink raw reply

* Re: [PATCH v3 net-next 0/6] tls: Add generic NIC offload infrastructure
From: Jiri Pirko @ 2017-12-18 17:10 UTC (permalink / raw)
  To: Ilya Lesokhin
  Cc: netdev, davem, davejwatson, tom, hannes, borisp, aviadye, liranl
In-Reply-To: <20171218111033.13256-1-ilyal@mellanox.com>

Mon, Dec 18, 2017 at 12:10:27PM CET, ilyal@mellanox.com wrote:
>Changes from v2:
>- Fix sk use after free and possible netdev use after free
>- tls device now keeps a refernce on the offloading netdev
>- tls device registers to the netdev notifer. 
>  Upon a NETDEV_DOWN event, offload is stopped and
>  the reference on the netdev is dropped.
>- SW fallback support for skb->ip_summed != CHECKSUM_PARTIAL 
>- Merged TLS patches are no longer part of this series.
>
>Changes from v1:
>- Remove the binding of the socket to a specific netdev 
>  through sk->sk_bound_dev_if.
>  Add a check in validate_xmit_skb to detect route changes
>  and call SW fallback code to do the crypto in software.
>- tls_get_record now returns the tls record sequence number.
>  This is required to support connections with rcd_sn != iv.
>- Bug fixes to the TLS code.
>
>This patchset adds a generic infrastructure to offload TLS crypto to a
>network devices.
>
>patches 1-2 Export functions that we need
>patch 3 adds infrastructue for offloaded socket fallback
>patches 4-5 add new NDOs and capabilities.
>patch 6 adds the TLS NIC offload infrastructure.
>
>Github with mlx5e TLS offload support:
>https://github.com/Mellanox/tls-offload/tree/tls_device_v3

I don't get it. You are pushing infra but not the actual driver part
who is consuming the infra? Why?

^ permalink raw reply

* RE: [Intel-wired-lan] v4.15-rc2 on thinkpad x60: ethernet stopped working
From: Fujinaka, Todd @ 2017-12-18 17:07 UTC (permalink / raw)
  To: Neftin, Sasha, Pavel Machek, Keller, Jacob E
  Cc: bpoirier@suse.com, nix.or.die@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	lsorense@csclub.uwaterloo.ca, David Miller
In-Reply-To: <077087f2-551a-c045-6b07-b1b661e53dad@intel.com>

Jeff was out sick last week. It might take him a bit to catch up.

I'll remind him when I see him next (which I hope is soon).

Todd Fujinaka
Software Application Engineer
Datacenter Engineering Group
Intel Corporation
todd.fujinaka@intel.com

-----Original Message-----
From: Intel-wired-lan [mailto:intel-wired-lan-bounces@osuosl.org] On Behalf Of Neftin, Sasha
Sent: Monday, December 18, 2017 7:50 AM
To: Pavel Machek <pavel@ucw.cz>; Keller, Jacob E <jacob.e.keller@intel.com>
Cc: bpoirier@suse.com; nix.or.die@gmail.com; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; intel-wired-lan@lists.osuosl.org; lsorense@csclub.uwaterloo.ca; David Miller <davem@davemloft.net>
Subject: Re: [Intel-wired-lan] v4.15-rc2 on thinkpad x60: ethernet stopped working

On 12/18/2017 13:58, Pavel Machek wrote:
> On Mon 2017-12-18 13:24:40, Neftin, Sasha wrote:
>> On 12/18/2017 12:26, Pavel Machek wrote:
>>> Hi!
>>>
>>>>>>> In v4.15-rc2+, network manager can not see my ethernet card, and 
>>>>>>> manual attempts to ifconfig it up did not really help, either.
>>>>>>>
>>>>>>> Card is:
>>>>>>>
>>>>>>> 02:00.0 Ethernet controller: Intel Corporation 82573L Gigabit 
>>>>>>> Ethernet Controller
>>>>> ....
>>>>>>> Any ideas ?
>>>>>> Yes , 19110cfbb34d4af0cdfe14cd243f3b09dc95b013 broke it.
>>>>>>
>>>>>> See:
>>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=198047
>>>>>>
>>>>>> Fix there :
>>>>>> https://marc.info/?l=linux-kernel&m=151272209903675&w=2
>>>>> I don't see the patch in latest mainline. Not having ethernet 
>>>>> is... somehow annoying. What is going on there?
>>>> Generally speaking, e1000 maintainence has been handled very poorly 
>>>> over the past few years, I have to say.
>>>>
>>>> Fixes take forever to propagate even when someone other than the 
>>>> maintainer provides a working and tested fix, just like this case.
>>>>
>>>> Jeff, please take e1000 maintainence seriously and get these 
>>>> critical bug fixes propagated.
>>> No response AFAICT. I guess I should test reverting 
>>> 19110cfbb34d4af0cdfe14cd243f3b09dc95b013, then ask you for revert?
>> Hello Pavel,
>>
>> Before ask for reverting 19110cfbb..., please, check if follow patch 
>> of Benjamin work for you http://patchwork.ozlabs.org/patch/846825/
> Jacob said, in another email:
>
> # Digging into this, the problem is complicated. The original bug # 
> assumed behavior of the .check_for_link call, which is universally not 
> # implemented.
> #
> # I think the correct fix is to revert 19110cfbb34d ("e1000e: Separate 
> # signaling for link check/link up", 2017-10-10) and find a more proper solution.
>
> ...which makes me think that revert is preffered?
>
> 									Pavel
>
Pavel, before ask for revert - let's check Benjamin's patch following to his previous patch. Previous patch was not competed and latest one come to complete changes.

_______________________________________________
Intel-wired-lan mailing list
Intel-wired-lan@osuosl.org
https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply

* Re: [v2 PATCH -tip 3/6] net: sctp: Add SCTP ACK tracking trace event
From: Steven Rostedt @ 2017-12-18 17:05 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, Ian McDonald, Vlad Yasevich, Stephen Hemminger,
	Peter Zijlstra, Thomas Gleixner, LKML, H . Peter Anvin,
	Gerrit Renker, David S . Miller, Neil Horman, dccp, netdev,
	linux-sctp, Stephen Rothwell
In-Reply-To: <151358473510.28850.10475072993963389604.stgit@devbox>

On Mon, 18 Dec 2017 17:12:15 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Add SCTP ACK tracking trace event to trace the changes of SCTP
> association state in response to incoming packets.
> It is used for debugging SCTP congestion control algorithms,
> and will replace sctp_probe module.
> 
> Note that this event a bit tricky. Since this consists of 2
> events (sctp_probe and sctp_probe_path) so you have to enable
> both events as below.
> 
>   # cd /sys/kernel/debug/tracing
>   # echo 1 > events/sctp/sctp_probe/enable
>   # echo 1 > events/sctp/sctp_probe_path/enable
> 
> Or, you can enable all the events under sctp.
> 
>   # echo 1 > events/sctp/enable
> 
> Since sctp_probe_path event is always invoked from sctp_probe
> event, you can not see any output if you only enable
> sctp_probe_path.

I have to ask, why did you do it this way?


> +#include <trace/define_trace.h>
> diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> index 8f8ccded13e4..c5f92b2cc5c3 100644
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -59,6 +59,9 @@
>  #include <net/sctp/sm.h>
>  #include <net/sctp/structs.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/sctp.h>
> +
>  static struct sctp_packet *sctp_abort_pkt_new(
>  					struct net *net,
>  					const struct sctp_endpoint *ep,
> @@ -3219,6 +3222,8 @@ enum sctp_disposition sctp_sf_eat_sack_6_2(struct net *net,
>  	struct sctp_sackhdr *sackh;
>  	__u32 ctsn;
>  
> +	trace_sctp_probe(ep, asoc, chunk);

What about doing this right after this probe:

	if (trace_sctp_probe_path_enabled()) {
		struct sctp_transport *sp;

		list_for_each_entry(sp, &asoc->peer.transpor_addr_list,
				    transports) {
			trace_sctp_probe_path(sp, asoc);
		}
	}

The "trace_sctp_probe_path_enabled()" is a static branch, which means
it's a nop just like a tracepoint is, and will not add any overhead if
the trace_sctp_probe_path is not enabled.

-- Steve

> +
>  	if (!sctp_vtag_verify(chunk, asoc))
>  		return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
>  

^ permalink raw reply

* Re: BUG: spinlock bad magic (2)
From: Dmitry Vyukov @ 2017-12-18 17:01 UTC (permalink / raw)
  To: Santosh Shilimkar
  Cc: syzbot, David Miller, LKML, linux-rdma, netdev, rds-devel,
	syzkaller-bugs
In-Reply-To: <6dbd4f85-f3f2-97f3-5b82-451276fbf877@oracle.com>

On Mon, Dec 18, 2017 at 5:46 PM, Santosh Shilimkar
<santosh.shilimkar@oracle.com> wrote:
> On 12/18/2017 4:36 AM, syzbot wrote:
>>
>> Hello,
>>
>> syzkaller hit the following crash on
>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>>
>> Unfortunately, I don't have any reproducer for this bug yet.
>>
> [...]
>
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
>> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186
>
>
> This one seems to be same bug as reported as below.
>
> BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit

Hi Santosh,

The proper syntax to tell syzbot about dups is this (from email footer):

> See https://goo.gl/tpsmEJ for details.
> Please credit me with: Reported-by: syzbot <syzkaller@googlegroups.com>
> syzbot will keep track of this bug report.
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> Note: all commands must start from beginning of the line in the email body.

^ permalink raw reply

* [PATCH net-next 6/6] sfc: populate the timer reload field
From: Edward Cree @ 2017-12-18 16:57 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <f9e1279b-03d0-729c-2518-c1e204444447@solarflare.com>

From: Bert Kenward <bkenward@solarflare.com>

The timer mode register now has a separate field for the reload value.
Since we always use this timer with the reload (for interrupt moderation)
we set this to the same as the initial value.

Previous hardware ignores this field, so we can safely set these bits
on all hardware that uses this register.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 56a6bc60dac1..1f64c7f60943 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -2010,8 +2010,9 @@ static void efx_ef10_push_irq_moderation(struct efx_channel *channel)
 	} else {
 		unsigned int ticks = efx_usecs_to_ticks(efx, usecs);
 
-		EFX_POPULATE_DWORD_2(timer_cmd, ERF_DZ_TC_TIMER_MODE, mode,
-				     ERF_DZ_TC_TIMER_VAL, ticks);
+		EFX_POPULATE_DWORD_3(timer_cmd, ERF_DZ_TC_TIMER_MODE, mode,
+				     ERF_DZ_TC_TIMER_VAL, ticks,
+				     ERF_FZ_TC_TMR_REL_VAL, ticks);
 		efx_writed_page(efx, &timer_cmd, ER_DZ_EVQ_TMR,
 				channel->channel);
 	}

^ permalink raw reply related

* [PATCH net-next 5/6] sfc: update EF10 register definitions
From: Edward Cree @ 2017-12-18 16:57 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <f9e1279b-03d0-729c-2518-c1e204444447@solarflare.com>

From: Bert Kenward <bkenward@solarflare.com>

The RX_L4_CLASS field has shrunk from 3 bits to 2 bits. The upper
bit was never used in previous hardware, so we can use the new
definition throughout.

The TSO OUTER_IPID field was previously spelt differently from the
external definitions.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c      | 16 ++++++-------
 drivers/net/ethernet/sfc/ef10_regs.h | 46 +++++++++++++++++++++++-------------
 2 files changed, 37 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 009bf28bdba5..56a6bc60dac1 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -3292,8 +3292,8 @@ static u16 efx_ef10_handle_rx_event_errors(struct efx_channel *channel,
 		if (unlikely(rx_encap_hdr != ESE_EZ_ENCAP_HDR_VXLAN &&
 			     ((rx_l3_class != ESE_DZ_L3_CLASS_IP4 &&
 			       rx_l3_class != ESE_DZ_L3_CLASS_IP6) ||
-			      (rx_l4_class != ESE_DZ_L4_CLASS_TCP &&
-			       rx_l4_class != ESE_DZ_L4_CLASS_UDP))))
+			      (rx_l4_class != ESE_FZ_L4_CLASS_TCP &&
+			       rx_l4_class != ESE_FZ_L4_CLASS_UDP))))
 			netdev_WARN(efx->net_dev,
 				    "invalid class for RX_TCPUDP_CKSUM_ERR: event="
 				    EFX_QWORD_FMT "\n",
@@ -3330,8 +3330,8 @@ static u16 efx_ef10_handle_rx_event_errors(struct efx_channel *channel,
 				    EFX_QWORD_VAL(*event));
 		else if (unlikely((rx_l3_class != ESE_DZ_L3_CLASS_IP4 &&
 				   rx_l3_class != ESE_DZ_L3_CLASS_IP6) ||
-				  (rx_l4_class != ESE_DZ_L4_CLASS_TCP &&
-				   rx_l4_class != ESE_DZ_L4_CLASS_UDP)))
+				  (rx_l4_class != ESE_FZ_L4_CLASS_TCP &&
+				   rx_l4_class != ESE_FZ_L4_CLASS_UDP)))
 			netdev_WARN(efx->net_dev,
 				    "invalid class for RX_TCP_UDP_INNER_CHKSUM_ERR: event="
 				    EFX_QWORD_FMT "\n",
@@ -3366,7 +3366,7 @@ static int efx_ef10_handle_rx_event(struct efx_channel *channel,
 	next_ptr_lbits = EFX_QWORD_FIELD(*event, ESF_DZ_RX_DSC_PTR_LBITS);
 	rx_queue_label = EFX_QWORD_FIELD(*event, ESF_DZ_RX_QLABEL);
 	rx_l3_class = EFX_QWORD_FIELD(*event, ESF_DZ_RX_L3_CLASS);
-	rx_l4_class = EFX_QWORD_FIELD(*event, ESF_DZ_RX_L4_CLASS);
+	rx_l4_class = EFX_QWORD_FIELD(*event, ESF_FZ_RX_L4_CLASS);
 	rx_cont = EFX_QWORD_FIELD(*event, ESF_DZ_RX_CONT);
 	rx_encap_hdr =
 		nic_data->datapath_caps &
@@ -3444,8 +3444,8 @@ static int efx_ef10_handle_rx_event(struct efx_channel *channel,
 							 rx_l3_class, rx_l4_class,
 							 event);
 	} else {
-		bool tcpudp = rx_l4_class == ESE_DZ_L4_CLASS_TCP ||
-			      rx_l4_class == ESE_DZ_L4_CLASS_UDP;
+		bool tcpudp = rx_l4_class == ESE_FZ_L4_CLASS_TCP ||
+			      rx_l4_class == ESE_FZ_L4_CLASS_UDP;
 
 		switch (rx_encap_hdr) {
 		case ESE_EZ_ENCAP_HDR_VXLAN: /* VxLAN or GENEVE */
@@ -3466,7 +3466,7 @@ static int efx_ef10_handle_rx_event(struct efx_channel *channel,
 		}
 	}
 
-	if (rx_l4_class == ESE_DZ_L4_CLASS_TCP)
+	if (rx_l4_class == ESE_FZ_L4_CLASS_TCP)
 		flags |= EFX_RX_PKT_TCP;
 
 	channel->irq_mod_score += 2 * n_packets;
diff --git a/drivers/net/ethernet/sfc/ef10_regs.h b/drivers/net/ethernet/sfc/ef10_regs.h
index 2c4bf9476c37..6a56778cf06c 100644
--- a/drivers/net/ethernet/sfc/ef10_regs.h
+++ b/drivers/net/ethernet/sfc/ef10_regs.h
@@ -1,6 +1,6 @@
 /****************************************************************************
  * Driver for Solarflare network controllers and boards
- * Copyright 2012-2015 Solarflare Communications Inc.
+ * Copyright 2012-2017 Solarflare Communications Inc.
  *
  * This program is free software; you can redistribute it and/or modify it
  * under the terms of the GNU General Public License version 2 as published
@@ -79,6 +79,8 @@
 #define	ER_DZ_EVQ_TMR 0x00000420
 #define	ER_DZ_EVQ_TMR_STEP 8192
 #define	ER_DZ_EVQ_TMR_ROWS 2048
+#define	ERF_FZ_TC_TMR_REL_VAL_LBN 16
+#define	ERF_FZ_TC_TMR_REL_VAL_WIDTH 14
 #define	ERF_DZ_TC_TIMER_MODE_LBN 14
 #define	ERF_DZ_TC_TIMER_MODE_WIDTH 2
 #define	ERF_DZ_TC_TIMER_VAL_LBN 0
@@ -159,16 +161,24 @@
 #define	ESF_DZ_RX_EV_SOFT2_WIDTH 2
 #define	ESF_DZ_RX_DSC_PTR_LBITS_LBN 48
 #define	ESF_DZ_RX_DSC_PTR_LBITS_WIDTH 4
-#define	ESF_DZ_RX_L4_CLASS_LBN 45
-#define	ESF_DZ_RX_L4_CLASS_WIDTH 3
-#define	ESE_DZ_L4_CLASS_RSVD7 7
-#define	ESE_DZ_L4_CLASS_RSVD6 6
-#define	ESE_DZ_L4_CLASS_RSVD5 5
-#define	ESE_DZ_L4_CLASS_RSVD4 4
-#define	ESE_DZ_L4_CLASS_RSVD3 3
-#define	ESE_DZ_L4_CLASS_UDP 2
-#define	ESE_DZ_L4_CLASS_TCP 1
-#define	ESE_DZ_L4_CLASS_UNKNOWN 0
+#define	ESF_DE_RX_L4_CLASS_LBN 45
+#define	ESF_DE_RX_L4_CLASS_WIDTH 3
+#define	ESE_DE_L4_CLASS_RSVD7 7
+#define	ESE_DE_L4_CLASS_RSVD6 6
+#define	ESE_DE_L4_CLASS_RSVD5 5
+#define	ESE_DE_L4_CLASS_RSVD4 4
+#define	ESE_DE_L4_CLASS_RSVD3 3
+#define	ESE_DE_L4_CLASS_UDP 2
+#define	ESE_DE_L4_CLASS_TCP 1
+#define	ESE_DE_L4_CLASS_UNKNOWN 0
+#define	ESF_FZ_RX_FASTPD_INDCTR_LBN 47
+#define	ESF_FZ_RX_FASTPD_INDCTR_WIDTH 1
+#define	ESF_FZ_RX_L4_CLASS_LBN 45
+#define	ESF_FZ_RX_L4_CLASS_WIDTH 2
+#define	ESE_FZ_L4_CLASS_RSVD3 3
+#define	ESE_FZ_L4_CLASS_UDP 2
+#define	ESE_FZ_L4_CLASS_TCP 1
+#define	ESE_FZ_L4_CLASS_UNKNOWN 0
 #define	ESF_DZ_RX_L3_CLASS_LBN 42
 #define	ESF_DZ_RX_L3_CLASS_WIDTH 3
 #define	ESE_DZ_L3_CLASS_RSVD7 7
@@ -215,6 +225,8 @@
 #define	ESF_EZ_RX_ABORT_WIDTH 1
 #define	ESF_DZ_RX_ECC_ERR_LBN 29
 #define	ESF_DZ_RX_ECC_ERR_WIDTH 1
+#define	ESF_DZ_RX_TRUNC_ERR_LBN 29
+#define	ESF_DZ_RX_TRUNC_ERR_WIDTH 1
 #define	ESF_DZ_RX_CRC1_ERR_LBN 28
 #define	ESF_DZ_RX_CRC1_ERR_WIDTH 1
 #define	ESF_DZ_RX_CRC0_ERR_LBN 27
@@ -332,6 +344,8 @@
 #define	ESE_DZ_TX_OPTION_DESC_CRC_CSUM 0
 #define	ESF_DZ_TX_TSO_OPTION_TYPE_LBN 56
 #define	ESF_DZ_TX_TSO_OPTION_TYPE_WIDTH 4
+#define	ESE_DZ_TX_TSO_OPTION_DESC_FATSO2B 3
+#define	ESE_DZ_TX_TSO_OPTION_DESC_FATSO2A 2
 #define	ESE_DZ_TX_TSO_OPTION_DESC_ENCAP 1
 #define	ESE_DZ_TX_TSO_OPTION_DESC_NORMAL 0
 #define	ESF_DZ_TX_TSO_TCP_FLAGS_LBN 48
@@ -341,7 +355,7 @@
 #define	ESF_DZ_TX_TSO_TCP_SEQNO_LBN 0
 #define	ESF_DZ_TX_TSO_TCP_SEQNO_WIDTH 32
 
-/* TX_TSO_FATSO2A_DESC */
+/* TX_TSO_V2_DESC_A */
 #define	ESF_DZ_TX_DESC_IS_OPT_LBN 63
 #define	ESF_DZ_TX_DESC_IS_OPT_WIDTH 1
 #define	ESF_DZ_TX_OPTION_TYPE_LBN 60
@@ -360,8 +374,7 @@
 #define	ESF_DZ_TX_TSO_TCP_SEQNO_LBN 0
 #define	ESF_DZ_TX_TSO_TCP_SEQNO_WIDTH 32
 
-
-/* TX_TSO_FATSO2B_DESC */
+/* TX_TSO_V2_DESC_B */
 #define	ESF_DZ_TX_DESC_IS_OPT_LBN 63
 #define	ESF_DZ_TX_DESC_IS_OPT_WIDTH 1
 #define	ESF_DZ_TX_OPTION_TYPE_LBN 60
@@ -375,11 +388,10 @@
 #define	ESE_DZ_TX_TSO_OPTION_DESC_FATSO2A 2
 #define	ESE_DZ_TX_TSO_OPTION_DESC_ENCAP 1
 #define	ESE_DZ_TX_TSO_OPTION_DESC_NORMAL 0
-#define	ESF_DZ_TX_TSO_OUTER_IP_ID_LBN 0
-#define	ESF_DZ_TX_TSO_OUTER_IP_ID_WIDTH 16
 #define	ESF_DZ_TX_TSO_TCP_MSS_LBN 32
 #define	ESF_DZ_TX_TSO_TCP_MSS_WIDTH 16
-
+#define	ESF_DZ_TX_TSO_OUTER_IPID_LBN 0
+#define	ESF_DZ_TX_TSO_OUTER_IPID_WIDTH 16
 
 /*************************************************************************/
 

^ permalink raw reply related

* [PATCH net-next 4/6] sfc: improve PTP error reporting
From: Edward Cree @ 2017-12-18 16:56 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <f9e1279b-03d0-729c-2518-c1e204444447@solarflare.com>

Log a message if PTP probing fails; if we then, unexpectedly, get PTP
 events, only log a message for the first one on each device.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c       | 9 ++++++++-
 drivers/net/ethernet/sfc/net_driver.h | 2 ++
 drivers/net/ethernet/sfc/ptp.c        | 4 +++-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index dcd6be14a430..009bf28bdba5 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -747,7 +747,14 @@ static int efx_ef10_probe(struct efx_nic *efx)
 	if (rc && rc != -EPERM)
 		goto fail5;
 
-	efx_ptp_probe(efx, NULL);
+	rc = efx_ptp_probe(efx, NULL);
+	/* Failure to probe PTP is not fatal.
+	 * In the case of EPERM, efx_ptp_probe will print its own message (in
+	 * efx_ptp_get_attributes()), so we don't need to.
+	 */
+	if (rc && rc != -EPERM)
+		netif_warn(efx, drv, efx->net_dev,
+			   "Failed to probe PTP, rc=%d\n", rc);
 
 #ifdef CONFIG_SFC_SRIOV
 	if ((efx->pci_dev->physfn) && (!efx->pci_dev->is_physfn)) {
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 2e41f2c39c4a..6b8730a24513 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -813,6 +813,7 @@ struct vfdi_status;
  * @vf_init_count: Number of VFs that have been fully initialised.
  * @vi_scale: log2 number of vnics per VF.
  * @ptp_data: PTP state data
+ * @ptp_warned: has this NIC seen and warned about unexpected PTP events?
  * @vpd_sn: Serial number read from VPD
  * @monitor_work: Hardware monitor workitem
  * @biu_lock: BIU (bus interface unit) lock
@@ -968,6 +969,7 @@ struct efx_nic {
 #endif
 
 	struct efx_ptp_data *ptp_data;
+	bool ptp_warned;
 
 	char *vpd_sn;
 
diff --git a/drivers/net/ethernet/sfc/ptp.c b/drivers/net/ethernet/sfc/ptp.c
index caa89bf7603e..3b37d7ded3c4 100644
--- a/drivers/net/ethernet/sfc/ptp.c
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -1662,9 +1662,11 @@ void efx_ptp_event(struct efx_nic *efx, efx_qword_t *ev)
 	int code = EFX_QWORD_FIELD(*ev, MCDI_EVENT_CODE);
 
 	if (!ptp) {
-		if (net_ratelimit())
+		if (!efx->ptp_warned) {
 			netif_warn(efx, drv, efx->net_dev,
 				   "Received PTP event but PTP not set up\n");
+			efx->ptp_warned = true;
+		}
 		return;
 	}
 

^ permalink raw reply related

* [PATCH net-next 3/6] sfc: add Medford2 (SFC9250) PCI Device IDs
From: Edward Cree @ 2017-12-18 16:56 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <f9e1279b-03d0-729c-2518-c1e204444447@solarflare.com>

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/efx.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index e50049cba50b..7bcbedce07a5 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -2910,6 +2910,10 @@ static const struct pci_device_id efx_pci_table[] = {
 	 .driver_data = (unsigned long) &efx_hunt_a0_nic_type},
 	{PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x1a03),  /* SFC9220 VF */
 	 .driver_data = (unsigned long) &efx_hunt_a0_vf_nic_type},
+	{PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x0b03),  /* SFC9250 PF */
+	 .driver_data = (unsigned long) &efx_hunt_a0_nic_type},
+	{PCI_DEVICE(PCI_VENDOR_ID_SOLARFLARE, 0x1b03),  /* SFC9250 VF */
+	 .driver_data = (unsigned long) &efx_hunt_a0_vf_nic_type},
 	{0}			/* end of list */
 };
 

^ permalink raw reply related

* [PATCH net-next 2/6] sfc: support VI strides other than 8k
From: Edward Cree @ 2017-12-18 16:56 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <f9e1279b-03d0-729c-2518-c1e204444447@solarflare.com>

Medford2 can also have 16k or 64k VI stride.  This is reported by MCDI in
 GET_CAPABILITIES, which fortunately is called before the driver does
 anything sensitive to the VI stride (such as accessing or even allocating
 VIs past the zeroth).

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c       | 70 +++++++++++++++++++++++++----------
 drivers/net/ethernet/sfc/efx.c        |  2 +
 drivers/net/ethernet/sfc/io.h         | 19 ++++++----
 drivers/net/ethernet/sfc/mcdi.h       |  3 ++
 drivers/net/ethernet/sfc/net_driver.h |  3 ++
 5 files changed, 70 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 5cc786aec7c4..dcd6be14a430 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -233,7 +233,7 @@ static int efx_ef10_get_vf_index(struct efx_nic *efx)
 
 static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
 {
-	MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_CAPABILITIES_V2_OUT_LEN);
+	MCDI_DECLARE_BUF(outbuf, MC_CMD_GET_CAPABILITIES_V3_OUT_LEN);
 	struct efx_ef10_nic_data *nic_data = efx->nic_data;
 	size_t outlen;
 	int rc;
@@ -277,6 +277,35 @@ static int efx_ef10_init_datapath_caps(struct efx_nic *efx)
 		return -ENODEV;
 	}
 
+	if (outlen >= MC_CMD_GET_CAPABILITIES_V3_OUT_LEN) {
+		u8 vi_window_mode = MCDI_BYTE(outbuf,
+				GET_CAPABILITIES_V3_OUT_VI_WINDOW_MODE);
+
+		switch (vi_window_mode) {
+		case MC_CMD_GET_CAPABILITIES_V3_OUT_VI_WINDOW_MODE_8K:
+			efx->vi_stride = 8192;
+			break;
+		case MC_CMD_GET_CAPABILITIES_V3_OUT_VI_WINDOW_MODE_16K:
+			efx->vi_stride = 16384;
+			break;
+		case MC_CMD_GET_CAPABILITIES_V3_OUT_VI_WINDOW_MODE_64K:
+			efx->vi_stride = 65536;
+			break;
+		default:
+			netif_err(efx, probe, efx->net_dev,
+				  "Unrecognised VI window mode %d\n",
+				  vi_window_mode);
+			return -EIO;
+		}
+		netif_dbg(efx, probe, efx->net_dev, "vi_stride = %u\n",
+			  efx->vi_stride);
+	} else {
+		/* keep default VI stride */
+		netif_dbg(efx, probe, efx->net_dev,
+			  "firmware did not report VI window mode, assuming vi_stride = %u\n",
+			  efx->vi_stride);
+	}
+
 	return 0;
 }
 
@@ -609,17 +638,6 @@ static int efx_ef10_probe(struct efx_nic *efx)
 	struct efx_ef10_nic_data *nic_data;
 	int i, rc;
 
-	/* We can have one VI for each 8K region.  However, until we
-	 * use TX option descriptors we need two TX queues per channel.
-	 */
-	efx->max_channels = min_t(unsigned int,
-				  EFX_MAX_CHANNELS,
-				  efx_ef10_mem_map_size(efx) /
-				  (EFX_VI_PAGE_SIZE * EFX_TXQ_TYPES));
-	efx->max_tx_channels = efx->max_channels;
-	if (WARN_ON(efx->max_channels == 0))
-		return -EIO;
-
 	nic_data = kzalloc(sizeof(*nic_data), GFP_KERNEL);
 	if (!nic_data)
 		return -ENOMEM;
@@ -691,6 +709,20 @@ static int efx_ef10_probe(struct efx_nic *efx)
 	if (rc < 0)
 		goto fail5;
 
+	/* We can have one VI for each vi_stride-byte region.
+	 * However, until we use TX option descriptors we need two TX queues
+	 * per channel.
+	 */
+	efx->max_channels = min_t(unsigned int,
+				  EFX_MAX_CHANNELS,
+				  efx_ef10_mem_map_size(efx) /
+				  (efx->vi_stride * EFX_TXQ_TYPES));
+	efx->max_tx_channels = efx->max_channels;
+	if (WARN_ON(efx->max_channels == 0)) {
+		rc = -EIO;
+		goto fail5;
+	}
+
 	efx->rx_packet_len_offset =
 		ES_DZ_RX_PREFIX_PKTLEN_OFST - ES_DZ_RX_PREFIX_SIZE;
 
@@ -927,7 +959,7 @@ static int efx_ef10_link_piobufs(struct efx_nic *efx)
 			} else {
 				tx_queue->piobuf =
 					nic_data->pio_write_base +
-					index * EFX_VI_PAGE_SIZE + offset;
+					index * efx->vi_stride + offset;
 				tx_queue->piobuf_offset = offset;
 				netif_dbg(efx, probe, efx->net_dev,
 					  "linked VI %u to PIO buffer %u offset %x addr %p\n",
@@ -1273,19 +1305,19 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
 	 * for writing PIO buffers through.
 	 *
 	 * The UC mapping contains (channel_vis - 1) complete VIs and the
-	 * first half of the next VI.  Then the WC mapping begins with
-	 * the second half of this last VI.
+	 * first 4K of the next VI.  Then the WC mapping begins with
+	 * the remainder of this last VI.
 	 */
-	uc_mem_map_size = PAGE_ALIGN((channel_vis - 1) * EFX_VI_PAGE_SIZE +
+	uc_mem_map_size = PAGE_ALIGN((channel_vis - 1) * efx->vi_stride +
 				     ER_DZ_TX_PIOBUF);
 	if (nic_data->n_piobufs) {
 		/* pio_write_vi_base rounds down to give the number of complete
 		 * VIs inside the UC mapping.
 		 */
-		pio_write_vi_base = uc_mem_map_size / EFX_VI_PAGE_SIZE;
+		pio_write_vi_base = uc_mem_map_size / efx->vi_stride;
 		wc_mem_map_size = (PAGE_ALIGN((pio_write_vi_base +
 					       nic_data->n_piobufs) *
-					      EFX_VI_PAGE_SIZE) -
+					      efx->vi_stride) -
 				   uc_mem_map_size);
 		max_vis = pio_write_vi_base + nic_data->n_piobufs;
 	} else {
@@ -1357,7 +1389,7 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx)
 		nic_data->pio_write_vi_base = pio_write_vi_base;
 		nic_data->pio_write_base =
 			nic_data->wc_membase +
-			(pio_write_vi_base * EFX_VI_PAGE_SIZE + ER_DZ_TX_PIOBUF -
+			(pio_write_vi_base * efx->vi_stride + ER_DZ_TX_PIOBUF -
 			 uc_mem_map_size);
 
 		rc = efx_ef10_link_piobufs(efx);
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index bbe4ace7dd9d..e50049cba50b 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -27,6 +27,7 @@
 #include <net/udp_tunnel.h>
 #include "efx.h"
 #include "nic.h"
+#include "io.h"
 #include "selftest.h"
 #include "sriov.h"
 
@@ -2977,6 +2978,7 @@ static int efx_init_struct(struct efx_nic *efx,
 	efx->rx_packet_ts_offset =
 		efx->type->rx_ts_offset - efx->type->rx_prefix_size;
 	spin_lock_init(&efx->stats_lock);
+	efx->vi_stride = EFX_DEFAULT_VI_STRIDE;
 	mutex_init(&efx->mac_lock);
 	efx->phy_op = &efx_dummy_phy_operations;
 	efx->mdio.dev = net_dev;
diff --git a/drivers/net/ethernet/sfc/io.h b/drivers/net/ethernet/sfc/io.h
index afb94aa2c15e..89563170af52 100644
--- a/drivers/net/ethernet/sfc/io.h
+++ b/drivers/net/ethernet/sfc/io.h
@@ -222,18 +222,21 @@ static inline void efx_reado_table(struct efx_nic *efx, efx_oword_t *value,
 	efx_reado(efx, value, reg + index * sizeof(efx_oword_t));
 }
 
-/* Page size used as step between per-VI registers */
-#define EFX_VI_PAGE_SIZE 0x2000
+/* default VI stride (step between per-VI registers) is 8K */
+#define EFX_DEFAULT_VI_STRIDE 0x2000
 
 /* Calculate offset to page-mapped register */
-#define EFX_PAGED_REG(page, reg) \
-	((page) * EFX_VI_PAGE_SIZE + (reg))
+static inline unsigned int efx_paged_reg(struct efx_nic *efx, unsigned int page,
+					 unsigned int reg)
+{
+	return page * efx->vi_stride + reg;
+}
 
 /* Write the whole of RX_DESC_UPD or TX_DESC_UPD */
 static inline void _efx_writeo_page(struct efx_nic *efx, efx_oword_t *value,
 				    unsigned int reg, unsigned int page)
 {
-	reg = EFX_PAGED_REG(page, reg);
+	reg = efx_paged_reg(efx, page, reg);
 
 	netif_vdbg(efx, hw, efx->net_dev,
 		   "writing register %x with " EFX_OWORD_FMT "\n", reg,
@@ -262,7 +265,7 @@ static inline void
 _efx_writed_page(struct efx_nic *efx, const efx_dword_t *value,
 		 unsigned int reg, unsigned int page)
 {
-	efx_writed(efx, value, EFX_PAGED_REG(page, reg));
+	efx_writed(efx, value, efx_paged_reg(efx, page, reg));
 }
 #define efx_writed_page(efx, value, reg, page)				\
 	_efx_writed_page(efx, value,					\
@@ -288,10 +291,10 @@ static inline void _efx_writed_page_locked(struct efx_nic *efx,
 
 	if (page == 0) {
 		spin_lock_irqsave(&efx->biu_lock, flags);
-		efx_writed(efx, value, EFX_PAGED_REG(page, reg));
+		efx_writed(efx, value, efx_paged_reg(efx, page, reg));
 		spin_unlock_irqrestore(&efx->biu_lock, flags);
 	} else {
-		efx_writed(efx, value, EFX_PAGED_REG(page, reg));
+		efx_writed(efx, value, efx_paged_reg(efx, page, reg));
 	}
 }
 #define efx_writed_page_locked(efx, value, reg, page)			\
diff --git a/drivers/net/ethernet/sfc/mcdi.h b/drivers/net/ethernet/sfc/mcdi.h
index 154ef41d1927..ebd95972ae7b 100644
--- a/drivers/net/ethernet/sfc/mcdi.h
+++ b/drivers/net/ethernet/sfc/mcdi.h
@@ -208,6 +208,9 @@ void efx_mcdi_sensor_event(struct efx_nic *efx, efx_qword_t *ev);
 #define _MCDI_DWORD(_buf, _field)					\
 	((_buf) + (_MCDI_CHECK_ALIGN(MC_CMD_ ## _field ## _OFST, 4) >> 2))
 
+#define MCDI_BYTE(_buf, _field)						\
+	((void)BUILD_BUG_ON_ZERO(MC_CMD_ ## _field ## _LEN != 1),	\
+	 *MCDI_PTR(_buf, _field))
 #define MCDI_WORD(_buf, _field)						\
 	((u16)BUILD_BUG_ON_ZERO(MC_CMD_ ## _field ## _LEN != 2) +	\
 	 le16_to_cpu(*(__force const __le16 *)MCDI_PTR(_buf, _field)))
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index 2b6599f8d9fa..2e41f2c39c4a 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -708,6 +708,7 @@ struct vfdi_status;
  * @reset_work: Scheduled reset workitem
  * @membase_phys: Memory BAR value as physical address
  * @membase: Memory BAR value
+ * @vi_stride: step between per-VI registers / memory regions
  * @interrupt_mode: Interrupt mode
  * @timer_quantum_ns: Interrupt timer quantum, in nanoseconds
  * @timer_max_ns: Interrupt timer maximum value, in nanoseconds
@@ -842,6 +843,8 @@ struct efx_nic {
 	resource_size_t membase_phys;
 	void __iomem *membase;
 
+	unsigned int vi_stride;
+
 	enum efx_int_mode interrupt_mode;
 	unsigned int timer_quantum_ns;
 	unsigned int timer_max_ns;

^ permalink raw reply related

* [PATCH net-next 1/6] sfc: make mem_bar a function rather than a constant
From: Edward Cree @ 2017-12-18 16:55 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev
In-Reply-To: <f9e1279b-03d0-729c-2518-c1e204444447@solarflare.com>

Support using BAR 0 on SFC9250, even though the driver doesn't bind to such
 devices yet.

Signed-off-by: Edward Cree <ecree@solarflare.com>
---
 drivers/net/ethernet/sfc/ef10.c       | 26 +++++++++++++++++++++++---
 drivers/net/ethernet/sfc/efx.c        |  4 ++--
 drivers/net/ethernet/sfc/efx.h        |  5 -----
 drivers/net/ethernet/sfc/net_driver.h |  2 +-
 drivers/net/ethernet/sfc/siena.c      | 10 +++++++++-
 5 files changed, 35 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index e566dbb3343d..5cc786aec7c4 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -160,11 +160,31 @@ static int efx_ef10_get_warm_boot_count(struct efx_nic *efx)
 		EFX_DWORD_FIELD(reg, EFX_WORD_0) : -EIO;
 }
 
+/* On all EF10s up to and including SFC9220 (Medford1), all PFs use BAR 0 for
+ * I/O space and BAR 2(&3) for memory.  On SFC9250 (Medford2), there is no I/O
+ * bar; PFs use BAR 0/1 for memory.
+ */
+static unsigned int efx_ef10_pf_mem_bar(struct efx_nic *efx)
+{
+	switch (efx->pci_dev->device) {
+	case 0x0b03: /* SFC9250 PF */
+		return 0;
+	default:
+		return 2;
+	}
+}
+
+/* All VFs use BAR 0/1 for memory */
+static unsigned int efx_ef10_vf_mem_bar(struct efx_nic *efx)
+{
+	return 0;
+}
+
 static unsigned int efx_ef10_mem_map_size(struct efx_nic *efx)
 {
 	int bar;
 
-	bar = efx->type->mem_bar;
+	bar = efx->type->mem_bar(efx);
 	return resource_size(&efx->pci_dev->resource[bar]);
 }
 
@@ -6392,7 +6412,7 @@ static int efx_ef10_udp_tnl_del_port(struct efx_nic *efx,
 
 const struct efx_nic_type efx_hunt_a0_vf_nic_type = {
 	.is_vf = true,
-	.mem_bar = EFX_MEM_VF_BAR,
+	.mem_bar = efx_ef10_vf_mem_bar,
 	.mem_map_size = efx_ef10_mem_map_size,
 	.probe = efx_ef10_probe_vf,
 	.remove = efx_ef10_remove,
@@ -6500,7 +6520,7 @@ const struct efx_nic_type efx_hunt_a0_vf_nic_type = {
 
 const struct efx_nic_type efx_hunt_a0_nic_type = {
 	.is_vf = false,
-	.mem_bar = EFX_MEM_BAR,
+	.mem_bar = efx_ef10_pf_mem_bar,
 	.mem_map_size = efx_ef10_mem_map_size,
 	.probe = efx_ef10_probe_pf,
 	.remove = efx_ef10_remove,
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index e3c492fcaff0..bbe4ace7dd9d 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -1248,7 +1248,7 @@ static int efx_init_io(struct efx_nic *efx)
 
 	netif_dbg(efx, probe, efx->net_dev, "initialising I/O\n");
 
-	bar = efx->type->mem_bar;
+	bar = efx->type->mem_bar(efx);
 
 	rc = pci_enable_device(pci_dev);
 	if (rc) {
@@ -1323,7 +1323,7 @@ static void efx_fini_io(struct efx_nic *efx)
 	}
 
 	if (efx->membase_phys) {
-		bar = efx->type->mem_bar;
+		bar = efx->type->mem_bar(efx);
 		pci_release_region(efx->pci_dev, bar);
 		efx->membase_phys = 0;
 	}
diff --git a/drivers/net/ethernet/sfc/efx.h b/drivers/net/ethernet/sfc/efx.h
index 52c84b782901..16da3e9a6000 100644
--- a/drivers/net/ethernet/sfc/efx.h
+++ b/drivers/net/ethernet/sfc/efx.h
@@ -14,11 +14,6 @@
 #include "net_driver.h"
 #include "filter.h"
 
-/* All controllers use BAR 0 for I/O space and BAR 2(&3) for memory */
-/* All VFs use BAR 0/1 for memory */
-#define EFX_MEM_BAR 2
-#define EFX_MEM_VF_BAR 0
-
 int efx_net_open(struct net_device *net_dev);
 int efx_net_stop(struct net_device *net_dev);
 
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index c0537ea06c9a..2b6599f8d9fa 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -1154,7 +1154,7 @@ struct efx_udp_tunnel {
  */
 struct efx_nic_type {
 	bool is_vf;
-	unsigned int mem_bar;
+	unsigned int (*mem_bar)(struct efx_nic *efx);
 	unsigned int (*mem_map_size)(struct efx_nic *efx);
 	int (*probe)(struct efx_nic *efx);
 	void (*remove)(struct efx_nic *efx);
diff --git a/drivers/net/ethernet/sfc/siena.c b/drivers/net/ethernet/sfc/siena.c
index a617f657eae3..22d49ebb347c 100644
--- a/drivers/net/ethernet/sfc/siena.c
+++ b/drivers/net/ethernet/sfc/siena.c
@@ -242,6 +242,14 @@ static int siena_dimension_resources(struct efx_nic *efx)
 	return 0;
 }
 
+/* On all Falcon-architecture NICs, PFs use BAR 0 for I/O space and BAR 2(&3)
+ * for memory.
+ */
+static unsigned int siena_mem_bar(struct efx_nic *efx)
+{
+	return 2;
+}
+
 static unsigned int siena_mem_map_size(struct efx_nic *efx)
 {
 	return FR_CZ_MC_TREG_SMEM +
@@ -950,7 +958,7 @@ static int siena_mtd_probe(struct efx_nic *efx)
 
 const struct efx_nic_type siena_a0_nic_type = {
 	.is_vf = false,
-	.mem_bar = EFX_MEM_BAR,
+	.mem_bar = siena_mem_bar,
 	.mem_map_size = siena_mem_map_size,
 	.probe = siena_probe_nic,
 	.remove = siena_remove_nic,

^ permalink raw reply related

* [PATCH net-next 0/6] sfc: Initial X2000-series (Medford2) support
From: Edward Cree @ 2017-12-18 16:54 UTC (permalink / raw)
  To: linux-net-drivers, davem; +Cc: netdev

Basic PCI-level changes to support X2000-series NICs.
Also fix unexpected-PTP-event log messages, since the timestamp format has
 been changed in these NICs and that causes us to fail to probe PTP (but we
 still get the PPS events).

Bert Kenward (2):
  sfc: update EF10 register definitions
  sfc: populate the timer reload field

Edward Cree (4):
  sfc: make mem_bar a function rather than a constant
  sfc: support VI strides other than 8k
  sfc: add Medford2 (SFC9250) PCI Device IDs
  sfc: improve PTP error reporting

 drivers/net/ethernet/sfc/ef10.c       | 126 +++++++++++++++++++++++++---------
 drivers/net/ethernet/sfc/ef10_regs.h  |  46 ++++++++-----
 drivers/net/ethernet/sfc/efx.c        |  10 ++-
 drivers/net/ethernet/sfc/efx.h        |   5 --
 drivers/net/ethernet/sfc/io.h         |  19 ++---
 drivers/net/ethernet/sfc/mcdi.h       |   3 +
 drivers/net/ethernet/sfc/net_driver.h |   7 +-
 drivers/net/ethernet/sfc/ptp.c        |   4 +-
 drivers/net/ethernet/sfc/siena.c      |  10 ++-
 9 files changed, 162 insertions(+), 68 deletions(-)

^ permalink raw reply

* Re: [PATCH 3/3] trace: print address if symbol not found
From: Steven Rostedt @ 2017-12-18 16:49 UTC (permalink / raw)
  To: Tobin C. Harding
  Cc: kernel-hardening, Tycho Andersen, Linus Torvalds, Kees Cook,
	Andrew Morton, Daniel Borkmann, Masahiro Yamada,
	Alexei Starovoitov, linux-kernel, Network Development
In-Reply-To: <1513554812-13014-4-git-send-email-me@tobin.cc>

On Mon, 18 Dec 2017 10:53:32 +1100
"Tobin C. Harding" <me@tobin.cc> wrote:

> Fixes behaviour modified by: commit bd6b239cdbb2 ("kallsyms: don't leak
> address when symbol not found")
> 
> Previous patch changed behaviour of kallsyms function sprint_symbol() to
> return an error code instead of printing the address if a symbol was not
> found. Ftrace relies on the original behaviour. We should not break
> tracing when applying the previous patch. We can maintain the original
> behaviour by checking the return code on calls to sprint_symbol() and
> friends.
> 
> Check return code and print actual address on error (i.e symbol not
> found).
> 
> Signed-off-by: Tobin C. Harding <me@tobin.cc>
> ---
>  kernel/trace/trace.h             | 24 ++++++++++++++++++++++++
>  kernel/trace/trace_events_hist.c |  6 +++---
>  2 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 2a6d0325a761..881b1a577d75 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -1814,4 +1814,28 @@ static inline void trace_event_eval_update(struct trace_eval_map **map, int len)
>  
>  extern struct trace_iterator *tracepoint_print_iter;
>  
> +static inline int
> +trace_sprint_symbol(char *buffer, unsigned long address)
> +{
> +	int ret;
> +
> +	ret = sprint_symbol(buffer, address);
> +	if (ret == -1)
> +		ret = sprintf(buffer, "0x%lx", address);
> +
> +	return ret;
> +}
> +
> +static inline int
> +trace_sprint_symbol_no_offset(char *buffer, unsigned long address)
> +{
> +	int ret;
> +
> +	ret = sprint_symbol_no_offset(buffer, address);
> +	if (ret == -1)
> +		ret = sprintf(buffer, "0x%lx", address);
> +
> +	return ret;
> +}
> +
>  #endif /* _LINUX_KERNEL_TRACE_H */
> diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
> index 1e1558c99d56..3e28522a76f4 100644
> --- a/kernel/trace/trace_events_hist.c
> +++ b/kernel/trace/trace_events_hist.c
> @@ -982,7 +982,7 @@ static void hist_trigger_stacktrace_print(struct seq_file *m,
>  			return;
>  
>  		seq_printf(m, "%*c", 1 + spaces, ' ');
> -		sprint_symbol(str, stacktrace_entries[i]);
> +		trace_sprint_symbol_addr(str, stacktrace_entries[i]);

Hmm, where is trace_sprint_symbol_addr() defined?

-- Steve

>  		seq_printf(m, "%s\n", str);
>  	}
>  }
> @@ -1014,12 +1014,12 @@ hist_trigger_entry_print(struct seq_file *m,
>  			seq_printf(m, "%s: %llx", field_name, uval);
>  		} else if (key_field->flags & HIST_FIELD_FL_SYM) {
>  			uval = *(u64 *)(key + key_field->offset);
> -			sprint_symbol_no_offset(str, uval);
> +			trace_sprint_symbol_no_offset(str, uval);
>  			seq_printf(m, "%s: [%llx] %-45s", field_name,
>  				   uval, str);
>  		} else if (key_field->flags & HIST_FIELD_FL_SYM_OFFSET) {
>  			uval = *(u64 *)(key + key_field->offset);
> -			sprint_symbol(str, uval);
> +			trace_sprint_symbol(str, uval);
>  			seq_printf(m, "%s: [%llx] %-55s", field_name,
>  				   uval, str);
>  		} else if (key_field->flags & HIST_FIELD_FL_EXECNAME) {

^ permalink raw reply

* Re: BUG: spinlock bad magic (2)
From: Santosh Shilimkar @ 2017-12-18 16:46 UTC (permalink / raw)
  To: syzbot, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	syzkaller-bugs-/JYPxA39Uh5TLH3MbocFFw
In-Reply-To: <001a113fae28c2fd6605609c97a2-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

On 12/18/2017 4:36 AM, syzbot wrote:
> Hello,
> 
> syzkaller hit the following crash on 
> 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
> 
> Unfortunately, I don't have any reproducer for this bug yet.
> 
[...]

> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186

This one seems to be same bug as reported as below.

BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v2 0/5] Support for generalized use of make C={1,2} via a wrapper program
From: Knut Omang @ 2017-12-18 16:41 UTC (permalink / raw)
  To: Joe Perches, Jason Gunthorpe
  Cc: Stephen Hemminger, linux-kernel, Mauro Carvalho Chehab,
	Nicolas Palix, Jonathan Corbet, Santosh Shilimkar, Matthew Wilcox,
	cocci, rds-devel, linux-rdma, linux-doc, Doug Ledford,
	Mickaël Salaün, Shuah Khan, linux-kbuild, Michal Marek,
	Julia Lawall, John Haxby, Åsmund Østvold,
	Masahiro Yamada
In-Reply-To: <1513611003.31581.71.camel@perches.com>

On Mon, 2017-12-18 at 07:30 -0800, Joe Perches wrote:
> On Mon, 2017-12-18 at 14:05 +0100, Knut Omang wrote:
> > > Here is a list of the checkpatch messages for drivers/infiniband
> > > sorted by type.
> > > 
> > > Many of these might be corrected by using
> > > 
> > > $ ./scripts/checkpatch.pl -f --fix-inplace --types=<TYPE> \
> > >   $(git ls-files drivers/infiniband/)
> > 
> > Yes, and I already did that work piece by piece for individual types,
> > just to test the runchecks tool, and want to post that set once the 
> > runchecks script and Makefile changes itself are in,
> 
> I think those are independent of any runcheck tool changes and
> could be posted now.  In general, don't keep patches in a local
> tree waiting on some other unrelated patch.

It becomes related in that the runchecks.cfg file is updated 
in all the patches to keep 'make C=2' run with 0 errors while 
enabling more checks. I think they serve well as examples of 
how a workflow with runchecks could be.

> Just fyi:
> 
> There is a script that helps automate checkpatch "by-type" conversions
> with compilation, .o difference checking, and git commit editing.
> 
> https://lkml.org/lkml/2014/7/11/794

oh - good to know - seems it would have been a good help
during my little exercise..

Thanks,
Knut

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox