Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2 1/4] siphash: add cryptographically secure hashtable function
From: Hannes Frederic Sowa @ 2016-12-15 20:31 UTC (permalink / raw)
  To: Jason A. Donenfeld, David Laight
  Cc: Netdev, kernel-hardening@lists.openwall.com,
	Jean-Philippe Aumasson, LKML, Linux Crypto Mailing List,
	Daniel J . Bernstein, Linus Torvalds, Eric Biggers
In-Reply-To: <CAHmME9pTLFu3-4n6m_OMj5jVWGE-+yC-4CnkynD--H4Nt8_cpA@mail.gmail.com>

Hello,

On 15.12.2016 19:50, Jason A. Donenfeld wrote:
> Hi David & Hannes,
> 
> This conversation is veering off course.

Why?

> I think this doesn't really
> matter at all. Gcc converts u64 into essentially a pair of u32 on
> 32-bit platforms, so the alignment requirements for 32-bit is at a
> maximum 32 bits. On 64-bit platforms the alignment requirements are
> related at a maximum to the biggest register size, so 64-bit
> alignment. For this reason, no matter the behavior of __aligned(8),
> we're okay. Likewise, even without __aligned(8), if gcc aligns structs
> by their biggest member, then we get 4 byte alignment on 32-bit and 8
> byte alignment on 64-bit, which is fine. There's no 32-bit platform
> that will trap on a 64-bit unaligned access because there's no such
> thing as a 64-bit access there. In short, we're fine.

ARM64 and x86-64 have memory operations that are not vector operations
that operate on 128 bit memory.

How do you know that the compiler for some architecture will not chose a
more optimized instruction to load a 64 bit memory value into two 32 bit
registers if you tell the compiler it is 8 byte aligned but it actually
isn't? I don't know the answer but telling the compiler some data is 8
byte aligned while it isn't really pretty much seems like a call for
trouble.

Why can't a compiler not vectorize this code if it can prove that it
doesn't conflict with other register users?

Bye,
Hannes

^ permalink raw reply

* [net-next PATCH v6 3/5] virtio_net: add dedicated XDP transmit queues
From: John Fastabend @ 2016-12-15 20:13 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem
In-Reply-To: <20161215200712.23639.53043.stgit@john-Precision-Tower-5810>

XDP requires using isolated transmit queues to avoid interference
with normal networking stack (BQL, NETDEV_TX_BUSY, etc). This patch
adds a XDP queue per cpu when a XDP program is loaded and does not
expose the queues to the OS via the normal API call to
netif_set_real_num_tx_queues(). This way the stack will never push
an skb to these queues.

However virtio/vhost/qemu implementation only allows for creating
TX/RX queue pairs at this time so creating only TX queues was not
possible. And because the associated RX queues are being created I
went ahead and exposed these to the stack and let the backend use
them. This creates more RX queues visible to the network stack than
TX queues which is worth mentioning but does not cause any issues as
far as I can tell.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |   30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 93075da..992ec5f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -114,6 +114,9 @@ struct virtnet_info {
 	/* # of queue pairs currently used by the driver */
 	u16 curr_queue_pairs;
 
+	/* # of XDP queue pairs currently used by the driver */
+	u16 xdp_queue_pairs;
+
 	/* I like... big packets and I cannot lie! */
 	bool big_packets;
 
@@ -1526,7 +1529,8 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
 	struct virtnet_info *vi = netdev_priv(dev);
 	struct bpf_prog *old_prog;
-	int i;
+	u16 xdp_qp = 0, curr_qp;
+	int i, err;
 
 	if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO4) ||
 	    virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6)) {
@@ -1544,12 +1548,34 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 		return -EINVAL;
 	}
 
+	curr_qp = vi->curr_queue_pairs - vi->xdp_queue_pairs;
+	if (prog)
+		xdp_qp = nr_cpu_ids;
+
+	/* XDP requires extra queues for XDP_TX */
+	if (curr_qp + xdp_qp > vi->max_queue_pairs) {
+		netdev_warn(dev, "request %i queues but max is %i\n",
+			    curr_qp + xdp_qp, vi->max_queue_pairs);
+		return -ENOMEM;
+	}
+
+	err = virtnet_set_queues(vi, curr_qp + xdp_qp);
+	if (err) {
+		dev_warn(&dev->dev, "XDP Device queue allocation failure.\n");
+		return err;
+	}
+
 	if (prog) {
 		prog = bpf_prog_add(prog, vi->max_queue_pairs - 1);
-		if (IS_ERR(prog))
+		if (IS_ERR(prog)) {
+			virtnet_set_queues(vi, curr_qp);
 			return PTR_ERR(prog);
+		}
 	}
 
+	vi->xdp_queue_pairs = xdp_qp;
+	netif_set_real_num_rx_queues(dev, curr_qp + xdp_qp);
+
 	for (i = 0; i < vi->max_queue_pairs; i++) {
 		old_prog = rtnl_dereference(vi->rq[i].xdp_prog);
 		rcu_assign_pointer(vi->rq[i].xdp_prog, prog);

^ permalink raw reply related

* [PATCH v5 4/4] random: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

This duplicates the current algorithm for get_random_int/long, but uses
siphash instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
separates hashed fields into three values instead of one, in order to
increase diffusion.

The previous MD5 algorithm used a per-cpu MD5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte MD5 state, there is
instead a per-cpu previously returned value for chaining.

The speed benefits are substantial:

                | siphash | md5    | speedup |
		------------------------------
get_random_long | 137130  | 415983 | 3.03x   |
get_random_int  | 86384   | 343323 | 3.97x   |

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Ted Tso <tytso@mit.edu>
---
 drivers/char/random.c | 32 +++++++++++++-------------------
 1 file changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..a51f0ff43f00 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include <linux/syscalls.h>
 #include <linux/completion.h>
 #include <linux/uuid.h>
+#include <linux/siphash.h>
 #include <crypto/chacha20.h>
 
 #include <asm/processor.h>
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif 	/* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned;
+static siphash_key_t random_int_secret;
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
 	return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-		__aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,16 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-	__u32 *hash;
 	unsigned int ret;
+	u64 *chaining;
 
 	if (arch_get_random_int(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = hash[0];
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2080,16 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-	__u32 *hash;
 	unsigned long ret;
+	u64 *chaining;
 
 	if (arch_get_random_long(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = *(unsigned long *)hash;
-	put_cpu_var(get_random_int_hash);
-
+	chaining = &get_cpu_var(get_random_int_chaining);
+	ret = *chaining = siphash_3u64(*chaining, jiffies, random_get_entropy() +
+				       current->pid, random_int_secret);
+	put_cpu_var(get_random_int_chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v5 3/4] secure_seq: use SipHash in place of MD5
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, for IPv6, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code. For IPv4, we pass the values directly to the
short input convenience functions.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Laight <David.Laight@aculab.com>
Cc: Tom Herbert <tom@herbertland.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/core/secure_seq.c | 133 ++++++++++++++++++++------------------------------
 1 file changed, 52 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..c80583bf3213 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */
+
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/cryptohash.h>
@@ -8,14 +10,14 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
-
+#include <linux/siphash.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include <linux/in6.h>
 #include <net/tcp.h>
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
+static siphash_key_t net_secret;
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,42 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				 __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
+	u64 hash;
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash(&combined, sizeof(combined), net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 dport;
+		u16 padding1;
+		u32 padding2;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32) daddr[i];
-	secret[4] = net_secret[4] + (__force u32)dport;
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return hash[0];
+	return siphash(&combined, sizeof(combined), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +91,17 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 			       __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	u64 hash;
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash_4u32(saddr, daddr, sport, dport, net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = (__force u32)dport ^ net_secret[14];
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return hash[0];
+	return siphash_4u32(saddr, daddr, dport, 0, net_secret);
 }
 EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 #endif
@@ -126,21 +110,11 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
 	u64 seq;
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash_4u32(saddr, daddr, sport, dport, net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccp_sequence_number);
@@ -149,26 +123,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		u32 padding;
+	} __aligned(SIPHASH_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-	u32 i;
-
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash(&combined, sizeof(combined), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccpv6_sequence_number);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v5 2/4] siphash: add Nu{32,64} helpers
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

These restore parity with the jhash interface by providing high
performance helpers for common input sizes.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Tom Herbert <tom@herbertland.com>
---
 include/linux/siphash.h |  33 ++++++++++
 lib/siphash.c           | 157 +++++++++++++++++++++++++++++++++++++-----------
 lib/test_siphash.c      |  18 ++++++
 3 files changed, 172 insertions(+), 36 deletions(-)

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
index 145cf5667078..6f5a08a0fc7e 100644
--- a/include/linux/siphash.h
+++ b/include/linux/siphash.h
@@ -29,4 +29,37 @@ static inline u64 siphash_unaligned(const void *data, size_t len,
 u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key);
 #endif
 
+u64 siphash_1u64(const u64 a, const siphash_key_t key);
+u64 siphash_2u64(const u64 a, const u64 b, const siphash_key_t key);
+u64 siphash_3u64(const u64 a, const u64 b, const u64 c,
+		 const siphash_key_t key);
+u64 siphash_4u64(const u64 a, const u64 b, const u64 c, const u64 d,
+		 const siphash_key_t key);
+
+static inline u64 siphash_2u32(const u32 a, const u32 b, const siphash_key_t key)
+{
+	return siphash_1u64((u64)b << 32 | a, key);
+}
+
+static inline u64 siphash_4u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const siphash_key_t key)
+{
+	return siphash_2u64((u64)b << 32 | a, (u64)d << 32 | c, key);
+}
+
+static inline u64 siphash_6u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const u32 e, const u32 f, const siphash_key_t key)
+{
+	return siphash_3u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e,
+			    key);
+}
+
+static inline u64 siphash_8u32(const u32 a, const u32 b, const u32 c, const u32 d,
+			       const u32 e, const u32 f, const u32 g, const u32 h,
+			       const siphash_key_t key)
+{
+	return siphash_4u64((u64)b << 32 | a, (u64)d << 32 | c, (u64)f << 32 | e,
+			    (u64)h << 32 | g, key);
+}
+
 #endif /* _LINUX_SIPHASH_H */
diff --git a/lib/siphash.c b/lib/siphash.c
index afc13cbb1b78..970c083ab06a 100644
--- a/lib/siphash.c
+++ b/lib/siphash.c
@@ -25,6 +25,29 @@
 	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
 	} while(0)
 
+#define PREAMBLE(len) \
+	u64 v0 = 0x736f6d6570736575ULL; \
+	u64 v1 = 0x646f72616e646f6dULL; \
+	u64 v2 = 0x6c7967656e657261ULL; \
+	u64 v3 = 0x7465646279746573ULL; \
+	u64 b = ((u64)len) << 56; \
+	v3 ^= key[1]; \
+	v2 ^= key[0]; \
+	v1 ^= key[1]; \
+	v0 ^= key[0];
+
+#define POSTAMBLE \
+	v3 ^= b; \
+	SIPROUND; \
+	SIPROUND; \
+	v0 ^= b; \
+	v2 ^= 0xff; \
+	SIPROUND; \
+	SIPROUND; \
+	SIPROUND; \
+	SIPROUND; \
+	return (v0 ^ v1) ^ (v2 ^ v3);
+
 /**
  * siphash - compute 64-bit siphash PRF value
  * @data: buffer to hash, must be aligned to SIPHASH_ALIGNMENT
@@ -33,18 +56,10 @@
  */
 u64 siphash(const void *data, size_t len, const siphash_key_t key)
 {
-	u64 v0 = 0x736f6d6570736575ULL;
-	u64 v1 = 0x646f72616e646f6dULL;
-	u64 v2 = 0x6c7967656e657261ULL;
-	u64 v3 = 0x7465646279746573ULL;
-	u64 b = ((u64)len) << 56;
-	u64 m;
 	const u8 *end = data + len - (len % sizeof(u64));
 	const u8 left = len & (sizeof(u64) - 1);
-	v3 ^= key[1];
-	v2 ^= key[0];
-	v1 ^= key[1];
-	v0 ^= key[0];
+	u64 m;
+	PREAMBLE(len)
 	for (; data != end; data += sizeof(u64)) {
 		m = le64_to_cpup(data);
 		v3 ^= m;
@@ -67,16 +82,7 @@ u64 siphash(const void *data, size_t len, const siphash_key_t key)
 	case 1: b |= end[0];
 	}
 #endif
-	v3 ^= b;
-	SIPROUND;
-	SIPROUND;
-	v0 ^= b;
-	v2 ^= 0xff;
-	SIPROUND;
-	SIPROUND;
-	SIPROUND;
-	SIPROUND;
-	return (v0 ^ v1) ^ (v2 ^ v3);
+	POSTAMBLE
 }
 EXPORT_SYMBOL(siphash);
 
@@ -89,18 +95,10 @@ EXPORT_SYMBOL(siphash);
  */
 u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key)
 {
-	u64 v0 = 0x736f6d6570736575ULL;
-	u64 v1 = 0x646f72616e646f6dULL;
-	u64 v2 = 0x6c7967656e657261ULL;
-	u64 v3 = 0x7465646279746573ULL;
-	u64 b = ((u64)len) << 56;
-	u64 m;
 	const u8 *end = data + len - (len % sizeof(u64));
 	const u8 left = len & (sizeof(u64) - 1);
-	v3 ^= key[1];
-	v2 ^= key[0];
-	v1 ^= key[1];
-	v0 ^= key[0];
+	u64 m;
+	PREAMBLE(len)
 	for (; data != end; data += sizeof(u64)) {
 		m = get_unaligned_le64(data);
 		v3 ^= m;
@@ -123,16 +121,103 @@ u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key)
 	case 1: b |= bytes[0];
 	}
 #endif
-	v3 ^= b;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_unaligned);
+#endif
+
+/**
+ * siphash_1u64 - compute 64-bit siphash PRF value of a u64
+ * @first: first u64
+ * @key: the siphash key
+ */
+u64 siphash_1u64(const u64 first, const siphash_key_t key)
+{
+	PREAMBLE(8)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_1u64);
+
+/**
+ * siphash_2u64 - compute 64-bit siphash PRF value of 2 u64
+ * @first: first u64
+ * @second: second u64
+ * @key: the siphash key
+ */
+u64 siphash_2u64(const u64 first, const u64 second, const siphash_key_t key)
+{
+	PREAMBLE(16)
+	v3 ^= first;
 	SIPROUND;
 	SIPROUND;
-	v0 ^= b;
-	v2 ^= 0xff;
+	v0 ^= first;
+	v3 ^= second;
 	SIPROUND;
 	SIPROUND;
+	v0 ^= second;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_2u64);
+
+/**
+ * siphash_3u64 - compute 64-bit siphash PRF value of 3 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @key: the siphash key
+ */
+u64 siphash_3u64(const u64 first, const u64 second, const u64 third,
+		 const siphash_key_t key)
+{
+	PREAMBLE(24)
+	v3 ^= first;
 	SIPROUND;
 	SIPROUND;
-	return (v0 ^ v1) ^ (v2 ^ v3);
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	v3 ^= third;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= third;
+	POSTAMBLE
 }
-EXPORT_SYMBOL(siphash_unaligned);
-#endif
+EXPORT_SYMBOL(siphash_3u64);
+
+/**
+ * siphash_4u64 - compute 64-bit siphash PRF value of 4 u64
+ * @first: first u64
+ * @second: second u64
+ * @third: third u64
+ * @forth: forth u64
+ * @key: the siphash key
+ */
+u64 siphash_4u64(const u64 first, const u64 second, const u64 third,
+		 const u64 forth, const siphash_key_t key)
+{
+	PREAMBLE(32)
+	v3 ^= first;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= first;
+	v3 ^= second;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= second;
+	v3 ^= third;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= third;
+	v3 ^= forth;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= forth;
+	POSTAMBLE
+}
+EXPORT_SYMBOL(siphash_4u64);
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
index 93549e4e22c5..1635189c171f 100644
--- a/lib/test_siphash.c
+++ b/lib/test_siphash.c
@@ -67,6 +67,24 @@ static int __init siphash_test_init(void)
 			ret = -EINVAL;
 		}
 	}
+	if (siphash_1u64(0x0706050403020100ULL, test_key) != test_vectors[8]) {
+		pr_info("self-test 1u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_2u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL, test_key) != test_vectors[16]) {
+		pr_info("self-test 2u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_3u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+			 0x1716151413121110ULL, test_key) != test_vectors[24]) {
+		pr_info("self-test 3u64: FAIL\n");
+		ret = -EINVAL;
+	}
+	if (siphash_4u64(0x0706050403020100ULL, 0x0f0e0d0c0b0a0908ULL,
+			 0x1716151413121110ULL, 0x1f1e1d1c1b1a1918ULL, test_key) != test_vectors[32]) {
+		pr_info("self-test 4u64: FAIL\n");
+		ret = -EINVAL;
+	}
 	if (!ret)
 		pr_info("self-tests: pass\n");
 	return ret;
-- 
2.11.0

^ permalink raw reply related

* [PATCH v5 1/4] siphash: add cryptographically secure PRF
From: Jason A. Donenfeld @ 2016-12-15 20:30 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein
In-Reply-To: <20161215203003.31989-1-Jason@zx2c4.com>

SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function, or as a
general PRF for short input use cases, such as sequence numbers or RNG
chaining.

For the first usage:

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector. Currently
hashtables use jhash, which is fast but not secure, and some kind of
rotating key scheme (or none at all, which isn't good). SipHash is meant
as a replacement for jhash in these cases.

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

While SipHash is extremely fast for a cryptographically secure function,
it is likely a tiny bit slower than the insecure jhash, and so replacements
will be evaluated on a case-by-case basis based on whether or not the
difference in speed is negligible and whether or not the current jhash usage
poses a real security risk.

For the second usage:

A few places in the kernel are using MD5 for creating secure sequence
numbers, port numbers, or fast random numbers. SipHash is a faster, more
fitting, and more secure replacement for MD5 in those situations.
Replacing MD5 with SipHash for these uses is obvious and straight-
forward, and so is submitted along with this patch series. There
shouldn't be much of a debate over its efficacy.

Dozens of languages are already using this internally for their hash
tables and PRFs. Some of the BSDs already use this in their kernels.
SipHash is a widely known high-speed solution to a widely known set of
problems, and it's time we catch-up.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Daniel J. Bernstein <djb@cr.yp.to>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
---
 include/linux/siphash.h |  32 +++++++++++
 lib/Kconfig.debug       |   6 +--
 lib/Makefile            |   5 +-
 lib/siphash.c           | 138 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      |  83 +++++++++++++++++++++++++++++
 5 files changed, 259 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 000000000000..145cf5667078
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,32 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4.
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+
+#define SIPHASH_ALIGNMENT 8
+
+typedef u64 siphash_key_t[2];
+
+u64 siphash(const void *data, size_t len, const siphash_key_t key);
+
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+static inline u64 siphash_unaligned(const void *data, size_t len,
+				    const siphash_key_t key)
+{
+	return siphash(data, len, key);
+}
+#else
+u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key);
+#endif
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7446097f72bd..86254ea99b45 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1843,9 +1843,9 @@ config TEST_HASH
 	tristate "Perform selftest on hash functions"
 	default n
 	help
-	  Enable this option to test the kernel's integer (<linux/hash,h>)
-	  and string (<linux/stringhash.h>) hash functions on boot
-	  (or module load).
+	  Enable this option to test the kernel's integer (<linux/hash.h>),
+	  string (<linux/stringhash.h>), and siphash (<linux/siphash.h>)
+	  hash functions on boot (or module load).
 
 	  This is intended to help people writing architecture-specific
 	  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
-	 earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+	 earlycpio.o seq_buf.o siphash.o \
+	 nmi_backtrace.o nodemask.o win_minmax.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
 obj-y += kstrtox.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
-obj-$(CONFIG_TEST_HASH) += test_hash.o
+obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
 obj-$(CONFIG_TEST_LKM) += test_module.o
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 000000000000..afc13cbb1b78
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,138 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4.
+ */
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <asm/unaligned.h>
+
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+#include <linux/dcache.h>
+#include <asm/word-at-a-time.h>
+#endif
+
+#define SIPROUND \
+	do { \
+	v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+	v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+	v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+	} while(0)
+
+/**
+ * siphash - compute 64-bit siphash PRF value
+ * @data: buffer to hash, must be aligned to SIPHASH_ALIGNMENT
+ * @size: size of @data
+ * @key: the siphash key
+ */
+u64 siphash(const void *data, size_t len, const siphash_key_t key)
+{
+	u64 v0 = 0x736f6d6570736575ULL;
+	u64 v1 = 0x646f72616e646f6dULL;
+	u64 v2 = 0x6c7967656e657261ULL;
+	u64 v3 = 0x7465646279746573ULL;
+	u64 b = ((u64)len) << 56;
+	u64 m;
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	v3 ^= key[1];
+	v2 ^= key[0];
+	v1 ^= key[1];
+	v0 ^= key[0];
+	for (; data != end; data += sizeof(u64)) {
+		m = le64_to_cpup(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+						  bytemask_from_count(left)));
+#else
+	switch (left) {
+	case 7: b |= ((u64)end[6]) << 48;
+	case 6: b |= ((u64)end[5]) << 40;
+	case 5: b |= ((u64)end[4]) << 32;
+	case 4: b |= le32_to_cpup(data); break;
+	case 3: b |= ((u64)end[2]) << 16;
+	case 2: b |= le16_to_cpup(data); break;
+	case 1: b |= end[0];
+	}
+#endif
+	v3 ^= b;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= b;
+	v2 ^= 0xff;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	return (v0 ^ v1) ^ (v2 ^ v3);
+}
+EXPORT_SYMBOL(siphash);
+
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+/**
+ * siphash - compute 64-bit siphash PRF value, without alignment requirements
+ * @data: buffer to hash
+ * @size: size of @data
+ * @key: the siphash key
+ */
+u64 siphash_unaligned(const void *data, size_t len, const siphash_key_t key)
+{
+	u64 v0 = 0x736f6d6570736575ULL;
+	u64 v1 = 0x646f72616e646f6dULL;
+	u64 v2 = 0x6c7967656e657261ULL;
+	u64 v3 = 0x7465646279746573ULL;
+	u64 b = ((u64)len) << 56;
+	u64 m;
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	v3 ^= key[1];
+	v2 ^= key[0];
+	v1 ^= key[1];
+	v0 ^= key[0];
+	for (; data != end; data += sizeof(u64)) {
+		m = get_unaligned_le64(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) &
+						  bytemask_from_count(left)));
+#else
+	switch (left) {
+	case 7: b |= ((u64)end[6]) << 48;
+	case 6: b |= ((u64)end[5]) << 40;
+	case 5: b |= ((u64)end[4]) << 32;
+	case 4: b |= get_unaligned_le32(end); break;
+	case 3: b |= ((u64)end[2]) << 16;
+	case 2: b |= get_unaligned_le16(end); break;
+	case 1: b |= bytes[0];
+	}
+#endif
+	v3 ^= b;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= b;
+	v2 ^= 0xff;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	return (v0 ^ v1) ^ (v2 ^ v3);
+}
+EXPORT_SYMBOL(siphash_unaligned);
+#endif
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
new file mode 100644
index 000000000000..93549e4e22c5
--- /dev/null
+++ b/lib/test_siphash.c
@@ -0,0 +1,83 @@
+/* Test cases for siphash.c
+ *
+ * Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ *
+ * This implementation is specifically for SipHash2-4.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+
+/* Test vectors taken from official reference source available at:
+ *     https://131002.net/siphash/siphash24.c
+ */
+static const u64 test_vectors[64] = {
+	0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
+	0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
+	0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
+	0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
+	0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
+	0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
+	0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
+	0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
+	0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
+	0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
+	0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
+	0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
+	0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
+	0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
+	0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
+	0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
+	0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
+	0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
+	0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
+	0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
+	0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
+	0x958a324ceb064572ULL
+};
+static const siphash_key_t test_key =
+	{ 0x0706050403020100ULL , 0x0f0e0d0c0b0a0908ULL };
+
+static int __init siphash_test_init(void)
+{
+	u8 in[64] __aligned(SIPHASH_ALIGNMENT);
+	u8 in_unaligned[65];
+	u8 i;
+	int ret = 0;
+
+	for (i = 0; i < 64; ++i) {
+		in[i] = i;
+		in_unaligned[i + 1] = i;
+		if (siphash(in, i, test_key) != test_vectors[i]) {
+			pr_info("self-test aligned %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+		if (siphash_unaligned(in_unaligned + 1, i, test_key) != test_vectors[i]) {
+			pr_info("self-test unaligned %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+	}
+	if (!ret)
+		pr_info("self-tests: pass\n");
+	return ret;
+}
+
+static void __exit siphash_test_exit(void)
+{
+}
+
+module_init(siphash_test_init);
+module_exit(siphash_test_exit);
+
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_LICENSE("Dual BSD/GPL");
-- 
2.11.0

^ permalink raw reply related

* [PATCH v5 0/4] The SipHash Patchset
From: Jason A. Donenfeld @ 2016-12-15 20:29 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto, David Laight,
	Ted Tso, Hannes Frederic Sowa, Linus Torvalds, Eric Biggers,
	Tom Herbert, George Spelvin, Vegard Nossum, ak, davem, luto
  Cc: Jason A. Donenfeld

Hey folks,

I think we're approaching the end of the review for this patchset and we're
getting somewhat close to being ready for it being queued up. At this point,
I've incorporated all of the extremely helpful and instructive suggestions
from the list.

For this v5, we now accept u64[2] as the key, so that alignment is taken
care of naturally. For other alignment issues, we have both the fast aligned
version and the unaligned version, depending on what's necessary. We've
worked out the issues for struct padding. The functions now take a void
pointer to avoid ugly casting, which also helps us shed the inline helper
functions which were not very pretty. The replacements of MD5 have been
benchmarked and show a big increase in speed. We've even come up with a
better naming scheme for dword/qword. All and all it's shaping up nicely.

So, if this series looks good to you, please send along your Reviewed-by,
so we can begin to get this completed. If there are still lingering issues,
let me know and I'll incorporated them into a v6 if necessary.

Thanks,
Jason

Jason A. Donenfeld (4):
  siphash: add cryptographically secure PRF
  siphash: add Nu{32,64} helpers
  secure_seq: use SipHash in place of MD5
  random: use SipHash in place of MD5

 drivers/char/random.c   |  32 +++----
 include/linux/siphash.h |  65 ++++++++++++++
 lib/Kconfig.debug       |   6 +-
 lib/Makefile            |   5 +-
 lib/siphash.c           | 223 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      | 101 ++++++++++++++++++++++
 net/core/secure_seq.c   | 133 +++++++++++------------------
 7 files changed, 460 insertions(+), 105 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

-- 
2.11.0

^ permalink raw reply

* [net-next PATCH v6 5/5] virtio_net: xdp, add slowpath case for non contiguous buffers
From: John Fastabend @ 2016-12-15 20:14 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem
In-Reply-To: <20161215200712.23639.53043.stgit@john-Precision-Tower-5810>

virtio_net XDP support expects receive buffers to be contiguous.
If this is not the case we enable a slowpath to allow connectivity
to continue but at a significan performance overhead associated with
linearizing data. To make it painfully aware to users that XDP is
running in a degraded mode we throw an xdp buffer error.

To linearize packets we allocate a page and copy the segments of
the data, including the header, into it. After this the page can be
handled by XDP code flow as normal.

Then depending on the return code the page is either freed or sent
to the XDP xmit path. There is no attempt to optimize this path.

This case is being handled simple as a precaution in case some
unknown backend were to generate packets in this form. To test this
I had to hack qemu and force it to generate these packets. I do not
expect this case to be generated by "real" backends.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---
 drivers/net/virtio_net.c |   75 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 74 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 1f8300b..ce4ae7f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -471,6 +471,64 @@ static struct sk_buff *receive_big(struct net_device *dev,
 	return NULL;
 }
 
+/* The conditions to enable XDP should preclude the underlying device from
+ * sending packets across multiple buffers (num_buf > 1). However per spec
+ * it does not appear to be illegal to do so but rather just against convention.
+ * So in order to avoid making a system unresponsive the packets are pushed
+ * into a page and the XDP program is run. This will be extremely slow and we
+ * push a warning to the user to fix this as soon as possible. Fixing this may
+ * require resolving the underlying hardware to determine why multiple buffers
+ * are being received or simply loading the XDP program in the ingress stack
+ * after the skb is built because there is no advantage to running it here
+ * anymore.
+ */
+static struct page *xdp_linearize_page(struct receive_queue *rq,
+				       u16 num_buf,
+				       struct page *p,
+				       int offset,
+				       unsigned int *len)
+{
+	struct page *page = alloc_page(GFP_ATOMIC);
+	unsigned int page_off = 0;
+
+	if (!page)
+		return NULL;
+
+	memcpy(page_address(page) + page_off, page_address(p) + offset, *len);
+	page_off += *len;
+
+	while (--num_buf) {
+		unsigned int buflen;
+		unsigned long ctx;
+		void *buf;
+		int off;
+
+		ctx = (unsigned long)virtqueue_get_buf(rq->vq, &buflen);
+		if (unlikely(!ctx))
+			goto err_buf;
+
+		/* guard against a misconfigured or uncooperative backend that
+		 * is sending packet larger than the MTU.
+		 */
+		if ((page_off + buflen) > PAGE_SIZE)
+			goto err_buf;
+
+		buf = mergeable_ctx_to_buf_address(ctx);
+		p = virt_to_head_page(buf);
+		off = buf - page_address(p);
+
+		memcpy(page_address(page) + page_off,
+		       page_address(p) + off, buflen);
+		page_off += buflen;
+	}
+
+	*len = page_off;
+	return page;
+err_buf:
+	__free_pages(page, 0);
+	return NULL;
+}
+
 static struct sk_buff *receive_mergeable(struct net_device *dev,
 					 struct virtnet_info *vi,
 					 struct receive_queue *rq,
@@ -491,6 +549,7 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
 	if (xdp_prog) {
+		struct page *xdp_page;
 		u32 act;
 
 		/* No known backend devices should send packets with
@@ -500,7 +559,15 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		 */
 		if (unlikely(num_buf > 1)) {
 			bpf_warn_invalid_xdp_buffer();
-			goto err_xdp;
+
+			/* linearize data for XDP */
+			xdp_page = xdp_linearize_page(rq, num_buf,
+						      page, offset, &len);
+			if (!xdp_page)
+				goto err_xdp;
+			offset = 0;
+		} else {
+			xdp_page = page;
 		}
 
 		/* Transient failure which in theory could occur if
@@ -514,12 +581,18 @@ static struct sk_buff *receive_mergeable(struct net_device *dev,
 		act = do_xdp_prog(vi, rq, xdp_prog, page, offset, len);
 		switch (act) {
 		case XDP_PASS:
+			if (unlikely(xdp_page != page))
+				__free_pages(xdp_page, 0);
 			break;
 		case XDP_TX:
+			if (unlikely(xdp_page != page))
+				goto err_xdp;
 			rcu_read_unlock();
 			goto xdp_xmit;
 		case XDP_DROP:
 		default:
+			if (unlikely(xdp_page != page))
+				__free_pages(xdp_page, 0);
 			goto err_xdp;
 		}
 	}

^ permalink raw reply related

* Re: [PATCH 8/8] Makefile: drop -D__CHECK_ENDIAN__ from cflags
From: Arend Van Spriel @ 2016-12-15 20:15 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Marcel Holtmann, Gustavo Padovan, Johan Hedberg,
	Wolfgang Grandegger, Marc Kleine-Budde, Vince Bridgers,
	Jay Cliburn, Chris Snook, Luis R. Rodriguez, Kalle Valo,
	Maya Erez, Franky Lin, Hante Meuleman, Stanislaw Gruszka,
	Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Jakub 
In-Reply-To: <1481778865-27667-9-git-send-email-mst@redhat.com>

On 15-12-2016 6:15, Michael S. Tsirkin wrote:
> That's the default now, no need for makefiles to set it.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  drivers/bluetooth/Makefile                                | 2 --
>  drivers/net/can/Makefile                                  | 1 -
>  drivers/net/ethernet/altera/Makefile                      | 1 -
>  drivers/net/ethernet/atheros/alx/Makefile                 | 1 -
>  drivers/net/ethernet/freescale/Makefile                   | 2 --
>  drivers/net/wireless/ath/Makefile                         | 2 --
>  drivers/net/wireless/ath/wil6210/Makefile                 | 2 --
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/Makefile | 2 --
>  drivers/net/wireless/broadcom/brcm80211/brcmsmac/Makefile | 1 -

For brcm80211 drivers:

Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com>

Regards,
Arend

^ permalink raw reply

* [net-next PATCH v6 0/5] XDP for virtio_net
From: John Fastabend @ 2016-12-15 20:12 UTC (permalink / raw)
  To: mst
  Cc: daniel, netdev, alexei.starovoitov, john.r.fastabend, brouer,
	tgraf, davem

This implements virtio_net for the mergeable buffers and big_packet
modes. I tested this with vhost_net running on qemu and did not see
any issues. For testing num_buf > 1 I added a hack to vhost driver
to only but 100 bytes per buffer.

There are some restrictions for XDP to be enabled and work well
(see patch 3) for more details.

  1. GUEST_TSO{4|6} must be off
  2. MTU must be less than PAGE_SIZE
  3. queues must be available to dedicate to XDP
  4. num_bufs received in mergeable buffers must be 1
  5. big_packet mode must have all data on single page

To test this I used pktgen in the hypervisor and ran the XDP sample
programs xdp1 and xdp2 from ./samples/bpf in the host. The default
mode that is used with these patches with Linux guest and QEMU/Linux
hypervisor is the mergeable buffers mode. I tested this mode for 2+
days running xdp2 without issues. Additionally I did a series of
driver unload/load tests to check the allocate/release paths.

To test the big_packets path I applied the following simple patch against
the virtio driver forcing big_packets mode,

--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2242,7 +2242,7 @@ static int virtnet_probe(struct virtio_device *vdev)
                vi->big_packets = true;
 
        if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF))
-               vi->mergeable_rx_bufs = true;
+               vi->mergeable_rx_bufs = false;
 
        if (virtio_has_feature(vdev, VIRTIO_NET_F_MRG_RXBUF) ||
            virtio_has_feature(vdev, VIRTIO_F_VERSION_1))

I then repeated the tests with xdp1 and xdp2. After letting them run
for a few hours I called it good enough.

Testing the unexpected case where virtio receives a packet across
multiple buffers required patching the hypervisor vhost driver to
convince it to send these unexpected packets. Then I used ping with
the -s option to trigger the case with multiple buffers. This mode
is not expected to be used but as MST pointed out per spec it is
not strictly speaking illegal to generate multi-buffer packets so we
need someway to handle these. The following patch can be used to
generate multiple buffers,


--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1777,7 +1777,8 @@ static int translate_desc(struct vhost_virtqueue
*vq, u64

                _iov = iov + ret;
                size = node->size - addr + node->start;
-               _iov->iov_len = min((u64)len - s, size);
+               printk("%s: build 100 length headers!\n", __func__);
+               _iov->iov_len = min((u64)len - s, (u64)100);//size);
                _iov->iov_base = (void __user *)(unsigned long)
                        (node->userspace_addr + addr - node->start);
                s += size;

The qemu command I most frequently used for testing (although I did test
various other combinations of devices) is the following,

 ./x86_64-softmmu/qemu-system-x86_64              \
    -hda /var/lib/libvirt/images/Fedora-test0.img \
    -m 4096  -enable-kvm -smp 2                   \
    -netdev tap,id=hn0,queues=4,vhost=on          \
    -device virtio-net-pci,netdev=hn0,mq=on,vectors=9,guest_tso4=off,guest_tso6=off \
    -serial stdio

The options 'guest_tso4=off,guest_tso6=off' are required because we
do not support LRO with XDP at the moment.

Please review any comments/feedback welcome as always.

Thanks,
John

---

John Fastabend (5):
      net: xdp: add invalid buffer warning
      virtio_net: Add XDP support
      virtio_net: add dedicated XDP transmit queues
      virtio_net: add XDP_TX support
      virtio_net: xdp, add slowpath case for non contiguous buffers


 drivers/net/virtio_net.c |  365 +++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/filter.h   |    1 
 net/core/filter.c        |    6 +
 3 files changed, 365 insertions(+), 7 deletions(-)

^ permalink raw reply

* Re: wl1251 & mac address & calibration data
From: Arend Van Spriel @ 2016-12-15 20:12 UTC (permalink / raw)
  To: Pali Rohár, Kalle Valo
  Cc: Sebastian Reichel, Pavel Machek, Michal Kazior, Ivaylo Dimitrov,
	Aaro Koskinen, Tony Lindgren, linux-wireless, Network Development,
	linux-kernel, Luis R. Rodriguez
In-Reply-To: <1481816017.2090.2.camel@Pali-Nokia-N900>

On 15-12-2016 16:33, Pali Rohár wrote:
> On Thu Dec 15 09:18:44 2016 Kalle Valo <kvalo@codeaurora.org> wrote:
>> (Adding Luis because he has been working on request_firmware() lately)
>>
>> Pali Rohár <pali.rohar@gmail.com> writes:
>>
>>>>> So no, there is no argument against... request_firmware() in
>>>>> fallback mode with userspace helper is by design blocking and
>>>>> waiting for userspace. But waiting for some change in DTS in
>>>>> kernel is just nonsense.
>>>>
>>>> I would just mark the wlan device with status = "disabled" and
>>>> enable it in the overlay together with adding the NVS & MAC info.
>>>
>>> So if you think that this solution make sense, we can wait what net 
>>> wireless maintainers say about it...
>>>
>>> For me it looks like that solution can be:
>>>
>>> extending request_firmware() to use only userspace helper
>>
>> I haven't followed the discussion very closely but this is my preference
>> what drivers should do:
>>
>> 1) First the driver should do try to get the calibration data and mac
>>        address from the device tree.
>>
> 
> Ok, but there is no (dynamic, device specific) data in DTS for N900. So 1) is noop.

Uhm. What do you mean? You can propose a patch to the DT bindings [1] to
get it in there and create your N900 DTB or am I missing something here.
Are there hardware restrictions that do not allow you to boot with your
own DTB.

>> 2) If they are not in DT the driver should retrieve the calibration data
>>        with request_firmware(). BUT with an option for user space to
>>        implement that with a helper script so that the data can be created
>>        dynamically, which I believe openwrt does with ath10k calibration
>>        data right now.
> 
> Currently there is flag for request_firmware() that it should fallback to user helper if direct VFS access not find needed firmware.
> 
> But this flag is not suitable as /lib/firmware already provides default (not device specific) calibration data.
> 
> So I would suggest to add another flag/function which will primary use user helper.

I recall Luis saying that user-mode helper (fallback) should be
discouraged, because there is no assurance that there is a user-mode
helper so you might just be pissing in the wind. The idea was to have a
dedicated API call that explicitly does the request towards user-space.

By the way, are we talking here about wl1251 device or driver as you
also mentioned wl12xx? I did not read the entire thread.

Regards,
Arend

^ permalink raw reply

* [PATCH iproute2 4/4] ip netns: Reset vrf to default VRF on namespace switch
From: David Ahern @ 2016-12-15 20:07 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern
In-Reply-To: <1481832422-10267-1-git-send-email-dsa@cumulusnetworks.com>

A vrf is local to a namespace. Drop any VRF association before trying
to exec a command in the new namespace.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 ip/ip_common.h |  1 +
 ip/ipnetns.c   |  5 +++++
 ip/ipvrf.c     | 14 ++++++++++++++
 3 files changed, 20 insertions(+)

diff --git a/ip/ip_common.h b/ip/ip_common.h
index 28763e81e4a4..ab6a83431fd6 100644
--- a/ip/ip_common.h
+++ b/ip/ip_common.h
@@ -58,6 +58,7 @@ int do_tcp_metrics(int argc, char **argv);
 int do_ipnetconf(int argc, char **argv);
 int do_iptoken(int argc, char **argv);
 int do_ipvrf(int argc, char **argv);
+void vrf_reset(void);
 
 int iplink_get(unsigned int flags, char *name, __u32 filt_mask);
 
diff --git a/ip/ipnetns.c b/ip/ipnetns.c
index db9a541769f1..8201b94a1620 100644
--- a/ip/ipnetns.c
+++ b/ip/ipnetns.c
@@ -387,6 +387,11 @@ static int netns_exec(int argc, char **argv)
 	if (netns_switch(argv[0]))
 		return -1;
 
+	/* we just changed namespaces. clear any vrf association
+	 * with prior namespace before exec'ing command
+	 */
+	vrf_reset();
+
 	/* ip must return the status of the child,
 	 * but do_cmd() will add a minus to this,
 	 * so let's add another one here to cancel it.
diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index de2ec5c120cb..dc8364a43a57 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -277,6 +277,20 @@ static int ipvrf_exec(int argc, char **argv)
 	return -cmd_exec(argv[1], argv + 1, !!batch_mode);
 }
 
+/* reset VRF association of current process to default VRF;
+ * used by netns_exec
+ */
+void vrf_reset(void)
+{
+	char vrf[32];
+
+	if (vrf_identify(getpid(), vrf, sizeof(vrf)) ||
+	    (vrf[0] == '\0'))
+		return;
+
+	vrf_switch("default");
+}
+
 int do_ipvrf(int argc, char **argv)
 {
 	if (argc == 0) {
-- 
2.1.4

^ permalink raw reply related

* [PATCH iproute2 3/4] ip vrf: Fix reset to default VRF
From: David Ahern @ 2016-12-15 20:07 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern
In-Reply-To: <1481832422-10267-1-git-send-email-dsa@cumulusnetworks.com>

Path in vrf_switch for "default" VRF is supposed to be MNT/vrf not
MNT/default. Also, default_vrf flag is redundant with ifindex. Remove
the flag in favor of ifindex != 0.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 ip/ipvrf.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index a2669f339691..de2ec5c120cb 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -202,16 +202,15 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 static int vrf_switch(const char *name)
 {
 	char path[PATH_MAX], *mnt, pid[16];
-	int ifindex = name_is_vrf(name);
-	bool default_vrf = false;
+	int ifindex = 0;
 	int rc = -1, len, fd = -1;
 
-	if (!ifindex) {
-		if (strcmp(name, "default")) {
+	if (strcmp(name, "default")) {
+		ifindex = name_is_vrf(name);
+		if (!ifindex) {
 			fprintf(stderr, "Invalid VRF name\n");
 			return -1;
 		}
-		default_vrf = true;
 	}
 
 	mnt = find_cgroup2_mount();
@@ -221,8 +220,8 @@ static int vrf_switch(const char *name)
 	/* path to cgroup; make sure buffer has room to cat "/cgroup.procs"
 	 * to the end of the path
 	 */
-	len = snprintf(path, sizeof(path) - sizeof(CGRP_PROC_FILE), "%s%s/%s",
-		       mnt, default_vrf ? "" : "/vrf", name);
+	len = snprintf(path, sizeof(path) - sizeof(CGRP_PROC_FILE), "%s/vrf/%s",
+		       mnt, ifindex ? name : "");
 	if (len > sizeof(path) - sizeof(CGRP_PROC_FILE)) {
 		fprintf(stderr, "Invalid path to cgroup2 mount\n");
 		goto out;
@@ -233,7 +232,7 @@ static int vrf_switch(const char *name)
 		goto out;
 	}
 
-	if (!default_vrf && vrf_configure_cgroup(path, ifindex))
+	if (ifindex && vrf_configure_cgroup(path, ifindex))
 		goto out;
 
 	/*
-- 
2.1.4

^ permalink raw reply related

* [PATCH iproute2 2/4] ip vrf: Refactor ipvrf_identify
From: David Ahern @ 2016-12-15 20:07 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern
In-Reply-To: <1481832422-10267-1-git-send-email-dsa@cumulusnetworks.com>

Split ipvrf_identify into arg processing and a function that does the
actual cgroup file parsing. The latter function is used in a follow
on patch.

In the process, convert the reading of the cgroups file to use fopen
and fgets just in case the file ever grows beyond 4k. Move printing
of any error message and the vrf name to the caller of the new
vrf_identify.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 ip/ipvrf.c | 69 +++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 39 insertions(+), 30 deletions(-)

diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 44ad7e07024a..a2669f339691 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -40,14 +40,43 @@ static void usage(void)
 	exit(-1);
 }
 
-static int ipvrf_identify(int argc, char **argv)
+static int vrf_identify(pid_t pid, char *name, size_t len)
 {
 	char path[PATH_MAX];
 	char buf[4096];
 	char *vrf, *end;
-	int fd, rc = -1;
+	FILE *fp;
+
+	snprintf(path, sizeof(path), "/proc/%d/cgroup", pid);
+	fp = fopen(path, "r");
+	if (!fp)
+		return -1;
+
+	memset(name, 0, len);
+
+	while (fgets(buf, sizeof(buf), fp)) {
+		vrf = strstr(buf, "::/vrf/");
+		if (vrf) {
+			vrf += 7;  /* skip past "::/vrf/" */
+			end = strchr(vrf, '\n');
+			if (end)
+				*end = '\0';
+
+			strncpy(name, vrf, len - 1);
+			break;
+		}
+	}
+
+	fclose(fp);
+
+	return 0;
+}
+
+static int ipvrf_identify(int argc, char **argv)
+{
+	char vrf[32];
+	int rc;
 	unsigned int pid;
-	ssize_t n;
 
 	if (argc < 1)
 		pid = getpid();
@@ -56,35 +85,15 @@ static int ipvrf_identify(int argc, char **argv)
 	else if (get_unsigned(&pid, argv[0], 10))
 		invarg("Invalid pid\n", argv[0]);
 
-	snprintf(path, sizeof(path), "/proc/%d/cgroup", pid);
-	fd = open(path, O_RDONLY);
-	if (fd < 0) {
-		fprintf(stderr,
-			"Failed to open cgroups file: %s\n", strerror(errno));
-		return -1;
-	}
-
-	n = read(fd, buf, sizeof(buf) - 1);
-	if (n < 0) {
-		fprintf(stderr,
-			"Failed to read cgroups file: %s\n", strerror(errno));
-		goto out;
-	}
-	buf[n] = '\0';
-	vrf = strstr(buf, "::/vrf/");
-	if (vrf) {
-		vrf += 7;  /* skip past "::/vrf/" */
-		end = strchr(vrf, '\n');
-		if (end)
-			*end = '\0';
-
-		printf("%s\n", vrf);
+	rc = vrf_identify(pid, vrf, sizeof(vrf));
+	if (!rc) {
+		if (vrf[0] != '\0')
+			printf("%s\n", vrf);
+	} else {
+		fprintf(stderr, "Failed to lookup vrf association: %s\n",
+			strerror(errno));
 	}
 
-	rc = 0;
-out:
-	close(fd);
-
 	return rc;
 }
 
-- 
2.1.4

^ permalink raw reply related

* [PATCH iproute2 1/4] ip vrf: Move kernel config hint to prog_load failure
From: David Ahern @ 2016-12-15 20:06 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern
In-Reply-To: <1481832422-10267-1-git-send-email-dsa@cumulusnetworks.com>

Move the hint about CGROUP_BPF enabled to prog_load failure since
it fails before the attach. Update the existing error message to
print to stderr.

Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 ip/ipvrf.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/ip/ipvrf.c b/ip/ipvrf.c
index 4d59845416cd..44ad7e07024a 100644
--- a/ip/ipvrf.c
+++ b/ip/ipvrf.c
@@ -170,14 +170,15 @@ static int vrf_configure_cgroup(const char *path, int ifindex)
 	 */
 	prog_fd = prog_load(ifindex);
 	if (prog_fd < 0) {
-		printf("Failed to load BPF prog: '%s'\n", strerror(errno));
+		fprintf(stderr, "Failed to load BPF prog: '%s'\n",
+			strerror(errno));
+		fprintf(stderr, "Kernel compiled with CGROUP_BPF enabled?\n");
 		goto out;
 	}
 
 	if (bpf_prog_attach_fd(prog_fd, cg_fd, BPF_CGROUP_INET_SOCK_CREATE)) {
 		fprintf(stderr, "Failed to attach prog to cgroup: '%s'\n",
 			strerror(errno));
-		fprintf(stderr, "Kernel compiled with CGROUP_BPF enabled?\n");
 		goto out;
 	}
 
-- 
2.1.4

^ permalink raw reply related

* [PATCH iproute2 0/4] ip vrf fixups
From: David Ahern @ 2016-12-15 20:06 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern

Some minor cleanups to the 'ip vrf' command.

Patch 1 moves the CGROUP_BPF hint to the failure of prog_load since it
fails first.

Patch 2 refactors ipvrf_identify. The action part is moved to a function
that can be used standalone and in the process flipped to fopen/fgets for
robustness should the cgroups file grow larger than 4k.

Patch 3 fixes the path switching to "default" VRF.

Patch 4 moves a task to default VRF when switching namespaces.

David Ahern (4):
  ip vrf: Move kernel config hint to prog_load failure
  ip vrf: Refactor ipvrf_identify
  ip vrf: Fix reset to default VRF
  ip netns: Reset vrf to default VRF on namespace switch

 ip/ip_common.h |   1 +
 ip/ipnetns.c   |   5 +++
 ip/ipvrf.c     | 103 +++++++++++++++++++++++++++++++++++----------------------
 3 files changed, 69 insertions(+), 40 deletions(-)

-- 
2.1.4

^ permalink raw reply

* Re: [PATCH v2 net] rebased to master
From: Florian Fainelli @ 2016-12-15 19:59 UTC (permalink / raw)
  To: Manuel Bessler, netdev
In-Reply-To: <1481826099-12840-1-git-send-email-manuel.bessler@sensus.com>

On 12/15/2016 10:21 AM, Manuel Bessler wrote:
> 'ifconfig eth0 down' makes r6040_close() trigger:
>  INFO: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
> 
> Fixed by moving calls to phy_stop(), napi_disable(), netif_stop_queue()
> to outside of the module's private spin_lock_irq block.
> 
> Found on a Versalogic Tomcat SBC with a Vortex86 SoC
> 
> s1660e_5150:~# sudo ifconfig eth0 down
> [   61.306415] ======================================================
> [   61.306415] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
> [   61.306415] 4.9.0-gb898d2d-manuel #1 Not tainted
> [   61.306415] ------------------------------------------------------
> [   61.306415] ifconfig/449 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
> [   61.306415]  (&dev->lock){+.+...}, at: [<c1336276>] phy_stop+0x16/0x80
> 
> [   61.306415] and this task is already holding:
> [   61.306415]  (&(&lp->lock)->rlock){+.-...}, at: [<d0934c84>] r6040_close+0x24/0x230 [r6040]
> which would create a new lock dependency:
> [   61.306415]  (&(&lp->lock)->rlock){+.-...} -> (&dev->lock){+.+...}
> 
> [   61.306415] but this new dependency connects a SOFTIRQ-irq-safe lock:
> [   61.306415]  (&(&lp->lock)->rlock){+.-...}
> [   61.306415] ... which became SOFTIRQ-irq-safe at:
> [   61.306415]   [   61.306415] [<c1075bc5>] __lock_acquire+0x555/0x1770
> [   61.306415]   [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]   [   61.306415] [<c14bb334>] _raw_spin_lock_irqsave+0x24/0x40
> [   61.306415]   [   61.306415] [<d0934ac0>] r6040_start_xmit+0x30/0x1d0 [r6040]
> [   61.306415]   [   61.306415] [<c13a7d4d>] dev_hard_start_xmit+0x9d/0x2d0
> [   61.306415]   [   61.306415] [<c13c8a38>] sch_direct_xmit+0xa8/0x140
> [   61.306415]   [   61.306415] [<c13a8436>] __dev_queue_xmit+0x416/0x780
> [   61.306415]   [   61.306415] [<c13a87aa>] dev_queue_xmit+0xa/0x10
> [   61.306415]   [   61.306415] [<c13b4837>] neigh_resolve_output+0x147/0x220
> [   61.306415]   [   61.306415] [<c144541b>] ip6_finish_output2+0x2fb/0x910
> [   61.306415]   [   61.306415] [<c14494e6>] ip6_finish_output+0xa6/0x1a0
> [   61.306415]   [   61.306415] [<c1449635>] ip6_output+0x55/0x320
> [   61.306415]   [   61.306415] [<c146f4d2>] mld_sendpack+0x352/0x560
> [   61.306415]   [   61.306415] [<c146fe55>] mld_ifc_timer_expire+0x155/0x280
> [   61.306415]   [   61.306415] [<c108b081>] call_timer_fn+0x81/0x270
> [   61.306415]   [   61.306415] [<c108b331>] expire_timers+0xc1/0x180
> [   61.306415]   [   61.306415] [<c108b4f7>] run_timer_softirq+0x77/0x150
> [   61.306415]   [   61.306415] [<c1043d04>] __do_softirq+0xb4/0x3d0
> [   61.306415]   [   61.306415] [<c101a15c>] do_softirq_own_stack+0x1c/0x30
> [   61.306415]   [   61.306415] [<c104416e>] irq_exit+0x8e/0xa0
> [   61.306415]   [   61.306415] [<c1019d31>] do_IRQ+0x51/0x100
> [   61.306415]   [   61.306415] [<c14bc176>] common_interrupt+0x36/0x40
> [   61.306415]   [   61.306415] [<c1134928>] set_root+0x68/0xf0
> [   61.306415]   [   61.306415] [<c1136120>] path_init+0x400/0x640
> [   61.306415]   [   61.306415] [<c11386bf>] path_lookupat+0xf/0xe0
> [   61.306415]   [   61.306415] [<c1139ebc>] filename_lookup+0x6c/0x100
> [   61.306415]   [   61.306415] [<c1139fd5>] user_path_at_empty+0x25/0x30
> [   61.306415]   [   61.306415] [<c11298c6>] SyS_faccessat+0x86/0x1e0
> [   61.306415]   [   61.306415] [<c1129a30>] SyS_access+0x10/0x20
> [   61.306415]   [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]   [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]
> [   61.306415] to a SOFTIRQ-irq-unsafe lock:
> [   61.306415]  (&dev->lock){+.+...}
> [   61.306415] ... which became SOFTIRQ-irq-unsafe at:
> [   61.306415] ...[   61.306415]
> [   61.306415] [<c1075c0c>] __lock_acquire+0x59c/0x1770
> [   61.306415]   [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]   [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
> [   61.306415]   [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
> [   61.306415]   [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
> [   61.306415]   [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
> [   61.306415]   [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
> [   61.306415]   [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
> [   61.306415]   [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
> [   61.306415]   [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
> [   61.306415]   [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
> [   61.306415]   [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
> [   61.306415]   [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
> [   61.306415]   [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
> [   61.306415]   [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
> [   61.306415]   [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
> [   61.306415]   [   61.306415] [<d0938017>] 0xd0938017
> [   61.306415]   [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
> [   61.306415]   [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
> [   61.306415]   [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
> [   61.306415]   [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
> [   61.306415]   [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]   [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]
> [   61.306415] other info that might help us debug this:
> [   61.306415]
> [   61.306415]  Possible interrupt unsafe locking scenario:
> [   61.306415]
> [   61.306415]        CPU0                    CPU1
> [   61.306415]        ----                    ----
> [   61.306415]   lock(&dev->lock);
> [   61.306415]                                local_irq_disable();
> [   61.306415]                                lock(&(&lp->lock)->rlock);
> [   61.306415]                                lock(&dev->lock);
> [   61.306415]   <Interrupt>
> [   61.306415]     lock(&(&lp->lock)->rlock);
> [   61.306415]
> [   61.306415]  *** DEADLOCK ***
> [   61.306415]
> [   61.306415] 2 locks held by ifconfig/449:
> [   61.306415]  #0:  (rtnl_mutex){+.+.+.}, at: [<c13b68ef>] rtnl_lock+0xf/0x20
> [   61.306415]  #1:  (&(&lp->lock)->rlock){+.-...}, at: [<d0934c84>] r6040_close+0x24/0x230 [r6040]
> [   61.306415]
> [   61.306415] the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
> [   61.306415] -> (&(&lp->lock)->rlock){+.-...} ops: 3049 {
> [   61.306415]    HARDIRQ-ON-W at:
> [   61.306415]                     [   61.306415] [<c1075be7>] __lock_acquire+0x577/0x1770
> [   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]                     [   61.306415] [<c14bb21b>] _raw_spin_lock+0x1b/0x30
> [   61.306415]                     [   61.306415] [<d09343cc>] r6040_poll+0x2c/0x330 [r6040]
> [   61.306415]                     [   61.306415] [<c13a5577>] net_rx_action+0x197/0x340
> [   61.306415]                     [   61.306415] [<c1043d04>] __do_softirq+0xb4/0x3d0
> [   61.306415]                     [   61.306415] [<c1044037>] run_ksoftirqd+0x17/0x40
> [   61.306415]                     [   61.306415] [<c105fe91>] smpboot_thread_fn+0x141/0x180
> [   61.306415]                     [   61.306415] [<c105c84e>] kthread+0xde/0x110
> [   61.306415]                     [   61.306415] [<c14bb949>] ret_from_fork+0x19/0x30
> [   61.306415]    IN-SOFTIRQ-W at:
> [   61.306415]                     [   61.306415] [<c1075bc5>] __lock_acquire+0x555/0x1770
> [   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]                     [   61.306415] [<c14bb334>] _raw_spin_lock_irqsave+0x24/0x40
> [   61.306415]                     [   61.306415] [<d0934ac0>] r6040_start_xmit+0x30/0x1d0 [r6040]
> [   61.306415]                     [   61.306415] [<c13a7d4d>] dev_hard_start_xmit+0x9d/0x2d0
> [   61.306415]                     [   61.306415] [<c13c8a38>] sch_direct_xmit+0xa8/0x140
> [   61.306415]                     [   61.306415] [<c13a8436>] __dev_queue_xmit+0x416/0x780
> [   61.306415]                     [   61.306415] [<c13a87aa>] dev_queue_xmit+0xa/0x10
> [   61.306415]                     [   61.306415] [<c13b4837>] neigh_resolve_output+0x147/0x220
> [   61.306415]                     [   61.306415] [<c144541b>] ip6_finish_output2+0x2fb/0x910
> [   61.306415]                     [   61.306415] [<c14494e6>] ip6_finish_output+0xa6/0x1a0
> [   61.306415]                     [   61.306415] [<c1449635>] ip6_output+0x55/0x320
> [   61.306415]                     [   61.306415] [<c146f4d2>] mld_sendpack+0x352/0x560
> [   61.306415]                     [   61.306415] [<c146fe55>] mld_ifc_timer_expire+0x155/0x280
> [   61.306415]                     [   61.306415] [<c108b081>] call_timer_fn+0x81/0x270
> [   61.306415]                     [   61.306415] [<c108b331>] expire_timers+0xc1/0x180
> [   61.306415]                     [   61.306415] [<c108b4f7>] run_timer_softirq+0x77/0x150
> [   61.306415]                     [   61.306415] [<c1043d04>] __do_softirq+0xb4/0x3d0
> [   61.306415]                     [   61.306415] [<c101a15c>] do_softirq_own_stack+0x1c/0x30
> [   61.306415]                     [   61.306415] [<c104416e>] irq_exit+0x8e/0xa0
> [   61.306415]                     [   61.306415] [<c1019d31>] do_IRQ+0x51/0x100
> [   61.306415]                     [   61.306415] [<c14bc176>] common_interrupt+0x36/0x40
> [   61.306415]                     [   61.306415] [<c1134928>] set_root+0x68/0xf0
> [   61.306415]                     [   61.306415] [<c1136120>] path_init+0x400/0x640
> [   61.306415]                     [   61.306415] [<c11386bf>] path_lookupat+0xf/0xe0
> [   61.306415]                     [   61.306415] [<c1139ebc>] filename_lookup+0x6c/0x100
> [   61.306415]                     [   61.306415] [<c1139fd5>] user_path_at_empty+0x25/0x30
> [   61.306415]                     [   61.306415] [<c11298c6>] SyS_faccessat+0x86/0x1e0
> [   61.306415]                     [   61.306415] [<c1129a30>] SyS_access+0x10/0x20
> [   61.306415]                     [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]                     [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]    INITIAL USE at:
> [   61.306415]                    [   61.306415] [<c107586e>] __lock_acquire+0x1fe/0x1770
> [   61.306415]                    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]                    [   61.306415] [<c14bb334>] _raw_spin_lock_irqsave+0x24/0x40
> [   61.306415]                    [   61.306415] [<d093474e>] r6040_get_stats+0x1e/0x60 [r6040]
> [   61.306415]                    [   61.306415] [<c139fb16>] dev_get_stats+0x96/0xc0
> [   61.306415]                    [   61.306415] [<c14b416e>] rtnl_fill_stats+0x36/0xfd
> [   61.306415]                    [   61.306415] [<c13b7b3c>] rtnl_fill_ifinfo+0x47c/0xce0
> [   61.306415]                    [   61.306415] [<c13bc08e>] rtmsg_ifinfo_build_skb+0x4e/0xd0
> [   61.306415]                    [   61.306415] [<c13bc120>] rtmsg_ifinfo.part.20+0x10/0x40
> [   61.306415]                    [   61.306415] [<c13bc16b>] rtmsg_ifinfo+0x1b/0x20
> [   61.306415]                    [   61.306415] [<c13a9d19>] register_netdevice+0x409/0x550
> [   61.306415]                    [   61.306415] [<c13a9e72>] register_netdev+0x12/0x20
> [   61.306415]                    [   61.306415] [<d09357e8>] r6040_init_one+0x3e8/0x500 [r6040]
> [   61.306415]                    [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
> [   61.306415]                    [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
> [   61.306415]                    [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
> [   61.306415]                    [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
> [   61.306415]                    [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
> [   61.306415]                    [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
> [   61.306415]                    [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
> [   61.306415]                    [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
> [   61.306415]                    [   61.306415] [<d0938017>] 0xd0938017
> [   61.306415]                    [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
> [   61.306415]                    [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
> [   61.306415]                    [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
> [   61.306415]                    [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
> [   61.306415]                    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]                    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]  }
> [   61.306415]  ... key      at: [<d0936280>] __key.45893+0x0/0xfffff739 [r6040]
> [   61.306415]  ... acquired at:
> [   61.306415]    [   61.306415] [<c1074a32>] check_irq_usage+0x42/0xb0
> [   61.306415]    [   61.306415] [<c107677c>] __lock_acquire+0x110c/0x1770
> [   61.306415]    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]    [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
> [   61.306415]    [   61.306415] [<c1336276>] phy_stop+0x16/0x80
> [   61.306415]    [   61.306415] [<d0934ce9>] r6040_close+0x89/0x230 [r6040]
> [   61.306415]    [   61.306415] [<c13a0a91>] __dev_close_many+0x61/0xa0
> [   61.306415]    [   61.306415] [<c13a0bbf>] __dev_close+0x1f/0x30
> [   61.306415]    [   61.306415] [<c13a9127>] __dev_change_flags+0x87/0x150
> [   61.306415]    [   61.306415] [<c13a9213>] dev_change_flags+0x23/0x60
> [   61.306415]    [   61.306415] [<c1416238>] devinet_ioctl+0x5f8/0x6f0
> [   61.306415]    [   61.306415] [<c1417f75>] inet_ioctl+0x65/0x90
> [   61.306415]    [   61.306415] [<c1389b54>] sock_ioctl+0x124/0x2b0
> [   61.306415]    [   61.306415] [<c113cf7c>] do_vfs_ioctl+0x7c/0x790
> [   61.306415]    [   61.306415] [<c113d6b8>] SyS_ioctl+0x28/0x50
> [   61.306415]    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]
> [   61.306415]
> the dependencies between the lock to be acquired[   61.306415]  and SOFTIRQ-irq-unsafe lock:
> [   61.306415] -> (&dev->lock){+.+...} ops: 56 {
> [   61.306415]    HARDIRQ-ON-W at:
> [   61.306415]                     [   61.306415] [<c1075be7>] __lock_acquire+0x577/0x1770
> [   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]                     [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
> [   61.306415]                     [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
> [   61.306415]                     [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
> [   61.306415]                     [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
> [   61.306415]                     [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
> [   61.306415]                     [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
> [   61.306415]                     [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
> [   61.306415]                     [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
> [   61.306415]                     [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
> [   61.306415]                     [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
> [   61.306415]                     [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
> [   61.306415]                     [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
> [   61.306415]                     [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
> [   61.306415]                     [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
> [   61.306415]                     [   61.306415] [<d0938017>] 0xd0938017
> [   61.306415]                     [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
> [   61.306415]                     [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
> [   61.306415]                     [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
> [   61.306415]                     [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
> [   61.306415]                     [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]                     [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]    SOFTIRQ-ON-W at:
> [   61.306415]                     [   61.306415] [<c1075c0c>] __lock_acquire+0x59c/0x1770
> [   61.306415]                     [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]                     [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
> [   61.306415]                     [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
> [   61.306415]                     [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
> [   61.306415]                     [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
> [   61.306415]                     [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
> [   61.306415]                     [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
> [   61.306415]                     [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
> [   61.306415]                     [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
> [   61.306415]                     [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
> [   61.306415]                     [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
> [   61.306415]                     [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
> [   61.306415]                     [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
> [   61.306415]                     [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
> [   61.306415]                     [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
> [   61.306415]                     [   61.306415] [<d0938017>] 0xd0938017
> [   61.306415]                     [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
> [   61.306415]                     [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
> [   61.306415]                     [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
> [   61.306415]                     [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
> [   61.306415]                     [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]                     [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]    INITIAL USE at:
> [   61.306415]                    [   61.306415] [<c107586e>] __lock_acquire+0x1fe/0x1770
> [   61.306415]                    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]                    [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
> [   61.306415]                    [   61.306415] [<c133747d>] phy_probe+0x4d/0xc0
> [   61.306415]                    [   61.306415] [<c1338afe>] phy_attach_direct+0xbe/0x190
> [   61.306415]                    [   61.306415] [<c1338ca7>] phy_connect_direct+0x17/0x60
> [   61.306415]                    [   61.306415] [<c1338d23>] phy_connect+0x33/0x70
> [   61.306415]                    [   61.306415] [<d09357a0>] r6040_init_one+0x3a0/0x500 [r6040]
> [   61.306415]                    [   61.306415] [<c12a78c7>] pci_device_probe+0x77/0xd0
> [   61.306415]                    [   61.306415] [<c12f5e15>] driver_probe_device+0x145/0x280
> [   61.306415]                    [   61.306415] [<c12f5fd9>] __driver_attach+0x89/0x90
> [   61.306415]                    [   61.306415] [<c12f43ef>] bus_for_each_dev+0x4f/0x80
> [   61.306415]                    [   61.306415] [<c12f5954>] driver_attach+0x14/0x20
> [   61.306415]                    [   61.306415] [<c12f55b7>] bus_add_driver+0x197/0x210
> [   61.306415]                    [   61.306415] [<c12f6a21>] driver_register+0x51/0xd0
> [   61.306415]                    [   61.306415] [<c12a6955>] __pci_register_driver+0x45/0x50
> [   61.306415]                    [   61.306415] [<d0938017>] 0xd0938017
> [   61.306415]                    [   61.306415] [<c100043f>] do_one_initcall+0x2f/0x140
> [   61.306415]                    [   61.306415] [<c10e48c0>] do_init_module+0x4a/0x19b
> [   61.306415]                    [   61.306415] [<c10a680e>] load_module+0x1b2e/0x2070
> [   61.306415]                    [   61.306415] [<c10a6eb9>] SyS_finit_module+0x69/0x80
> [   61.306415]                    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]                    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]  }
> [   61.306415]  ... key      at: [<c1f28f39>] __key.43998+0x0/0x8
> [   61.306415]  ... acquired at:
> [   61.306415]    [   61.306415] [<c1074a32>] check_irq_usage+0x42/0xb0
> [   61.306415]    [   61.306415] [<c107677c>] __lock_acquire+0x110c/0x1770
> [   61.306415]    [   61.306415] [<c107717c>] lock_acquire+0x7c/0x150
> [   61.306415]    [   61.306415] [<c14b7add>] mutex_lock_nested+0x2d/0x4a0
> [   61.306415]    [   61.306415] [<c1336276>] phy_stop+0x16/0x80
> [   61.306415]    [   61.306415] [<d0934ce9>] r6040_close+0x89/0x230 [r6040]
> [   61.306415]    [   61.306415] [<c13a0a91>] __dev_close_many+0x61/0xa0
> [   61.306415]    [   61.306415] [<c13a0bbf>] __dev_close+0x1f/0x30
> [   61.306415]    [   61.306415] [<c13a9127>] __dev_change_flags+0x87/0x150
> [   61.306415]    [   61.306415] [<c13a9213>] dev_change_flags+0x23/0x60
> [   61.306415]    [   61.306415] [<c1416238>] devinet_ioctl+0x5f8/0x6f0
> [   61.306415]    [   61.306415] [<c1417f75>] inet_ioctl+0x65/0x90
> [   61.306415]    [   61.306415] [<c1389b54>] sock_ioctl+0x124/0x2b0
> [   61.306415]    [   61.306415] [<c113cf7c>] do_vfs_ioctl+0x7c/0x790
> [   61.306415]    [   61.306415] [<c113d6b8>] SyS_ioctl+0x28/0x50
> [   61.306415]    [   61.306415] [<c100179f>] do_int80_syscall_32+0x3f/0x110
> [   61.306415]    [   61.306415] [<c14bba3f>] restore_all+0x0/0x61
> [   61.306415]
> [   61.306415]
> [   61.306415] stack backtrace:
> [   61.306415] CPU: 0 PID: 449 Comm: ifconfig Not tainted 4.9.0-gb898d2d-manuel #1
> [   61.306415] Call Trace:
> [   61.306415]  dump_stack+0x16/0x19
> [   61.306415]  check_usage+0x3f6/0x550
> [   61.306415]  ? check_usage+0x4d/0x550
> [   61.306415]  check_irq_usage+0x42/0xb0
> [   61.306415]  __lock_acquire+0x110c/0x1770
> [   61.306415]  lock_acquire+0x7c/0x150
> [   61.306415]  ? phy_stop+0x16/0x80
> [   61.306415]  mutex_lock_nested+0x2d/0x4a0
> [   61.306415]  ? phy_stop+0x16/0x80
> [   61.306415]  ? r6040_close+0x24/0x230 [r6040]
> [   61.306415]  ? __delay+0x9/0x10
> [   61.306415]  phy_stop+0x16/0x80
> [   61.306415]  r6040_close+0x89/0x230 [r6040]
> [   61.306415]  __dev_close_many+0x61/0xa0
> [   61.306415]  __dev_close+0x1f/0x30
> [   61.306415]  __dev_change_flags+0x87/0x150
> [   61.306415]  dev_change_flags+0x23/0x60
> [   61.306415]  devinet_ioctl+0x5f8/0x6f0
> [   61.306415]  inet_ioctl+0x65/0x90
> [   61.306415]  sock_ioctl+0x124/0x2b0
> [   61.306415]  ? dlci_ioctl_set+0x30/0x30
> [   61.306415]  do_vfs_ioctl+0x7c/0x790
> [   61.306415]  ? trace_hardirqs_on+0xb/0x10
> [   61.306415]  ? call_rcu_sched+0xd/0x10
> [   61.306415]  ? __put_cred+0x32/0x50
> [   61.306415]  ? SyS_faccessat+0x178/0x1e0
> [   61.306415]  SyS_ioctl+0x28/0x50
> [   61.306415]  do_int80_syscall_32+0x3f/0x110
> [   61.306415]  entry_INT80_32+0x2f/0x2f
> [   61.306415] EIP: 0xb764d364
> [   61.306415] EFLAGS: 00000286 CPU: 0
> [   61.306415] EAX: ffffffda EBX: 00000004 ECX: 00008914 EDX: bfa99d7c
> [   61.306415] ESI: bfa99e4c EDI: fffffffe EBP: 00000004 ESP: bfa99d58
> [   61.306415]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
> [   63.836607] r6040 0000:00:08.0 eth0: Link is Down
> 
> Signed-off-by: Manuel Bessler <manuel.bessler@sensus.com>

Would have been nice to CC the maintainer of the driver. Your patch
subject is no longer correct now, you should use the same subject you
used for the first submission.

Other than that:

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

> ---
>  drivers/net/ethernet/rdc/r6040.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/ethernet/rdc/r6040.c b/drivers/net/ethernet/rdc/r6040.c
> index 4ff4e04..aa11b70 100644
> --- a/drivers/net/ethernet/rdc/r6040.c
> +++ b/drivers/net/ethernet/rdc/r6040.c
> @@ -472,8 +472,6 @@ static void r6040_down(struct net_device *dev)
>  	iowrite16(adrp[0], ioaddr + MID_0L);
>  	iowrite16(adrp[1], ioaddr + MID_0M);
>  	iowrite16(adrp[2], ioaddr + MID_0H);
> -
> -	phy_stop(dev->phydev);
>  }
>  
>  static int r6040_close(struct net_device *dev)
> @@ -481,12 +479,12 @@ static int r6040_close(struct net_device *dev)
>  	struct r6040_private *lp = netdev_priv(dev);
>  	struct pci_dev *pdev = lp->pdev;
>  
> -	spin_lock_irq(&lp->lock);
> +	phy_stop(dev->phydev);
>  	napi_disable(&lp->napi);
>  	netif_stop_queue(dev);
> -	r6040_down(dev);
>  
> -	free_irq(dev->irq, dev);
> +	spin_lock_irq(&lp->lock);
> +	r6040_down(dev);
>  
>  	/* Free RX buffer */
>  	r6040_free_rxbufs(dev);
> @@ -496,6 +494,8 @@ static int r6040_close(struct net_device *dev)
>  
>  	spin_unlock_irq(&lp->lock);
>  
> +	free_irq(dev->irq, dev);
> +
>  	/* Free Descriptor memory */
>  	if (lp->rx_ring) {
>  		pci_free_consistent(pdev,
> 


-- 
Florian

^ permalink raw reply

* Re: [PATCH] net: sfc: use new api ethtool_{get|set}_link_ksettings
From: Jarod Wilson @ 2016-12-15 19:59 UTC (permalink / raw)
  To: Philippe Reynes, linux-net-drivers, ecree, bkenward; +Cc: netdev, linux-kernel
In-Reply-To: <1481757173-16000-1-git-send-email-tremyfr@gmail.com>

On 2016-12-14 6:12 PM, Philippe Reynes wrote:
> The ethtool api {get|set}_settings is deprecated.
> We move this driver to new api {get|set}_link_ksettings.
>
> Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
> ---
>  drivers/net/ethernet/sfc/ethtool.c    |   35 ++++++++++++-------
>  drivers/net/ethernet/sfc/mcdi_port.c  |   60 ++++++++++++++++++++------------
>  drivers/net/ethernet/sfc/net_driver.h |   12 +++---
>  3 files changed, 65 insertions(+), 42 deletions(-)

What about drivers/net/ethernet/sfc/falcon/ethtool.c? Coming in a 
separate patch?

-- 
Jarod Wilson
jarod@redhat.com

^ permalink raw reply

* Re: [PATCH 5/8] linux: drop __bitwise__ everywhere
From: Lee Duncan @ 2016-12-15 19:44 UTC (permalink / raw)
  To: Michael S. Tsirkin, linux-kernel
  Cc: Kukjin Kim, Krzysztof Kozlowski, Javier Martinez Canillas,
	Russell King, Alasdair Kergon, Mike Snitzer, dm-devel, Shaohua Li,
	Johannes Berg, Emmanuel Grumbach, Luca Coelho,
	Intel Linux Wireless, Kalle Valo, Greg Kroah-Hartman, Jiri Slaby,
	Chris Leech, James E.J. Bottomley, Martin K. Petersen,
	Nicholas A. Bellinger, Jason Wang, Alexander Aring,
	Stefan Schmidt, "David S. Miller" <d
In-Reply-To: <1481778865-27667-6-git-send-email-mst@redhat.com>

On 12/14/2016 09:15 PM, Michael S. Tsirkin wrote:
> __bitwise__ used to mean "yes, please enable sparse checks
> unconditionally", but now that we dropped __CHECK_ENDIAN__
> __bitwise is exactly the same.
> There aren't many users, replace it by __bitwise everywhere.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> ---
>  arch/arm/plat-samsung/include/plat/gpio-cfg.h    | 2 +-
>  drivers/md/dm-cache-block-types.h                | 6 +++---
>  drivers/net/ethernet/sun/sunhme.h                | 2 +-
>  drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h | 4 ++--
>  include/linux/mmzone.h                           | 2 +-
>  include/linux/serial_core.h                      | 4 ++--
>  include/linux/types.h                            | 4 ++--
>  include/scsi/iscsi_proto.h                       | 2 +-
>  include/target/target_core_base.h                | 2 +-
>  include/uapi/linux/virtio_types.h                | 6 +++---
>  net/ieee802154/6lowpan/6lowpan_i.h               | 2 +-
>  net/mac80211/ieee80211_i.h                       | 4 ++--
>  12 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/arch/arm/plat-samsung/include/plat/gpio-cfg.h b/arch/arm/plat-samsung/include/plat/gpio-cfg.h
> index 21391fa..e55d1f5 100644
> --- a/arch/arm/plat-samsung/include/plat/gpio-cfg.h
> +++ b/arch/arm/plat-samsung/include/plat/gpio-cfg.h
> @@ -26,7 +26,7 @@
>  
>  #include <linux/types.h>
>  
> -typedef unsigned int __bitwise__ samsung_gpio_pull_t;
> +typedef unsigned int __bitwise samsung_gpio_pull_t;
>  
>  /* forward declaration if gpio-core.h hasn't been included */
>  struct samsung_gpio_chip;
> diff --git a/drivers/md/dm-cache-block-types.h b/drivers/md/dm-cache-block-types.h
> index bed4ad4..389c9e8 100644
> --- a/drivers/md/dm-cache-block-types.h
> +++ b/drivers/md/dm-cache-block-types.h
> @@ -17,9 +17,9 @@
>   * discard bitset.
>   */
>  
> -typedef dm_block_t __bitwise__ dm_oblock_t;
> -typedef uint32_t __bitwise__ dm_cblock_t;
> -typedef dm_block_t __bitwise__ dm_dblock_t;
> +typedef dm_block_t __bitwise dm_oblock_t;
> +typedef uint32_t __bitwise dm_cblock_t;
> +typedef dm_block_t __bitwise dm_dblock_t;
>  
>  static inline dm_oblock_t to_oblock(dm_block_t b)
>  {
> diff --git a/drivers/net/ethernet/sun/sunhme.h b/drivers/net/ethernet/sun/sunhme.h
> index f430765..4a8d5b1 100644
> --- a/drivers/net/ethernet/sun/sunhme.h
> +++ b/drivers/net/ethernet/sun/sunhme.h
> @@ -302,7 +302,7 @@
>   * Always write the address first before setting the ownership
>   * bits to avoid races with the hardware scanning the ring.
>   */
> -typedef u32 __bitwise__ hme32;
> +typedef u32 __bitwise hme32;
>  
>  struct happy_meal_rxd {
>  	hme32 rx_flags;
> diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h b/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h
> index 1ad0ec1..84813b5 100644
> --- a/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h
> +++ b/drivers/net/wireless/intel/iwlwifi/iwl-fw-file.h
> @@ -228,7 +228,7 @@ enum iwl_ucode_tlv_flag {
>  	IWL_UCODE_TLV_FLAGS_BCAST_FILTERING	= BIT(29),
>  };
>  
> -typedef unsigned int __bitwise__ iwl_ucode_tlv_api_t;
> +typedef unsigned int __bitwise iwl_ucode_tlv_api_t;
>  
>  /**
>   * enum iwl_ucode_tlv_api - ucode api
> @@ -258,7 +258,7 @@ enum iwl_ucode_tlv_api {
>  #endif
>  };
>  
> -typedef unsigned int __bitwise__ iwl_ucode_tlv_capa_t;
> +typedef unsigned int __bitwise iwl_ucode_tlv_capa_t;
>  
>  /**
>   * enum iwl_ucode_tlv_capa - ucode capabilities
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0f088f3..36d9896 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -246,7 +246,7 @@ struct lruvec {
>  #define ISOLATE_UNEVICTABLE	((__force isolate_mode_t)0x8)
>  
>  /* LRU Isolation modes. */
> -typedef unsigned __bitwise__ isolate_mode_t;
> +typedef unsigned __bitwise isolate_mode_t;
>  
>  enum zone_watermarks {
>  	WMARK_MIN,
> diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
> index 5d49488..5def8e8 100644
> --- a/include/linux/serial_core.h
> +++ b/include/linux/serial_core.h
> @@ -111,8 +111,8 @@ struct uart_icount {
>  	__u32	buf_overrun;
>  };
>  
> -typedef unsigned int __bitwise__ upf_t;
> -typedef unsigned int __bitwise__ upstat_t;
> +typedef unsigned int __bitwise upf_t;
> +typedef unsigned int __bitwise upstat_t;
>  
>  struct uart_port {
>  	spinlock_t		lock;			/* port lock */
> diff --git a/include/linux/types.h b/include/linux/types.h
> index baf7183..d501ad3 100644
> --- a/include/linux/types.h
> +++ b/include/linux/types.h
> @@ -154,8 +154,8 @@ typedef u64 dma_addr_t;
>  typedef u32 dma_addr_t;
>  #endif
>  
> -typedef unsigned __bitwise__ gfp_t;
> -typedef unsigned __bitwise__ fmode_t;
> +typedef unsigned __bitwise gfp_t;
> +typedef unsigned __bitwise fmode_t;
>  
>  #ifdef CONFIG_PHYS_ADDR_T_64BIT
>  typedef u64 phys_addr_t;
> diff --git a/include/scsi/iscsi_proto.h b/include/scsi/iscsi_proto.h
> index c1260d8..df156f1 100644
> --- a/include/scsi/iscsi_proto.h
> +++ b/include/scsi/iscsi_proto.h
> @@ -74,7 +74,7 @@ static inline int iscsi_sna_gte(u32 n1, u32 n2)
>  #define zero_data(p) {p[0]=0;p[1]=0;p[2]=0;}
>  
>  /* initiator tags; opaque for target */
> -typedef uint32_t __bitwise__ itt_t;
> +typedef uint32_t __bitwise itt_t;
>  /* below makes sense only for initiator that created this tag */
>  #define build_itt(itt, age) ((__force itt_t)\
>  	((itt) | ((age) << ISCSI_AGE_SHIFT)))
> diff --git a/include/target/target_core_base.h b/include/target/target_core_base.h
> index c211900..0055828 100644
> --- a/include/target/target_core_base.h
> +++ b/include/target/target_core_base.h
> @@ -149,7 +149,7 @@ enum se_cmd_flags_table {
>   * Used by transport_send_check_condition_and_sense()
>   * to signal which ASC/ASCQ sense payload should be built.
>   */
> -typedef unsigned __bitwise__ sense_reason_t;
> +typedef unsigned __bitwise sense_reason_t;
>  
>  enum tcm_sense_reason_table {
>  #define R(x)	(__force sense_reason_t )(x)
> diff --git a/include/uapi/linux/virtio_types.h b/include/uapi/linux/virtio_types.h
> index e845e8c..55c3b73 100644
> --- a/include/uapi/linux/virtio_types.h
> +++ b/include/uapi/linux/virtio_types.h
> @@ -39,8 +39,8 @@
>   * - __le{16,32,64} for standard-compliant virtio devices
>   */
>  
> -typedef __u16 __bitwise__ __virtio16;
> -typedef __u32 __bitwise__ __virtio32;
> -typedef __u64 __bitwise__ __virtio64;
> +typedef __u16 __bitwise __virtio16;
> +typedef __u32 __bitwise __virtio32;
> +typedef __u64 __bitwise __virtio64;
>  
>  #endif /* _UAPI_LINUX_VIRTIO_TYPES_H */
> diff --git a/net/ieee802154/6lowpan/6lowpan_i.h b/net/ieee802154/6lowpan/6lowpan_i.h
> index 5ac7789..ac7c96b 100644
> --- a/net/ieee802154/6lowpan/6lowpan_i.h
> +++ b/net/ieee802154/6lowpan/6lowpan_i.h
> @@ -7,7 +7,7 @@
>  #include <net/inet_frag.h>
>  #include <net/6lowpan.h>
>  
> -typedef unsigned __bitwise__ lowpan_rx_result;
> +typedef unsigned __bitwise lowpan_rx_result;
>  #define RX_CONTINUE		((__force lowpan_rx_result) 0u)
>  #define RX_DROP_UNUSABLE	((__force lowpan_rx_result) 1u)
>  #define RX_DROP			((__force lowpan_rx_result) 2u)
> diff --git a/net/mac80211/ieee80211_i.h b/net/mac80211/ieee80211_i.h
> index d37a577..b2069fb 100644
> --- a/net/mac80211/ieee80211_i.h
> +++ b/net/mac80211/ieee80211_i.h
> @@ -159,7 +159,7 @@ enum ieee80211_bss_valid_data_flags {
>  	IEEE80211_BSS_VALID_ERP			= BIT(3)
>  };
>  
> -typedef unsigned __bitwise__ ieee80211_tx_result;
> +typedef unsigned __bitwise ieee80211_tx_result;
>  #define TX_CONTINUE	((__force ieee80211_tx_result) 0u)
>  #define TX_DROP		((__force ieee80211_tx_result) 1u)
>  #define TX_QUEUED	((__force ieee80211_tx_result) 2u)
> @@ -180,7 +180,7 @@ struct ieee80211_tx_data {
>  };
>  
>  
> -typedef unsigned __bitwise__ ieee80211_rx_result;
> +typedef unsigned __bitwise ieee80211_rx_result;
>  #define RX_CONTINUE		((__force ieee80211_rx_result) 0u)
>  #define RX_DROP_UNUSABLE	((__force ieee80211_rx_result) 1u)
>  #define RX_DROP_MONITOR		((__force ieee80211_rx_result) 2u)
>

For iscsi initiator, looks good.

Akced-by: Lee Duncan <lduncan@suse.com>

-- 
Lee Duncan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-15 19:42 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: ks.giri, linux-kernel, davem, bh74.an, peppe.cavallaro,
	alexandre.torgue, pavel, romieu, netdev, vipul.pandya
In-Reply-To: <CAD5ja63O5jFtt=RyXTBxRxKSaJVfyL+kFeJwwen-pt0ifQARYQ@mail.gmail.com>

Hi,

On 15.12.2016 19:52, Niklas Cassel wrote:
> Since v1 of this patch has already been merged to net-next, I think that
> you should create a new patch on top of that, rather than submitting a v2.
> 
> http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/drivers/net/ethernet/stmicro/stmmac?id=739c8e149ae40a1eb044edb92a133b93b59369d8
> 

It is v2 that has been merged, not v1. 
Both versions only differed in the commit message.

Regards,
Lino

^ permalink raw reply

* Re: [PATCH v2 2/2] net: ethernet: stmmac: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-15 19:37 UTC (permalink / raw)
  To: Pavel Machek
  Cc: bh74.an, ks.giri, vipul.pandya, peppe.cavallaro, alexandre.torgue,
	romieu, davem, linux-kernel, netdev
In-Reply-To: <20161215094517.GA406@amd>

Hi,

On 15.12.2016 10:45, Pavel Machek wrote:
> Hi!
> 
>> The driver uses a private lock for synchronization of the xmit function and
>> the xmit completion handler, but since the NETIF_F_LLTX flag is not set,
>> the xmit function is also called with the xmit_lock held.
>> 
>> On the other hand the completion handler uses the reverse locking order by
>> first taking the private lock and (in case that the tx queue had been
>> stopped) then the xmit_lock.
>> 
>> Improve the locking by removing the private lock and using only the
>> xmit_lock for synchronization instead.
> 
> Do you have stmmac hardware to test on?
> 

Unfortunately not (I mentioned that the patch I send was only compile tested in 
the first version but I think I forgot to do so in the last version).

> I believe something is very wrong with the locking there. In
> particular... scheduling the stmmac_tx_timer() function to run often
> should not do anything bad if locking is correct... but it breaks the
> driver rather quickly. [Example patch below, needs applying to two
> places in net-next.]
> 

Do you get this result only after the private lock is removed? Or has this problem
been there before? And how exactly does the failure look like?

Regards,
Lino  

^ permalink raw reply

* Re: [PATCH 1/2] net: ethernet: sxgbe: remove private tx queue lock
From: Lino Sanfilippo @ 2016-12-15 19:27 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Francois Romieu, bh74.an, ks.giri, vipul.pandya, peppe.cavallaro,
	alexandre.torgue, davem, linux-kernel, netdev
In-Reply-To: <20161211201104.GB20574@amd>

Hi Pavel,

sorry for the late reply.

On 11.12.2016 21:11, Pavel Machek wrote:
> 
> Do you understand what stmmac_tx_err(priv); is supposed to do? In
> particular, if it is called while the driver is working ok -- should
> the driver survive that?

As far as I understood it is supposed to fixup an errorneous tx path, e.g. a
missing tx completion for transmitted frames.

Some drivers do this by restarting only the HW parts responsible for tx, some
others by restarting the complete hardware. 
But IMO it should also be ok to be called if the HW is still working fine.

> Because it does not currently, and I don't know how to test that
> code. Unplugging the cable does not provoke that.
> 
> I tried
> 
>         } else if (unlikely(status == tx_hard_error))
>                 stmmac_tx_err(priv);
> +
> +       {
> +               static int i;
> +               i++;
> +               if (i==1000) {
> +                       i = 0;
> +                       printk("Simulated error\n");
> +                       stmmac_tx_err(priv);
> +               }
> +       }
>  }
> 

Ok, there is this race that Francois mentioned so it is not surprising that
the driver does not survive the call of stmmac_tx_err() as it is called now.
Thats why I suggested to do a proper shutdown and restart of the tx path to
avoid the race.

Regards,
Lino

^ permalink raw reply

* Re: [PATCH perf/core REBASE 2/5] samples/bpf: Switch over to libbpf
From: Arnaldo Carvalho de Melo @ 2016-12-15 19:04 UTC (permalink / raw)
  To: Joe Stringer
  Cc: LKML, netdev, Wang Nan, ast, Daniel Borkmann,
	Arnaldo Carvalho de Melo
In-Reply-To: <CAPWQB7HMzpPOgA1rdrKHmbxU29gkaUS1UQF69pHLrfsNhjDvMQ@mail.gmail.com>

Em Thu, Dec 15, 2016 at 10:29:19AM -0800, Joe Stringer escreveu:
> On 15 December 2016 at 07:50, Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Em Wed, Dec 14, 2016 at 02:43:39PM -0800, Joe Stringer escreveu:
> >> Now that libbpf under tools/lib/bpf/* is synced with the version from
> >> samples/bpf, we can get rid most of the libbpf library here.
> >>
> >> Signed-off-by: Joe Stringer <joe@ovn.org>
> >> Cc: Alexei Starovoitov <ast@fb.com>
> >> Cc: Daniel Borkmann <daniel@iogearbox.net>
> >> Cc: Wang Nan <wangnan0@huawei.com>
> >> Link: http://lkml.kernel.org/r/20161209024620.31660-6-joe@ovn.org
> >> [ Use -I$(srctree)/tools/lib/ to support out of source code tree builds, as noticed by Wang Nan ]
> >> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> >
> > So, right before this patch building samples/bpf works, then, after, it fails,
> > investigating:
> >
> > [root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ headers_install
> > make[1]: Entering directory '/tmp/build/linux'
> >   CHK     include/generated/uapi/linux/version.h
> > make[1]: Leaving directory '/tmp/build/linux'
> > [root@1e797fdfbf4f linux]# make -j4 O=/tmp/build/linux/ samples/bpf/
> > make[1]: Entering directory '/tmp/build/linux'
> >   CHK     include/config/kernel.release
> >   GEN     ./Makefile
> >   CHK     include/generated/uapi/linux/version.h
> >   Using /git/linux as source for kernel
> >   CHK     include/generated/utsrelease.h
> >   CHK     include/generated/timeconst.h
> >   CHK     include/generated/bounds.h
> >   CHK     include/generated/asm-offsets.h
> >   CALL    /git/linux/scripts/checksyscalls.sh
> >   HOSTCC  samples/bpf/test_lru_dist.o
> >   HOSTCC  samples/bpf/libbpf.o
> >   HOSTCC  samples/bpf/sock_example.o
> >   HOSTCC  samples/bpf/bpf_load.o
> > In file included from /git/linux/samples/bpf/libbpf.c:12:0:
> > /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory
> >  #include <bpf/bpf.h>
> >                      ^
> > compilation terminated.
> > In file included from /git/linux/samples/bpf/test_lru_dist.c:24:0:
> > /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory
> >  #include <bpf/bpf.h>
> >                      ^
> > compilation terminated.
> > make[2]: *** [scripts/Makefile.host:124: samples/bpf/test_lru_dist.o] Error 1
> > make[2]: *** Waiting for unfinished jobs....
> > make[2]: *** [scripts/Makefile.host:124: samples/bpf/libbpf.o] Error 1
> > In file included from /git/linux/samples/bpf/bpf_load.c:24:0:
> > /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory
> >  #include <bpf/bpf.h>
> >                      ^
> > compilation terminated.
> > make[2]: *** [scripts/Makefile.host:124: samples/bpf/bpf_load.o] Error 1
> > In file included from /git/linux/samples/bpf/sock_example.c:29:0:
> > /git/linux/samples/bpf/libbpf.h:5:21: fatal error: bpf/bpf.h: No such file or directory
> >  #include <bpf/bpf.h>
> >                      ^
> > compilation terminated.
> > make[2]: *** [scripts/Makefile.host:124: samples/bpf/sock_example.o] Error 1
> > make[1]: *** [/git/linux/Makefile:1659: samples/bpf/] Error 2
> > make[1]: Leaving directory '/tmp/build/linux'
> > make: *** [Makefile:150: sub-make] Error 2
> > [root@1e797fdfbf4f linux]#
> 
> Sorry about that.
> 
> It looks like this fragment which ended up in "samples/bpf: Remove
> perf_event_open() declaration" patch should be here instead:

I figured that out, but there is another problem, see my other messages,

- Arnaldo
 
> diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
> index add514e2984a..9718f664fedf 100644
> --- a/samples/bpf/Makefile
> +++ b/samples/bpf/Makefile
> @@ -108,6 +108,8 @@ always += xdp_tx_iptunnel_kern.o
> 
> HOSTCFLAGS += -I$(objtree)/usr/include
> HOSTCFLAGS += -I$(srctree)/tools/testing/selftests/bpf/
> +HOSTCFLAGS += -I$(srctree)/tools/lib/ -I$(srctree)/tools/include
> +HOSTCFLAGS += -I$(srctree)/tools/perf
> 
> HOSTCFLAGS_bpf_load.o += -I$(objtree)/usr/include -Wno-unused-variable
> HOSTLOADLIBES_fds_example += -lelf

^ permalink raw reply

* Re: Soft lockup in inet_put_port on 4.6
From: Josef Bacik @ 2016-12-15 18:53 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Craig Gallek, Hannes Frederic Sowa, Eric Dumazet,
	Linux Kernel Network Developers
In-Reply-To: <CALx6S369T_hvoTgyHbmeSiR2p3d68h+0tMKqMmcGYLrKiN3JMA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3507 bytes --]

On Tue, Dec 13, 2016 at 6:32 PM, Tom Herbert <tom@herbertland.com> 
wrote:
> On Tue, Dec 13, 2016 at 3:03 PM, Craig Gallek <kraigatgoog@gmail.com> 
> wrote:
>>  On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert <tom@herbertland.com> 
>> wrote:
>>>  I think there may be some suspicious code in inet_csk_get_port. At
>>>  tb_found there is:
>>> 
>>>                  if (((tb->fastreuse > 0 && reuse) ||
>>>                       (tb->fastreuseport > 0 &&
>>>                        !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>>                        sk->sk_reuseport && uid_eq(tb->fastuid, 
>>> uid))) &&
>>>                      smallest_size == -1)
>>>                          goto success;
>>>                  if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, 
>>> tb, true)) {
>>>                          if ((reuse ||
>>>                               (tb->fastreuseport > 0 &&
>>>                                sk->sk_reuseport &&
>>>                                
>>> !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>>                                uid_eq(tb->fastuid, uid))) &&
>>>                              smallest_size != -1 && --attempts >= 
>>> 0) {
>>>                                  spin_unlock_bh(&head->lock);
>>>                                  goto again;
>>>                          }
>>>                          goto fail_unlock;
>>>                  }
>>> 
>>>  AFAICT there is redundancy in these two conditionals.  The same 
>>> clause
>>>  is being checked in both: (tb->fastreuseport > 0 &&
>>>  !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
>>>  uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is true 
>>> the
>>>  first conditional should be hit, goto done,  and the second will 
>>> never
>>>  evaluate that part to true-- unless the sk is changed (do we need
>>>  READ_ONCE for sk->sk_reuseport_cb?).
>>  That's an interesting point... It looks like this function also
>>  changed in 4.6 from using a single local_bh_disable() at the 
>> beginning
>>  with several spin_lock(&head->lock) to exclusively
>>  spin_lock_bh(&head->lock) at each locking point.  Perhaps the full 
>> bh
>>  disable variant was preventing the timers in your stack trace from
>>  running interleaved with this function before?
> 
> Could be, although dropping the lock shouldn't be able to affect the
> search state. TBH, I'm a little lost in reading function, the
> SO_REUSEPORT handling is pretty complicated. For instance,
> rcu_access_pointer(sk->sk_reuseport_cb) is checked three times in that
> function and also in every call to inet_csk_bind_conflict. I wonder if
> we can simply this under the assumption that SO_REUSEPORT is only
> allowed if the port number (snum) is explicitly specified.

Ok first I have data for you Hannes, here's the time distributions 
before during and after the lockup (with all the debugging in place the 
box eventually recovers).  I've attached it as a text file since it is 
long.

Second is I was thinking about why we would spend so much time doing 
the ->owners list, and obviously it's because of the massive amount of 
timewait sockets on the owners list.  I wrote the following dumb patch 
and tested it and the problem has disappeared completely.  Now I don't 
know if this is right at all, but I thought it was weird we weren't 
copying the soreuseport option from the original socket onto the twsk.  
Is there are reason we aren't doing this currently?  Does this help 
explain what is happening?  Thanks,

Josef

[-- Attachment #2.1: timing-dist.txt --]
[-- Type: text/plain, Size: 30972 bytes --]

     inet_csk_get_port   : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 4        |*                                       |
      2048 -> 4095       : 100      |****************************************|
      4096 -> 8191       : 64       |*************************               |
      8192 -> 16383      : 35       |**************                          |
     16384 -> 32767      : 2        |                                        |
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 1        |*                                       |
      2048 -> 4095       : 38       |****************************************|
      4096 -> 8191       : 9        |*********                               |
      8192 -> 16383      : 2        |**                                      |
     16384 -> 32767      : 1        |*                                       |
<restart happens>
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 9        |**                                      |
      2048 -> 4095       : 54       |****************                        |
      4096 -> 8191       : 15       |****                                    |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 1        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 0        |                                        |
   2097152 -> 4194303    : 130      |****************************************|
   4194304 -> 8388607    : 0        |                                        |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 0        |                                        |
  33554432 -> 67108863   : 92       |****************************            |
     inet_csk_get_port   : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 11       |                                        |
      2048 -> 4095       : 132      |*********                               |
      4096 -> 8191       : 91       |******                                  |
      8192 -> 16383      : 13       |                                        |
     16384 -> 32767      : 0        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 0        |                                        |
   2097152 -> 4194303    : 401      |****************************            |
   4194304 -> 8388607    : 274      |*******************                     |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 16       |*                                       |
  33554432 -> 67108863   : 561      |****************************************|
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 6        |                                        |
      2048 -> 4095       : 68       |****                                    |
      4096 -> 8191       : 9        |                                        |
      8192 -> 16383      : 2        |                                        |
     16384 -> 32767      : 0        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 0        |                                        |
   2097152 -> 4194303    : 650      |****************************************|
   4194304 -> 8388607    : 0        |                                        |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 15       |                                        |
  33554432 -> 67108863   : 583      |***********************************     |
     inet_csk_get_port   : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 18       |*                                       |
      2048 -> 4095       : 263      |********************                    |
      4096 -> 8191       : 188      |**************                          |
      8192 -> 16383      : 186      |**************                          |
     16384 -> 32767      : 7        |                                        |
     32768 -> 65535      : 1        |                                        |
     65536 -> 131071     : 1        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 0        |                                        |
   2097152 -> 4194303    : 37       |**                                      |
   4194304 -> 8388607    : 454      |**********************************      |
   8388608 -> 16777215   : 9        |                                        |
  16777216 -> 33554431   : 24       |*                                       |
  33554432 -> 67108863   : 526      |****************************************|
<soft lockup messages start happening>
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 20       |*                                       |
      2048 -> 4095       : 130      |**********                              |
      4096 -> 8191       : 40       |***                                     |
      8192 -> 16383      : 2        |                                        |
     16384 -> 32767      : 1        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 0        |                                        |
   2097152 -> 4194303    : 506      |*************************************** |
   4194304 -> 8388607    : 0        |                                        |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 23       |*                                       |
  33554432 -> 67108863   : 511      |****************************************|
               inet_csk_get_port             : count     distribution
                   0 -> 1                    : 0        |                    |
                   2 -> 3                    : 0        |                    |
                   4 -> 7                    : 0        |                    |
                   8 -> 15                   : 0        |                    |
                  16 -> 31                   : 0        |                    |
                  32 -> 63                   : 0        |                    |
                  64 -> 127                  : 0        |                    |
                 128 -> 255                  : 0        |                    |
                 256 -> 511                  : 0        |                    |
                 512 -> 1023                 : 0        |                    |
                1024 -> 2047                 : 9        |                    |
                2048 -> 4095                 : 356      |********************|
                4096 -> 8191                 : 230      |************        |
                8192 -> 16383                : 342      |******************* |
               16384 -> 32767                : 12       |                    |
               32768 -> 65535                : 1        |                    |
               65536 -> 131071               : 0        |                    |
              131072 -> 262143               : 0        |                    |
              262144 -> 524287               : 1        |                    |
              524288 -> 1048575              : 0        |                    |
             1048576 -> 2097151              : 0        |                    |
             2097152 -> 4194303              : 311      |*****************   |
             4194304 -> 8388607              : 163      |*********           |
             8388608 -> 16777215             : 1        |                    |
            16777216 -> 33554431             : 3        |                    |
            33554432 -> 67108863             : 338      |******************  |
            67108864 -> 134217727            : 55       |***                 |
           134217728 -> 268435455            : 65       |***                 |
           268435456 -> 536870911            : 36       |**                  |
           536870912 -> 1073741823           : 22       |*                   |
          1073741824 -> 2147483647           : 16       |                    |
          2147483648 -> 4294967295           : 7        |                    |
          4294967296 -> 8589934591           : 1        |                    |
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 2        |                                        |
      2048 -> 4095       : 86       |***                                     |
      4096 -> 8191       : 16       |                                        |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 0        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 187      |*******                                 |
   2097152 -> 4194303    : 975      |****************************************|
   4194304 -> 8388607    : 0        |                                        |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 337      |*************                           |
  33554432 -> 67108863   : 442      |******************                      |
               inet_csk_get_port             : count     distribution
                   0 -> 1                    : 0        |                    |
                   2 -> 3                    : 0        |                    |
                   4 -> 7                    : 0        |                    |
                   8 -> 15                   : 0        |                    |
                  16 -> 31                   : 0        |                    |
                  32 -> 63                   : 0        |                    |
                  64 -> 127                  : 0        |                    |
                 128 -> 255                  : 0        |                    |
                 256 -> 511                  : 0        |                    |
                 512 -> 1023                 : 0        |                    |
                1024 -> 2047                 : 162      |****                |
                2048 -> 4095                 : 495      |**************      |
                4096 -> 8191                 : 66       |*                   |
                8192 -> 16383                : 6        |                    |
               16384 -> 32767                : 2        |                    |
               32768 -> 65535                : 0        |                    |
               65536 -> 131071               : 0        |                    |
              131072 -> 262143               : 0        |                    |
              262144 -> 524287               : 0        |                    |
              524288 -> 1048575              : 0        |                    |
             1048576 -> 2097151              : 0        |                    |
             2097152 -> 4194303              : 680      |********************|
             4194304 -> 8388607              : 166      |****                |
             8388608 -> 16777215             : 10       |                    |
            16777216 -> 33554431             : 6        |                    |
            33554432 -> 67108863             : 150      |****                |
            67108864 -> 134217727            : 275      |********            |
           134217728 -> 268435455            : 205      |******              |
           268435456 -> 536870911            : 151      |****                |
           536870912 -> 1073741823           : 137      |****                |
          1073741824 -> 2147483647           : 76       |**                  |
          2147483648 -> 4294967295           : 48       |*                   |
          4294967296 -> 8589934591           : 6        |                    |
          8589934592 -> 17179869183          : 2        |                    |
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 7        |                                        |
      2048 -> 4095       : 40       |***                                     |
      4096 -> 8191       : 0        |                                        |
      8192 -> 16383      : 0        |                                        |
     16384 -> 32767      : 0        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 33       |**                                      |
   2097152 -> 4194303    : 159      |************                            |
   4194304 -> 8388607    : 0        |                                        |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 311      |*************************               |
  33554432 -> 67108863   : 493      |****************************************|
               inet_csk_get_port             : count     distribution
                   0 -> 1                    : 0        |                    |
                   2 -> 3                    : 0        |                    |
                   4 -> 7                    : 0        |                    |
                   8 -> 15                   : 0        |                    |
                  16 -> 31                   : 0        |                    |
                  32 -> 63                   : 0        |                    |
                  64 -> 127                  : 0        |                    |
                 128 -> 255                  : 0        |                    |
                 256 -> 511                  : 0        |                    |
                 512 -> 1023                 : 0        |                    |
                1024 -> 2047                 : 129      |******************* |
                2048 -> 4095                 : 55       |********            |
                4096 -> 8191                 : 47       |*******             |
                8192 -> 16383                : 17       |**                  |
               16384 -> 32767                : 2        |                    |
               32768 -> 65535                : 0        |                    |
               65536 -> 131071               : 0        |                    |
              131072 -> 262143               : 0        |                    |
              262144 -> 524287               : 0        |                    |
              524288 -> 1048575              : 0        |                    |
             1048576 -> 2097151              : 30       |****                |
             2097152 -> 4194303              : 130      |********************|
             4194304 -> 8388607              : 24       |***                 |
             8388608 -> 16777215             : 0        |                    |
            16777216 -> 33554431             : 13       |**                  |
            33554432 -> 67108863             : 118      |******************  |
            67108864 -> 134217727            : 58       |********            |
           134217728 -> 268435455            : 17       |**                  |
           268435456 -> 536870911            : 7        |*                   |
           536870912 -> 1073741823           : 0        |                    |
          1073741824 -> 2147483647           : 1        |                    |
          2147483648 -> 4294967295           : 0        |                    |
          4294967296 -> 8589934591           : 1        |                    |
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 6        |*                                       |
      2048 -> 4095       : 14       |**                                      |
      4096 -> 8191       : 0        |                                        |
      8192 -> 16383      : 1        |                                        |
     16384 -> 32767      : 0        |                                        |
     32768 -> 65535      : 0        |                                        |
     65536 -> 131071     : 0        |                                        |
    131072 -> 262143     : 0        |                                        |
    262144 -> 524287     : 0        |                                        |
    524288 -> 1048575    : 0        |                                        |
   1048576 -> 2097151    : 158      |********************************        |
   2097152 -> 4194303    : 22       |****                                    |
   4194304 -> 8388607    : 0        |                                        |
   8388608 -> 16777215   : 0        |                                        |
  16777216 -> 33554431   : 192      |****************************************|
  33554432 -> 67108863   : 9        |*                                       |
<recovers>
     inet_csk_get_port   : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 10       |****************                        |
      2048 -> 4095       : 25       |****************************************|
      4096 -> 8191       : 16       |*************************               |
      8192 -> 16383      : 1        |*                                       |
     16384 -> 32767      : 0        |                                        |
     32768 -> 65535      : 1        |*                                       |
     inet_csk_bind_conflict : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 10       |*********************************       |
      2048 -> 4095       : 12       |****************************************|
     inet_csk_get_port   : count     distribution
         0 -> 1          : 0        |                                        |
         2 -> 3          : 0        |                                        |
         4 -> 7          : 0        |                                        |
         8 -> 15         : 0        |                                        |
        16 -> 31         : 0        |                                        |
        32 -> 63         : 0        |                                        |
        64 -> 127        : 0        |                                        |
       128 -> 255        : 0        |                                        |
       256 -> 511        : 0        |                                        |
       512 -> 1023       : 0        |                                        |
      1024 -> 2047       : 0        |                                        |
      2048 -> 4095       : 0        |                                        |
      4096 -> 8191       : 4        |****************************************|
      8192 -> 16383      : 1        |**********                              |

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2.2: tw-reuseport.patch --]
[-- Type: text/x-patch, Size: 1229 bytes --]

commit ea66f43c5b4d94625ad7322e4097acd9a06d7fdd
Author: Josef Bacik <jbacik@fb.com>
Date:   Wed Dec 14 11:54:49 2016 -0800

    do reuseport too

diff --git a/include/net/inet_timewait_sock.h b/include/net/inet_timewait_sock.h
index c9b3eb7..567017b 100644
--- a/include/net/inet_timewait_sock.h
+++ b/include/net/inet_timewait_sock.h
@@ -55,6 +55,7 @@ struct inet_timewait_sock {
 #define tw_family		__tw_common.skc_family
 #define tw_state		__tw_common.skc_state
 #define tw_reuse		__tw_common.skc_reuse
+#define tw_reuseport		__tw_common.skc_reuseport
 #define tw_ipv6only		__tw_common.skc_ipv6only
 #define tw_bound_dev_if		__tw_common.skc_bound_dev_if
 #define tw_node			__tw_common.skc_nulls_node
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index a1b1057..04c560e 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -183,6 +183,7 @@ struct inet_timewait_sock *inet_twsk_alloc(const struct sock *sk,
 		tw->tw_dport	    = inet->inet_dport;
 		tw->tw_family	    = sk->sk_family;
 		tw->tw_reuse	    = sk->sk_reuse;
+		tw->tw_reuseport    = sk->sk_reuseport;
 		tw->tw_hash	    = sk->sk_hash;
 		tw->tw_ipv6only	    = 0;
 		tw->tw_transparent  = inet->transparent;

^ permalink raw reply related

* Re: [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long
From: Jason A. Donenfeld @ 2016-12-15 18:51 UTC (permalink / raw)
  To: David Laight
  Cc: Netdev, kernel-hardening@lists.openwall.com, LKML,
	linux-crypto@vger.kernel.org, Jean-Philippe Aumasson, Ted Tso
In-Reply-To: <063D6719AE5E284EB5DD2968C1650D6DB02401A1@AcuExch.aculab.com>

Hi David,

On Thu, Dec 15, 2016 at 11:14 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Behalf Of Jason A. Donenfeld
>> Sent: 14 December 2016 18:46
> ...
>> +     ret = *chaining = siphash24((u8 *)&combined, offsetof(typeof(combined), end),
>
> If you make the first argument 'const void *' you won't need the cast
> on every call.
>
> I'd also suggest making the key u64[2].

I'll do both. Thanks for the suggestion.

Jason

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox