Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 00/27] Remove VLAN CFI bit abuse
From: Michał Mirosław @ 2016-12-14  2:00 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20161213171626.76e7dced@xeon-e3>

On Tue, Dec 13, 2016 at 05:16:26PM -0800, Stephen Hemminger wrote:
> On Tue, 13 Dec 2016 01:12:32 +0100 (CET)
> Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:
> > This series removes an abuse of VLAN CFI bit in Linux networking stack.
> > Currently Linux always clears the bit on outgoing traffic and presents
> > it cleared to userspace (even via AF_PACKET/tcpdump when hw-accelerated).
> > 
> > This uses a new vlan_present bit in struct skbuff, and removes an assumption
> > that vlan_proto != 0 when VLAN tag is present.
> > 
> > As I can't test most of the driver changes, please look at them carefully.
> > 
> > The series is supposed to be bisect-friendly and that requires temporary
> > insertion of #define VLAN_TAG_PRESENT in BPF code to be able to split
> > JIT changes per architecture.
> 
> I wonder if CFI can every validly be non-zero in the modern world, on Hyper-V.
> There are no token ring devices and that seems to be the only use case where CFI would
> be non-zero. Unless someone is planning to reuse it a a protocol bit which seems
> like a really bad idea.
> 
> Maybe the right thing is to keep hard coded as zero and not start adding
> more untestable code conditions.
> 
> My recommendation would be get rid of VLAN_TAG_PRESENT, but don't preserve
> CFI bit.

According to Wikipedia page [1] on 802.1Q, CFI bit got already changed
to DEI (Drop eligible indicator) in 2011 revision of the IEEE standard.

I can't verify this, though.

Best Regards,
Michał Mirosław

[1] https://en.wikipedia.org/wiki/IEEE_802.1Q#Frame_format

^ permalink raw reply

* Re: [PATCH net-next] net: remove abuse of VLAN DEI/CFI bit
From: Stephen Hemminger @ 2016-12-14  1:21 UTC (permalink / raw)
  To: Michał Mirosław
  Cc: netdev, moderated list:ETHERNET BRIDGE, open list:OPENVSWITCH
In-Reply-To: <cfa1f2efae2217b50cbefccbf9ba7f0d24a23c63.1480755768.git.mirq-linux@rere.qmqm.pl>

On Sat,  3 Dec 2016 10:22:28 +0100 (CET)
Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:

> This All-in-one patch removes abuse of VLAN CFI bit, so it can be passed
> intact through linux networking stack.
> 
> Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl>
> ---
> 
> Dear NetDevs
> 
> I guess this needs to be split to the prep..convert[]..finish sequence,
> but if you like it as is, then it's ready.
> 
> The biggest question is if the modified interface and vlan_present
> is the way to go. This can be changed to use vlan_proto != 0 instead
> of an extra flag bit.
> 
> As I can't test most of the driver changes, please look at them carefully.
> OVS and bridge eyes are especially welcome.
> 
> Best Regards,
> Michał Mirosław

Is the motivation to support 802.1ad Drop Eligability Indicator (DEI)?

If so then you need to be more verbose in the commit log, and lots more
work is needed. You need to rename fields and validate every place a
driver is using DEI bit to make sure it really does the right thing
on that hardware. It is not just a mechanical change.

^ permalink raw reply

* Re: [PATCH net-next 00/27] Remove VLAN CFI bit abuse
From: Stephen Hemminger @ 2016-12-14  1:16 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: netdev
In-Reply-To: <cover.1481586602.git.mirq-linux@rere.qmqm.pl>

On Tue, 13 Dec 2016 01:12:32 +0100 (CET)
Michał Mirosław <mirq-linux@rere.qmqm.pl> wrote:

> Dear NetDevs
> 
> This series removes an abuse of VLAN CFI bit in Linux networking stack.
> Currently Linux always clears the bit on outgoing traffic and presents
> it cleared to userspace (even via AF_PACKET/tcpdump when hw-accelerated).
> 
> This uses a new vlan_present bit in struct skbuff, and removes an assumption
> that vlan_proto != 0 when VLAN tag is present.
> 
> As I can't test most of the driver changes, please look at them carefully.
> 
> The series is supposed to be bisect-friendly and that requires temporary
> insertion of #define VLAN_TAG_PRESENT in BPF code to be able to split
> JIT changes per architecture.
> 
> Best Regards,
> Michał Mirosław

I wonder if CFI can every validly be non-zero in the modern world, on Hyper-V.
There are no token ring devices and that seems to be the only use case where CFI would
be non-zero. Unless someone is planning to reuse it a a protocol bit which seems
like a really bad idea.

Maybe the right thing is to keep hard coded as zero and not start adding
more untestable code conditions.

My recommendation would be get rid of VLAN_TAG_PRESENT, but don't preserve
CFI bit.

^ permalink raw reply

* Re: [PATCH net-next 13/27] bridge: use __vlan_hwaccel helpers
From: Toshiaki Makita @ 2016-12-14  0:40 UTC (permalink / raw)
  To: Michał Mirosław, Sergei Shtylyov
  Cc: netdev, moderated list:ETHERNET BRIDGE,
	open list:NETFILTER ({IP, IP6, ARP, EB , NF}TABLES),
	Jozsef Kadlecsik (supporter:NETFILTER ({IP, IP6, ARP, EB, NF}TABLES)),
	Patrick McHardy (supporter:NETFILTER ({IP, IP6, ARP , EB, NF}TABLES)),
	Pablo Neira Ayuso (supporter:NETFILTER ({IP , IP6, ARP, EB, NF}TABLES))
In-Reply-To: <20161213151110.yapgecrospdd23oj@rere.qmqm.pl>

On 2016/12/14 0:11, Michał Mirosław wrote:
> On Tue, Dec 13, 2016 at 03:59:46PM +0300, Sergei Shtylyov wrote:
>> Hello!
>>
>> On 12/13/2016 3:12 AM, Michał Mirosław wrote:
>>
>>> This removes assumption than vlan_tci != 0 when tag is present.
>>>
>>> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
>>> ---
>>>  net/bridge/br_netfilter_hooks.c | 14 ++++++++------
>>>  net/bridge/br_private.h         |  2 +-
>>>  net/bridge/br_vlan.c            |  6 +++---
>>>  3 files changed, 12 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c
>>> index b12501a..2cc0747 100644
>>> --- a/net/bridge/br_netfilter_hooks.c
>>> +++ b/net/bridge/br_netfilter_hooks.c
>> [...]
>>> @@ -749,8 +747,12 @@ static int br_nf_dev_queue_xmit(struct net *net, struct sock *sk, struct sk_buff
>>>
>>>  		data = this_cpu_ptr(&brnf_frag_data_storage);
>>>
>>> -		data->vlan_tci = skb->vlan_tci;
>>> -		data->vlan_proto = skb->vlan_proto;
>>> +		if (skb_vlan_tag_present(skb)) {
>>> +			data->vlan_tci = skb->vlan_tci;
>>> +			data->vlan_proto = skb->vlan_proto;
>>> +		} else
>>> +			data->vlan_proto = 0;
>>
>>    CodingStyle: should use {} in all branches of *if* if at least one branch
>> has {}.
>>
>> [...]
>>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>>> index b6de4f4..ef94664 100644
>>> --- a/net/bridge/br_vlan.c
>>> +++ b/net/bridge/br_vlan.c
>>
>>> @@ -444,8 +444,8 @@ static bool __allowed_ingress(const struct net_bridge *br,
>>>  			__vlan_hwaccel_put_tag(skb, br->vlan_proto, pvid);
>>>  		else
>>>  			/* Priority-tagged Frame.
>>> -			 * At this point, We know that skb->vlan_tci had
>>> -			 * VLAN_TAG_PRESENT bit and its VID field was 0x000.
>>> +			 * At this point, We know that skb->vlan_tci VID
>>
>>    s/We/we/.
>>
>>> +			 * field was 0x000.
>>
>>    Simply 0, maybe?

I originally wrote it like this because we are playing with bit field here.
I meant that all of 12 bits are 0 thus we can safely perform bitwise-OR
to update the VID field.

Thanks,
Toshiaki Makita

^ permalink raw reply

* Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock
From: Cong Wang @ 2016-12-14  0:19 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Linux Kernel Network Developers, LKML, linux-audit, Dmitry Vyukov,
	Eric Dumazet, Eric Paris, Paul Moore, sgrubb
In-Reply-To: <61c37ca790bc11bc023aea8f9b70ab3098aa30f5.1481626466.git.rgb@redhat.com>

On Tue, Dec 13, 2016 at 7:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> @@ -1283,8 +1299,10 @@ static void __net_exit audit_net_exit(struct net *net)
>  {
>         struct audit_net *aunet = net_generic(net, audit_net_id);
>         struct sock *sock = aunet->nlsk;
> +       mutex_lock(&audit_cmd_mutex);
>         if (sock == audit_sock)
>                 auditd_reset();
> +       mutex_unlock(&audit_cmd_mutex);

This still doesn't look correct to me, b/c here we release the audit_sock
refcnt twice:

1) inside audit_reset()
2) netlink_kernel_release()

^ permalink raw reply

* Re: netlink: GPF in sock_sndtimeo
From: Cong Wang @ 2016-12-14  0:17 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: Herbert Xu, Johannes Berg, netdev, Florian Westphal, LKML,
	Eric Dumazet, linux-audit, syzkaller, David Miller, Dmitry Vyukov
In-Reply-To: <20161213105233.GG1305@madcap2.tricolour.ca>

On Tue, Dec 13, 2016 at 2:52 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> It is actually the audit_pid and audit_nlk_portid that I care about
> more.  The audit daemon could vanish or close the socket while the
> kernel sock to which it was attached is still quite valid.  Accessing
> the set of three atomically is the urge.  I wonder if it makes more
> sense to test for the presence of auditd using audit_sock rather than
> audit_pid, but still keep audit_pid for our reporting and replacement
> strategy.  Another idea would be to put the three in one struct.

Note, the process has audit_pid should hold a refcnt to the netns too,
so the netns can't be gone until that process is gone.

>
> Can someone explain how they think the original test was able to trigger
> this GPF?  Network namespace shutdown while something pretended to set
> up a new auditd?  That's impressive for a fuzzer if that's the case...
> Is there an strace?  I guess it is all in test().
>

I am surprised you still don't get the race condition even when you
are now working on v2...

The race happens in this scenarios :

1) Create a new netns

2) In the new netns, communicate with kauditd to set audit_sock

3) Generate some audit messages, so kauditd will keep sending them
via audit_sock

4) exit the netns

5) the previous audit_sock is now going away, but kaudit_sock could still
access it in this small window.

^ permalink raw reply

* [PATCH 3/3] secure_seq: use fast&secure siphash instead of slow&insecure md5
From: Jason A. Donenfeld @ 2016-12-14  0:16 UTC (permalink / raw)
  To: Netdev, David Miller, Linus Torvalds,
	kernel-hardening@lists.openwall.com, LKML, George Spelvin,
	Scott Bauer, Andi Kleen, Andy Lutomirski, Greg KH, Eric Biggers
  Cc: Jason A. Donenfeld
In-Reply-To: <20161214001656.19388-1-Jason@zx2c4.com>

This gives a clear speed and security improvement. Rather than manually
filling MD5 buffers, we simply create a layout by a simple anonymous
struct, for which gcc generates rather efficient code.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 net/core/secure_seq.c | 153 ++++++++++++++++++++++++--------------------------
 1 file changed, 73 insertions(+), 80 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index fd3ce461fbe6..dcee974f2fb1 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */
+
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/cryptohash.h>
@@ -8,13 +10,12 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
-
+#include <linux/siphash.h>
+#include <linux/in6.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
-
-static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
+static u8 net_secret[SIPHASH24_KEY_LEN];
 
 static __always_inline void net_secret_init(void)
 {
@@ -43,43 +44,36 @@ static u32 seq_scale(u32 seq)
 __u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				   __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+	} __packed combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return seq_scale(hash[0]);
+	return seq_scale(siphash24((const u8 *)&combined, sizeof(combined), net_secret));
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 dport;
+	} __packed combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32) daddr[i];
-	secret[4] = net_secret[4] + (__force u32)dport;
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return hash[0];
+	return siphash24((const u8 *)&combined, sizeof(combined), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -89,32 +83,34 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 __u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 				 __be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	const struct {
+		__be32 saddr;
+		__be32 daddr;
+		__be16 sport;
+		__be16 dport;
+	} __packed combined = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return seq_scale(hash[0]);
+	return seq_scale(siphash24((const u8 *)&combined, sizeof(combined), net_secret));
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	const struct {
+		__be32 saddr;
+		__be32 daddr;
+		__be16 dport;
+	} __packed combined = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = (__force u32)dport ^ net_secret[14];
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return hash[0];
+	return seq_scale(siphash24((const u8 *)&combined, sizeof(combined), net_secret));
 }
 EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 #endif
@@ -123,21 +119,22 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		__be32 saddr;
+		__be32 daddr;
+		__be16 sport;
+		__be16 dport;
+	} __packed combined = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash24((const u8 *)&combined, sizeof(combined), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccp_sequence_number);
@@ -146,26 +143,22 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+	} __packed combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-	u32 i;
-
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash24((const u8 *)&combined, sizeof(combined), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccpv6_sequence_number);
-- 
2.11.0

^ permalink raw reply related

* [PATCH 2/3] siphash: add convenience functions for jhash converts
From: Jason A. Donenfeld @ 2016-12-14  0:16 UTC (permalink / raw)
  To: Netdev, David Miller, Linus Torvalds,
	kernel-hardening@lists.openwall.com, LKML, George Spelvin,
	Scott Bauer, Andi Kleen, Andy Lutomirski, Greg KH, Eric Biggers
  Cc: Jason A. Donenfeld
In-Reply-To: <20161214001656.19388-1-Jason@zx2c4.com>

Many jhash users currently rely on the Nwords functions. In order to
make transitions to siphash fit something people already know about, we
provide analog functions here. This also winds up being nice for the
networking stack, where hashing 32-bit fields is common.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 include/linux/siphash.h | 33 +++++++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
index 6623b3090645..1391054c4c29 100644
--- a/include/linux/siphash.h
+++ b/include/linux/siphash.h
@@ -17,4 +17,37 @@ enum siphash_lengths {
 
 u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
 
+static inline u64 siphash24_1word(const u32 a, const u8 key[SIPHASH24_KEY_LEN])
+{
+	return siphash24((u8 *)&a, sizeof(a), key);
+}
+
+static inline u64 siphash24_2words(const u32 a, const u32 b, const u8 key[SIPHASH24_KEY_LEN])
+{
+	const struct {
+		u32 a;
+		u32 b;
+	} __packed combined = {
+		.a = a,
+		.b = b
+	};
+
+	return siphash24((const u8 *)&combined, sizeof(combined), key);
+}
+
+static inline u64 siphash24_3words(const u32 a, const u32 b, const u32 c, const u8 key[SIPHASH24_KEY_LEN])
+{
+	const struct {
+		u32 a;
+		u32 b;
+		u32 c;
+	} __packed combined = {
+		.a = a,
+		.b = b,
+		.c = c
+	};
+
+	return siphash24((const u8 *)&combined, sizeof(combined), key);
+}
+
 #endif /* _LINUX_SIPHASH_H */
-- 
2.11.0

^ permalink raw reply related

* [PATCH 1/3] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-14  0:16 UTC (permalink / raw)
  To: Netdev, David Miller, Linus Torvalds,
	kernel-hardening@lists.openwall.com, LKML, George Spelvin,
	Scott Bauer, Andi Kleen, Andy Lutomirski, Greg KH, Eric Biggers
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein

SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function.

SipHash isn't just some new trendy hash function. It's been around for a
while, and there really isn't anything that comes remotely close to
being useful in the way SipHash is. With that said, why do we need this?

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector.

Linux developers already seem to be aware that this is an issue, and
various places that use hash tables in, say, a network context, use a
non-cryptographically secure function (usually jhash) and then try to
twiddle with the key on a time basis (or in many cases just do nothing
and hope that nobody notices). While this is an admirable attempt at
solving the problem, it doesn't actually fix it. SipHash fixes it.

(It fixes it in such a sound way that you could even build a stream
cipher out of SipHash that would resist the modern cryptanalysis.)

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

Dozens of languages are already using this internally for their hash
tables. Some of the BSDs already use this in their kernels. SipHash is
a widely known high-speed solution to a widely known problem, and it's
time we catch-up.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Daniel J. Bernstein <djb@cr.yp.to>
---
 include/linux/siphash.h | 20 +++++++++++++
 lib/Kconfig.debug       |  6 ++--
 lib/Makefile            |  5 ++--
 lib/siphash.c           | 76 +++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      | 74 +++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 176 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 000000000000..6623b3090645
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,20 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+
+enum siphash_lengths {
+	SIPHASH24_KEY_LEN = 16
+};
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index a6c8db1d62f6..2a1797704b41 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1823,9 +1823,9 @@ config TEST_HASH
 	tristate "Perform selftest on hash functions"
 	default n
 	help
-	  Enable this option to test the kernel's integer (<linux/hash,h>)
-	  and string (<linux/stringhash.h>) hash functions on boot
-	  (or module load).
+	  Enable this option to test the kernel's integer (<linux/hash.h>),
+	  string (<linux/stringhash.h>), and siphash (<linux/siphash.h>)
+	  hash functions on boot (or module load).
 
 	  This is intended to help people writing architecture-specific
 	  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
-	 earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+	 earlycpio.o seq_buf.o siphash.o \
+	 nmi_backtrace.o nodemask.o win_minmax.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
 obj-y += kstrtox.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
-obj-$(CONFIG_TEST_HASH) += test_hash.o
+obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
 obj-$(CONFIG_TEST_LKM) += test_module.o
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 000000000000..7b55ad3a7fe9
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,76 @@
+/* Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ * Copyright (C) 2012-2014 Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
+ * Copyright (C) 2012-2014 Daniel J. Bernstein <djb@cr.yp.to>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <asm/unaligned.h>
+
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+#include <linux/dcache.h>
+#include <asm/word-at-a-time.h>
+#endif
+
+#define SIPROUND \
+	do { \
+	v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+	v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+	v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+	} while(0)
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN])
+{
+	u64 v0 = 0x736f6d6570736575ULL;
+	u64 v1 = 0x646f72616e646f6dULL;
+	u64 v2 = 0x6c7967656e657261ULL;
+	u64 v3 = 0x7465646279746573ULL;
+	u64 b = ((u64)len) << 56;
+	u64 k0 = get_unaligned_le64(key);
+	u64 k1 = get_unaligned_le64(key + sizeof(u64));
+	u64 m;
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	v3 ^= k1;
+	v2 ^= k0;
+	v1 ^= k1;
+	v0 ^= k0;
+	for (; data != end; data += sizeof(u64)) {
+		m = get_unaligned_le64(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu(load_unaligned_zeropad(data) & bytemask_from_count(left));
+#else
+	switch (left) {
+	case 7: b |= ((u64)data[6]) << 48;
+	case 6: b |= ((u64)data[5]) << 40;
+	case 5: b |= ((u64)data[4]) << 32;
+	case 4: b |= get_unaligned_le32(data); break;
+	case 3: b |= ((u64)data[2]) << 16;
+	case 2: b |= get_unaligned_le16(data); break;
+	case 1: b |= data[0];
+	}
+#endif
+	v3 ^= b;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= b;
+	v2 ^= 0xff;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	return (v0 ^ v1) ^ (v2 ^ v3);
+}
+EXPORT_SYMBOL(siphash24);
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
new file mode 100644
index 000000000000..336298aaa33b
--- /dev/null
+++ b/lib/test_siphash.c
@@ -0,0 +1,74 @@
+/* Test cases for siphash.c
+ *
+ * Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+
+/* Test vectors taken from official reference source available at:
+ *     https://131002.net/siphash/siphash24.c
+ */
+static const u64 test_vectors[64] = {
+	0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
+	0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
+	0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
+	0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
+	0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
+	0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
+	0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
+	0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
+	0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
+	0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
+	0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
+	0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
+	0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
+	0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
+	0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
+	0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
+	0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
+	0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
+	0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
+	0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
+	0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
+	0x958a324ceb064572ULL
+};
+
+static int __init siphash_test_init(void)
+{
+	u8 in[64], k[16], i;
+	int ret = 0;
+
+	for (i = 0; i < 16; ++i)
+		k[i] = i;
+	for (i = 0; i < 64; ++i) {
+		in[i] = i;
+		if (siphash24(in, i, k) != test_vectors[i]) {
+			pr_info("self-test %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+	}
+	if (!ret)
+		pr_info("self-tests: pass\n");
+	return ret;
+}
+
+static void __exit siphash_test_exit(void)
+{
+}
+
+module_init(siphash_test_init);
+module_exit(siphash_test_exit);
+
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_LICENSE("Dual BSD/GPL");
-- 
2.11.0

^ permalink raw reply related

* [PATCH net] ibmveth: calculate gso_segs for large packets
From: Thomas Falcon @ 2016-12-14  0:15 UTC (permalink / raw)
  To: netdev; +Cc: brking, marcelo.leitner, pradeeps, jmaxwell37, zdai, eric.dumazet

Include calculations to compute the number of segments
that comprise an aggregated large packet.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmveth.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmveth.c b/drivers/net/ethernet/ibm/ibmveth.c
index fbece63..a831f94 100644
--- a/drivers/net/ethernet/ibm/ibmveth.c
+++ b/drivers/net/ethernet/ibm/ibmveth.c
@@ -1181,7 +1181,9 @@ static netdev_tx_t ibmveth_start_xmit(struct sk_buff *skb,
 
 static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 {
+	struct tcphdr *tcph;
 	int offset = 0;
+	int hdr_len;
 
 	/* only TCP packets will be aggregated */
 	if (skb->protocol == htons(ETH_P_IP)) {
@@ -1208,14 +1210,20 @@ static void ibmveth_rx_mss_helper(struct sk_buff *skb, u16 mss, int lrg_pkt)
 	/* if mss is not set through Large Packet bit/mss in rx buffer,
 	 * expect that the mss will be written to the tcp header checksum.
 	 */
+	tcph = (struct tcphdr *)(skb->data + offset);
 	if (lrg_pkt) {
 		skb_shinfo(skb)->gso_size = mss;
 	} else if (offset) {
-		struct tcphdr *tcph = (struct tcphdr *)(skb->data + offset);
-
 		skb_shinfo(skb)->gso_size = ntohs(tcph->check);
 		tcph->check = 0;
 	}
+
+	if (skb_shinfo(skb)->gso_size) {
+		hdr_len = offset + tcph->doff * 4;
+		skb_shinfo(skb)->gso_segs =
+				DIV_ROUND_UP(skb->len - hdr_len,
+					     skb_shinfo(skb)->gso_size);
+	}
 }
 
 static int ibmveth_poll(struct napi_struct *napi, int budget)
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH] [v2] net: qcom/emac: don't try to claim clocks on ACPI systems
From: Florian Fainelli @ 2016-12-13 23:54 UTC (permalink / raw)
  To: Timur Tabi, David Miller, netdev, Christopher Covington, alokc
In-Reply-To: <1481672942-20024-1-git-send-email-timur@codeaurora.org>

On 12/13/2016 03:49 PM, Timur Tabi wrote:
> On ACPI systems, clocks are not available to drivers directly.  They are
> handled exclusively by ACPI and/or firmware, so there is no clock driver.
> Calls to clk_get() always fail, so we should not even attempt to claim
> any clocks on ACPI systems.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

^ permalink raw reply

* [PATCH] [v2] net: qcom/emac: don't try to claim clocks on ACPI systems
From: Timur Tabi @ 2016-12-13 23:49 UTC (permalink / raw)
  To: David Miller, Florian Fainelli, netdev, Christopher Covington,
	alokc

On ACPI systems, clocks are not available to drivers directly.  They are
handled exclusively by ACPI and/or firmware, so there is no clock driver.
Calls to clk_get() always fail, so we should not even attempt to claim
any clocks on ACPI systems.

Signed-off-by: Timur Tabi <timur@codeaurora.org>
---

Notes:
    v2: move check into functions

 drivers/net/ethernet/qualcomm/emac/emac.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c b/drivers/net/ethernet/qualcomm/emac/emac.c
index ae32f85..422289c 100644
--- a/drivers/net/ethernet/qualcomm/emac/emac.c
+++ b/drivers/net/ethernet/qualcomm/emac/emac.c
@@ -460,6 +460,12 @@ static int emac_clks_phase1_init(struct platform_device *pdev,
 {
 	int ret;
 
+	/* On ACPI platforms, clocks are controlled by firmware and/or
+	 * ACPI, not by drivers.
+	 */
+	if (has_acpi_companion(&pdev->dev))
+		return 0;
+
 	ret = emac_clks_get(pdev, adpt);
 	if (ret)
 		return ret;
@@ -485,6 +491,9 @@ static int emac_clks_phase2_init(struct platform_device *pdev,
 {
 	int ret;
 
+	if (has_acpi_companion(&pdev->dev))
+		return 0;
+
 	ret = clk_set_rate(adpt->clk[EMAC_CLK_TX], 125000000);
 	if (ret)
 		return ret;
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply related

* Re: Soft lockup in inet_put_port on 4.6
From: Tom Herbert @ 2016-12-13 23:32 UTC (permalink / raw)
  To: Craig Gallek
  Cc: Josef Bacik, Hannes Frederic Sowa, Eric Dumazet,
	Linux Kernel Network Developers
In-Reply-To: <CAEfhGiyFW1YHmmUYpFv+q4J4r=wSv3ZBZowG69BqkT3p-G4gNw@mail.gmail.com>

On Tue, Dec 13, 2016 at 3:03 PM, Craig Gallek <kraigatgoog@gmail.com> wrote:
> On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert <tom@herbertland.com> wrote:
>> I think there may be some suspicious code in inet_csk_get_port. At
>> tb_found there is:
>>
>>                 if (((tb->fastreuse > 0 && reuse) ||
>>                      (tb->fastreuseport > 0 &&
>>                       !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>                       sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
>>                     smallest_size == -1)
>>                         goto success;
>>                 if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true)) {
>>                         if ((reuse ||
>>                              (tb->fastreuseport > 0 &&
>>                               sk->sk_reuseport &&
>>                               !rcu_access_pointer(sk->sk_reuseport_cb) &&
>>                               uid_eq(tb->fastuid, uid))) &&
>>                             smallest_size != -1 && --attempts >= 0) {
>>                                 spin_unlock_bh(&head->lock);
>>                                 goto again;
>>                         }
>>                         goto fail_unlock;
>>                 }
>>
>> AFAICT there is redundancy in these two conditionals.  The same clause
>> is being checked in both: (tb->fastreuseport > 0 &&
>> !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
>> uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is true the
>> first conditional should be hit, goto done,  and the second will never
>> evaluate that part to true-- unless the sk is changed (do we need
>> READ_ONCE for sk->sk_reuseport_cb?).
> That's an interesting point... It looks like this function also
> changed in 4.6 from using a single local_bh_disable() at the beginning
> with several spin_lock(&head->lock) to exclusively
> spin_lock_bh(&head->lock) at each locking point.  Perhaps the full bh
> disable variant was preventing the timers in your stack trace from
> running interleaved with this function before?

Could be, although dropping the lock shouldn't be able to affect the
search state. TBH, I'm a little lost in reading function, the
SO_REUSEPORT handling is pretty complicated. For instance,
rcu_access_pointer(sk->sk_reuseport_cb) is checked three times in that
function and also in every call to inet_csk_bind_conflict. I wonder if
we can simply this under the assumption that SO_REUSEPORT is only
allowed if the port number (snum) is explicitly specified.

Tom

^ permalink raw reply

* [PATCH iproute2] Fix compile warning in get_addr_1
From: David Ahern @ 2016-12-13 23:34 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern

A recent cleanup causes a compile warning on Debian jessie:

    CC       utils.o
utils.c: In function ‘get_addr_1’:
utils.c:486:21: warning: passing argument 1 of ‘ll_addr_a2n’ from incompatible pointer type
   len = ll_addr_a2n(&addr->data, sizeof(addr->data), name);
                     ^
In file included from utils.c:34:0:
../include/rt_names.h:27:5: note: expected ‘char *’ but argument is of type ‘__u32 (*)[8]’
 int ll_addr_a2n(char *lladdr, int len, const char *arg);
     ^

Revert the removal of the typecast

Fixes: e1933b928125 ("utils: cleanup style")
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
---
 lib/utils.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/utils.c b/lib/utils.c
index 316b048abcfc..83c9d097c608 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -483,7 +483,8 @@ int get_addr_1(inet_prefix *addr, const char *name, int family)
 	if (family == AF_PACKET) {
 		int len;
 
-		len = ll_addr_a2n(&addr->data, sizeof(addr->data), name);
+		len = ll_addr_a2n((char *) &addr->data, sizeof(addr->data),
+				  name);
 		if (len < 0)
 			return -1;
 
-- 
2.1.4

^ permalink raw reply related

* Re: Soft lockup in inet_put_port on 4.6
From: Craig Gallek @ 2016-12-13 23:03 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Josef Bacik, Hannes Frederic Sowa, Eric Dumazet,
	Linux Kernel Network Developers
In-Reply-To: <CALx6S34S82kFVqYW1M0ZuHx_Mxut4QaLFVerLX4aEsksrFirXg@mail.gmail.com>

On Tue, Dec 13, 2016 at 3:51 PM, Tom Herbert <tom@herbertland.com> wrote:
> I think there may be some suspicious code in inet_csk_get_port. At
> tb_found there is:
>
>                 if (((tb->fastreuse > 0 && reuse) ||
>                      (tb->fastreuseport > 0 &&
>                       !rcu_access_pointer(sk->sk_reuseport_cb) &&
>                       sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
>                     smallest_size == -1)
>                         goto success;
>                 if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true)) {
>                         if ((reuse ||
>                              (tb->fastreuseport > 0 &&
>                               sk->sk_reuseport &&
>                               !rcu_access_pointer(sk->sk_reuseport_cb) &&
>                               uid_eq(tb->fastuid, uid))) &&
>                             smallest_size != -1 && --attempts >= 0) {
>                                 spin_unlock_bh(&head->lock);
>                                 goto again;
>                         }
>                         goto fail_unlock;
>                 }
>
> AFAICT there is redundancy in these two conditionals.  The same clause
> is being checked in both: (tb->fastreuseport > 0 &&
> !rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
> uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is true the
> first conditional should be hit, goto done,  and the second will never
> evaluate that part to true-- unless the sk is changed (do we need
> READ_ONCE for sk->sk_reuseport_cb?).
That's an interesting point... It looks like this function also
changed in 4.6 from using a single local_bh_disable() at the beginning
with several spin_lock(&head->lock) to exclusively
spin_lock_bh(&head->lock) at each locking point.  Perhaps the full bh
disable variant was preventing the timers in your stack trace from
running interleaved with this function before?

^ permalink raw reply

* Re: [PATCH] net: qcom/emac: don't try to claim clocks on ACPI systems
From: Timur Tabi @ 2016-12-13 22:05 UTC (permalink / raw)
  To: Florian Fainelli, David Miller, netdev, Christopher Covington,
	alokc
In-Reply-To: <e4666d2c-1690-0513-b2e2-57733f04a69c@gmail.com>

On 12/13/2016 04:02 PM, Florian Fainelli wrote:
> No strong feelings either, it just seems easier and safer to move the
> check down in the function and make it return success rather than
> potentially affecting the error path within the caller of
> emac_clks_phase{1,2}_init here.

I suppose that makes sense.  I'll post a V2.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* Re: [PATCH] net: qcom/emac: don't try to claim clocks on ACPI systems
From: Florian Fainelli @ 2016-12-13 22:02 UTC (permalink / raw)
  To: Timur Tabi, David Miller, netdev, Christopher Covington, alokc
In-Reply-To: <58506E00.9040801@codeaurora.org>

On 12/13/2016 01:54 PM, Timur Tabi wrote:
> On 12/13/2016 03:46 PM, Florian Fainelli wrote:
>> Is there a reason why the check is not moved down inwo
>> emac_clks_phase{1,2}_init functions? Do you anticipate other
>> ACPI-related changes in the future that would warrant having this check
>> moved at a higher level?
> 
> No, this is the last ACPI-related change that I expect.  I could move
> the check into those functions, but I don't see how that's any different
> than what I'm doing now.  My way avoids calling a function altogether,
> your way calls into a function only to have it return immediately.
> 
> But I don't have any strong feelings either way.  I will change it if
> you want me to.

No strong feelings either, it just seems easier and safer to move the
check down in the function and make it return success rather than
potentially affecting the error path within the caller of
emac_clks_phase{1,2}_init here.
-- 
Florian

^ permalink raw reply

* Re: [PATCH] net: qcom/emac: don't try to claim clocks on ACPI systems
From: Timur Tabi @ 2016-12-13 21:54 UTC (permalink / raw)
  To: Florian Fainelli, David Miller, netdev, Christopher Covington,
	alokc
In-Reply-To: <d6683e1c-803d-3859-f125-93fecaa0df02@gmail.com>

On 12/13/2016 03:46 PM, Florian Fainelli wrote:
> Is there a reason why the check is not moved down inwo
> emac_clks_phase{1,2}_init functions? Do you anticipate other
> ACPI-related changes in the future that would warrant having this check
> moved at a higher level?

No, this is the last ACPI-related change that I expect.  I could move 
the check into those functions, but I don't see how that's any different 
than what I'm doing now.  My way avoids calling a function altogether, 
your way calls into a function only to have it return immediately.

But I don't have any strong feelings either way.  I will change it if 
you want me to.

-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm
Technologies, Inc.  Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* Re: [PATCH] net: qcom/emac: don't try to claim clocks on ACPI systems
From: Florian Fainelli @ 2016-12-13 21:46 UTC (permalink / raw)
  To: Timur Tabi, David Miller, netdev, Christopher Covington, alokc
In-Reply-To: <1481658930-565-1-git-send-email-timur@codeaurora.org>

On 12/13/2016 11:55 AM, Timur Tabi wrote:
> On ACPI systems, clocks are not available to drivers directly.  They are
> handled exclusively by ACPI and/or firmware, so there is no clock driver.
> Calls to clk_get() always fail, so we should not even attempt to claim
> any clocks on ACPI systems.
> 
> Signed-off-by: Timur Tabi <timur@codeaurora.org>
> ---
>  drivers/net/ethernet/qualcomm/emac/emac.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qualcomm/emac/emac.c b/drivers/net/ethernet/qualcomm/emac/emac.c
> index ae32f85..b1c1cdc 100644
> --- a/drivers/net/ethernet/qualcomm/emac/emac.c
> +++ b/drivers/net/ethernet/qualcomm/emac/emac.c
> @@ -627,11 +627,12 @@ static int emac_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto err_undo_netdev;
>  
> -	/* initialize clocks */
> -	ret = emac_clks_phase1_init(pdev, adpt);
> -	if (ret) {
> -		dev_err(&pdev->dev, "could not initialize clocks\n");
> -		goto err_undo_netdev;
> +	if (!has_acpi_companion(&pdev->dev)) {

Is there a reason why the check is not moved down inwo
emac_clks_phase{1,2}_init functions? Do you anticipate other
ACPI-related changes in the future that would warrant having this check
moved at a higher level?
-- 
Florian

^ permalink raw reply

* Re: sctp: suspicious rcu_dereference_check() usage in sctp_epaddr_lookup_transport
From: Marcelo Ricardo Leitner @ 2016-12-13 21:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Vladislav Yasevich, Neil Horman, David Miller, linux-sctp, netdev,
	LKML, Eric Dumazet, syzkaller, Xin Long
In-Reply-To: <CACT4Y+bEi6aXTTrk4P37hFnMdyte0voxUVdz8t0XQP95PgH9+w@mail.gmail.com>

On Tue, Dec 13, 2016 at 07:07:01PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am getting the following reports while running syzkaller fuzzer:
> 
> [ INFO: suspicious RCU usage. ]
> 4.9.0+ #85 Not tainted
> -------------------------------
> ./include/linux/rhashtable.h:572 suspicious rcu_dereference_check() usage!
> 
> other info that might help us debug this:
> 
> rcu_scheduler_active = 1, debug_locks = 0
> 1 lock held by syz-executor1/18023:
>  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<     inline     >] lock_sock
> include/net/sock.h:1454
>  #0:  (sk_lock-AF_INET){+.+.+.}, at: [<ffffffff87bb3ccf>]
> sctp_getsockopt+0x45f/0x6800 net/sctp/socket.c:6432
> 
> stack backtrace:
> CPU: 2 PID: 18023 Comm: syz-executor1 Not tainted 4.9.0+ #85
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Call Trace:
> [<     inline     >] __dump_stack lib/dump_stack.c:15
> [<        none        >] dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
> [<        none        >] lockdep_rcu_suspicious+0x139/0x180
> kernel/locking/lockdep.c:4448
> [<     inline     >] __rhashtable_lookup ./include/linux/rhashtable.h:572
> [<     inline     >] rhltable_lookup ./include/linux/rhashtable.h:660
> [<        none        >] sctp_epaddr_lookup_transport+0x641/0x930
> net/sctp/input.c:946

I think this was introduced in the rhlist converstion. We had removed
some rcu_read_lock() calls on sctp stack because rhashtable was already
calling it, but then we didn't add them back when moving to rhlist.

This code:
+/* return a transport without holding it, as it's only used under sock lock */
 struct sctp_transport *sctp_epaddr_lookup_transport(
                                const struct sctp_endpoint *ep,
                                const union sctp_addr *paddr)
 {
        struct net *net = sock_net(ep->base.sk);
+       struct rhlist_head *tmp, *list;
+       struct sctp_transport *t;
        struct sctp_hash_cmp_arg arg = {
-               .ep    = ep,
                .paddr = paddr,
                .net   = net,
+               .lport = htons(ep->base.bind_addr.port),
        };
 
-       return rhashtable_lookup_fast(&sctp_transport_hashtable, &arg,
-                                     sctp_hash_params);
+       list = rhltable_lookup(&sctp_transport_hashtable, &arg,
+                              sctp_hash_params);

Had an implicit rcu_read_lock() on rhashtable_lookup_fast, but it
doesn't on rhltable_lookup and rhltable_lookup uses _rcu calls which
assumes we have rcu read protection.

  Marcelo

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH net v2] igb: re-assign hw address pointer        on reset after PCI error
From: Brown, Aaron F @ 2016-12-13 20:51 UTC (permalink / raw)
  To: Guilherme G. Piccoli, intel-wired-lan@lists.osuosl.org
  Cc: netdev@vger.kernel.org
In-Reply-To: <1478803603-30306-1-git-send-email-gpiccoli@linux.vnet.ibm.com>

> From: Intel-wired-lan [mailto:intel-wired-lan-bounces@lists.osuosl.org] On
> Behalf Of Guilherme G. Piccoli
> Sent: Thursday, November 10, 2016 10:47 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: netdev@vger.kernel.org; gpiccoli@linux.vnet.ibm.com
> Subject: [Intel-wired-lan] [PATCH net v2] igb: re-assign hw address pointer on
> reset after PCI error
> 
> Whenever the igb driver detects the result of a read operation returns
> a value composed only by F's (like 0xFFFFFFFF), it will detach the
> net_device, clear the hw_addr pointer and warn to the user that adapter's
> link is lost - those steps happen on igb_rd32().
> 
> In case a PCI error happens on Power architecture, there's a recovery
> mechanism called EEH, that will reset the PCI slot and call driver's
> handlers to reset the adapter and network functionality as well.
> 
> We observed that once hw_addr is NULL after the error is detected on
> igb_rd32(), it's never assigned back, so in the process of resetting
> the network functionality we got a NULL pointer dereference in both
> igb_configure_tx_ring() and igb_configure_rx_ring(). In order to avoid
> such bug, this patch re-assigns the hw_addr value in the slot_reset
> handler.
> 
> Reported-by: Anthony H. Thai <ahthai@us.ibm.com>
> Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
> ---
>  drivers/net/ethernet/intel/igb/igb_main.c | 5 +++++
>  1 file changed, 5 insertions(+)

Tested-by: Aaron Brown <aaron.f.brown@intel.com>

^ permalink raw reply

* Re: Soft lockup in inet_put_port on 4.6
From: Tom Herbert @ 2016-12-13 20:51 UTC (permalink / raw)
  To: Josef Bacik
  Cc: Hannes Frederic Sowa, Eric Dumazet,
	Linux Kernel Network Developers
In-Reply-To: <1481581466.24490.2@smtp.office365.com>

I think there may be some suspicious code in inet_csk_get_port. At
tb_found there is:

                if (((tb->fastreuse > 0 && reuse) ||
                     (tb->fastreuseport > 0 &&
                      !rcu_access_pointer(sk->sk_reuseport_cb) &&
                      sk->sk_reuseport && uid_eq(tb->fastuid, uid))) &&
                    smallest_size == -1)
                        goto success;
                if (inet_csk(sk)->icsk_af_ops->bind_conflict(sk, tb, true)) {
                        if ((reuse ||
                             (tb->fastreuseport > 0 &&
                              sk->sk_reuseport &&
                              !rcu_access_pointer(sk->sk_reuseport_cb) &&
                              uid_eq(tb->fastuid, uid))) &&
                            smallest_size != -1 && --attempts >= 0) {
                                spin_unlock_bh(&head->lock);
                                goto again;
                        }
                        goto fail_unlock;
                }

AFAICT there is redundancy in these two conditionals.  The same clause
is being checked in both: (tb->fastreuseport > 0 &&
!rcu_access_pointer(sk->sk_reuseport_cb) && sk->sk_reuseport &&
uid_eq(tb->fastuid, uid))) && smallest_size == -1. If this is true the
first conditional should be hit, goto done,  and the second will never
evaluate that part to true-- unless the sk is changed (do we need
READ_ONCE for sk->sk_reuseport_cb?).

Another potential issue is the that the goto again goes back to doing
the port scan, but if snum had been set originally that doesn't seem
like what we want.

Thanks,
Tom




On Mon, Dec 12, 2016 at 2:24 PM, Josef Bacik <jbacik@fb.com> wrote:
>
> On Mon, Dec 12, 2016 at 1:44 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>>
>> On 12.12.2016 19:05, Josef Bacik wrote:
>>>
>>>  On Fri, Dec 9, 2016 at 11:14 PM, Eric Dumazet <eric.dumazet@gmail.com>
>>>  wrote:
>>>>
>>>>  On Fri, 2016-12-09 at 19:47 -0800, Eric Dumazet wrote:
>>>>
>>>>>
>>>>>   Hmm... Is your ephemeral port range includes the port your load
>>>>>   balancing app is using ?
>>>>
>>>>
>>>>  I suspect that you might have processes doing bind( port = 0) that are
>>>>  trapped into the bind_conflict() scan ?
>>>>
>>>>  With 100,000 + timewaits there, this possibly hurts.
>>>>
>>>>  Can you try the following loop breaker ?
>>>
>>>
>>>  It doesn't appear that the app is doing bind(port = 0) during normal
>>>  operation.  I tested this patch and it made no difference.  I'm going to
>>>  test simply restarting the app without changing to the SO_REUSEPORT
>>>  option.  Thanks,
>>
>>
>> Would it be possible to trace the time the function uses with trace? If
>> we don't see the number growing considerably over time we probably can
>> rule out that we loop somewhere in there (I would instrument
>> inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port).
>>
>> __inet_hash_connect -> __inet_check_established also takes a lock
>> (inet_ehash_lockp) which can be locked from inet_diag code path during
>> socket diag info dumping.
>>
>> Unfortunately we couldn't reproduce it so far. :/
>
>
> So I had a bcc script running to time how long we spent in
> inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port, but of
> course I'm an idiot and didn't actually separate out the stats so I could
> tell _which_ one was taking forever.  But anyway here's a normal
> distribution on the box
>
>     Some shit           : count     distribution
>         0 -> 1          : 0        |                                       |
>         2 -> 3          : 0        |                                       |
>         4 -> 7          : 0        |                                       |
>         8 -> 15         : 0        |                                       |
>        16 -> 31         : 0        |                                       |
>        32 -> 63         : 0        |                                       |
>        64 -> 127        : 0        |                                       |
>       128 -> 255        : 0        |                                       |
>       256 -> 511        : 0        |                                       |
>       512 -> 1023       : 0        |                                       |
>      1024 -> 2047       : 74       |                                       |
>      2048 -> 4095       : 10537
> |****************************************|
>      4096 -> 8191       : 8497     |********************************       |
>      8192 -> 16383      : 3745     |**************                         |
>     16384 -> 32767      : 300      |*                                      |
>     32768 -> 65535      : 250      |                                       |
>     65536 -> 131071     : 180      |                                       |
>    131072 -> 262143     : 71       |                                       |
>    262144 -> 524287     : 18       |                                       |
>    524288 -> 1048575    : 5        |                                       |
>
> With the times in nanoseconds, and here's the distribution during the
> problem
>
>     Some shit           : count     distribution
>         0 -> 1          : 0        |                                       |
>         2 -> 3          : 0        |                                       |
>         4 -> 7          : 0        |                                       |
>         8 -> 15         : 0        |                                       |
>        16 -> 31         : 0        |                                       |
>        32 -> 63         : 0        |                                       |
>        64 -> 127        : 0        |                                       |
>       128 -> 255        : 0        |                                       |
>       256 -> 511        : 0        |                                       |
>       512 -> 1023       : 0        |                                       |
>      1024 -> 2047       : 21       |                                       |
>      2048 -> 4095       : 21820
> |****************************************|
>      4096 -> 8191       : 11598    |*********************                  |
>      8192 -> 16383      : 4337     |*******                                |
>     16384 -> 32767      : 290      |                                       |
>     32768 -> 65535      : 59       |                                       |
>     65536 -> 131071     : 23       |                                       |
>    131072 -> 262143     : 12       |                                       |
>    262144 -> 524287     : 6        |                                       |
>    524288 -> 1048575    : 19       |                                       |
>   1048576 -> 2097151    : 1079     |*                                      |
>   2097152 -> 4194303    : 0        |                                       |
>   4194304 -> 8388607    : 1        |                                       |
>   8388608 -> 16777215   : 0        |                                       |
>  16777216 -> 33554431   : 0        |                                       |
>  33554432 -> 67108863   : 1192     |**                                     |
>               Some shit                     : count     distribution
>                   0 -> 1                    : 0        |                   |
>                   2 -> 3                    : 0        |                   |
>                   4 -> 7                    : 0        |                   |
>                   8 -> 15                   : 0        |                   |
>                  16 -> 31                   : 0        |                   |
>                  32 -> 63                   : 0        |                   |
>                  64 -> 127                  : 0        |                   |
>                 128 -> 255                  : 0        |                   |
>                 256 -> 511                  : 0        |                   |
>                 512 -> 1023                 : 0        |                   |
>                1024 -> 2047                 : 48       |                   |
>                2048 -> 4095                 : 14714
> |********************|
>                4096 -> 8191                 : 6769     |*********          |
>                8192 -> 16383                : 2234     |***                |
>               16384 -> 32767                : 422      |                   |
>               32768 -> 65535                : 208      |                   |
>               65536 -> 131071               : 61       |                   |
>              131072 -> 262143               : 10       |                   |
>              262144 -> 524287               : 416      |                   |
>              524288 -> 1048575              : 826      |*                  |
>             1048576 -> 2097151              : 598      |                   |
>             2097152 -> 4194303              : 10       |                   |
>             4194304 -> 8388607              : 0        |                   |
>             8388608 -> 16777215             : 1        |                   |
>            16777216 -> 33554431             : 289      |                   |
>            33554432 -> 67108863             : 921      |*                  |
>            67108864 -> 134217727            : 74       |                   |
>           134217728 -> 268435455            : 75       |                   |
>           268435456 -> 536870911            : 48       |                   |
>           536870912 -> 1073741823           : 25       |                   |
>          1073741824 -> 2147483647           : 3        |                   |
>          2147483648 -> 4294967295           : 2        |                   |
>          4294967296 -> 8589934591           : 1        |                   |
>
> As you can see we start getting tail latencies of up to 4-8 seconds.
> Tomorrow I'll separate out the stats so we can know which function is the
> problem child.  Sorry about not doing that first.  Thanks,
>
> Josef
>

^ permalink raw reply

* Re: [RFC PATCH v3] audit: use proper refcount locking on audit_sock
From: Paul Moore @ 2016-12-13 20:50 UTC (permalink / raw)
  To: Richard Guy Briggs
  Cc: netdev, linux-kernel, edumazet, linux-audit, xiyou.wangcong,
	dvyukov
In-Reply-To: <61c37ca790bc11bc023aea8f9b70ab3098aa30f5.1481626466.git.rgb@redhat.com>

On Tue, Dec 13, 2016 at 10:03 AM, Richard Guy Briggs <rgb@redhat.com> wrote:
> Resetting audit_sock appears to be racy.
>
> audit_sock was being copied and dereferenced without using a refcount on
> the source sock.
>
> Bump the refcount on the underlying sock when we store a refrence in
> audit_sock and release it when we reset audit_sock.  audit_sock
> modification needs the audit_cmd_mutex.
>
> See: https://lkml.org/lkml/2016/11/26/232
>
> Thanks to Eric Dumazet <edumazet@google.com> and Cong Wang
> <xiyou.wangcong@gmail.com> on ideas how to fix it.
>
> Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
> ---
> There has been a lot of change in the audit code that is about to go
> upstream to address audit queue issues.  This patch is based on the
> source tree: git://git.infradead.org/users/pcmoore/audit#next
> ---
>  kernel/audit.c |   28 +++++++++++++++++++++++-----
>  1 files changed, 23 insertions(+), 5 deletions(-)

This looks more reasonable.  I still wonder about synchronization
between threads changing the audit_* connection variables and the
kauditd_thread, but I guess we can treat that as another issue; this
patch fixes a bug and is worth merging now.

I'm building a test kernel right now, assuming nothing blows up I'll
push this patch with the rest of the audit patches tomorrow; if
something bad happens, this is going to miss the first audit pull
request.

> diff --git a/kernel/audit.c b/kernel/audit.c
> index f20eee0..3bb4126 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -446,14 +446,19 @@ static void kauditd_retry_skb(struct sk_buff *skb)
>   * Description:
>   * Break the auditd/kauditd connection and move all the records in the retry
>   * queue into the hold queue in case auditd reconnects.
> + * The audit_cmd_mutex must be held when calling this function.
>   */

Don't resend, but in the future please start comments like this on the
previous line.

^ permalink raw reply

* Re: bpf debug info
From: Daniel Borkmann @ 2016-12-13 19:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: netdev@vger.kernel.org, Brenden Blanco, Thomas Graf, Wangnan,
	He Kuang, Kernel Team
In-Reply-To: <CAADnVQKKjYDCqRGrB2UVKf=-KZBpt+5+M4nXXVuVefEGRv5MYQ@mail.gmail.com>

On 12/13/2016 08:38 PM, Alexei Starovoitov wrote:
> On Tue, Nov 29, 2016 at 9:01 AM, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>>> If I try to run samples/bpf/test_cls_bpf.sh the verifier will complain:
>>>> R0=imm0,min_value=0,max_value=0 R1=pkt(id=0,off=0,r=42) R2=pkt_end
>>>> 112: (0f) r4 += r3
>>>> 113: (0f) r1 += r4
>>>> 114: (b7) r0 = 2
>>>> 115: (69) r2 = *(u16 *)(r1 +2)
>>>> invalid access to packet, off=2 size=2, R1(id=3,off=0,r=0)
>>>>
>>>> Now multiply 115 * 8 and convert to hex. This is address 0x398 in llvm-objdump:
>>>> ; struct udphdr *udp = data + tp_off;
>>>>       388:       r1 += r4
>>>>       390:       r0 = 2
>>>> ; if (udp->dest == htons(DEFAULT_PKTGEN_UDP_PORT) ||
>>>>       398:       r2 = *(u16 *)(r1 + 2)
>>>>       3a0:       if r2 == 2304 goto 16
>>>>
>>>> Now it's clear which line of C code is causing the verifier to reject.
>>> [...]
>>>
>>> Could llvm-objdump switch line numbering for bpf same way as verifier
>>> output, so mapping step is not really needed?
>>
>> you mean that llvm-objdump to print 113,114,115 ?
>> I guess it's doable. Will give it a try.
>
> Hi Daniel,
>
> your feature request turned out to be pretty straightforward
> to implement. Please pull the latest llvm and rebuild llvm-objdump.
> It will be printing instruction numbers instead of absolute addresses.
> No "multiply 115 * 8 and convert to hex" steps necessary anymore.

That's great to hear, thanks for following up on this. Sounds about
right to upgrade.

Thanks,
Daniel

^ permalink raw reply

* [PATCH iproute2 1/1] tc: pass correct conversion specifier to print 'unsigned int' action index.
From: Roman Mashak @ 2016-12-13 20:31 UTC (permalink / raw)
  To: stephen; +Cc: netdev, jhs, daniel, xiyou.wangcong, Roman Mashak

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
---
 tc/m_bpf.c      | 2 +-
 tc/m_connmark.c | 2 +-
 tc/m_csum.c     | 3 ++-
 tc/m_gact.c     | 3 ++-
 tc/m_ife.c      | 2 +-
 tc/m_ipt.c      | 2 +-
 tc/m_mirred.c   | 3 ++-
 tc/m_pedit.c    | 2 +-
 tc/m_simple.c   | 2 +-
 tc/m_skbedit.c  | 2 +-
 tc/m_skbmod.c   | 2 +-
 tc/m_vlan.c     | 2 +-
 tc/m_xt.c       | 2 +-
 tc/m_xt_old.c   | 2 +-
 14 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/tc/m_bpf.c b/tc/m_bpf.c
index 9bf2a85..6400724 100644
--- a/tc/m_bpf.c
+++ b/tc/m_bpf.c
@@ -161,7 +161,7 @@ static int bpf_print_opt(struct action_util *au, FILE *f, struct rtattr *arg)
 	}
 
 	fprintf(f, "default-action %s\n", action_n2a(parm->action));
-	fprintf(f, "\tindex %d ref %d bind %d", parm->index, parm->refcnt,
+	fprintf(f, "\tindex %u ref %d bind %d", parm->index, parm->refcnt,
 		parm->bindcnt);
 
 	if (show_stats) {
diff --git a/tc/m_connmark.c b/tc/m_connmark.c
index 20f98e4..295f90d 100644
--- a/tc/m_connmark.c
+++ b/tc/m_connmark.c
@@ -123,7 +123,7 @@ static int print_connmark(struct action_util *au, FILE *f, struct rtattr *arg)
 	ci = RTA_DATA(tb[TCA_CONNMARK_PARMS]);
 
 	fprintf(f, " connmark zone %d\n", ci->zone);
-	fprintf(f, "\t index %d ref %d bind %d", ci->index,
+	fprintf(f, "\t index %u ref %d bind %d", ci->index,
 		ci->refcnt, ci->bindcnt);
 
 	if (show_stats) {
diff --git a/tc/m_csum.c b/tc/m_csum.c
index a6e4c1e..d5b1af6 100644
--- a/tc/m_csum.c
+++ b/tc/m_csum.c
@@ -199,7 +199,8 @@ print_csum(struct action_util *au, FILE *f, struct rtattr *arg)
 		uflag_1, uflag_2, uflag_3,
 		uflag_4, uflag_5, uflag_6,
 		action_n2a(sel->action));
-	fprintf(f, "\tindex %d ref %d bind %d", sel->index, sel->refcnt, sel->bindcnt);
+	fprintf(f, "\tindex %u ref %d bind %d", sel->index, sel->refcnt,
+		sel->bindcnt);
 
 	if (show_stats) {
 		if (tb[TCA_CSUM_TM]) {
diff --git a/tc/m_gact.c b/tc/m_gact.c
index dc04b9f..755a3be 100644
--- a/tc/m_gact.c
+++ b/tc/m_gact.c
@@ -224,7 +224,8 @@ print_gact(struct action_util *au, FILE * f, struct rtattr *arg)
 	fprintf(f, "\n\t random type %s %s val %d",
 		prob_n2a(pp->ptype), action_n2a(pp->paction), pp->pval);
 #endif
-	fprintf(f, "\n\t index %d ref %d bind %d", p->index, p->refcnt, p->bindcnt);
+	fprintf(f, "\n\t index %u ref %d bind %d", p->index, p->refcnt,
+		p->bindcnt);
 	if (show_stats) {
 		if (tb[TCA_GACT_TM]) {
 			struct tcf_t *tm = RTA_DATA(tb[TCA_GACT_TM]);
diff --git a/tc/m_ife.c b/tc/m_ife.c
index e6f6153..f6131b1 100644
--- a/tc/m_ife.c
+++ b/tc/m_ife.c
@@ -312,7 +312,7 @@ static int print_ife(struct action_util *au, FILE *f, struct rtattr *arg)
 				    sizeof(b2)));
 	}
 
-	fprintf(f, "\n\t index %d ref %d bind %d", p->index, p->refcnt,
+	fprintf(f, "\n\t index %u ref %d bind %d", p->index, p->refcnt,
 		p->bindcnt);
 	if (show_stats) {
 		if (tb[TCA_IFE_TM]) {
diff --git a/tc/m_ipt.c b/tc/m_ipt.c
index d6f62bd..1b935ec 100644
--- a/tc/m_ipt.c
+++ b/tc/m_ipt.c
@@ -489,7 +489,7 @@ print_ipt(struct action_util *au, FILE * f, struct rtattr *arg)
 			__u32 index;
 
 			index = rta_getattr_u32(tb[TCA_IPT_INDEX]);
-			fprintf(f, "\n\tindex %d", index);
+			fprintf(f, "\n\tindex %u", index);
 		}
 
 		if (tb[TCA_IPT_CNT]) {
diff --git a/tc/m_mirred.c b/tc/m_mirred.c
index 11f4c9b..35ae21f 100644
--- a/tc/m_mirred.c
+++ b/tc/m_mirred.c
@@ -260,7 +260,8 @@ print_mirred(struct action_util *au, FILE * f, struct rtattr *arg)
 		mirred_n2a(p->eaction), dev, action_n2a(p->action));
 
 	fprintf(f, "\n ");
-	fprintf(f, "\tindex %d ref %d bind %d", p->index, p->refcnt, p->bindcnt);
+	fprintf(f, "\tindex %u ref %d bind %d", p->index, p->refcnt,
+		p->bindcnt);
 
 	if (show_stats) {
 		if (tb[TCA_MIRRED_TM]) {
diff --git a/tc/m_pedit.c b/tc/m_pedit.c
index 891c2ec..8e9bf07 100644
--- a/tc/m_pedit.c
+++ b/tc/m_pedit.c
@@ -527,7 +527,7 @@ int print_pedit(struct action_util *au, FILE *f, struct rtattr *arg)
 
 	fprintf(f, " pedit action %s keys %d\n ",
 		action_n2a(sel->action), sel->nkeys);
-	fprintf(f, "\t index %d ref %d bind %d", sel->index, sel->refcnt,
+	fprintf(f, "\t index %u ref %d bind %d", sel->index, sel->refcnt,
 		sel->bindcnt);
 
 	if (show_stats) {
diff --git a/tc/m_simple.c b/tc/m_simple.c
index 732eaf1..3a8bd91 100644
--- a/tc/m_simple.c
+++ b/tc/m_simple.c
@@ -187,7 +187,7 @@ static int print_simple(struct action_util *au, FILE *f, struct rtattr *arg)
 	simpdata = RTA_DATA(tb[TCA_DEF_DATA]);
 
 	fprintf(f, "Simple <%s>\n", simpdata);
-	fprintf(f, "\t index %d ref %d bind %d", sel->index,
+	fprintf(f, "\t index %u ref %d bind %d", sel->index,
 		sel->refcnt, sel->bindcnt);
 
 	if (show_stats) {
diff --git a/tc/m_skbedit.c b/tc/m_skbedit.c
index 368debc..8660d60 100644
--- a/tc/m_skbedit.c
+++ b/tc/m_skbedit.c
@@ -214,7 +214,7 @@ static int print_skbedit(struct action_util *au, FILE *f, struct rtattr *arg)
 			fprintf(f, " ptype %d", *ptype);
 	}
 
-	fprintf(f, "\n\t index %d ref %d bind %d",
+	fprintf(f, "\n\t index %u ref %d bind %d",
 		p->index, p->refcnt, p->bindcnt);
 
 	if (show_stats) {
diff --git a/tc/m_skbmod.c b/tc/m_skbmod.c
index 0c293fc..acb7771 100644
--- a/tc/m_skbmod.c
+++ b/tc/m_skbmod.c
@@ -237,7 +237,7 @@ static int print_skbmod(struct action_util *au, FILE *f, struct rtattr *arg)
 	if (p->flags & SKBMOD_F_SWAPMAC)
 		fprintf(f, "swap mac ");
 
-	fprintf(f, "\n\t index %d ref %d bind %d", p->index, p->refcnt,
+	fprintf(f, "\n\t index %u ref %d bind %d", p->index, p->refcnt,
 		p->bindcnt);
 	if (show_stats) {
 		if (tb[TCA_SKBMOD_TM]) {
diff --git a/tc/m_vlan.c b/tc/m_vlan.c
index b32f746..44b9375 100644
--- a/tc/m_vlan.c
+++ b/tc/m_vlan.c
@@ -226,7 +226,7 @@ static int print_vlan(struct action_util *au, FILE *f, struct rtattr *arg)
 	}
 	fprintf(f, " %s", action_n2a(parm->action));
 
-	fprintf(f, "\n\t index %d ref %d bind %d", parm->index, parm->refcnt,
+	fprintf(f, "\n\t index %u ref %d bind %d", parm->index, parm->refcnt,
 		parm->bindcnt);
 
 	if (show_stats) {
diff --git a/tc/m_xt.c b/tc/m_xt.c
index 028bad6..dbb5498 100644
--- a/tc/m_xt.c
+++ b/tc/m_xt.c
@@ -372,7 +372,7 @@ print_ipt(struct action_util *au, FILE *f, struct rtattr *arg)
 		__u32 index;
 
 		index = rta_getattr_u32(tb[TCA_IPT_INDEX]);
-		fprintf(f, "\n\tindex %d", index);
+		fprintf(f, "\n\tindex %u", index);
 	}
 
 	if (tb[TCA_IPT_CNT]) {
diff --git a/tc/m_xt_old.c b/tc/m_xt_old.c
index 20a6342..e9cc624 100644
--- a/tc/m_xt_old.c
+++ b/tc/m_xt_old.c
@@ -412,7 +412,7 @@ print_ipt(struct action_util *au, FILE * f, struct rtattr *arg)
 			__u32 index;
 
 			index = rta_getattr_u32(tb[TCA_IPT_INDEX]);
-			fprintf(f, "\n\tindex %d", index);
+			fprintf(f, "\n\tindex %u", index);
 		}
 
 		if (tb[TCA_IPT_CNT]) {
-- 
1.9.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox