Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH RFC] net: introduce support for lazy initialization of secret keys
From: Hannes Frederic Sowa @ 2013-09-26  3:03 UTC (permalink / raw)
  To: Eric Dumazet, netdev, fw, edumazet, davem, ycheng
In-Reply-To: <20130926025024.GC30920@order.stressinduktion.org>

On Thu, Sep 26, 2013 at 04:50:24AM +0200, Hannes Frederic Sowa wrote:
>  static inline unsigned int inet6_ehashfn(struct net *net,
>  				const struct in6_addr *laddr, const u16 lport,
>  				const struct in6_addr *faddr, const __be16 fport)
>  {
> -	u32 ports = (((u32)lport) << 16) | (__force u32)fport;
> +	u32 ports;
> +
> +	net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret));

I was too quick with sending, this is broken. I cannot call this in a
static inline. Sorry for the noise.

^ permalink raw reply

* Re: [PATCH net-next v5 2/6] bonding: remove the no effect lock for bond_3ad_lacpdu_recv()
From: Ding Tianhong @ 2013-09-26  2:51 UTC (permalink / raw)
  To: Veaceslav Falico
  Cc: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Netdev
In-Reply-To: <20130925103300.GB23575@redhat.com>

On 2013/9/25 18:33, Veaceslav Falico wrote:
> On Wed, Sep 25, 2013 at 05:52:19PM +0800, Ding Tianhong wrote:
>> There is no pointer needed read lock protection, remove the unnecessary lock
>> and improve performance for the 3ad recv path.
> 
> I don't really understand it. Here's the code path:
> 
> rx_handler (holding rcu_read_lock()) -> bond_handle_frame() ->
> bond->recv_probe -> bond_3ad_lacpdu_recv(). So we're holding only the
> rcu_read_lock() there. What stops us from racing with
> bond_3ad_unbind_slave(), for example?
> 
> As in:
> 
> CPU0                CPU1
> --------            -----------
> ...                bond_3ad_unbind_slave()
> bond_3ad_rx_indication()    ...
> if (!port->slave) {        ...            //slave is ok
>                 port->slave = NULL;
> ad_marker_info_received()    ...
> ad_marker_send()        ...
> slave = port->slave;        ...
> skb->dev = slave->dev;        ...
>        ^^^ NULL pointer dereference.
> 
> 
> I'm not saying that this approach is wrong, maybe I'm missing something,
> but when removing locks it's usually a good thing to do - to comment it in
> depth in the commit message why it's not already needed.
> 

no, it will not happend, pls review the cold:
	netdev_rx_handler_unregister(slave_dev);
	write_lock_bh(&bond->lock);

	/* Inform AD package of unbinding of slave. */
	if (bond->params.mode == BOND_MODE_8023AD) {
		/* must be called before the slave is
		 * detached from the list
		 */
		bond_3ad_unbind_slave(slave);
	}
netdev_rx_handler_unregiste() will remvoe the rx_handle before the bond_3ad_unbind_slave(),
it was safe to run bond_3ad_rx_indication(). 

Best regards
Ding Tianhong


>>
>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>> Cc: Nikolay Aleksandrov <nikolay@redhat.com>
>> ---
>> drivers/net/bonding/bond_3ad.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index 7a3860f..c134f43 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -2494,9 +2494,7 @@ int bond_3ad_lacpdu_recv(const struct sk_buff *skb, struct bonding *bond,
>>     if (!lacpdu)
>>         return ret;
>>
>> -    read_lock(&bond->lock);
>>     ret = bond_3ad_rx_indication(lacpdu, slave, skb->len);
>> -    read_unlock(&bond->lock);
>>     return ret;
>> }
>>
>> -- 
>> 1.8.2.1
>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 

^ permalink raw reply

* Re: [PATCH RFC] net: introduce support for lazy initialization of secret keys
From: Hannes Frederic Sowa @ 2013-09-26  2:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, fw, edumazet, davem, ycheng
In-Reply-To: <1380148946.3165.166.camel@edumazet-glaptop>

On Wed, Sep 25, 2013 at 03:42:26PM -0700, Eric Dumazet wrote:
> BTW, build_ehash_secret() is called like that :
> 
> if (unlikely(!inet_ehash_secret))
>     if (sock->type != SOCK_RAW && sock->type != SOCK_DGRAM)
>         build_ehash_secret();
> 
> So it would be better to make sure inet_ehash_secret is not 0 by
> accident.

This is the most idiomatic version which does not have this problem. It
does add a very few instructions on every hashing call and splits
inet6_ehash_secret and inet_ehash_secret. I just did this in a hurry
and did not test it well, so this is only a preview if this way would
be acceptable.

I would finish a final version by tomorrow based on the feedback. Thanks! ;)

[PATCH RFC] net: introduce support for lazy initialization of secret keys

net_get_random_once is a new macro which handles the initialization
of secret keys. It is possible to call it in the fast path. Only the
initialization depends on the spinlock and is rather slow. Otherwise
it should get used just before the key is used to delay the entropy
extration as late as possible to get better randomness. It returns true
if the key got initialized.

This patch splits the secret key for syncookies between ipv4 and ipv6
and initializes them with net_get_random_once.

Changed key initialization of tcp_fastopen cookies to net_get_random_once.

Also initialize the secret keys for tcp connection hashing via
net_get_random_once and split inet_ehash_secret and inet6_ehash_secret.
Thus inet_hash_secret need not be exported any more.

Cc: Florian Westphal <fw@strlen.de>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/linux/net.h            | 14 ++++++++++++++
 include/net/inet6_hashtables.h |  9 +++++++--
 include/net/inet_sock.h        |  4 ++--
 include/net/ipv6.h             |  7 ++++++-
 include/net/tcp.h              |  3 +--
 net/core/secure_seq.c          | 14 ++------------
 net/core/utils.c               | 21 +++++++++++++++++++++
 net/ipv4/af_inet.c             | 25 -------------------------
 net/ipv4/syncookies.c          | 16 ++++++----------
 net/ipv4/sysctl_net_ipv4.c     |  5 +++++
 net/ipv4/tcp_fastopen.c        | 21 ++++++++++-----------
 net/ipv6/af_inet6.c            |  5 -----
 net/ipv6/inet6_hashtables.c    |  3 +++
 net/ipv6/syncookies.c          | 10 ++++++++--
 14 files changed, 85 insertions(+), 72 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index 4f27575..d14fad5 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -243,6 +243,20 @@ do {								\
 #define net_random()		prandom_u32()
 #define net_srandom(seed)	prandom_seed((__force u32)(seed))
 
+bool __net_get_random_once(void *buf, int nbytes, bool *done);
+
+/* BE CAREFUL: this function is not interrupt safe */
+#define net_get_random_once(buf, nbytes)				\
+	({								\
+		static bool ___done = false;				\
+		bool ___ret = false;					\
+		if (unlikely(!___done))					\
+			___ret = __net_get_random_once(buf,		\
+						       nbytes,		\
+						       &___done);	\
+		___ret;							\
+	})
+
 extern int   	     kernel_sendmsg(struct socket *sock, struct msghdr *msg,
 				    struct kvec *vec, size_t num, size_t len);
 extern int   	     kernel_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h
index f52fa88..8b66189 100644
--- a/include/net/inet6_hashtables.h
+++ b/include/net/inet6_hashtables.h
@@ -28,16 +28,21 @@
 
 struct inet_hashinfo;
 
+extern u32 inet6_ehash_secret;
+
 static inline unsigned int inet6_ehashfn(struct net *net,
 				const struct in6_addr *laddr, const u16 lport,
 				const struct in6_addr *faddr, const __be16 fport)
 {
-	u32 ports = (((u32)lport) << 16) | (__force u32)fport;
+	u32 ports;
+
+	net_get_random_once(&inet6_ehash_secret, sizeof(inet6_ehash_secret));
 
+	ports = (((u32)lport) << 16) | (__force u32)fport;
 	return jhash_3words((__force u32)laddr->s6_addr32[3],
 			    ipv6_addr_jhash(faddr),
 			    ports,
-			    inet_ehash_secret + net_hash_mix(net));
+			    inet6_ehash_secret + net_hash_mix(net));
 }
 
 static inline int inet6_sk_ehashfn(const struct sock *sk)
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 636d203..d5a5353 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -202,13 +202,13 @@ static inline void inet_sk_copy_descendant(struct sock *sk_to,
 int inet_sk_rebuild_header(struct sock *sk);
 
 extern u32 inet_ehash_secret;
-extern u32 ipv6_hash_secret;
-void build_ehash_secret(void);
 
 static inline unsigned int inet_ehashfn(struct net *net,
 					const __be32 laddr, const __u16 lport,
 					const __be32 faddr, const __be16 fport)
 {
+	net_get_random_once(&inet_ehash_secret, sizeof(inet_ehash_secret));
+
 	return jhash_3words((__force __u32) laddr,
 			    (__force __u32) faddr,
 			    ((__u32) lport) << 16 | (__force __u32)fport,
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index fe1c7f6..c683020 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -538,11 +538,16 @@ static inline u32 ipv6_addr_hash(const struct in6_addr *a)
 #endif
 }
 
+extern u32 ipv6_hash_secret;
+
 /* more secured version of ipv6_addr_hash() */
 static inline u32 ipv6_addr_jhash(const struct in6_addr *a)
 {
-	u32 v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
+	u32 v;
+
+	net_get_random_once(&ipv6_hash_secret, sizeof(ipv6_hash_secret));
 
+	v = (__force u32)a->s6_addr32[0] ^ (__force u32)a->s6_addr32[1];
 	return jhash_3words(v,
 			    (__force u32)a->s6_addr32[2],
 			    (__force u32)a->s6_addr32[3],
diff --git a/include/net/tcp.h b/include/net/tcp.h
index de870ee..2a26100 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -475,7 +475,6 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size);
 void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb);
 
 /* From syncookies.c */
-extern __u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
 int __cookie_v4_check(const struct iphdr *iph, const struct tcphdr *th,
 		      u32 cookie);
 struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb,
@@ -1323,7 +1322,7 @@ extern struct tcp_fastopen_context __rcu *tcp_fastopen_ctx;
 int tcp_fastopen_reset_cipher(void *key, unsigned int len);
 void tcp_fastopen_cookie_gen(__be32 src, __be32 dst,
 			     struct tcp_fastopen_cookie *foc);
-
+void tcp_fastopen_init_key_once(bool publish);
 #define TCP_FASTOPEN_KEY_LENGTH 16
 
 /* Fastopen key context */
diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 3f1ec15..b02fd16 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -7,6 +7,7 @@
 #include <linux/hrtimer.h>
 #include <linux/ktime.h>
 #include <linux/string.h>
+#include <linux/net.h>
 
 #include <net/secure_seq.h>
 
@@ -16,18 +17,7 @@ static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
 
 static void net_secret_init(void)
 {
-	u32 tmp;
-	int i;
-
-	if (likely(net_secret[0]))
-		return;
-
-	for (i = NET_SECRET_SIZE; i > 0;) {
-		do {
-			get_random_bytes(&tmp, sizeof(tmp));
-		} while (!tmp);
-		cmpxchg(&net_secret[--i], 0, tmp);
-	}
+	net_get_random_once(net_secret, sizeof(net_secret));
 }
 
 #ifdef CONFIG_INET
diff --git a/net/core/utils.c b/net/core/utils.c
index aa88e23..b420547 100644
--- a/net/core/utils.c
+++ b/net/core/utils.c
@@ -338,3 +338,24 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
 				  csum_unfold(*sum)));
 }
 EXPORT_SYMBOL(inet_proto_csum_replace16);
+
+bool __net_get_random_once(void *buf, int nbytes, bool *done)
+{
+	static DEFINE_SPINLOCK(lock);
+
+	spin_lock_bh(&lock);
+	if (*done) {
+		spin_unlock_bh(&lock);
+		return false;
+	}
+
+	get_random_bytes(buf, nbytes);
+	/* Make sure random data is published before toggeling done.
+	 * There is no corresponding rmb.
+	 */
+	smp_wmb();
+	*done = true;
+	spin_unlock_bh(&lock);
+	return true;
+}
+EXPORT_SYMBOL(__net_get_random_once);
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index cfeb85c..49a5c48 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -246,27 +246,6 @@ out:
 EXPORT_SYMBOL(inet_listen);
 
 u32 inet_ehash_secret __read_mostly;
-EXPORT_SYMBOL(inet_ehash_secret);
-
-u32 ipv6_hash_secret __read_mostly;
-EXPORT_SYMBOL(ipv6_hash_secret);
-
-/*
- * inet_ehash_secret must be set exactly once, and to a non nul value
- * ipv6_hash_secret must be set exactly once.
- */
-void build_ehash_secret(void)
-{
-	u32 rnd;
-
-	do {
-		get_random_bytes(&rnd, sizeof(rnd));
-	} while (rnd == 0);
-
-	if (cmpxchg(&inet_ehash_secret, 0, rnd) == 0)
-		get_random_bytes(&ipv6_hash_secret, sizeof(ipv6_hash_secret));
-}
-EXPORT_SYMBOL(build_ehash_secret);
 
 /*
  *	Create an inet socket.
@@ -284,10 +263,6 @@ static int inet_create(struct net *net, struct socket *sock, int protocol,
 	int try_loading_module = 0;
 	int err;
 
-	if (unlikely(!inet_ehash_secret))
-		if (sock->type != SOCK_RAW && sock->type != SOCK_DGRAM)
-			build_ehash_secret();
-
 	sock->state = SS_UNCONNECTED;
 
 	/* Look for the requested type/protocol pair. */
diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c
index 15e0241..af79b84 100644
--- a/net/ipv4/syncookies.c
+++ b/net/ipv4/syncookies.c
@@ -25,15 +25,7 @@
 
 extern int sysctl_tcp_syncookies;
 
-__u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
-EXPORT_SYMBOL(syncookie_secret);
-
-static __init int init_syncookies(void)
-{
-	get_random_bytes(syncookie_secret, sizeof(syncookie_secret));
-	return 0;
-}
-__initcall(init_syncookies);
+static u32 syncookie_secret[2][16-4+SHA_DIGEST_WORDS];
 
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
@@ -44,7 +36,11 @@ static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS],
 static u32 cookie_hash(__be32 saddr, __be32 daddr, __be16 sport, __be16 dport,
 		       u32 count, int c)
 {
-	__u32 *tmp = __get_cpu_var(ipv4_cookie_scratch);
+	__u32 *tmp;
+
+	net_get_random_once(syncookie_secret, sizeof(syncookie_secret));
+
+	tmp = __get_cpu_var(ipv4_cookie_scratch);
 
 	memcpy(tmp + 4, syncookie_secret[c], sizeof(syncookie_secret[c]));
 	tmp[0] = (__force u32)saddr;
diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c
index 540279f..d2b5140 100644
--- a/net/ipv4/sysctl_net_ipv4.c
+++ b/net/ipv4/sysctl_net_ipv4.c
@@ -267,6 +267,11 @@ static int proc_tcp_fastopen_key(struct ctl_table *ctl, int write,
 			ret = -EINVAL;
 			goto bad_key;
 		}
+		/* Generate a dummy secret but don't publish it. This
+		 * is needed so we don't regenerate a new key on the
+		 * first invocation of tcp_fastopen_cookie_gen
+		 */
+		tcp_fastopen_init_key_once(false);
 		tcp_fastopen_reset_cipher(user_key, TCP_FASTOPEN_KEY_LENGTH);
 	}
 
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
index ab7bd35..316bfdc 100644
--- a/net/ipv4/tcp_fastopen.c
+++ b/net/ipv4/tcp_fastopen.c
@@ -14,6 +14,14 @@ struct tcp_fastopen_context __rcu *tcp_fastopen_ctx;
 
 static DEFINE_SPINLOCK(tcp_fastopen_ctx_lock);
 
+void tcp_fastopen_init_key_once(bool publish)
+{
+	static u8 key[TCP_FASTOPEN_KEY_LENGTH];
+
+	if (net_get_random_once(key, sizeof(key)) && publish)
+		tcp_fastopen_reset_cipher(key, sizeof(key));
+}
+
 static void tcp_fastopen_ctx_free(struct rcu_head *head)
 {
 	struct tcp_fastopen_context *ctx =
@@ -70,6 +78,8 @@ void tcp_fastopen_cookie_gen(__be32 src, __be32 dst,
 	__be32 path[4] = { src, dst, 0, 0 };
 	struct tcp_fastopen_context *ctx;
 
+	tcp_fastopen_init_key_once(true);
+
 	rcu_read_lock();
 	ctx = rcu_dereference(tcp_fastopen_ctx);
 	if (ctx) {
@@ -78,14 +88,3 @@ void tcp_fastopen_cookie_gen(__be32 src, __be32 dst,
 	}
 	rcu_read_unlock();
 }
-
-static int __init tcp_fastopen_init(void)
-{
-	__u8 key[TCP_FASTOPEN_KEY_LENGTH];
-
-	get_random_bytes(key, sizeof(key));
-	tcp_fastopen_reset_cipher(key, sizeof(key));
-	return 0;
-}
-
-late_initcall(tcp_fastopen_init);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 4966b12..5bd9b25 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -110,11 +110,6 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 	int try_loading_module = 0;
 	int err;
 
-	if (sock->type != SOCK_RAW &&
-	    sock->type != SOCK_DGRAM &&
-	    !inet_ehash_secret)
-		build_ehash_secret();
-
 	/* Look for the requested type/protocol pair. */
 lookup_protocol:
 	err = -ESOCKTNOSUPPORT;
diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c
index 32b4a16..4015720 100644
--- a/net/ipv6/inet6_hashtables.c
+++ b/net/ipv6/inet6_hashtables.c
@@ -23,6 +23,9 @@
 #include <net/secure_seq.h>
 #include <net/ip.h>
 
+u32 inet6_ehash_secret __read_mostly;
+u32 ipv6_hash_secret __read_mostly;
+
 int __inet6_hash(struct sock *sk, struct inet_timewait_sock *tw)
 {
 	struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo;
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index d703218..a454cef 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -21,6 +21,8 @@
 #include <net/ipv6.h>
 #include <net/tcp.h>
 
+static u32 syncookie6_secret[2][16-4+SHA_DIGEST_WORDS];
+
 #define COOKIEBITS 24	/* Upper bits store count */
 #define COOKIEMASK (((__u32)1 << COOKIEBITS) - 1)
 
@@ -61,14 +63,18 @@ static DEFINE_PER_CPU(__u32 [16 + 5 + SHA_WORKSPACE_WORDS],
 static u32 cookie_hash(const struct in6_addr *saddr, const struct in6_addr *daddr,
 		       __be16 sport, __be16 dport, u32 count, int c)
 {
-	__u32 *tmp = __get_cpu_var(ipv6_cookie_scratch);
+	__u32 *tmp;
+
+	net_get_random_once(syncookie6_secret, sizeof(syncookie6_secret));
+
+	tmp = __get_cpu_var(ipv6_cookie_scratch);
 
 	/*
 	 * we have 320 bits of information to hash, copy in the remaining
 	 * 192 bits required for sha_transform, from the syncookie_secret
 	 * and overwrite the digest with the secret
 	 */
-	memcpy(tmp + 10, syncookie_secret[c], 44);
+	memcpy(tmp + 10, syncookie6_secret[c], 44);
 	memcpy(tmp, saddr, 16);
 	memcpy(tmp + 4, daddr, 16);
 	tmp[8] = ((__force u32)sport << 16) + (__force u32)dport;
-- 
1.8.3.1

^ permalink raw reply related

* Re: Kernel Panic Sending Frames Using dev_queue_xmit()
From: Merlin Davis @ 2013-09-26  2:47 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Ben Hutchings, netdev
In-Reply-To: <20130925220351.GA4653@electric-eye.fr.zoreil.com>

On Wed, Sep 25, 2013 at 3:03 PM, Francois Romieu <romieu@fr.zoreil.com> wrote:
> Macro expansion bug.

Yes!  Excellent.  If I change the code to:

   err = dev_queue_xmit(skb);
   err = net_xmit_eval(err);

then the panic seems to disappear (I've run for at least 30 minutes
now without a crash anyway).  Would the best fix for this in the
kernel code be to:

A.) document the net_xmit_eval() macro with a warning,
B.) add local scope the the macro with ({ ... }) to remove the double
evaluation, or
C.) do nothing in the kernel code and stop being such a dumbass when calling it?

Thank you,
Merlin

^ permalink raw reply

* Re: [PATCH] moxa: drop free_irq of devm_request_irq allocated irq
From: Jingoo Han @ 2013-09-26  2:36 UTC (permalink / raw)
  To: 'Wei Yongjun'
  Cc: 'Jonas Jensen', 'David S. Miller',
	'Grant Likely', rob.herring, yongjun_wei, netdev,
	'Sachin Kamat', 'Jingoo Han'
In-Reply-To: <CAPgLHd9VXaPx18ujDKbUik6XjOsnARcK-C-orCOynyJaWFREjw@mail.gmail.com>

On Thursday, September 26, 2013 11:13 AM, Wei Yongjun wrote:
> On 09/26/2013 08:47 AM, Jingoo Han wrote:
> > On Wednesday, September 25, 2013 4:33 PM, Wei Yongjun wrote:
> >> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> >>
> >> irq allocated with devm_request_irq should not be freed using
> >> free_irq, because doing so causes a dangling pointer, and a
> >> subsequent double free.
> >>
> >> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> >> ---
> >>  drivers/net/ethernet/moxa/moxart_ether.c | 1 -
> >>  1 file changed, 1 deletion(-)
> >>
> >> diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
> >> index 83c2091..9a7fcb5 100644
> >> --- a/drivers/net/ethernet/moxa/moxart_ether.c
> >> +++ b/drivers/net/ethernet/moxa/moxart_ether.c
> >> @@ -531,7 +531,6 @@ static int moxart_remove(struct platform_device *pdev)
> >>  	struct net_device *ndev = platform_get_drvdata(pdev);
> >>
> >>  	unregister_netdev(ndev);
> >> -	free_irq(ndev->irq, ndev);
> >>  	moxart_mac_free_memory(ndev);
> >>  	free_netdev(ndev);
> > CC'ed Sachin Kamat,
> >
> > In this case, the free_irq() will be called, after calling
> > free_netdev(). 'ndev' is freed by free_netdev(). Then, 'ndev->irq'
> > is used by free_irq(). Is it right?
> >
> > In my humble opinion, it seems to make the problem.
> >
> 
> devm_request_irq() has recorded the irq and dev_id, so free_irq() by devm_*
> will not touch 'ndev' which has been freed by free_netdev().
> So, if we not need to call free_irq() before free_netdev(), there will be
> no problem.

However, 'dev_id' is a pointer, not a value.
It seems to make the problem that references the invalid pointer.

Best regards,
Jingoo Han

^ permalink raw reply

* Re: [PATCH net-next v5 1/6] bonding: simplify and use RCU protection for 3ad xmit path
From: Ding Tianhong @ 2013-09-26  2:33 UTC (permalink / raw)
  To: Veaceslav Falico
  Cc: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Netdev
In-Reply-To: <20130925105000.GC23575@redhat.com>

On 2013/9/25 18:50, Veaceslav Falico wrote:
> On Wed, Sep 25, 2013 at 05:52:15PM +0800, Ding Tianhong wrote:
>> The commit 278b20837511776dc9d5f6ee1c7fabd5479838bb
>> (bonding: initial RCU conversion) has convert the roundrobin, active-backup,
>> broadcast and xor xmit path to rcu protection, the performance will be better
>> for these mode, so this time, convert xmit path for 3ad mode.
>>
>> Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com>
>> Suggested-by: Veaceslav Falico <vfalico@redhat.com>
>> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
>> Signed-off-by: Wang Yufen <wangyufen@huawei.com>
>> Cc: Nikolay Aleksandrov <nikolay@redhat.com>
>> Cc: Veaceslav Falico <vfalico@redhat.com>
>> ---
>> drivers/net/bonding/bond_3ad.c | 32 ++++++++++++++------------------
>> drivers/net/bonding/bonding.h  | 30 +++++++++++++++++++++++++++++-
>> 2 files changed, 43 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
>> index 0d8f427..13f1deb 100644
>> --- a/drivers/net/bonding/bond_3ad.c
>> +++ b/drivers/net/bonding/bond_3ad.c
>> @@ -143,7 +143,7 @@ static inline struct bonding *__get_bond_by_port(struct port *port)
>>  */
>> static inline struct port *__get_first_port(struct bonding *bond)
>> {
>> -    struct slave *first_slave = bond_first_slave(bond);
>> +    struct slave *first_slave = bond_first_slave_rcu(bond);
>>
>>     return first_slave ? &(SLAVE_AD_INFO(first_slave).port) : NULL;
>> }
>> @@ -163,7 +163,7 @@ static inline struct port *__get_next_port(struct port *port)
>>     // If there's no bond for this port, or this is the last slave
>>     if (bond == NULL)
>>         return NULL;
>> -    slave_next = bond_next_slave(bond, slave);
>> +    slave_next = bond_next_slave_rcu(bond, slave);
>>     if (!slave_next || bond_is_first_slave(bond, slave_next))
>>         return NULL;
>>
>> @@ -2417,16 +2417,14 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
>>
>> int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
>> {
>> -    struct slave *slave, *start_at;
>>     struct bonding *bond = netdev_priv(dev);
>> +    struct slave *slave;
>>     int slave_agg_no;
>>     int slaves_in_agg;
>>     int agg_id;
>> -    int i;
>>     struct ad_info ad_info;
>>     int res = 1;
>>
>> -    read_lock(&bond->lock);
>>     if (__bond_3ad_get_active_agg_info(bond, &ad_info)) {
>>         pr_debug("%s: Error: __bond_3ad_get_active_agg_info failed\n",
>>              dev->name);
>> @@ -2444,13 +2442,17 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
>>
>>     slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg);
>>
>> -    bond_for_each_slave(bond, slave) {
>> +    bond_for_each_slave_rcu(bond, slave) {
>>         struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
>>
>>         if (agg && (agg->aggregator_identifier == agg_id)) {
>> -            slave_agg_no--;
>> -            if (slave_agg_no < 0)
>> -                break;
>> +            if (--slave_agg_no < 0) {
>> +                if (SLAVE_IS_OK(slave)) {
>> +                    res = bond_dev_queue_xmit(bond,
>> +                        skb, slave->dev);
>> +                    goto out;
>> +                }
>> +            }
>>         }
>>     }
>>
>> @@ -2460,23 +2462,17 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
>>         goto out;
>>     }
>>
>> -    start_at = slave;
>> -
>> -    bond_for_each_slave_from(bond, slave, i, start_at) {
>> -        int slave_agg_id = 0;
>> +    bond_for_each_slave_rcu(bond, slave) {
>>         struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
>>
>> -        if (agg)
>> -            slave_agg_id = agg->aggregator_identifier;
>> -
>> -        if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
>> +        if (SLAVE_IS_OK(slave) && agg &&
>> +            agg->aggregator_identifier == agg_id) {
>>             res = bond_dev_queue_xmit(bond, skb, slave->dev);
>>             break;
>>         }
>>     }
>>
>> out:
>> -    read_unlock(&bond->lock);
>>     if (res) {
>>         /* no suitable interface, frame not sent */
>>         kfree_skb(skb);
>> diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
>> index 03cf3fd..eb36f57 100644
>> --- a/drivers/net/bonding/bonding.h
>> +++ b/drivers/net/bonding/bonding.h
>> @@ -74,13 +74,31 @@
>> /* slave list primitives */
>> #define bond_to_slave(ptr) list_entry(ptr, struct slave, list)
>>
>> +/* slave list primitives, Caller must hold rcu_read_lock */
>> +#define bond_to_slave_rcu(ptr) list_entry_rcu(ptr, struct slave, list)
>> +
>> +/* bond_is_empty return NULL if slave list is empty*/
>> +#define bond_is_empty(bond) \
>> +    (list_empty(&(bond)->slave_list))
>> +
>> +/* bond_is_empty_rcu return NULL if slave list is empty*/
>> +#define bond_is_empty_rcu(bond) \
>> +    (!list_first_or_null_rcu(&(bond)->slave_list, struct slave, list))
> 
> Useless/dangerous function. It's not used in this patch and there can be no
> constructs like:
> 
> 1 if (!bond_is_empty_rcu(bond)) {
> 2     slave = bond_first_slave_rcu(bond);
> 3     do_something(slave);
> 

agree, forgot to remove it.

> because between 1) and 2) the slave can go away, and we'll end up with
> slave == NULL.
> 
>> +
>> /* IMPORTANT: bond_first/last_slave can return NULL in case of an empty list */
>> #define bond_first_slave(bond) \
>>     list_first_entry_or_null(&(bond)->slave_list, struct slave, list)
>> #define bond_last_slave(bond) \
>> -    (list_empty(&(bond)->slave_list) ? NULL : \
>> +    (bond_is_empty(bond) ? NULL : \
>>                        bond_to_slave((bond)->slave_list.prev))
>>
>> +/**
>> + * IMPORTANT: bond_first/last_slave_rcu can return NULL in case of an empty list
>> + * Caller must hold rcu_read_lock
>> + */
>> +#define bond_first_slave_rcu(bond) \
>> +    list_first_or_null_rcu(&(bond)->slave_list, struct slave, list);
> 
> bond_first_slave_or_null_rcu() ?
> 

both is suitable.

>> +
>> #define bond_is_first_slave(bond, pos) ((pos)->list.prev == &(bond)->slave_list)
>> #define bond_is_last_slave(bond, pos) ((pos)->list.next == &(bond)->slave_list)
>>
>> @@ -93,6 +111,16 @@
>>     (bond_is_first_slave(bond, pos) ? bond_last_slave(bond) : \
>>                       bond_to_slave((pos)->list.prev))
>>
>> +/* Since bond_first/last_slave_rcu can return NULL, these can return NULL too */
>> +#define bond_next_slave_rcu(bond, pos) \
>> +    ({struct list_head *__slave_list = &(bond)->slave_list; \
>> +     struct list_head __rcu *__next = list_next_rcu(__slave_list); \
>> +     struct list_head __rcu *__pos_next = list_next_rcu(&(pos)->list); \
>> +     likely(__pos_next != __slave_list) ? \
>> +     container_of(__pos_next, struct slave, list) : \
>> +     container_of(__next, struct slave, list); \
>> +     })
> 
> And if bond has no slaves, whilst pos being the slave that has just
> detached? Then bond->slave_list->next == pos->list->next ==
> &bond->slave_list, and we'll return the container_of(bond->slave_list),
> which is garbage, but not NULL.
> 
>> +
>> /**
>>  * bond_for_each_slave_from - iterate the slaves list from a starting point
>>  * @bond:    the bond holding this list.
>> -- 
>> 1.8.0
>>
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 

^ permalink raw reply

* Re: [PATCH v5 net-next 11/27] bonding: rework bond_3ad_xmit_xor() to use bond_for_each_slave() only
From: Ding Tianhong @ 2013-09-26  2:31 UTC (permalink / raw)
  To: Veaceslav Falico; +Cc: netdev, jiri, Jay Vosburgh, Andy Gospodarek
In-Reply-To: <1380093632-1842-12-git-send-email-vfalico@redhat.com>

On 2013/9/25 15:20, Veaceslav Falico wrote:
> Currently, there are two loops - first we find the first slave in an
> aggregator after the xmit_hash_policy() returned number, and after that we
> loop from that slave, over bonding head, and till that slave to find any
> suitable slave to send the packet through.
> 
> Replace it by just one bond_for_each_slave() loop, which first loops
> through the requested number of slaves, saving the first suitable one, and
> after that we've hit the requested number of slaves to skip - search for
> any up slave to send the packet through. If we don't find such kind of
> slave - then just send the packet through the first suitable slave found.
> 
> Logic remains unchainged, and we skip two loops. Also, refactor it a bit
> for readability.
> 
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Veaceslav Falico <vfalico@redhat.com>
> ---
> 
> Notes:
>     v4  -> v5
>     No change.
>     
>     v3  -> v4:
>     No change.
>     
>     v2  -> v3:
>     No change.
>     
>     v1  -> v2:
>     No changes.
>     
>     RFC -> v1:
>     New patch.
> 
>  drivers/net/bonding/bond_3ad.c | 46 ++++++++++++++++++++----------------------
>  1 file changed, 22 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
> index 3847aee..c861ee7 100644
> --- a/drivers/net/bonding/bond_3ad.c
> +++ b/drivers/net/bonding/bond_3ad.c
> @@ -2417,15 +2417,15 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
>  
>  int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
>  {
> -	struct slave *slave, *start_at;
>  	struct bonding *bond = netdev_priv(dev);
> +	struct slave *slave, *first_ok_slave;
> +	struct aggregator *agg;
> +	struct ad_info ad_info;
>  	struct list_head *iter;
> -	int slave_agg_no;
>  	int slaves_in_agg;
> -	int agg_id;
> -	int i;
> -	struct ad_info ad_info;
> +	int slave_agg_no;
>  	int res = 1;
> +	int agg_id;
>  
>  	read_lock(&bond->lock);
>  	if (__bond_3ad_get_active_agg_info(bond, &ad_info)) {
> @@ -2438,20 +2438,28 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
>  	agg_id = ad_info.aggregator_id;
>  
>  	if (slaves_in_agg == 0) {
> -		/*the aggregator is empty*/
>  		pr_debug("%s: Error: active aggregator is empty\n", dev->name);
>  		goto out;
>  	}
>  
>  	slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg);
> +	first_ok_slave = NULL;
>  
>  	bond_for_each_slave(bond, slave, iter) {
> -		struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
> +		agg = SLAVE_AD_INFO(slave).port.aggregator;
> +		if (!agg || agg->aggregator_identifier != agg_id)
> +			continue;
>  
> -		if (agg && (agg->aggregator_identifier == agg_id)) {
> +		if (slave_agg_no >= 0) {
> +			if (!first_ok_slave && SLAVE_IS_OK(slave))
> +				first_ok_slave = slave;
>  			slave_agg_no--;
> -			if (slave_agg_no < 0)
> -				break;
> +			continue;
> +		}
> +
> +		if (SLAVE_IS_OK(slave)) {
> +			res = bond_dev_queue_xmit(bond, skb, slave->dev);
> +			goto out;
>  		}

I think you miss something, you will send skb always by the first suitable port,
it could not support load balance.
pls consult my function.
 		if (agg && (agg->aggregator_identifier == agg_id)) {
-			slave_agg_no--;
-			if (slave_agg_no < 0)
-				break;
+			if (--slave_agg_no < 0) {
+				if (SLAVE_IS_OK(slave)) {
+					res = bond_dev_queue_xmit(bond,
+						skb, slave->dev);
+					goto out;
+				}
+			}
 		}
 	}

>  	}
>  
> @@ -2461,20 +2469,10 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
>  		goto out;
>  	}
>  
> -	start_at = slave;
> -
> -	bond_for_each_slave_from(bond, slave, i, start_at) {
> -		int slave_agg_id = 0;
> -		struct aggregator *agg = SLAVE_AD_INFO(slave).port.aggregator;
> -
> -		if (agg)
> -			slave_agg_id = agg->aggregator_identifier;
> -
> -		if (SLAVE_IS_OK(slave) && agg && (slave_agg_id == agg_id)) {
> -			res = bond_dev_queue_xmit(bond, skb, slave->dev);
> -			break;
> -		}
> -	}
> +	/* we couldn't find any suitable slave after the agg_no, so use the
> +	 * first suitable found, if found. */
> +	if (first_ok_slave)
> +		res = bond_dev_queue_xmit(bond, skb, first_ok_slave->dev);
>  
>  out:
>  	read_unlock(&bond->lock);
> 

^ permalink raw reply

* Re: [PATCH] moxa: drop free_irq of devm_request_irq allocated irq
From: Wei Yongjun @ 2013-09-26  2:12 UTC (permalink / raw)
  To: jg1.han
  Cc: jonas.jensen, davem, grant.likely, rob.herring, yongjun_wei,
	netdev, sachin.kamat
In-Reply-To: <000001ceba51$f8b45c10$ea1d1430$%han@samsung.com>

On 09/26/2013 08:47 AM, Jingoo Han wrote:
> On Wednesday, September 25, 2013 4:33 PM, Wei Yongjun wrote:
>> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>>
>> irq allocated with devm_request_irq should not be freed using
>> free_irq, because doing so causes a dangling pointer, and a
>> subsequent double free.
>>
>> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>> ---
>>  drivers/net/ethernet/moxa/moxart_ether.c | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
>> index 83c2091..9a7fcb5 100644
>> --- a/drivers/net/ethernet/moxa/moxart_ether.c
>> +++ b/drivers/net/ethernet/moxa/moxart_ether.c
>> @@ -531,7 +531,6 @@ static int moxart_remove(struct platform_device *pdev)
>>  	struct net_device *ndev = platform_get_drvdata(pdev);
>>
>>  	unregister_netdev(ndev);
>> -	free_irq(ndev->irq, ndev);
>>  	moxart_mac_free_memory(ndev);
>>  	free_netdev(ndev);
> CC'ed Sachin Kamat,
>
> In this case, the free_irq() will be called, after calling
> free_netdev(). 'ndev' is freed by free_netdev(). Then, 'ndev->irq'
> is used by free_irq(). Is it right?
>
> In my humble opinion, it seems to make the problem.
>

devm_request_irq() has recorded the irq and dev_id, so free_irq() by devm_*
will not touch 'ndev' which has been freed by free_netdev().
So, if we not need to call free_irq() before free_netdev(), there will be
no problem.

^ permalink raw reply

* Re: BUG: MARK in OUTPUT + ip_tunnel causes kernel panic
From: Konstantin Kuzov @ 2013-09-26  2:06 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: netdev
In-Reply-To: <20130925085947.GY7660@secunet.com>

On Wed, 25 Sep 2013 10:59:47 +0200
Steffen Klassert wrote:

 > > > When trying to tunnel traffic originating from the same machine as the
 > > > tunnel endpoint, I am experiencing kernel panics for some types of
 > > > traffic (ICMP and UDP). TCP seems not to be affected by this, at least
 > > > I have not been able to trigger the panic.
 > > > 
 > > > I have one tunnel (without an IP address) and use policy routing to
 > > > steer some traffic through the tunnels.  
 > > [...]  
 > > > An interesting thing is that I have seen different kernel panics being
 > > > triggered. The other one I have seen has RIP pointing to
 > > > e1000_xmit_frame() and the message "protocol 0800 is buggy". However,
 > > > the one I have posted is by far the most common.  
 > > I'm experiencing the same issue on two different machines. It happens on any 
 > > kernel starting from 3.10 when ip_tunnel/ip_tunnel_core were introduced.
 > >   
 > Can you please try the patch below?
 > I've posted the same patch already to netdev in the morning.
 > 
 > Subject: [PATCH net 1/2] ip_tunnel: Fix a memory corruption in ip_tunnel_xmit  
 
Thank you very much. All works fine with that patch applied.

^ permalink raw reply

* Re: +157.8% netperf throughput by "ipv4: raise IP_MAX_MTU to theoretical limit"
From: Fengguang Wu @ 2013-09-26  1:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexey Kuznetsov, Willem de Bruijn, lkp, netdev@vger.kernel.org,
	LKML
In-Reply-To: <20130926012144.GA9931@localhost>

On Thu, Sep 26, 2013 at 09:21:44AM +0800, Fengguang Wu wrote:
> Hi Eric,
> 
> We are glad to find that your below commit brings large increase in
> lo netperf throughput:
> 
>     35596b2796713c6a9dc0      734d2725db879f3f6fcd  
> ------------------------  ------------------------  
>                   761.80      +534.6%      4834.60  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
>                   168.10     +1317.4%      2382.70  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
>                   169.60      +979.4%      1830.70  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
>                  2154.20      +135.7%      5077.50  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
>                  3559.00        -3.5%      3435.20  lkp-t410/micro/netperf/120s-200%-TCP_STREAM
>                  6812.70      +157.8%     17560.70  TOTAL netperf.Throughput_Mbps
> 
> The side effects are some increased/decreased lock contentions:

This direct view may be more clear. Before patch:

                              class name    con-bounces    contentions   waittime-min   waittime-max waittime-total    acq-bounces   acquis
itions   holdtime-min   holdtime-max holdtime-total
-------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------

                 &(&nf->lru_lock)->rlock:      19017744       19034681           0.15        5884.35  5772892473.69       20428335       20
475976           0.10        1109.59    77448429.38
                 -----------------------
                 &(&nf->lru_lock)->rlock        4905538          [<ffffffff819227b9>] ip_defrag+0xa4f/0xbd3
                 &(&nf->lru_lock)->rlock        5695105          [<ffffffff8195bd6f>] inet_frag_find+0x2c7/0x30d
                 &(&nf->lru_lock)->rlock        5629414          [<ffffffff8195be74>] inet_frag_kill+0xbf/0x117
                 &(&nf->lru_lock)->rlock        2804624          [<ffffffff8195bf29>] inet_frag_evictor+0x5d/0x103
                 -----------------------
                 &(&nf->lru_lock)->rlock        6172104          [<ffffffff8195bd6f>] inet_frag_find+0x2c7/0x30d
                 &(&nf->lru_lock)->rlock        5348696          [<ffffffff8195be74>] inet_frag_kill+0xbf/0x117
                 &(&nf->lru_lock)->rlock        4421308          [<ffffffff819227b9>] ip_defrag+0xa4f/0xbd3
                 &(&nf->lru_lock)->rlock        3092573          [<ffffffff8195bf29>] inet_frag_evictor+0x5d/0x103

...............................................................................................................................................................................................

                      &(&q->lock)->rlock:       2322575        2323896           0.22        5802.58   934469091.38        3041941       13000848           0.10        5902.91  2811690638.18
                      ------------------
                      &(&q->lock)->rlock        2163449          [<ffffffff8195bf7e>] inet_frag_evictor+0xb2/0x103
                      &(&q->lock)->rlock         160447          [<ffffffff81921e91>] ip_defrag+0x127/0xbd3
                      ------------------
                      &(&q->lock)->rlock        2165896          [<ffffffff81921e91>] ip_defrag+0x127/0xbd3
                      &(&q->lock)->rlock         158000          [<ffffffff8195bf7e>] inet_frag_evictor+0xb2/0x103

...............................................................................................................................................................................................

                   &(&zone->lock)->rlock:       1845042        1851805           0.18        4917.52    19475590.18        9003807       10134386           0.13        3747.70     8347088.06
                   ---------------------
                   &(&zone->lock)->rlock         866751          [<ffffffff8116fbbc>] __free_pages_ok.part.47+0x94/0x2a1
                   &(&zone->lock)->rlock         984597          [<ffffffff8116f856>] get_page_from_freelist+0x4a3/0x6e8
                   &(&zone->lock)->rlock            112          [<ffffffff8116fe3b>] free_pcppages_bulk+0x35/0x31a
                   &(&zone->lock)->rlock            116          [<ffffffff8116f72c>] get_page_from_freelist+0x379/0x6e8
                   ---------------------
                   &(&zone->lock)->rlock         918190          [<ffffffff8116fbbc>] __free_pages_ok.part.47+0x94/0x2a1
                   &(&zone->lock)->rlock            722          [<ffffffff8116f72c>] get_page_from_freelist+0x379/0x6e8
                   &(&zone->lock)->rlock            861          [<ffffffff8116fe3b>] free_pcppages_bulk+0x35/0x31a
                   &(&zone->lock)->rlock         922607          [<ffffffff8116f856>] get_page_from_freelist+0x4a3/0x6e8


After patch, top contented locks become:

                   &(&zone->lock)->rlock:      58469530       58470181           0.16        4838.84   238618042.87      107374530      107408478           0.13        3610.05    73617127.93
                   ---------------------
                   &(&zone->lock)->rlock       29783268          [<ffffffff8116f856>] get_page_from_freelist+0x4a3/0x6e8
                   &(&zone->lock)->rlock            837          [<ffffffff8116f72c>] get_page_from_freelist+0x379/0x6e8
                   &(&zone->lock)->rlock           1105          [<ffffffff8116fe3b>] free_pcppages_bulk+0x35/0x31a
                   &(&zone->lock)->rlock       28684627          [<ffffffff8116fbbc>] __free_pages_ok.part.47+0x94/0x2a1
                   ---------------------
                   &(&zone->lock)->rlock          11356          [<ffffffff8116fe3b>] free_pcppages_bulk+0x35/0x31a
                   &(&zone->lock)->rlock           6741          [<ffffffff8116f72c>] get_page_from_freelist+0x379/0x6e8
                   &(&zone->lock)->rlock       28880589          [<ffffffff8116f856>] get_page_from_freelist+0x4a3/0x6e8
                   &(&zone->lock)->rlock       29558251          [<ffffffff8116fbbc>] __free_pages_ok.part.47+0x94/0x2a1

...............................................................................................................................................................................................

                           slock-AF_INET:        507780         508036           0.20        1167.78     2564695.48       11115246      106594271           0.12        1196.01   989718694.82
                           -------------
                           slock-AF_INET         434691          [<ffffffff818ed738>] lock_sock_fast+0x2f/0x84
                           slock-AF_INET          73294          [<ffffffff819482dc>] udp_queue_rcv_skb+0x1ba/0x3aa
                           slock-AF_INET             51          [<ffffffff818ed6b5>] lock_sock_nested+0x34/0x88
                           -------------
                           slock-AF_INET         434615          [<ffffffff819482dc>] udp_queue_rcv_skb+0x1ba/0x3aa
                           slock-AF_INET          73370          [<ffffffff818ed738>] lock_sock_fast+0x2f/0x84
                           slock-AF_INET             51          [<ffffffff8193fae1>] tcp_v4_rcv+0x390/0x978

..............................................................................................................................................................................................

                               &rq->lock:        286309         286456           0.21         294.85     1768779.90        5887506      244517912           0.09        1080.71   315600465.71
                               ---------
                               &rq->lock          92057          [<ffffffff81a0aa65>] __schedule+0x103/0x852
                               &rq->lock          18386          [<ffffffff810ecc02>] try_to_wake_up+0x95/0x26c
                               &rq->lock            730          [<ffffffff810f13fb>] update_blocked_averages+0x30/0x47f
                               &rq->lock            304          [<ffffffff81a0af43>] __schedule+0x5e1/0x852
                               ---------
                               &rq->lock         107807          [<ffffffff810ecd7b>] try_to_wake_up+0x20e/0x26c
                               &rq->lock         144391          [<ffffffff81a0aa65>] __schedule+0x103/0x852
                               &rq->lock            924          [<ffffffff810ecc02>] try_to_wake_up+0x95/0x26c
                               &rq->lock             29          [<ffffffff810e9090>] task_rq_lock+0x4b/0x85

Thanks,
Fengguang

^ permalink raw reply

* Re: Device tree node for Freescale Gianfar PTP reference clock source selection
From: Scott Wood @ 2013-09-26  1:30 UTC (permalink / raw)
  To: Aida Mynzhasova
  Cc: netdev, Richard Cochran, linuxppc-dev, Claudiu Manoil, devicetree
In-Reply-To: <523FEFCD.70409@skitlab.ru>

On Mon, 2013-09-23 at 11:37 +0400, Aida Mynzhasova wrote:
> Hi,
> 
> Currently, Freescale Gianfar PTP reference clock source is determined 
> through hard-coded value in gianfar_ptp driver. I don't think that 
> recompilation of the entire module (or even worse - the kernel) is a god 
> idea when we want to change one clock source to another. So, I want to 
> add new device tree binding, which can be used as:

Is this describing the hardware or how you're using it?  If the latter,
it should be a module parameter or some sort of runtime knob instead.

> 	ptp_clock@24E00 {
> 		compatible = "fsl,etsec-ptp";
> 		reg = <0x24E00 0xB0>;
> 		interrupts = <12 0x8 13 0x8>;
> 		interrupt-parent = < &ipic >;
> 		fsl,cksel = <0>; /* <-- New entry */
> 		fsl,tclk-period = <10>;
> 		fsl,tmr-prsc    = <100>;
> 		fsl,tmr-add     = <0x999999A4>;
> 		fsl,tmr-fiper1  = <0x3B9AC9F6>;
> 		fsl,tmr-fiper2  = <0x00018696>;
> 		fsl,max-adj     = <659999998>;
> 	};
> 
> fsl,cksel acceptable values:
> 
> <0> for external clock;
> <1> for eTSEC system clock;
> <2> for eTSEC1 transmit clock;
> <3> for RTC clock input.
> 
> I am new in this mailing list, and as far as I know, I have to discuss 
> all updates for device tree files here before sending patch, which uses 
> new attributes.
> 
> Also, should I define new bindings in some special way? I want to add 
> description of cksel attribute in 
> /Documentation/devicetree/bindings/net/fsl-tsec-phy.txt. Is it enough or 
> not?

Assuming this is actually describing how the hardware is wired up, yes,
that's how you'd document it (making sure that device trees without that
property are interpreted the same as today).

-Scott

^ permalink raw reply

* +157.8% netperf throughput by "ipv4: raise IP_MAX_MTU to theoretical limit"
From: Fengguang Wu @ 2013-09-26  1:21 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Alexey Kuznetsov, Willem de Bruijn, lkp, netdev@vger.kernel.org,
	LKML

Hi Eric,

We are glad to find that your below commit brings large increase in
lo netperf throughput:

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                  761.80      +534.6%      4834.60  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  168.10     +1317.4%      2382.70  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  169.60      +979.4%      1830.70  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                 2154.20      +135.7%      5077.50  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                 3559.00        -3.5%      3435.20  lkp-t410/micro/netperf/120s-200%-TCP_STREAM
                 6812.70      +157.8%     17560.70  TOTAL netperf.Throughput_Mbps

The side effects are some increased/decreased lock contentions:

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                 7695.93    +16534.7%   1280196.60  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                 1917.74   +159631.5%   3063227.91  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                 2293.73   +118436.8%   2718918.57  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                50823.71     +1881.3%   1006978.18  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                62731.11    +12763.3%   8069321.26  TOTAL lock_stat.slock-AF_INET.waittime-total

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                 7007.20    +17377.5%   1224681.80  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  960.80    +74382.1%    715623.90  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  951.40    +55033.2%    524536.90  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                49582.00     +2141.1%   1111185.60  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                58501.40     +6012.7%   3576028.20  TOTAL lock_stat.slock-AF_INET.contentions

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                 7289.40    +16702.1%   1224771.60  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  922.40    +77764.1%    718218.60  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  910.10    +58394.1%    532355.20  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                51325.70     +2062.2%   1109739.10  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                60447.60     +5830.9%   3585084.50  TOTAL lock_stat.slock-AF_INET.contentions.udp_queue_rcv_skb

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                 4266.20    +13915.1%    597912.60  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  560.50    +57667.0%    323783.90  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  518.80    +46369.9%    241086.00  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                23391.50     +1891.7%    465887.80  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                28737.00     +5567.5%   1628670.30  TOTAL lock_stat.&(&list->lock)->rlock#2.contentions.__skb_recv_datagram

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                 4245.80    +13836.7%    591726.20  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  563.30    +57100.8%    322211.90  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  520.40    +45650.2%    238084.00  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                23157.10     +1898.6%    462812.40  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                28486.60     +5568.8%   1614834.50  TOTAL lock_stat.&(&list->lock)->rlock#2.contentions.sock_queue_rcv_skb

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                  466.24    +20666.8%     96823.64  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  227.96    +39775.4%     90899.22  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  165.76    +63787.9%    105903.10  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                 3031.79     +2325.1%     73523.25  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                 3891.75     +9334.0%    367149.20  TOTAL lock_stat.&wq->wait.waittime-total

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
                  891.80    +21266.3%    190545.00  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
                  213.30    +50036.7%    106941.50  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
                  152.60    +58410.9%     89287.60  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
                 6143.00     +2623.3%    167291.40  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
                 7400.70     +7386.7%    554065.50  TOTAL lock_stat.&wq->wait.contentions.__wake_up_sync_key

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
              4174266.80     +3137.3% 135134002.00  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
              1541864.80     +3808.9%  60270231.50  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
              1754844.30     +3034.9%  55013405.80  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
             22053438.30      +263.9%  80244072.30  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
             29524414.20     +1020.0% 330661711.60  TOTAL lock_stat.&(&zone->lock)->rlock.contentions.__free_pages_ok

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
              4303082.60     +2809.1% 125179298.60  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
              1638845.10     +3613.5%  60859041.30  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
              1883848.20     +3012.7%  58638288.40  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
             22293114.60      +216.3%  70505512.80  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
             30118890.50      +946.5% 315182141.10  TOTAL lock_stat.&(&zone->lock)->rlock.contentions.get_page_from_freelist

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
               291821.83       -96.6%      9957.12  lkp-nex04/micro/netperf/120s-200%-UDP_STREAM
               400941.64       -94.0%     24083.56  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
               692763.47       -95.1%     34040.68  TOTAL lock_stat.&mm->mmap_sem/1.holdtime-max

    35596b2796713c6a9dc0      734d2725db879f3f6fcd  
------------------------  ------------------------  
             38718739.60       -95.8%   1637024.20  lkp-ib03/micro/netperf/120s-200%-UDP_STREAM
             13064940.40       -84.5%   2031181.40  lkp-nex05/micro/netperf/120s-200%-UDP_STREAM
             64201496.70       -98.2%   1166711.50  lkp-sb03/micro/netperf/120s-200%-UDP_STREAM
              2302513.00       -97.9%     48685.00  lkp-t410/micro/netperf/120s-200%-UDP_STREAM
            118287689.70       -95.9%   4883602.10  TOTAL lock_stat.&(&base->lock)->rlock.acquisitions
......

commit 734d2725db879f3f6fcdc2b1d2a5deae105f5e95
Author: Eric Dumazet <edumazet@google.com>
Date:   Sun Aug 18 19:08:07 2013 -0700

    ipv4: raise IP_MAX_MTU to theoretical limit
    
    As discussed last year [1], there is no compelling reason
    to limit IPv4 MTU to 0xFFF0, while real limit is 0xFFFF
    
    [1] : http://marc.info/?l=linux-netdev&m=135607247609434&w=2
    

:040000 040000 f2085347b7781a6a42020dc1bc5ca090f7077361 fcb2c411b5073c8ac5009a41f7a1908a352c23d5 M	net
bisect run success

# bad: [272b98c6455f00884f0350f775c5342358ebb73f] Linux 3.12-rc1
# good: [6e4664525b1db28f8c4e1130957f70a94c19213e] Linux 3.11
git bisect start '272b98c6455f00884f0350f775c5342358ebb73f' '6e4664525b1db28f8c4e1130957f70a94c19213e' '--'
# good: [57d730924d5cc2c3e280af16a9306587c3a511db] Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 57d730924d5cc2c3e280af16a9306587c3a511db
# bad: [27c7651a6a5f143eccd66db38c7a3035e1f8bcfb] Merge tag 'gpio-v3.12-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
git bisect bad 27c7651a6a5f143eccd66db38c7a3035e1f8bcfb
# bad: [06c54055bebf919249aa1eb68312887c3cfe77b4] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
git bisect bad 06c54055bebf919249aa1eb68312887c3cfe77b4
# bad: [d75ea942b360690a380da8012a51eaf6a6ebb1b1] can: flexcan: use platform_set_drvdata()
git bisect bad d75ea942b360690a380da8012a51eaf6a6ebb1b1
# good: [926489be1d2b030c17d38fa10b5921bf3409d91d] drivers: net: cpsw: Add support for new CPSW IP version present in AM43xx SoC
git bisect good 926489be1d2b030c17d38fa10b5921bf3409d91d
# good: [a5354ccaaf54ac61c6d1b350e8d3e4234dd28849] ath9k: Enable WLAN/BT Ant Diversity for WB225/WB195
git bisect good a5354ccaaf54ac61c6d1b350e8d3e4234dd28849
# good: [84ce1ddfefc3d5a8af5ede6fe16546c143117616] 6lowpan: init ipv6hdr buffer to zero
git bisect good 84ce1ddfefc3d5a8af5ede6fe16546c143117616
# bad: [a0e186003be7892fd75613a23aaafaf09f3611e6] net: fsl_pq_mdio: use platform_{get,set}_drvdata()
git bisect bad a0e186003be7892fd75613a23aaafaf09f3611e6
# good: [89d5e23210f53ab53b7ff64843bce62a106d454f] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next
git bisect good 89d5e23210f53ab53b7ff64843bce62a106d454f
# good: [0aa857f83fe2674f973ad89ec7c0202f5c5d9554] moxa: fix missing unlock on error in moxart_mac_start_xmit()
git bisect good 0aa857f83fe2674f973ad89ec7c0202f5c5d9554
# bad: [0dde80268ee0a5a1511935bdb9c547191d616aa9] myri10ge: Add support for ndo_busy_poll
git bisect bad 0dde80268ee0a5a1511935bdb9c547191d616aa9
# good: [397b41746333ad386d91d23ea0f79481320dcdcc] tcp: trivial: Remove nocache argument from tcp_v4_send_synack
git bisect good 397b41746333ad386d91d23ea0f79481320dcdcc
# bad: [734d2725db879f3f6fcdc2b1d2a5deae105f5e95] ipv4: raise IP_MAX_MTU to theoretical limit
git bisect bad 734d2725db879f3f6fcdc2b1d2a5deae105f5e95
# good: [35596b2796713c6a9dc05759837fa9f0e156a200] vhost: Include linux/uio.h instead of linux/socket.h
git bisect good 35596b2796713c6a9dc05759837fa9f0e156a200
# first bad commit: [734d2725db879f3f6fcdc2b1d2a5deae105f5e95] ipv4: raise IP_MAX_MTU to theoretical limit


Compare of all good/bad commits during the bisect.

                                  iostat.cpu.user

   2.2 ++-------------------------------------------------------------------+
     2 ++               O   O              O                                |
       |                  O  O O O OO O O O  O OO O O OO   OO O O O         |
   1.8 ++                                                O                  |
   1.6 ++                                                                   |
       |                                                                    |
   1.4 O+OO O O OO O O O                                                    |
   1.2 ++                                                                   |
     1 ++                                                                   |
       |                                                                    |
   0.8 ++                                                                   |
   0.6 ++                                                                   |
       |                                                                    |
   0.4 ++               *.*.           .*.**.    .*.                       .*
   0.2 *+**-*-*-**-*-*-*----**-*-*-**-*------*-**---*-**-*-**-*-*-**-*-*-**-+


                               netperf.Throughput_Mbps

   2000 ++------------------------------------------------------------------+
   1800 ++               O OO O O OO O OO O OO O O OO O OO O OO O O         |
        |                                                                   |
   1600 ++                                                                  |
   1400 ++                                                                  |
        O OO O O OO O OO                                                    |
   1200 ++                                                                  |
   1000 ++                                                                  |
    800 ++                                                                  |
        |                                                                   |
    600 ++                                                                  |
    400 ++                                                                  |
        |                                                                   |
    200 *+**.*.*.**.*.**.*.**.*.*.**.*.**.*.**.*.*.**.*.**.*.**.*.*.**.*.**.*
      0 ++------------------------------------------------------------------+


                                  vmstat.system.in

   34000 ++---------------------O-----------O-------------------------------+
         |                     O  O OO O O O  O    O  O O OO O OO O         |
   32000 ++                                     OO   O                      |
   30000 ++                                                                 |
         |                                                                  |
   28000 ++                                                                 |
   26000 O+   O OO O O  O                                                   |
         | OO         O                                                     |
   24000 ++                                                                 |
   22000 ++                                                                 |
         |                                                                  |
   20000 ++                O                                                |
   18000 ++                                    .*                           |
         *.**.*.**.*.**.*.  .*.**.*.**.*.*.**.*  *.*.**.*.**.*.**.*.**.*.**.*
   16000 ++---------------**------------------------------------------------+


                                   vmstat.system.cs

   900000 ++---------------O------------------------------------------------+
          |                   O                                             |
   800000 ++                O  O O O    OO O OO O OO O OO   OO OO O         |
   700000 ++                        O O                   O                 |
          |                                                                 |
   600000 ++                                                                |
   500000 O+OO O OO O OO O                                                  |
          |                                                                 |
   400000 ++                                                                |
   300000 ++                                                                |
          |                                                                 |
   200000 ++                                                                |
   100000 ++                                                                |
          *.**.*.**.*.**.*.**.**.*.**.*.**.*.**.*.**.*.**.*.**.**.*.**.*.**.*
        0 ++----------------------------------------------------------------+


                     lock_stat.&(&zone->lock)->rlock.contentions

   7e+07 ++---------------O-------------------------------------------------+
         |                 O O                                              |
   6e+07 ++                    OO O      O O  O OO O OO O OO O OO O         |
         |                          OO O    O                               |
   5e+07 O+OO O OO O OO O                                                   |
         |                                                                  |
   4e+07 ++                                                                 |
         |                                                                  |
   3e+07 ++                                                                 |
         |                                                                  |
   2e+07 ++                                                                 |
         |                                                                  |
   1e+07 ++                                                                 |
         |               .**.*.                                             |
       0 *+**-*-**-*-**-*------**-*-**-*-*-**-*-**-*-**-*-**-*-**-*-**-*-**-*


             lock_stat.&(&zone->lock)->rlock.contentions.__free_pages_ok

   7e+07 ++-----------------------------------------------------------------+
         |                O  O                                              |
   6e+07 ++                O   OO O           O OO O O  O  O O  O O         |
         |                          OO O O OO         O   O                 |
   5e+07 O+OO O OO O OO O                                                   |
         |                                                                  |
   4e+07 ++                                                                 |
         |                                                                  |
   3e+07 ++                                                    O            |
         |                                                                  |
   2e+07 ++                                                                 |
         |                                                                  |
   1e+07 ++                                                                 |
         |                                                                  |
       0 *+**-*-**-*-**-*-**-*-**-*-**-*-*-**-*-**-*-**-*-**-*-**-*-**-*-**-*


         lock_stat.&(&zone->lock)->rlock.contentions.get_page_from_freelist

   7e+07 ++---------------O-------------------------------------------------+
         |                 O O                                              |
   6e+07 ++                    OO O OO O O OO O OO O OO O OO O OO O         |
         |                                                                  |
   5e+07 O+OO O OO O OO O                                                   |
         |                                                                  |
   4e+07 ++                                                                 |
         |                                                                  |
   3e+07 ++                                                                 |
         |                                                                  |
   2e+07 ++                                                                 |
         |                                                                  |
   1e+07 ++                                                                 |
         |               .**.*.                                             |
       0 *+**-*-**-*-**-*------**-*-**-*-*-**-*-**-*-**-*-**-*-**-*-**-*-**-*


                            lock_stat.&rq->lock.contentions

   300000 ++---------------------O------OO-O-O---------------O--------------+
          |                    O   OO O       O O OO O OO O O  OO O         |
   250000 ++                                                                |
          |                                                                 |
          |            O                                                    |
   200000 O+OO O OO O O  O                                                  |
          |                                                                 |
   150000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
          *. *.                      .*.* .*.* .*.*   .* .*.    *. .**.    .*
          | *  *.**.*.**.*     *.*.**    *    *    *.*  *   **.*  *    *.** |
    50000 ++              :    :                                            |
          |               : O O                                             |
        0 ++---------------**-*---------------------------------------------+


               lock_stat.clockevents_lock.contentions.clockevents_notify

   120000 ++----------------------------------------------------------------+
          |                                     *                           |
   100000 ++                                    :                           |
          |                                     :                           |
          |                                    : :                          |
    80000 ++                                   : :                          |
          |           *                        : :                          |
    60000 ++         ::                        : :                          |
          |       *. : :                    .* : :                          |
    40000 ++      : *  *.       .*         * ::   :                .*       *
          *. *.  :       *.**.**  + *.*.  +   :   :*.   *. .**.* .*  *. .* +|
          | *  *.*                 *    **    *   *  *.*  *     *      *  * |
    20000 ++          O    OO OO   O  O OO O    O    O OO O  O O            |
          O OO O OO O  O O       O  O        OO   OO        O   O O         |
        0 ++----------------------------------------------------------------+


             lock_stat.&(&base->lock)->rlock.contentions.lock_timer_base

   12000 ++-----------------------------------------------------------------+
         |              *                                                   |
   10000 *+* .*.**.    + :     * .*.          *.  .*.            .*. *.     |
         |  *      *.**  :    : *   *        +  **   **.*.**.*. *   *  *.**.*
         |                :   :      *.*.*.**                  *            |
    8000 ++               **.*                                              |
         |                                                                  |
    6000 ++                                                                 |
         |                                                                  |
    4000 ++                                                                 |
         |                                                                  |
         |                                                                  |
    2000 ++                                                                 |
         |                                                                  |
       0 O+OO-O-OO-O-OO-O-OO-O-OO-O-OO-O-O-OO-O-OO-O-OO-O-OO-O-OO-O---------+


            lock_stat.&(&base->lock)->rlock.contentions.run_timer_softirq

   7000 ++------------------------------------------------------------------+
        |                        .*                                         |
   6000 ++*            *      *.* :          *.   .*             .*. *.     |
        *  :.*.*.* .*. :      :    :         : *.*  *.*.**.    .*   *  *.**.*
   5000 ++ *      *   * :    :     :         :             *.**             |
        |               :    :     *.*.*    :                               |
   4000 ++               :  :           *.*.*                               |
        |                *.**                                               |
   3000 ++                                                                  |
        |                                                                   |
   2000 ++                                                                  |
        |                                                                   |
   1000 ++                                                                  |
        |                                                                   |
      0 O+OO-O-O-OO-O-OO-O-OO-O-O-OO-O-OO-O-OO-O-O-OO-O-OO-O-OO-O-O---------+


                           lock_stat.rcu_node_1.contentions

   120000 ++----------------------------------------------------------------+
          |                                                                 |
   100000 ++                O OO O      OO                     OO O         |
          |                O       OO      O O  O O  O OO O O               |
          | OO O O       O            O       O    O         O              |
    80000 O+      O O OO                                                    |
          |                                                                 |
    60000 ++                                                                |
          |                                                                 |
    40000 ++                                                                |
          |                                                                 |
          |                                                                 |
    20000 ++                                                                |
          |                                                                 |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                lock_stat.rcu_node_1.contentions.rcu_process_callbacks

   200000 ++------------------O---------------------------------------------+
   180000 ++                O  O O      OO O O       O O  O    OO O         |
          |                O       OO           O O     O   OO              |
   160000 ++OO O O       O            O       O    O                        |
   140000 O+      O O OO                                                    |
          |                                                                 |
   120000 ++                                                                |
   100000 ++                                                                |
    80000 ++                                                                |
          |                                                                 |
    60000 ++                                                                |
    40000 ++                                                                |
          |                                                                 |
    20000 ++                                                                |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                    lock_stat.rcu_node_1.contentions.force_qs_rnp

   20000 ++-----------------------------------------------------------------+
   18000 ++               O  O                                              |
         |                 O                                                |
   16000 ++                                                                 |
   14000 ++                                                                 |
         |                               O                                  |
   12000 ++                    OO O O  O   OO   OO   OO O OO O OO O         |
   10000 O+OO O OO O    O            O        O    O                        |
    8000 ++          OO                                                     |
         |                                                                  |
    6000 ++                                                                 |
    4000 ++                                                                 |
         |                                                                  |
    2000 ++                                                                 |
       0 *+**-*-**-*-**-*-**-*-**-*-**-*-*-**-*-**-*-**-*-**-*-**-*-**-*-**-*


                          lock_stat.slock-AF_INET.contentions

   600000 ++----------------------------------------------------------------+
          |                    O O  O   O          O   OO O  O              |
   500000 ++                       O  O  O O OO   O         O  OO O         |
          |                                     O    O                      |
          |                                                                 |
   400000 ++                O                                               |
          |                                                                 |
   300000 O+OO O OO O OO O                                                  |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
          |                                                                 |
   100000 ++                                                                |
          |                   O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                  lock_stat.slock-AF_INET.contentions.lock_sock_fast

   600000 ++----------------------------------------------------------------+
          |                    O O                        O                 |
   500000 ++                       OO O OO O OO   OO   OO   OO OO O         |
          |                                     O    O                      |
          |                                                                 |
   400000 ++                                                                |
          |                                                                 |
   300000 O+OO O OO O OO O  O                                               |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
          |                                                                 |
   100000 ++                                                                |
          |                   O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                 lock_stat.slock-AF_INET.contentions.udp_queue_rcv_skb

   600000 ++----------------------------------------------O-----------------+
          |                    O O  O   O    O     O   OO    O              |
   500000 ++                       O  O  O O  O   O         O  OO O         |
          |                 O                   O    O                      |
          |                                                                 |
   400000 ++                                                                |
          |                                                                 |
   300000 O+OO O OO O OO O                                                  |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
          |                                                                 |
   100000 ++                                                                |
          |                   O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                     lock_stat.&(&list->lock)->rlock#2.contentions

   300000 ++----------------------------------------------------------------+
          |                           O O                                   |
   250000 ++                        O    O O OO   O    OO                   |
          |                    O O O            O  O      O OO OO O         |
          |                                          O                      |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
          O OO O OO O OO O                                                  |
   100000 ++                                                                |
          |                                                                 |
          |                 O                                               |
    50000 ++                                                                |
          |                   O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


           lock_stat.&(&list->lock)->rlock#2.contentions.__skb_recv_datagram

   300000 ++----------------------------------------------------------------+
          |                           O O                                   |
   250000 ++                        O    O O OO   O    OO                   |
          |                    O O O            O  O      O OO OO O         |
          |                                          O                      |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
          O OO O OO O OO O                                                  |
   100000 ++                                                                |
          |                                                                 |
          |                 O                                               |
    50000 ++                                                                |
          |                   O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


           lock_stat.&(&list->lock)->rlock#2.contentions.sock_queue_rcv_skb

   300000 ++----------------------------------------------------------------+
          |                           O O                                   |
   250000 ++                        O    O O OO   O    OO                   |
          |                    O O O            O  O      O OO OO O         |
          |                                          O                      |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
          O OO O OO O OO O                                                  |
   100000 ++                                                                |
          |                                                                 |
          |                 O                                               |
    50000 ++                                                                |
          |                   O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                            lock_stat.&wq->wait.contentions

   100000 ++-------------------O--------------------------------------------+
    90000 ++                     O  O    O O  O O OO O OO O    O            |
          |                        O  O O    O              OO  O O         |
    80000 ++                                                                |
    70000 ++                                                                |
          |                                                                 |
    60000 ++     O                                                          |
    50000 O+OO O  O O OO O                                                  |
    40000 ++                                                                |
          |                                                                 |
    30000 ++                                                                |
    20000 ++                                                                |
          |                 O                                               |
    10000 ++                  O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                      lock_stat.&wq->wait.contentions.finish_wait

   100000 ++----------------------------------------------------------------+
    90000 ++                   O         O O  O O O    OO O    O            |
          |                      O OO O O    O     O O       O  O O         |
    80000 ++                                                O               |
    70000 ++                                                                |
          |                                                                 |
    60000 ++                                                                |
    50000 O+OO O OO O OO O                                                  |
    40000 ++                                                                |
          |                                                                 |
    30000 ++                                                                |
    20000 ++                                                                |
          |                                                                 |
    10000 ++                O                                               |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*


                  lock_stat.&wq->wait.contentions.__wake_up_sync_key

   100000 ++----------------------------------------------------------------+
    90000 ++                   O         O O  O O OO   OO O    O            |
          |                      O OO O O    O       O      OO  O O         |
    80000 ++                                                                |
    70000 ++                                                                |
          |                                                                 |
    60000 ++     O                                                          |
    50000 O+OO O  O O OO O                                                  |
    40000 ++                                                                |
          |                                                                 |
    30000 ++                                                                |
    20000 ++                                                                |
          |                 O                                               |
    10000 ++                  O                                             |
        0 *+**-*-**-*-**-*-**-**-*-**-*-**-*-**-*-**-*-**-*-**-**-*-**-*-**-*

^ permalink raw reply

* Re: [PATCH] moxa: drop free_irq of devm_request_irq allocated irq
From: Jingoo Han @ 2013-09-26  0:47 UTC (permalink / raw)
  To: 'Wei Yongjun', 'Jonas Jensen'
  Cc: davem, grant.likely, rob.herring, yongjun_wei, netdev,
	'Jingoo Han', 'Sachin Kamat'
In-Reply-To: <CAPgLHd8YRP7kL_jNHi3J9jFn86f=Fv-AWWUY_aRGZQOgVTNPgg@mail.gmail.com>

On Wednesday, September 25, 2013 4:33 PM, Wei Yongjun wrote:
> 
> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> 
> irq allocated with devm_request_irq should not be freed using
> free_irq, because doing so causes a dangling pointer, and a
> subsequent double free.
> 
> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> ---
>  drivers/net/ethernet/moxa/moxart_ether.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/moxa/moxart_ether.c b/drivers/net/ethernet/moxa/moxart_ether.c
> index 83c2091..9a7fcb5 100644
> --- a/drivers/net/ethernet/moxa/moxart_ether.c
> +++ b/drivers/net/ethernet/moxa/moxart_ether.c
> @@ -531,7 +531,6 @@ static int moxart_remove(struct platform_device *pdev)
>  	struct net_device *ndev = platform_get_drvdata(pdev);
> 
>  	unregister_netdev(ndev);
> -	free_irq(ndev->irq, ndev);
>  	moxart_mac_free_memory(ndev);
>  	free_netdev(ndev);

CC'ed Sachin Kamat,

In this case, the free_irq() will be called, after calling
free_netdev(). 'ndev' is freed by free_netdev(). Then, 'ndev->irq'
is used by free_irq(). Is it right?

In my humble opinion, it seems to make the problem.

Best regards,
Jingoo Han

^ permalink raw reply

* Re: [PATCH net-next v2 0/3] bonding: modify the current and add new hash functions
From: Nikolay Aleksandrov @ 2013-09-25 23:56 UTC (permalink / raw)
  To: netdev; +Cc: davem, andy, fubar, eric.dumazet, vfalico
In-Reply-To: <1380133912-1240-1-git-send-email-nikolay@redhat.com>

Self-nack on this patchset as the flow_dissector patch will make nhoff
different from before, that is the port offset will not be added to it, and
thus will make flow_keys' th_off different in skb_flow_dissect.
Again, I will wait until tomorrow to see if there're any other comments and
will re-post a v3 with this bug fixed.

Nik

^ permalink raw reply

* Re: [PATCH net-next v2 1/3] flow_dissector: factor out the ports extraction in skb_flow_get_ports
From: Nikolay Aleksandrov @ 2013-09-25 23:50 UTC (permalink / raw)
  To: netdev; +Cc: davem, andy, fubar, eric.dumazet, vfalico
In-Reply-To: <52437454.6020407@redhat.com>

On 09/26/2013 01:40 AM, Nikolay Aleksandrov wrote:
> On 09/25/2013 08:31 PM, Nikolay Aleksandrov wrote:
>> Factor out the code that extracts the ports from skb_flow_dissect and
>> add a new function skb_flow_get_ports which can be re-used.
>>
>> Suggested-by: Veaceslav Falico <vfalico@redhat.com>
>> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
>> ---
>> v2: new patch
>> This seems like a good idea because there're other users that can re-use
>> it later as well.
>>
>>  include/net/flow_keys.h   |  1 +
>>  net/core/flow_dissector.c | 38 +++++++++++++++++++++++++++-----------
>>  2 files changed, 28 insertions(+), 11 deletions(-)
>>
>> diff --git a/include/net/flow_keys.h b/include/net/flow_keys.h
>> index ac2439d..621bf11 100644
>> --- a/include/net/flow_keys.h
>> +++ b/include/net/flow_keys.h
>> @@ -14,4 +14,5 @@ struct flow_keys {
>>  };
>>  
>>  bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys *flow);
>> +__be32 skb_flow_get_ports(const struct sk_buff *skb, int nhoff, u8 ip_proto);
>>  #endif
>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> index 1929af8..c609faf 100644
>> --- a/net/core/flow_dissector.c
>> +++ b/net/core/flow_dissector.c
>> @@ -25,9 +25,34 @@ static void iph_to_flow_copy_addrs(struct flow_keys *flow, const struct iphdr *i
>>  	memcpy(&flow->src, &iph->saddr, sizeof(flow->src) + sizeof(flow->dst));
>>  }
>>  
>> +/**
>> + * skb_flow_get_ports - extract the upper layer ports and return them
>> + * @skb: buffer to extract the ports from
>> + * @nhoff: network header offset
> I hurried too much, this should've said "offset to the end of the network
> header" i.e. nhoff + sizeof(network header), I'll wait until tomorrow to
> see if there aren't any other comments and will edit this and re-post a v3.
> 
> Nik
> 
Ugh, nevermind this email, I'm going crazy about semantics today :-)
Everything is fine.

^ permalink raw reply

* Re: [PATCH net-next v2 1/3] flow_dissector: factor out the ports extraction in skb_flow_get_ports
From: Nikolay Aleksandrov @ 2013-09-25 23:40 UTC (permalink / raw)
  To: netdev; +Cc: davem, andy, fubar, eric.dumazet, vfalico
In-Reply-To: <1380133912-1240-2-git-send-email-nikolay@redhat.com>

On 09/25/2013 08:31 PM, Nikolay Aleksandrov wrote:
> Factor out the code that extracts the ports from skb_flow_dissect and
> add a new function skb_flow_get_ports which can be re-used.
> 
> Suggested-by: Veaceslav Falico <vfalico@redhat.com>
> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
> ---
> v2: new patch
> This seems like a good idea because there're other users that can re-use
> it later as well.
> 
>  include/net/flow_keys.h   |  1 +
>  net/core/flow_dissector.c | 38 +++++++++++++++++++++++++++-----------
>  2 files changed, 28 insertions(+), 11 deletions(-)
> 
> diff --git a/include/net/flow_keys.h b/include/net/flow_keys.h
> index ac2439d..621bf11 100644
> --- a/include/net/flow_keys.h
> +++ b/include/net/flow_keys.h
> @@ -14,4 +14,5 @@ struct flow_keys {
>  };
>  
>  bool skb_flow_dissect(const struct sk_buff *skb, struct flow_keys *flow);
> +__be32 skb_flow_get_ports(const struct sk_buff *skb, int nhoff, u8 ip_proto);
>  #endif
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 1929af8..c609faf 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -25,9 +25,34 @@ static void iph_to_flow_copy_addrs(struct flow_keys *flow, const struct iphdr *i
>  	memcpy(&flow->src, &iph->saddr, sizeof(flow->src) + sizeof(flow->dst));
>  }
>  
> +/**
> + * skb_flow_get_ports - extract the upper layer ports and return them
> + * @skb: buffer to extract the ports from
> + * @nhoff: network header offset
I hurried too much, this should've said "offset to the end of the network
header" i.e. nhoff + sizeof(network header), I'll wait until tomorrow to
see if there aren't any other comments and will edit this and re-post a v3.

Nik

^ permalink raw reply

* RE: [PATCH 09/11] mwifiex: Remove extern from function prototypes
From: Bing Zhao @ 2013-09-25 23:13 UTC (permalink / raw)
  To: Joe Perches, netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
  Cc: David S. Miller, John W. Linville,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <b3a7cd194857b3dc099e8f8bb0ed97bb4a1b5fde.1380137610.git.joe-6d6DIl74uiNBDgjK7y7TUQ@public.gmane.org>

Hi Joe,

Thanks for your patch.

> There are a mix of function prototypes with and without extern
> in the kernel sources.  Standardize on not using extern for
> function prototypes.
> 
> Function prototypes don't need to be written with extern.
> extern is assumed by the compiler.  Its use is as unnecessary as
> using auto to declare automatic/local variables in a block.
> 
> Signed-off-by: Joe Perches <joe-6d6DIl74uiNBDgjK7y7TUQ@public.gmane.org>

Acked-by: Bing Zhao <bzhao-eYqpPyKDWXRBDgjK7y7TUQ@public.gmane.org>

Thanks,
Bing

> ---
>  drivers/net/wireless/mwifiex/wmm.h | 24 ++++++++++--------------
>  1 file changed, 10 insertions(+), 14 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 11/11] rtlwifi: Remove extern from function prototypes
From: Larry Finger @ 2013-09-25 23:01 UTC (permalink / raw)
  To: Joe Perches
  Cc: netdev, David S. Miller, Chaoming Li, John W. Linville,
	linux-wireless, linux-kernel
In-Reply-To: <a958df5dc3065ce517726dc54087639e13ffee8f.1380137610.git.joe@perches.com>

On 09/25/2013 02:37 PM, Joe Perches wrote:
> There are a mix of function prototypes with and without extern
> in the kernel sources.  Standardize on not using extern for
> function prototypes.
>
> Function prototypes don't need to be written with extern.
> extern is assumed by the compiler.  Its use is as unnecessary as
> using auto to declare automatic/local variables in a block.
>
> Signed-off-by: Joe Perches <joe@perches.com>

Acked-by: Larry Finger <Larry.Finger@lwfinger.net>

Larry

^ permalink raw reply

* Re: [PATCH RFC] net: introduce support for lazy initialization of secret keys
From: Hannes Frederic Sowa @ 2013-09-25 22:58 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, fw, edumazet, davem, ycheng
In-Reply-To: <1380148946.3165.166.camel@edumazet-glaptop>

On Wed, Sep 25, 2013 at 03:42:26PM -0700, Eric Dumazet wrote:
> On Wed, 2013-09-25 at 23:34 +0200, Hannes Frederic Sowa wrote:
> > net_get_random_once is a new macro which handles the initialization of
> > secret keys at use-site. It is possible to call it in the fast path. Only
> > the initialization depends on the spinlock and is rather slow. Otherwise
> > it should get used just before the key is used to delay the entropy
> > extration as late as possible to get better randomness. It returns true
> > if the key got initialized.
> 
> So you don't like cmpxchg() ;)

Actually, my first thought was to swap in a pointer via cmpxchg. But I didn't
know what to do when kmalloc returns NULL. After that I went with the spinlock
approach.

Also, I don't know with what size I get called. Consider this code:

| u8 byte;
| net_get_random_once(&byte, sizeof(byte));

Not allowing 0 in here would actually reduce the amount of randomness
considerable.

> 
> > +/* BE CAREFUL: this function is not interrupt safe */
> > +#define net_get_random_once(buf, nbytes)				\
> > +	({								\
> > +		static DEFINE_SPINLOCK(__lock);				\
> > +		static bool __done = false;				\
> > +		bool __ret = false;					\
> > +		if (unlikely(!__done))					\
> > +			__ret = __net_get_random_once(buf,		\
> > +						    nbytes,		\
> > +						    &__done,		\
> > +						    &__lock);		\
> > +		__ret;							\
> > +	})
> > +
> >  
> 
> No idea why its needed to have one spinlock per call point.
> 
> A single lock should be more than enough.
> 
> The spinlock could be private to __net_get_random_once()

Ack! Very good. Don't know why I build this so complex.

> +bool __net_get_random_once(void *buf, int nbytes, bool *done,
> +                           spinlock_t *lock)
> +{
> +       spin_lock_bh(lock);
> +       if (*done) {
> +               spin_unlock_bh(lock);
> +               return false;
> +       }
> +
> +       get_random_bytes(buf, nbytes);
> 
> I think you might need a memory barrier here.
> 
> (smp_wmb();)

I actually did some research on this and came to the conclusion that the
call to a function in another compilation unit should be barrier enough
(I really forgot to consider other architectures).

But a barrier should not hurt and it will be executed very infrequently,
so I'll add one.

> 
> +       *done = true;
> +       spin_unlock_bh(lock);
> 
> 
> BTW, build_ehash_secret() is called like that :
> 
> if (unlikely(!inet_ehash_secret))
>     if (sock->type != SOCK_RAW && sock->type != SOCK_DGRAM)
>         build_ehash_secret();
> 
> So it would be better to make sure inet_ehash_secret is not 0 by
> accident.

Urks. I'll think about that.

Thanks a lot for the review, Eric!

^ permalink raw reply

* Re: [PATCH RFC] net: introduce support for lazy initialization of secret keys
From: Eric Dumazet @ 2013-09-25 22:42 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, fw, edumazet, davem, ycheng
In-Reply-To: <20130925213403.GF4904@order.stressinduktion.org>

On Wed, 2013-09-25 at 23:34 +0200, Hannes Frederic Sowa wrote:
> net_get_random_once is a new macro which handles the initialization of
> secret keys at use-site. It is possible to call it in the fast path. Only
> the initialization depends on the spinlock and is rather slow. Otherwise
> it should get used just before the key is used to delay the entropy
> extration as late as possible to get better randomness. It returns true
> if the key got initialized.

So you don't like cmpxchg() ;)

> +/* BE CAREFUL: this function is not interrupt safe */
> +#define net_get_random_once(buf, nbytes)				\
> +	({								\
> +		static DEFINE_SPINLOCK(__lock);				\
> +		static bool __done = false;				\
> +		bool __ret = false;					\
> +		if (unlikely(!__done))					\
> +			__ret = __net_get_random_once(buf,		\
> +						    nbytes,		\
> +						    &__done,		\
> +						    &__lock);		\
> +		__ret;							\
> +	})
> +
>  

No idea why its needed to have one spinlock per call point.

A single lock should be more than enough.

The spinlock could be private to __net_get_random_once()

+bool __net_get_random_once(void *buf, int nbytes, bool *done,
+                           spinlock_t *lock)
+{
+       spin_lock_bh(lock);
+       if (*done) {
+               spin_unlock_bh(lock);
+               return false;
+       }
+
+       get_random_bytes(buf, nbytes);

I think you might need a memory barrier here.

(smp_wmb();)

+       *done = true;
+       spin_unlock_bh(lock);


BTW, build_ehash_secret() is called like that :

if (unlikely(!inet_ehash_secret))
    if (sock->type != SOCK_RAW && sock->type != SOCK_DGRAM)
        build_ehash_secret();

So it would be better to make sure inet_ehash_secret is not 0 by
accident.

^ permalink raw reply

* Re: [PATCH 06/11] ipw2x00: Remove extern from function prototypes
From: Stanislav Yakovlev @ 2013-09-25 22:24 UTC (permalink / raw)
  To: Joe Perches
  Cc: netdev, David S. Miller, John W. Linville, wireless, linux-kernel
In-Reply-To: <89eb744f1a77b32d6ec718618f356df3188403e7.1380137609.git.joe@perches.com>

Hi Joe,

On 25 September 2013 23:37, Joe Perches <joe@perches.com> wrote:
> There are a mix of function prototypes with and without extern
> in the kernel sources.  Standardize on not using extern for
> function prototypes.
>
> Function prototypes don't need to be written with extern.
> extern is assumed by the compiler.  Its use is as unnecessary as
> using auto to declare automatic/local variables in a block.
>
> Signed-off-by: Joe Perches <joe@perches.com>
> ---

Looks fine, thanks.

Stanislav.

^ permalink raw reply

* Re: [PATCH RFC] net: introduce support for lazy initialization of secret keys
From: Florian Westphal @ 2013-09-25 22:14 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20130925213403.GF4904@order.stressinduktion.org>

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> This patch splits the secret key for syncookies between ipv4 and ipv6
> and initializes them with net_get_random_once. This change was the reason
> I did this patch. I think the initialization of the syncookie_secret is
> way to early.

Agreed, thanks for working on this.

^ permalink raw reply

* [PATCH] ipv6: Fix preferred_lft not updating in some cases
From: Paul Marks @ 2013-09-25 22:12 UTC (permalink / raw)
  To: netdev; +Cc: davem, yoshfuji, lorenzo, Paul Marks

Consider the scenario where an IPv6 router is advertising a fixed
preferred_lft of 1800 seconds, while the valid_lft begins at 3600
seconds and counts down in realtime.

A client should reset its preferred_lft to 1800 every time the RA is
received, but a bug is causing Linux to ignore the update.

The core problem is here:
  if (prefered_lft != ifp->prefered_lft) {

Note that ifp->prefered_lft is an offset, so it doesn't decrease over
time.  Thus, the comparison is always (1800 != 1800), which fails to
trigger an update.

The most direct solution would be to compute a "stored_prefered_lft",
and use that value in the comparison.  But I think that trying to filter
out unnecessary updates here is a premature optimization.  In order for
the filter to apply, both of these would need to hold:

  - The advertised valid_lft and preferred_lft are both declining in
    real time.
  - No clock skew exists between the router & client.

So in this patch, I've set "update_lft = 1" unconditionally, which
allows the surrounding code to be greatly simplified.

Signed-off-by: Paul Marks <pmarks@google.com>
---
 net/ipv6/addrconf.c | 52 +++++++++++++++-------------------------------------
 1 file changed, 15 insertions(+), 37 deletions(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index d6ff126..9a5052c 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2193,43 +2193,21 @@ ok:
 			else
 				stored_lft = 0;
 			if (!update_lft && !create && stored_lft) {
-				if (valid_lft > MIN_VALID_LIFETIME ||
-				    valid_lft > stored_lft)
-					update_lft = 1;
-				else if (stored_lft <= MIN_VALID_LIFETIME) {
-					/* valid_lft <= stored_lft is always true */
-					/*
-					 * RFC 4862 Section 5.5.3e:
-					 * "Note that the preferred lifetime of
-					 *  the corresponding address is always
-					 *  reset to the Preferred Lifetime in
-					 *  the received Prefix Information
-					 *  option, regardless of whether the
-					 *  valid lifetime is also reset or
-					 *  ignored."
-					 *
-					 *  So if the preferred lifetime in
-					 *  this advertisement is different
-					 *  than what we have stored, but the
-					 *  valid lifetime is invalid, just
-					 *  reset prefered_lft.
-					 *
-					 *  We must set the valid lifetime
-					 *  to the stored lifetime since we'll
-					 *  be updating the timestamp below,
-					 *  else we'll set it back to the
-					 *  minimum.
-					 */
-					if (prefered_lft != ifp->prefered_lft) {
-						valid_lft = stored_lft;
-						update_lft = 1;
-					}
-				} else {
-					valid_lft = MIN_VALID_LIFETIME;
-					if (valid_lft < prefered_lft)
-						prefered_lft = valid_lft;
-					update_lft = 1;
-				}
+				const u32 minimum_lft = min(
+					stored_lft, (u32)MIN_VALID_LIFETIME);
+				valid_lft = max(valid_lft, minimum_lft);
+
+				/* RFC4862 Section 5.5.3e:
+				 * "Note that the preferred lifetime of the
+				 *  corresponding address is always reset to
+				 *  the Preferred Lifetime in the received
+				 *  Prefix Information option, regardless of
+				 *  whether the valid lifetime is also reset or
+				 *  ignored."
+				 *
+				 * So we should always update prefered_lft here.
+				 */
+				update_lft = 1;
 			}
 
 			if (update_lft) {
-- 
1.8.4

^ permalink raw reply related

* Re: Kernel Panic Sending Frames Using dev_queue_xmit()
From: Francois Romieu @ 2013-09-25 22:03 UTC (permalink / raw)
  To: Merlin Davis; +Cc: Ben Hutchings, netdev
In-Reply-To: <CAFu7c2XDYR-SMiuqMxtqGhfM00nw8-h0Tr2uewN8+D1BptsayA@mail.gmail.com>

Merlin Davis <me.the.wizard@gmail.com> :
[...]
>       err = net_xmit_eval(dev_queue_xmit(skb));

Macro expansion bug.

-- 
Ueimor

^ permalink raw reply

* Re: Result Notification
From: Marie Siggelin @ 2013-09-25 21:48 UTC (permalink / raw)
  To: info

 
Reply back to uk.larrywilliams@gmail.com as soon as you receive  this notification massage for your claims. Ticket Number: 3134231835667

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox