Netdev List
 help / color / mirror / Atom feed
* Re: Crash due to mutex genl_lock called from RCU context
From: subashab @ 2016-11-26  5:59 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: tgraf, netdev, netdev-owner
In-Reply-To: <1480136078.8455.589.camel@edumazet-glaptop3.roam.corp.google.com>

> Oh well, this wont work, since sk->sk_destruct will be called from RCU
> callback.
> 
> Grabbing the mutex should not be done from netlink_sock_destruct() but
> from netlink_release()
> 
> Maybe this patch would be better :
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 62bea4591054..cce10e3c9b68 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -324,16 +324,6 @@ static void netlink_skb_set_owner_r(struct
> sk_buff *skb, struct sock *sk)
> 
>  static void netlink_sock_destruct(struct sock *sk)
>  {
> -	struct netlink_sock *nlk = nlk_sk(sk);
> -
> -	if (nlk->cb_running) {
> -		if (nlk->cb.done)
> -			nlk->cb.done(&nlk->cb);
> -
> -		module_put(nlk->cb.module);
> -		kfree_skb(nlk->cb.skb);
> -	}
> -
>  	skb_queue_purge(&sk->sk_receive_queue);
> 
>  	if (!sock_flag(sk, SOCK_DEAD)) {
> @@ -456,8 +446,9 @@ static struct sock *netlink_lookup(struct net
> *net, int protocol, u32 portid)
> 
>  	rcu_read_lock();
>  	sk = __netlink_lookup(table, portid, net);
> -	if (sk)
> -		sock_hold(sk);
> +	if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
> +		sk = NULL;
> +
>  	rcu_read_unlock();
> 
>  	return sk;
> @@ -581,6 +572,7 @@ static int __netlink_create(struct net *net,
> struct socket *sock,
>  	}
>  	init_waitqueue_head(&nlk->wait);
> 
> +	sock_set_flag(sk, SOCK_RCU_FREE);
>  	sk->sk_destruct = netlink_sock_destruct;
>  	sk->sk_protocol = protocol;
>  	return 0;
> @@ -645,13 +637,6 @@ static int netlink_create(struct net *net, struct
> socket *sock, int protocol,
>  	goto out;
>  }
> 
> -static void deferred_put_nlk_sk(struct rcu_head *head)
> -{
> -	struct netlink_sock *nlk = container_of(head, struct netlink_sock, 
> rcu);
> -
> -	sock_put(&nlk->sk);
> -}
> -
>  static int netlink_release(struct socket *sock)
>  {
>  	struct sock *sk = sock->sk;
> @@ -724,7 +709,19 @@ static int netlink_release(struct socket *sock)
>  	local_bh_disable();
>  	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
>  	local_bh_enable();
> -	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
> +	if (nlk->cb_running) {
> +		mutex_lock(nlk->cb_mutex);
> +		if (nlk->cb_running) {
> +			if (nlk->cb.done)
> +				nlk->cb.done(&nlk->cb);
> +
> +			module_put(nlk->cb.module);
> +			kfree_skb(nlk->cb.skb);
> +			nlk->cb_running = false;
> +		}
> +		mutex_unlock(nlk->cb_mutex);
> +	}
> +	sock_put(sk);
>  	return 0;
>  }
> 
> diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
> index 3cfd6cc60504..5dc08a7b0a2b 100644
> --- a/net/netlink/af_netlink.h
> +++ b/net/netlink/af_netlink.h
> @@ -32,7 +32,6 @@ struct netlink_sock {
>  	struct module		*module;
> 
>  	struct rhash_head	node;
> -	struct rcu_head		rcu;
>  };
> 
>  static inline struct netlink_sock *nlk_sk(struct sock *sk)

Thanks Eric! I'll try this and get back with results over this weekend.

^ permalink raw reply

* Re: Crash due to mutex genl_lock called from RCU context
From: Eric Dumazet @ 2016-11-26  4:54 UTC (permalink / raw)
  To: subashab; +Cc: tgraf, netdev
In-Reply-To: <1480133493.8455.584.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, 2016-11-25 at 20:11 -0800, Eric Dumazet wrote:
> On Fri, 2016-11-25 at 19:15 -0700, subashab@codeaurora.org wrote:
> > We are seeing a crash due to gen_lock mutex being acquired in RCU 
> > context.
> > Crash is seen on a 4.4 based kernel ARM64 device. This occurred in a
> > regression rack, so unfortunately I don't have steps for a reproducer.
> > 
> > It looks like freeing socket in RCU was brought in through commit
> > 21e4902aea80ef35afc00ee8d2abdea4f519b7f7 ("netlink: Lockless lookup with
> > RCU grace period in socket release").
> > I am not very familiar with generic netlink sockets so I am not sure
> > if there is any other way to fix this apart from reverting this patch.
> > 
> > Any pointers to debug this would be appreciated.
> > 
> > Here is the call stack -
> > 
> > BUG: sleeping function called from invalid context 
> > kernel/locking/mutex.c:98
> > in_atomic(): 1, irqs_disabled(): 0, pid: 16400, name: busybox
> > [<ffffff80080cad20>] ___might_sleep+0x134/0x144
> > [<ffffff80080cadac>] __might_sleep+0x7c/0x8c
> > [<ffffff8008ef09a8>] mutex_lock+0x2c/0x4c
> > [<ffffff8008d307f0>] genl_lock+0x1c/0x24
> > [<ffffff8008d30848>] genl_lock_done+0x2c/0x50
> > [<ffffff8008d2ccac>] netlink_sock_destruct+0x30/0x94
> > [<ffffff8008cdef44>] sk_destruct+0x2c/0x150
> > [<ffffff8008cdf104>] __sk_free+0x9c/0xc4
> > [<ffffff8008cdf16c>] sk_free+0x40/0x4c
> > [<ffffff8008d2c7fc>] deferred_put_nlk_sk+0x40/0x4c
> > [<ffffff800810b104>] rcu_process_callbacks+0x4d4/0x644
> > [<ffffff80080a6598>] __do_softirq+0x1b8/0x3c4
> > [<ffffff80080a6a60>] irq_exit+0x80/0xd4
> > [<ffffff800808e554>] handle_IPI+0x1c0/0x364
> > [<ffffff80080817f8>] gic_handle_irq+0x154/0x1a4
> 
> Right, Thomas commit looks buggy.
> 
> Unfortunately the proper infra was added later in commit 
> a4298e4522d687a79af8 ("net: add SOCK_RCU_FREE socket flag")
> 
> I guess we should backport it, then apply following (untested) fix
> 
> Could you test this solution ? Thanks !
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 62bea4591054..fe0d43314198 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -456,8 +456,9 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
>  
>  	rcu_read_lock();
>  	sk = __netlink_lookup(table, portid, net);
> -	if (sk)
> -		sock_hold(sk);
> +	if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
> +		sk = NULL;
> +
>  	rcu_read_unlock();
>  
>  	return sk;
> @@ -581,6 +582,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
>  	}
>  	init_waitqueue_head(&nlk->wait);
>  
> +	sock_set_flag(sk, SOCK_RCU_FREE);
>  	sk->sk_destruct = netlink_sock_destruct;
>  	sk->sk_protocol = protocol;

Oh well, this wont work, since sk->sk_destruct will be called from RCU
callback.

Grabbing the mutex should not be done from netlink_sock_destruct() but
from netlink_release()

Maybe this patch would be better :

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 62bea4591054..cce10e3c9b68 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -324,16 +324,6 @@ static void netlink_skb_set_owner_r(struct sk_buff *skb, struct sock *sk)
 
 static void netlink_sock_destruct(struct sock *sk)
 {
-	struct netlink_sock *nlk = nlk_sk(sk);
-
-	if (nlk->cb_running) {
-		if (nlk->cb.done)
-			nlk->cb.done(&nlk->cb);
-
-		module_put(nlk->cb.module);
-		kfree_skb(nlk->cb.skb);
-	}
-
 	skb_queue_purge(&sk->sk_receive_queue);
 
 	if (!sock_flag(sk, SOCK_DEAD)) {
@@ -456,8 +446,9 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
 
 	rcu_read_lock();
 	sk = __netlink_lookup(table, portid, net);
-	if (sk)
-		sock_hold(sk);
+	if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
+		sk = NULL;
+
 	rcu_read_unlock();
 
 	return sk;
@@ -581,6 +572,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
 	}
 	init_waitqueue_head(&nlk->wait);
 
+	sock_set_flag(sk, SOCK_RCU_FREE);
 	sk->sk_destruct = netlink_sock_destruct;
 	sk->sk_protocol = protocol;
 	return 0;
@@ -645,13 +637,6 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
-static void deferred_put_nlk_sk(struct rcu_head *head)
-{
-	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
-
-	sock_put(&nlk->sk);
-}
-
 static int netlink_release(struct socket *sock)
 {
 	struct sock *sk = sock->sk;
@@ -724,7 +709,19 @@ static int netlink_release(struct socket *sock)
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
 	local_bh_enable();
-	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
+	if (nlk->cb_running) {
+		mutex_lock(nlk->cb_mutex);
+		if (nlk->cb_running) {
+			if (nlk->cb.done)
+				nlk->cb.done(&nlk->cb);
+	
+			module_put(nlk->cb.module);
+			kfree_skb(nlk->cb.skb);
+			nlk->cb_running = false;
+		}
+		mutex_unlock(nlk->cb_mutex);
+	}
+	sock_put(sk);
 	return 0;
 }
 
diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
index 3cfd6cc60504..5dc08a7b0a2b 100644
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -32,7 +32,6 @@ struct netlink_sock {
 	struct module		*module;
 
 	struct rhash_head	node;
-	struct rcu_head		rcu;
 };
 
 static inline struct netlink_sock *nlk_sk(struct sock *sk)

^ permalink raw reply related

* Re: Crash due to mutex genl_lock called from RCU context
From: Eric Dumazet @ 2016-11-26  4:11 UTC (permalink / raw)
  To: subashab; +Cc: tgraf, netdev
In-Reply-To: <c669cd06c3e50febd95b5ed4aaa78532@codeaurora.org>

On Fri, 2016-11-25 at 19:15 -0700, subashab@codeaurora.org wrote:
> We are seeing a crash due to gen_lock mutex being acquired in RCU 
> context.
> Crash is seen on a 4.4 based kernel ARM64 device. This occurred in a
> regression rack, so unfortunately I don't have steps for a reproducer.
> 
> It looks like freeing socket in RCU was brought in through commit
> 21e4902aea80ef35afc00ee8d2abdea4f519b7f7 ("netlink: Lockless lookup with
> RCU grace period in socket release").
> I am not very familiar with generic netlink sockets so I am not sure
> if there is any other way to fix this apart from reverting this patch.
> 
> Any pointers to debug this would be appreciated.
> 
> Here is the call stack -
> 
> BUG: sleeping function called from invalid context 
> kernel/locking/mutex.c:98
> in_atomic(): 1, irqs_disabled(): 0, pid: 16400, name: busybox
> [<ffffff80080cad20>] ___might_sleep+0x134/0x144
> [<ffffff80080cadac>] __might_sleep+0x7c/0x8c
> [<ffffff8008ef09a8>] mutex_lock+0x2c/0x4c
> [<ffffff8008d307f0>] genl_lock+0x1c/0x24
> [<ffffff8008d30848>] genl_lock_done+0x2c/0x50
> [<ffffff8008d2ccac>] netlink_sock_destruct+0x30/0x94
> [<ffffff8008cdef44>] sk_destruct+0x2c/0x150
> [<ffffff8008cdf104>] __sk_free+0x9c/0xc4
> [<ffffff8008cdf16c>] sk_free+0x40/0x4c
> [<ffffff8008d2c7fc>] deferred_put_nlk_sk+0x40/0x4c
> [<ffffff800810b104>] rcu_process_callbacks+0x4d4/0x644
> [<ffffff80080a6598>] __do_softirq+0x1b8/0x3c4
> [<ffffff80080a6a60>] irq_exit+0x80/0xd4
> [<ffffff800808e554>] handle_IPI+0x1c0/0x364
> [<ffffff80080817f8>] gic_handle_irq+0x154/0x1a4

Right, Thomas commit looks buggy.

Unfortunately the proper infra was added later in commit 
a4298e4522d687a79af8 ("net: add SOCK_RCU_FREE socket flag")

I guess we should backport it, then apply following (untested) fix

Could you test this solution ? Thanks !

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 62bea4591054..fe0d43314198 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -456,8 +456,9 @@ static struct sock *netlink_lookup(struct net *net, int protocol, u32 portid)
 
 	rcu_read_lock();
 	sk = __netlink_lookup(table, portid, net);
-	if (sk)
-		sock_hold(sk);
+	if (sk && !atomic_inc_not_zero(&sk->sk_refcnt))
+		sk = NULL;
+
 	rcu_read_unlock();
 
 	return sk;
@@ -581,6 +582,7 @@ static int __netlink_create(struct net *net, struct socket *sock,
 	}
 	init_waitqueue_head(&nlk->wait);
 
+	sock_set_flag(sk, SOCK_RCU_FREE);
 	sk->sk_destruct = netlink_sock_destruct;
 	sk->sk_protocol = protocol;
 	return 0;
@@ -645,13 +647,6 @@ static int netlink_create(struct net *net, struct socket *sock, int protocol,
 	goto out;
 }
 
-static void deferred_put_nlk_sk(struct rcu_head *head)
-{
-	struct netlink_sock *nlk = container_of(head, struct netlink_sock, rcu);
-
-	sock_put(&nlk->sk);
-}
-
 static int netlink_release(struct socket *sock)
 {
 	struct sock *sk = sock->sk;
@@ -724,7 +719,7 @@ static int netlink_release(struct socket *sock)
 	local_bh_disable();
 	sock_prot_inuse_add(sock_net(sk), &netlink_proto, -1);
 	local_bh_enable();
-	call_rcu(&nlk->rcu, deferred_put_nlk_sk);
+	sock_put(sk);
 	return 0;
 }
 
diff --git a/net/netlink/af_netlink.h b/net/netlink/af_netlink.h
index 3cfd6cc60504..5dc08a7b0a2b 100644
--- a/net/netlink/af_netlink.h
+++ b/net/netlink/af_netlink.h
@@ -32,7 +32,6 @@ struct netlink_sock {
 	struct module		*module;
 
 	struct rhash_head	node;
-	struct rcu_head		rcu;
 };
 
 static inline struct netlink_sock *nlk_sk(struct sock *sk)

^ permalink raw reply related

* Xmas Offer
From: Mrs Julie Leach @ 2016-11-26  3:05 UTC (permalink / raw)
  To: Recipients

You are a recipient to Mrs Julie Leach Donation of $3 million USD. Contact ( julieleach93@gmail.com ) for claims.

^ permalink raw reply

* Re: [PATCH net 1/1] tipc: resolve connection flow control compatibility problem
From: David Miller @ 2016-11-26  2:38 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, parthasarathy.bhuvaragan, ying.xue, maloy,
	tipc-discussion
In-Reply-To: <1480031227-14836-1-git-send-email-jon.maloy@ericsson.com>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Thu, 24 Nov 2016 18:47:07 -0500

> In commit 10724cc7bb78 ("tipc: redesign connection-level flow control")
> we replaced the previous message based flow control with one based on
> 1k blocks. In order to ensure backwards compatibility the mechanism
> falls back to using message as base unit when it senses that the peer
> doesn't support the new algorithm. The default flow control window,
> i.e., how many units can be sent before the sender blocks and waits
> for an acknowledge (aka advertisement) is 512. This was tested against
> the previous version, which uses an acknowledge frequency of on ack per
> 256 received message, and found to work fine.
> 
> However, we missed the fact that versions older than Linux 3.15 use an
> acknowledge frequency of 512, which is exactly the limit where a 4.6+
> sender will stop and wait for acknowledge. This would also work fine if
> it weren't for the fact that if the first sent message on a 4.6+ server
> side is an empty SYNACK, this one is also is counted as a sent message,
> while it is not counted as a received message on a legacy 3.15-receiver.
> This leads to the sender always being one step ahead of the receiver, a
> scenario causing the sender to block after 512 sent messages, while the
> receiver only has registered 511 read messages. Hence, the legacy
> receiver is not trigged to send an acknowledge, with a permanently
> blocked sender as result.
> 
> We solve this deadlock by simply allowing the sender to send one more
> message before it blocks, i.e., by a making minimal change to the
> condition used for determining connection congestion.
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied, thanks Jon.

^ permalink raw reply

* Re: [patch net-next 00/19] mlxsw: traps, trap groups and policers
From: David Miller @ 2016-11-26  2:22 UTC (permalink / raw)
  To: jiri; +Cc: netdev, nogahf, idosch, eladr, yotamg, arkadis, ogerlitz
In-Reply-To: <1480066427-3961-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 25 Nov 2016 10:33:28 +0100

> From: Jiri Pirko <jiri@mellanox.com>
> 
> Nogah says:
> 
> For a packet to be sent from the HW to the cpu, it needs to be trapped.
> For a trap to be activate it should be assigned to a trap group.
> Those trap groups can have policers, to limit the packet rate (the max
> number of packets that can be sent to the cpu in a time slot, the rest
> will be discarded) or the data rate (the same, but the count is not by the
> number of packets but by their total length in bytes).
> 
> This patchset rearrange the trap setting API, re-write the traps and the
> trap groups list in spectrum and assign them policers.

Series applied, thanks.

^ permalink raw reply

* Crash due to mutex genl_lock called from RCU context
From: subashab @ 2016-11-26  2:15 UTC (permalink / raw)
  To: tgraf, netdev; +Cc: eric.dumazet

We are seeing a crash due to gen_lock mutex being acquired in RCU 
context.
Crash is seen on a 4.4 based kernel ARM64 device. This occurred in a
regression rack, so unfortunately I don't have steps for a reproducer.

It looks like freeing socket in RCU was brought in through commit
21e4902aea80ef35afc00ee8d2abdea4f519b7f7 ("netlink: Lockless lookup with
RCU grace period in socket release").
I am not very familiar with generic netlink sockets so I am not sure
if there is any other way to fix this apart from reverting this patch.

Any pointers to debug this would be appreciated.

Here is the call stack -

BUG: sleeping function called from invalid context 
kernel/locking/mutex.c:98
in_atomic(): 1, irqs_disabled(): 0, pid: 16400, name: busybox
[<ffffff80080cad20>] ___might_sleep+0x134/0x144
[<ffffff80080cadac>] __might_sleep+0x7c/0x8c
[<ffffff8008ef09a8>] mutex_lock+0x2c/0x4c
[<ffffff8008d307f0>] genl_lock+0x1c/0x24
[<ffffff8008d30848>] genl_lock_done+0x2c/0x50
[<ffffff8008d2ccac>] netlink_sock_destruct+0x30/0x94
[<ffffff8008cdef44>] sk_destruct+0x2c/0x150
[<ffffff8008cdf104>] __sk_free+0x9c/0xc4
[<ffffff8008cdf16c>] sk_free+0x40/0x4c
[<ffffff8008d2c7fc>] deferred_put_nlk_sk+0x40/0x4c
[<ffffff800810b104>] rcu_process_callbacks+0x4d4/0x644
[<ffffff80080a6598>] __do_softirq+0x1b8/0x3c4
[<ffffff80080a6a60>] irq_exit+0x80/0xd4
[<ffffff800808e554>] handle_IPI+0x1c0/0x364
[<ffffff80080817f8>] gic_handle_irq+0x154/0x1a4

^ permalink raw reply

* Re: [PATCH net] mvpp2: use correct size for memset
From: David Miller @ 2016-11-26  1:57 UTC (permalink / raw)
  To: arnd; +Cc: jszhang, mw, tremyfr, netdev, linux-kernel
In-Reply-To: <20161124162843.3849988-1-arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Thu, 24 Nov 2016 17:28:12 +0100

> gcc-7 detects a short memset in mvpp2, introduced in the original
> merge of the driver:
> 
> drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_cls_init':
> drivers/net/ethernet/marvell/mvpp2.c:3296:2: error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size]
> 
> The result seems to be that we write uninitialized data into the
> flow table registers, although we did not get any warning about
> that uninitialized data usage.
> 
> Using sizeof() lets us initialize then entire array instead.
> 
> Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit")
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] irda: fix overly long udelay()
From: David Miller @ 2016-11-26  1:56 UTC (permalink / raw)
  To: arnd; +Cc: samuel, netdev, linux-kernel
In-Reply-To: <20161124162630.3802535-1-arnd@arndb.de>

From: Arnd Bergmann <arnd@arndb.de>
Date: Thu, 24 Nov 2016 17:26:22 +0100

> irda_get_mtt() returns a hardcoded '10000' in some cases,
> and with gcc-7, we get a build error because this triggers a
> compile-time check in udelay():
> 
> drivers/net/irda/w83977af_ir.o: In function `w83977af_hard_xmit':
> w83977af_ir.c:(.text.w83977af_hard_xmit+0x14c): undefined reference to `__bad_udelay'
> 
> Older compilers did not run into this because they either did not
> completely inline the irda_get_mtt() or did not consider the
> 10000 value a constant expression.
> 
> The code has been wrong since the start of git history.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 ...
> @@ -518,7 +518,9 @@ static netdev_tx_t w83977af_hard_xmit(struct sk_buff *skb,
>  		
>  		mtt = irda_get_mtt(skb);
>  		pr_debug("%s(%ld), mtt=%d\n", __func__ , jiffies, mtt);
> -			if (mtt)
> +			if (mtt > 1000)
> +				mdelay(mtt/1000);
> +			else if (mtt)
>  				udelay(mtt);

I know this isn't caused by you, but wow what is going on with the
indentation here?!?!?

^ permalink raw reply

* Re: [PATCH V3 net-next 05/15] smc: CLC handshake (incl. preparation steps)
From: David Miller @ 2016-11-26  1:53 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, utz.bacher
In-Reply-To: <20161124150645.90881-6-ubraun@linux.vnet.ibm.com>

From: Ursula Braun <ubraun@linux.vnet.ibm.com>
Date: Thu, 24 Nov 2016 16:06:35 +0100

> +struct smc_clc_msg_hdr {	/* header1 of clc messages */
> +	u8 eyecatcher[4];	/* eye catcher */
> +	u8 type;		/* proposal / accept / confirm / decline */
> +	__be16 length;
> +#if defined(__BIG_ENDIAN_BITFIELD)
> +	u8 version : 4,
> +	   flag    : 1,
> +	   rsvd	   : 3;
> +#elif defined(__LITTLE_ENDIAN_BITFIELD)
> +	u8 rsvd    : 3,
> +	   flag    : 1,
> +	   version : 4;
> +#endif
> +} __packed;

Please get rid of all of these __packed attributes.  They are likely
completely unnecessary and the code generated for this construct on
certain architectures is amazingly inefficient.

Describe the on-wire datastructures properly with fixed sized types
and any padding or overlapping, as needed.

^ permalink raw reply

* Re: [PATCH V3 net-next 03/15] smc: establish pnet table management
From: David Miller @ 2016-11-26  1:51 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, utz.bacher
In-Reply-To: <20161124150645.90881-4-ubraun@linux.vnet.ibm.com>

From: Ursula Braun <ubraun@linux.vnet.ibm.com>
Date: Thu, 24 Nov 2016 16:06:33 +0100

> Connection creation with SMC-R starts through an internal
> TCP-connection. The Ethernet interface for this TCP-connection is not
> restricted to the Ethernet interface of a RoCE device. Any existing
> Ethernet interface belonging to the same physical net can be used, as
> long as there is a defined relation between the Ethernet interface and
> some RoCE devices. This relation is defined with the help of an
> identification string called "Physical Net ID" or short "pnet ID".
> Information about defined pnet IDs and their related Ethernet
> interfaces and RoCE devices is stored in the SMC-R pnet table.
> 
> This patch adds pnet table configuration support as a set of
> sysfs files listed under /sys/kernel/smc. Attribute files
> exist to add and delete pnet IDs and to map RoCE devices and
> ethernet interfaces to an individual pnet ID.
> 
> There is no cross check if ethernet interfaces or infiniband
> devices really exist in the system. This enables the configuration of
> the pnet table after module load even if interfaces or devices might
> not yet be available.
> 
> Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>

Please do not use sysfs to configure your subsystems IB and eth
mappings.

Instead use a properly formed genetlink interface that will properly
integrate with other subsystems, and also allow for proper event
monitoring.

I also fundamentally disagree with allowing references to devices that
do not even exist yet.  It implies a string --> device conversion in
places where that cost should be avoided.

^ permalink raw reply

* Re: [PATCH V3 net-next 02/15] smc: establish new socket family
From: David Miller @ 2016-11-26  1:47 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, utz.bacher
In-Reply-To: <20161124150645.90881-3-ubraun@linux.vnet.ibm.com>

From: Ursula Braun <ubraun@linux.vnet.ibm.com>
Date: Thu, 24 Nov 2016 16:06:32 +0100

> +static struct sock *smc_sock_alloc(struct net *net, struct socket *sock)
> +{
> +	struct smc_sock *smc;
> +	struct sock *sk;
> +
> +	sk = sk_alloc(net, PF_SMC, GFP_KERNEL, &smc_proto, 0);
> +	if (!sk)
> +		return NULL;
> +
> +	sock_init_data(sock, sk); /* sets sk_refcnt to 1 */
> +	sk->sk_state = SMC_INIT;
> +	sk->sk_destruct = smc_destruct;
> +	sk->sk_protocol = SMCPROTO_SMC;
> +	sk_refcnt_debug_inc(sk);
> +
> +	smc = smc_sk(sk);
> +	smc->clcsock = NULL;
> +	smc->use_fallback = 0;

This is unnecessary, sk_alloc() clears out the memory for you.

> +static int smc_bind(struct socket *sock, struct sockaddr *uaddr,
> +		    int addr_len)
> +{
 ...
> +	smc->clcsock->sk->sk_reuse = sk->sk_reuse;
> +	rc = kernel_bind(smc->clcsock, uaddr, addr_len);

Is it valid to assume smc->clcsock is not NULL right here?

> +struct smc_sock {				/* smc sock container */
> +	struct sock		sk;
> +	struct socket		*clcsock;	/* internal tcp socket */
> +	u8			use_fallback : 1; /* fallback to tcp */
> +};

Please use 'bool' and true/false for 'use_fallback'.

^ permalink raw reply

* Re: [PATCH V3 net-next 01/15] net: introduce keepalive function in struct proto
From: David Miller @ 2016-11-26  1:41 UTC (permalink / raw)
  To: ubraun; +Cc: netdev, linux-s390, schwidefsky, heiko.carstens, utz.bacher
In-Reply-To: <20161124150645.90881-2-ubraun@linux.vnet.ibm.com>

From: Ursula Braun <ubraun@linux.vnet.ibm.com>
Date: Thu, 24 Nov 2016 16:06:31 +0100

> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 3ea1cf8..9b1602a 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -617,6 +617,7 @@ void tcp_set_keepalive(struct sock *sk, int val)
>  	else if (!val)
>  		inet_csk_delete_keepalive_timer(sk);
>  }
> +EXPORT_SYMBOL(tcp_set_keepalive);

Please use EXPORT_SYMBOL_GPL().

^ permalink raw reply

* Re: [PATCH] net/mlx5: drop duplicate header delay.h
From: David Miller @ 2016-11-26  1:33 UTC (permalink / raw)
  To: geliangtang; +Cc: saeedm, matanb, leonro, netdev, linux-rdma, linux-kernel
In-Reply-To: <03d5a2a0f03458cdb4f2b139cfabc11b5c334f95.1479990943.git.geliangtang@gmail.com>

From: Geliang Tang <geliangtang@gmail.com>
Date: Thu, 24 Nov 2016 21:58:33 +0800

> Drop duplicate header delay.h from mlx5/core/main.c.
> 
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: ieee802154: drop duplicate header delay.h
From: David Miller @ 2016-11-26  1:33 UTC (permalink / raw)
  To: geliangtang; +Cc: michael.hennerich, aar, linux-wpan, netdev, linux-kernel
In-Reply-To: <f64943b9c1da12b6199ba745ba04cb477a05a5a3.1479991128.git.geliangtang@gmail.com>

From: Geliang Tang <geliangtang@gmail.com>
Date: Thu, 24 Nov 2016 21:58:32 +0800

> Drop duplicate header delay.h from adf7242.c.
> 
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH] ibmvnic: drop duplicate header seq_file.h
From: David Miller @ 2016-11-26  1:32 UTC (permalink / raw)
  To: geliangtang
  Cc: tlfalcon, jallen, benh, paulus, mpe, netdev, linuxppc-dev,
	linux-kernel
In-Reply-To: <9f2baee4e0a81ba8b5089a2b5a42f67678de4b06.1479989017.git.geliangtang@gmail.com>

From: Geliang Tang <geliangtang@gmail.com>
Date: Thu, 24 Nov 2016 21:58:29 +0800

> Drop duplicate header seq_file.h from ibmvnic.c.
> 
> Signed-off-by: Geliang Tang <geliangtang@gmail.com>

Applied.

^ permalink raw reply

* Re: [patch] fsl/fman: fix a leak in tgec_free()
From: David Miller @ 2016-11-26  1:30 UTC (permalink / raw)
  To: dan.carpenter; +Cc: madalin.bucur, igal.liberman, netdev, kernel-janitors
In-Reply-To: <20161124111931.GK17225@mwanda>

From: Dan Carpenter <dan.carpenter@oracle.com>
Date: Thu, 24 Nov 2016 14:20:43 +0300

> We set "tgec->cfg" to NULL before passing it to kfree().  There is no
> need to set it to NULL at all.  Let's just delete it.
> 
> Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>

Applied.

^ permalink raw reply

* Re: [patch] net/mlx5: remove a duplicate condition
From: David Miller @ 2016-11-26  1:30 UTC (permalink / raw)
  To: dan.carpenter-QHcLZuEGTsvQT0dZR+AlfA
  Cc: saeedm-VPRAkNaXOzVWk0Htik3J/w, matanb-VPRAkNaXOzVWk0Htik3J/w,
	leonro-VPRAkNaXOzVWk0Htik3J/w, netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20161124110345.GC17225@mwanda>

From: Dan Carpenter <dan.carpenter-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Date: Thu, 24 Nov 2016 14:03:45 +0300

> We verified that MLX5_FLOW_CONTEXT_ACTION_COUNT was set on the first
> line of the function so we don't need to check again here.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS
From: David Miller @ 2016-11-26  1:23 UTC (permalink / raw)
  To: mlichvar; +Cc: netdev, decot
In-Reply-To: <20161124095506.25791-1-mlichvar@redhat.com>

From: Miroslav Lichvar <mlichvar@redhat.com>
Date: Thu, 24 Nov 2016 10:55:06 +0100

> The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET
> command and likewise it shouldn't require the CAP_NET_ADMIN capability.
> 
> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>

Good catch, applied, thanks.

^ permalink raw reply

* Re: [PATCH 0/4] net: thunderx: Support for 80xx, RED, PFC e.t.c
From: David Miller @ 2016-11-26  1:21 UTC (permalink / raw)
  To: sunil.kovvuri; +Cc: netdev, linux-kernel, linux-arm-kernel, sgoutham
In-Reply-To: <1479979083-15963-1-git-send-email-sunil.kovvuri@gmail.com>

From: sunil.kovvuri@gmail.com
Date: Thu, 24 Nov 2016 14:47:59 +0530

> This patch series adds support for SLM modules present on 80xx
> silicon, enables ramdom early discard, backpressure generation,
> PFC and some ethtool changes to display supported link modes e.t.c.

Series applied to net-next.

^ permalink raw reply

* Re: [net-next] neigh: fix the loop index error in neigh dump
From: David Miller @ 2016-11-26  1:17 UTC (permalink / raw)
  To: zhangshengju; +Cc: netdev, dsa
In-Reply-To: <1479965109-14607-1-git-send-email-zhangshengju@cmss.chinamobile.com>

From: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Date: Thu, 24 Nov 2016 13:25:09 +0800

> Loop index in neigh dump function is not updated correctly under some
> circumstances, this patch will fix it.
> 
> Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>

If this bug is also in the 'net' tree you should target this patch
there.

Also, you must provide an appropriate "Fixes: " tag identifying
the commit which introduced this bug.

^ permalink raw reply

* Re: [PATCH net 1/1] tipc: improve sanity check for received domain records
From: David Miller @ 2016-11-26  1:07 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, parthasarathy.bhuvaragan, ying.xue, maloy,
	tipc-discussion
In-Reply-To: <1479962769-24239-1-git-send-email-jon.maloy@ericsson.com>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Wed, 23 Nov 2016 23:46:09 -0500

> In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we
> added a data area to the link monitor STATE messages under the
> assumption that previous versions did not use any such data area.
> 
> For versions older than Linux 4.3 this assumption is not correct. In
> those version, all STATE messages sent out from a node inadvertently
> contain a 16 byte data area containing a string; -a leftover from
> previous RESET messages which were using this during the setup phase.
> This string serves no purpose in STATE messages, and should no be there.
> 
> Unfortunately, this data area is delivered to the link monitor
> framework, where a sanity check catches that it is not a correct domain
> record, and drops it. It also issues a rate limited warning about the
> event.
> 
> Since such events occur much more frequently than anticipated, we now
> choose to remove the warning in order to not fill the kernel log with
> useless contents. We also make the sanity check stricter, to further
> reduce the risk that such data is inavertently admitted.
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net 1/1] tipc: fix compatibility bug in link monitoring
From: David Miller @ 2016-11-26  1:06 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, amar.nv005, parthasarathy.bhuvaragan, ying.xue, maloy,
	tipc-discussion
In-Reply-To: <1479953126-3552-1-git-send-email-jon.maloy@ericsson.com>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Wed, 23 Nov 2016 21:05:26 -0500

> commit 817298102b0b ("tipc: fix link priority propagation") introduced a
> compatibility problem between TIPC versions newer than Linux 4.6 and
> those older than Linux 4.4. In versions later than 4.4, link STATE
> messages only contain a non-zero link priority value when the sender
> wants the receiver to change its priority. This has the effect that the
> receiver resets itself in order to apply the new priority. This works
> well, and is consistent with the said commit.
> 
> However, in versions older than 4.4 a valid link priority is present in
> all sent link STATE messages, leading to cyclic link establishment and
> reset on the 4.6+ node.
> 
> We fix this by adding a test that the received value should not only
> be valid, but also differ from the current value in order to cause the
> receiving link endpoint to reset.
> 
> Reported-by: Amar Nv <amar.nv005@gmail.com>
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented
From: David Miller @ 2016-11-26  0:59 UTC (permalink / raw)
  To: andrew; +Cc: thomas.petazzoni, netdev
In-Reply-To: <1479942493-19784-1-git-send-email-andrew@lunn.ch>

From: Andrew Lunn <andrew@lunn.ch>
Date: Thu, 24 Nov 2016 00:08:13 +0100

> The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it
> actually does not. The hardware probably does support it, but there is
> no code to configure the filter. As a quick and simple fix, remove the
> flag. This will cause the core to fall back to promiscuous mode.
> 
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> Fixes: b50b72de2f2f ("net: mvneta: enable features before registering the driver")

Applied.

^ permalink raw reply

* Re: [PATCH v2 net] phy: fix error case of phy_led_triggers_(un)register
From: David Miller @ 2016-11-26  0:59 UTC (permalink / raw)
  To: Woojung.Huh; +Cc: zach.brown, netdev, f.fainelli, andrew
In-Reply-To: <9235D6609DB808459E95D78E17F2E43D40969DE2@CHN-SV-EXMX02.mchp-main.com>

From: <Woojung.Huh@microchip.com>
Date: Wed, 23 Nov 2016 23:10:33 +0000

> From: Woojung Huh <woojung.huh@microchip.com>
> 
> When phy_init_hw() fails at phy_attach_direct();
> - phy_detach() calls phy_led_triggers_unregister() without
>   previous call of phy_led_triggers_register().
> - still call phy_led_triggers_register() and cause memory leak.
> 
> Fixes: 2e0bc452f472 ("net: phy: leds: add support for led triggers on phy link state change")
> Signed-off-by: Woojung Huh <woojung.huh@microchip.com>

Applied to net-next.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox