Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net] MAINTAINERS: net: update sfc maintainers
From: Bert Kenward @ 2016-04-25 16:42 UTC (permalink / raw)
  To: David Miller, linux-kernel, netdev
  Cc: Edward Cree, Shradha Shah, Solarflare linux maintainers

Add myself and Edward Cree as maintainers.
Remove Shradha Shah, who is on extended leave.

Cc: David S. Miller <davem@davemloft.net>
Cc: Edward Cree <ecree@solarflare.com>
Cc: Shradha Shah <sshah@solarflare.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
---
 MAINTAINERS | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8491336..17ad615 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10014,7 +10014,8 @@ F:	drivers/infiniband/hw/ocrdma/
 
 SFC NETWORK DRIVER
 M:	Solarflare linux maintainers <linux-net-drivers@solarflare.com>
-M:	Shradha Shah <sshah@solarflare.com>
+M:	Edward Cree <ecree@solarflare.com>
+M:	Bert Kenward <bkenward@solarflare.com>
 L:	netdev@vger.kernel.org
 S:	Supported
 F:	drivers/net/ethernet/sfc/
-- 
2.5.5

^ permalink raw reply related

* Re: [PATCH net] MAINTAINERS: net: update sfc maintainers
From: Edward Cree @ 2016-04-25 17:02 UTC (permalink / raw)
  To: Bert Kenward, David Miller, linux-kernel, netdev
  Cc: Shradha Shah, Solarflare linux maintainers
In-Reply-To: <571E48E4.4050106@solarflare.com>

On 25/04/16 17:42, Bert Kenward wrote:
> Add myself and Edward Cree as maintainers.
> Remove Shradha Shah, who is on extended leave.
>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Edward Cree <ecree@solarflare.com>
> Cc: Shradha Shah <sshah@solarflare.com>
> Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Acked-by: Edward Cree <ecree@solarflare.com>
> ---
>  MAINTAINERS | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 8491336..17ad615 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -10014,7 +10014,8 @@ F:	drivers/infiniband/hw/ocrdma/
>  
>  SFC NETWORK DRIVER
>  M:	Solarflare linux maintainers <linux-net-drivers@solarflare.com>
> -M:	Shradha Shah <sshah@solarflare.com>
> +M:	Edward Cree <ecree@solarflare.com>
> +M:	Bert Kenward <bkenward@solarflare.com>
>  L:	netdev@vger.kernel.org
>  S:	Supported
>  F:	drivers/net/ethernet/sfc/

^ permalink raw reply

* Re: [PATCH net-next] soreuseport: Resolve merge conflict for v4/v6 ordering fix
From: David Miller @ 2016-04-25 17:28 UTC (permalink / raw)
  To: kraigatgoog; +Cc: netdev
In-Reply-To: <1461595332-16994-1-git-send-email-kraigatgoog@gmail.com>

From: Craig Gallek <kraigatgoog@gmail.com>
Date: Mon, 25 Apr 2016 10:42:12 -0400

> From: Craig Gallek <kraig@google.com>
> 
> d894ba18d4e4 ("soreuseport: fix ordering for mixed v4/v6 sockets")
> was merged as a bug fix to the net tree.  Two conflicting changes
> were committed to net-next before the above fix was merged back to
> net-next:
> ca065d0cf80f ("udp: no longer use SLAB_DESTROY_BY_RCU")
> 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood")
> 
> These changes switched the datastructure used for TCP and UDP sockets
> from hlist_nulls to hlist.  This patch applies the necessary parts
> of the net tree fix to net-next which were not automatic as part of the
> merge.
> 
> Fixes: 1602f49b58ab ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
> Signed-off-by: Craig Gallek <kraig@google.com>

Applied, thanks for taking care of this Craig.

^ permalink raw reply

* Re: qdisc spin lock
From: Michael Ma @ 2016-04-25 17:29 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Cong Wang, Linux Kernel Network Developers
In-Reply-To: <CAAmHdhx_Db3GMCmwn3UJajP7_se6tRHPGk_fQUDgDWDq5hN34A@mail.gmail.com>

2016-04-21 15:12 GMT-07:00 Michael Ma <make0818@gmail.com>:
> 2016-04-21 5:41 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>> On Wed, 2016-04-20 at 22:51 -0700, Michael Ma wrote:
>>> 2016-04-20 15:34 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>>> > On Wed, 2016-04-20 at 14:24 -0700, Michael Ma wrote:
>>> >> 2016-04-08 7:19 GMT-07:00 Eric Dumazet <eric.dumazet@gmail.com>:
>>> >> > On Thu, 2016-03-31 at 16:48 -0700, Michael Ma wrote:
>>> >> >> I didn't really know that multiple qdiscs can be isolated using MQ so
>>> >> >> that each txq can be associated with a particular qdisc. Also we don't
>>> >> >> really have multiple interfaces...
>>> >> >>
>>> >> >> With this MQ solution we'll still need to assign transmit queues to
>>> >> >> different classes by doing some math on the bandwidth limit if I
>>> >> >> understand correctly, which seems to be less convenient compared with
>>> >> >> a solution purely within HTB.
>>> >> >>
>>> >> >> I assume that with this solution I can still share qdisc among
>>> >> >> multiple transmit queues - please let me know if this is not the case.
>>> >> >
>>> >> > Note that this MQ + HTB thing works well, unless you use a bonding
>>> >> > device. (Or you need the MQ+HTB on the slaves, with no way of sharing
>>> >> > tokens between the slaves)
>>> >>
>>> >> Actually MQ+HTB works well for small packets - like flow of 512 byte
>>> >> packets can be throttled by HTB using one txq without being affected
>>> >> by other flows with small packets. However I found using this solution
>>> >> large packets (10k for example) will only achieve very limited
>>> >> bandwidth. In my test I used MQ to assign one txq to a HTB which sets
>>> >> rate at 1Gbit/s, 512 byte packets can achieve the ceiling rate by
>>> >> using 30 threads. But sending 10k packets using 10 threads has only 10
>>> >> Mbit/s with the same TC configuration. If I increase burst and cburst
>>> >> of HTB to some extreme large value (like 50MB) the ceiling rate can be
>>> >> hit.
>>> >>
>>> >> The strange thing is that I don't see this problem when using HTB as
>>> >> the root. So txq number seems to be a factor here - however it's
>>> >> really hard to understand why would it only affect larger packets. Is
>>> >> this a known issue? Any suggestion on how to investigate the issue
>>> >> further? Profiling shows that the cpu utilization is pretty low.
>>> >
>>> > You could try
>>> >
>>> > perf record -a -g -e skb:kfree_skb sleep 5
>>> > perf report
>>> >
>>> > So that you see where the packets are dropped.
>>> >
>>> > Chances are that your UDP sockets SO_SNDBUF is too big, and packets are
>>> > dropped at qdisc enqueue time, instead of having backpressure.
>>> >
>>>
>>> Thanks for the hint - how should I read the perf report? Also we're
>>> using TCP socket in this testing - TCP window size is set to 70kB.
>>
>> But how are you telling TCP to send 10k packets ?
>>
> We just write to the socket with 10k buffer and wait for a response
> from the server (using read()) before the next write. Using tcpdump I
> can see the 10k write is actually sent through 3 packets
> (7.3k/1.5k/1.3k).
>
>> AFAIK you can not : TCP happily aggregates packets in write queue
>> (see current MSG_EOR discussion)
>>
>> I suspect a bug in your tc settings.
>>
>>
>
> Could you help to check my tc setting?
>
> sudo tc qdisc add dev eth0 root mqprio num_tc 6 map 0 1 2 3 4 5 0 0
> queues 19@0 1@19 1@20 1@21 1@22 1@23 hw 0
> sudo tc qdisc add dev eth0 parent 805a:1a handle 8001:0 htb default 10
> sudo tc class add dev eth0 parent 8001: classid 8001:10 htb rate 1000Mbit
>
> I didn't set r2q/burst/cburst/mtu/mpu so the default value should be used.

Just to circle back on this - it seems there is 200ms delay sometimes
during data push which stalled the sending:

01:34:44.046232 IP (tos 0x0, ttl  64, id 2863, offset 0, flags [DF],
proto: TCP (6), length: 8740) 10.101.197.75.59126 >
10.101.197.105.redwood-broker: . 250025:258713(8688) ack 1901 win 58
<nop,nop,timestamp 507571833 196626529>
01:34:44.046304 IP (tos 0x0, ttl  64, id 15420, offset 0, flags [DF],
proto: TCP (6), length: 52) 10.101.197.105.redwood-broker >
10.101.197.75.59126: ., cksum 0x187d (correct), 1901:1901(0) ack
258713 win 232 <nop,nop,timestamp 196626529 507571833>
01:34:44.247184 IP (tos 0x0, ttl  64, id 2869, offset 0, flags [DF],
proto: TCP (6), length: 1364) 10.101.197.75.59126 >
10.101.197.105.redwood-broker: P 258713:260025(1312) ack 1901 win 58
<nop,nop,timestamp 507571833 196626529>
01:34:44.247186 IP (tos 0x0, ttl  64, id 2870, offset 0, flags [DF],
proto: TCP (6), length: 1364) 10.101.197.75.59126 >
10.101.197.105.redwood-broker: P 258713:260025(1312) ack 1901 win 58
<nop,nop,timestamp 507572034 196626529>

at 44.046s there was an ack from the iperf server (10.101.197.105) for
a previous sent packet of size 8740, then after exact 200 ms (44.247s
above) two identical packets were pushed from the client
(10.101.197.75). It looks like there is some TCP timer triggered -
however disabling Nagel or delayed ack doesn't help. So maybe TC has
delayed the first packet and for some reason only after 200 ms seconds
the packet was sent together with the retransmitted one.

As I mentioned before, setting burst/cburst to 50MB eliminates this
problem. Also setting the TCP receive window on server side to some
value from 4k to 12k solved the issue - but from TCPDump this might
just caused the packet to be segmented further so it's not gated
significantly by HTB. Using TCP window auto-scaling doesn't help.

It all looks like HTB delayed a packet when the rate limit is hit (did
manual computation and found the timing matches) and instead of
sending it through a TC timer (which should be much less than 200ms -
100ms?), the packet was sent when TCP decides to retransmit the same
packet.

^ permalink raw reply

* Re: [PATCH] [RFC] net: dsa: mv88e6xxx: Pre-initialize err in mv88e6xxx_port_bridge_join()
From: Vivien Didelot @ 2016-04-25 17:31 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: David S. Miller, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAMuHMdVzeR9iMAJPwZj3HV56XwYcc_+eAhx+YiCN33Do3tsJSg@mail.gmail.com>

Hi Geert,

Geert Uytterhoeven <geert@linux-m68k.org> writes:

> On Mon, Apr 25, 2016 at 5:03 PM, Vivien Didelot
> <vivien.didelot@savoirfairelinux.com> wrote:
>> Geert Uytterhoeven <geert@linux-m68k.org> writes:
>>> drivers/net/dsa/mv88e6xxx.c: In function ‘mv88e6xxx_port_bridge_join’:
>>> drivers/net/dsa/mv88e6xxx.c:2184: warning: ‘err’ may be used uninitialized in this function
>>
>> Interesting, I don't have those warnings on 207afda1b5036009...
>
> It depends on the compiler version (still using 4.1.2) and options.
>
>>> If netdev_notifier_changeupper_info.upper_dev is ever NULL, the bridge
>>> parameter will be NULL too, and the function will return an
>>> uninitialized value.
>>>
>>> Pre-initialize err to zero to fix this.
>>>
>>> Fixes: 207afda1b5036009 ("net: dsa: mv88e6xxx: share the same default FDB")
>>> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
>>> ---
>>> Can this actually happen?
>>
>> bridge cannot be NULL here. Also ps->ports[port].bridge_dev is assigned
>> to it before entering the for loop, so _mv88e6xxx_port_based_vlan_map
>> will be called at least for this port.
>
> But there's no way the compiler can know that...

Or maybe it can in new configurations. Anyway, this fix doesn't hurt,
with a relevant commit message, I'd ack it.

Thanks,

        Vivien

^ permalink raw reply

* Re: Warning triggered by lockdep checks for sock_owned_by_user on linux-next-20160420
From: Shi, Yang @ 2016-04-25 17:32 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, hannes, LKML, Network Development
In-Reply-To: <1461387028.7627.39.camel@edumazet-glaptop3.roam.corp.google.com>

On 4/22/2016 9:50 PM, Eric Dumazet wrote:
> On Fri, 2016-04-22 at 21:02 -0700, Shi, Yang wrote:
>> Hi David,
>>
>> When I ran some test on a nfs mounted rootfs, I got the below warning
>> with LOCKDEP enabled on linux-next-20160420:
>>
>> WARNING: CPU: 9 PID: 0 at include/net/sock.h:1408
>> udp_queue_rcv_skb+0x3d0/0x660
>> Modules linked in:
>> CPU: 9 PID: 0 Comm: swapper/9 Tainted: G      D
>> 4.6.0-rc4-next-20160420-WR7.0.0.0_standard+ #6
>> Hardware name: Intel Corporation S5520HC/S5520HC, BIOS
>> S5500.86B.01.10.0025.030220091519 03/02/2009
>>    0000000000000000 ffff88066fd03a70 ffffffff8155855f 0000000000000000
>>    0000000000000000 ffff88066fd03ab0 ffffffff81062803 0000058061318ec8
>>    ffff88065d1e39c0 ffff880661318e40 0000000000000000 ffff880661318ec8
>> Call Trace:
>>    <IRQ>  [<ffffffff8155855f>] dump_stack+0x67/0x98
>> Checking out fil [<ffffffff81062803>] __warn+0xd3/0xf0
>>    [<ffffffff810628ed>] warn_slowpath_null+0x1d/0x20
>>    [<ffffffff81aa48f0>] udp_queue_rcv_skb+0x3d0/0x660
>>    [<ffffffff81aa505c>] __udp4_lib_rcv+0x4dc/0xc00
>>    [<ffffffff81aa5b5a>] udp_rcv+0x1a/0x20
>>    [<ffffffff81a728a1>] ip_local_deliver_finish+0xd1/0x2e0
>> es:  57% (30585/ [<ffffffff81a7280f>] ? ip_local_deliver_finish+0x3f/0x2e0
>>    [<ffffffff81a73262>] ip_local_deliver+0xc2/0xd0
>>    [<ffffffff81a72c92>] ip_rcv_finish+0x1e2/0x5a0
>>    [<ffffffff81a7354c>] ip_rcv+0x2dc/0x410
>>    [<ffffffff81a20a32>] ? __pskb_pull_tail+0x82/0x400
>>    [<ffffffff81a2e188>] __netif_receive_skb_core+0x3a8/0xa80
>>    [<ffffffff81a30b9b>] ? netif_receive_skb_internal+0x1b/0xf0
>>    [<ffffffff81a30b3d>] __netif_receive_skb+0x1d/0x60
>>    [<ffffffff81a30bd5>] netif_receive_skb_internal+0x55/0xf0
>>    [<ffffffff81a30b9b>] ? netif_receive_skb_internal+0x1b/0xf0
>>    [<ffffffff81a31b52>] napi_gro_receive+0xc2/0x180
>>    [<ffffffff8187188a>] igb_poll+0x5ea/0xdf0
>>    [<ffffffff81a32b9c>] net_rx_action+0x15c/0x3d0
>>    [<ffffffff81c668c1>] __do_softirq+0x161/0x413
>>    [<ffffffff810683a1>] irq_exit+0xd1/0x110
>>    [<ffffffff81c664d2>] do_IRQ+0x62/0xf0
>>    [<ffffffff81c6474e>] common_interrupt+0x8e/0x8e
>>    <EOI>  [<ffffffff8198d9c6>] ? cpuidle_enter_state+0xc6/0x290
>>    [<ffffffff8198dbc7>] cpuidle_enter+0x17/0x20
>>    [<ffffffff810aa963>] call_cpuidle+0x33/0x50
>>    [<ffffffff810aace9>] cpu_startup_entry+0x229/0x3b0
>>    [<ffffffff810407e4>] start_secondary+0x144/0x150
>> ---[ end trace ba508c424f0d52bf ]---
>>
>>
>> The warning is triggered by commit
>> fafc4e1ea1a4c1eb13a30c9426fb799f5efacbc3 ("sock: tigthen lockdep checks
>> for sock_owned_by_user"), which checks if slock is held before locking
>> "owned".
>>
>> It looks good to lock_sock which is just called lock_sock_nested. But,
>> bh_lock_sock is different, which just calls spin_lock so it doesn't
>> touch dep_map then the check will fail even though it is locked.
>
> ?? spin_lock() definitely is lockdep friendly.

Yes, this is what I thought too. But, I didn't figure out why the 
warning was still reported even though spin_lock is called.

>
>>
>> So, I'm wondering what a right fix for it should be:
>>
>> 1. Replace bh_lock_sock to bh_lock_sock_nested in the protocols
>> implementation, but there are a lot places calling it.
>>
>> 2. Just like lock_sock, just call bh_lock_sock_nested instead of spin_lock.
>>
>> Or the both approach is wrong or not ideal?
>
> I sent a patch yesterday, I am not sure what the status is.

Thanks for the patch. I just found your original patch and the 
discussion with Valdis. I think I ran into the same problem. There is 
kernel BUG is triggered before the warning, but "lockdep is off" 
information is not printed out, although is it really off.

Just tried your patch, it works for me.

Thanks,
Yang

>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index d997ec13a643..db8301c76d50 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -1350,7 +1350,8 @@ static inline bool lockdep_sock_is_held(const struct sock *csk)
>   {
>   	struct sock *sk = (struct sock *)csk;
>
> -	return lockdep_is_held(&sk->sk_lock) ||
> +	return !debug_locks ||
> +	       lockdep_is_held(&sk->sk_lock) ||
>   	       lockdep_is_held(&sk->sk_lock.slock);
>   }
>   #endif
>
>
>
>
>

^ permalink raw reply

* Re: [PATCH net-next] soreuseport: Resolve merge conflict for v4/v6 ordering fix
From: Eric Dumazet @ 2016-04-25 17:33 UTC (permalink / raw)
  To: Craig Gallek; +Cc: davem, netdev
In-Reply-To: <1461595332-16994-1-git-send-email-kraigatgoog@gmail.com>

On Mon, 2016-04-25 at 10:42 -0400, Craig Gallek wrote:
> From: Craig Gallek <kraig@google.com>
...
>  static inline void __sk_nulls_add_node_rcu(struct sock *sk, struct hlist_nulls_head *list)
> diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
> index fcadb670f50b..b76b0d7e59c1 100644
> --- a/net/ipv4/inet_hashtables.c
> +++ b/net/ipv4/inet_hashtables.c
> @@ -479,7 +479,11 @@ int __inet_hash(struct sock *sk, struct sock *osk,
>  		if (err)
>  			goto unlock;
>  	}
> -	hlist_add_head_rcu(&sk->sk_node, &ilb->head);
> +	if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport &&
> +		sk->sk_family == AF_INET6)

Nit : alignment was wrong here.

cond1 & cond2 should be aligned as in :

if (cond1 &&
    cond2)


> +		hlist_add_tail_rcu(&sk->sk_node, &ilb->head);
> +	else
> +		hlist_add_head_rcu(&sk->sk_node, &ilb->head);
>  	sock_set_flag(sk, SOCK_RCU_FREE);
>  	sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1);
>  unlock:

^ permalink raw reply

* [PATCH net-next 1/2] net: SOCKWQ_ASYNC_NOSPACE optimizations
From: Eric Dumazet @ 2016-04-25 17:39 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
In-Reply-To: <1461605974-4242-1-git-send-email-edumazet@google.com>

SOCKWQ_ASYNC_NOSPACE is tested in sock_wake_async()
so that a SIGIO signal is sent when needed.

tcp_sendmsg() clears the bit.
tcp_poll() sets the bit when stream is not writeable.

We can avoid two atomic operations by first checking if socket
is actually interested in the FASYNC business (most sockets in
real applications do not use AIO, but select()/poll()/epoll())

This also removes one cache line miss to access sk->sk_wq->flags
in tcp_sendmsg()

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sock.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index d63b8494124e..0f48aad9f8e8 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1940,11 +1940,17 @@ static inline unsigned long sock_wspace(struct sock *sk)
  */
 static inline void sk_set_bit(int nr, struct sock *sk)
 {
+	if (nr == SOCKWQ_ASYNC_NOSPACE && !sock_flag(sk, SOCK_FASYNC))
+		return;
+
 	set_bit(nr, &sk->sk_wq_raw->flags);
 }
 
 static inline void sk_clear_bit(int nr, struct sock *sk)
 {
+	if (nr == SOCKWQ_ASYNC_NOSPACE && !sock_flag(sk, SOCK_FASYNC))
+		return;
+
 	clear_bit(nr, &sk->sk_wq_raw->flags);
 }
 
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related

* [PATCH v2 net-next] tcp-tso: do not split TSO packets at retransmit time
From: Eric Dumazet @ 2016-04-25 17:39 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
In-Reply-To: <1461605974-4242-1-git-send-email-edumazet@google.com>

Linux TCP stack painfully segments all TSO/GSO packets before retransmits.

This was fine back in the days when TSO/GSO were emerging, with their
bugs, but we believe the dark age is over.

Keeping big packets in write queues, but also in stack traversal
has a lot of benefits.
 - Less memory overhead, because write queues have less skbs
 - Less cpu overhead at ACK processing.
 - Better SACK processing, as lot of studies mentioned how
   awful linux was at this ;)
 - Less cpu overhead to send the rtx packets
   (IP stack traversal, netfilter traversal, drivers...)
 - Better latencies in presence of losses.
 - Smaller spikes in fq like packet schedulers, as retransmits
   are not constrained by TCP Small Queues.

1 % packet losses are common today, and at 100Gbit speeds, this
translates to ~80,000 losses per second.
Losses are often correlated, and we see many retransmit events
leading to 1-MSS train of packets, at the time hosts are already
under stress.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
---
 include/net/tcp.h     |  4 ++--
 net/ipv4/tcp_input.c  |  2 +-
 net/ipv4/tcp_output.c | 64 +++++++++++++++++++++++----------------------------
 net/ipv4/tcp_timer.c  |  4 ++--
 4 files changed, 34 insertions(+), 40 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index fd40f8c64d5f..0dc272dcd772 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -538,8 +538,8 @@ __u32 cookie_v6_init_sequence(const struct sk_buff *skb, __u16 *mss);
 void __tcp_push_pending_frames(struct sock *sk, unsigned int cur_mss,
 			       int nonagle);
 bool tcp_may_send_now(struct sock *sk);
-int __tcp_retransmit_skb(struct sock *, struct sk_buff *);
-int tcp_retransmit_skb(struct sock *, struct sk_buff *);
+int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs);
+int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs);
 void tcp_retransmit_timer(struct sock *sk);
 void tcp_xmit_retransmit_queue(struct sock *);
 void tcp_simple_retransmit(struct sock *);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 90e0d9256b74..729e489b5608 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5543,7 +5543,7 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack,
 	if (data) { /* Retransmit unacked data in SYN */
 		tcp_for_write_queue_from(data, sk) {
 			if (data == tcp_send_head(sk) ||
-			    __tcp_retransmit_skb(sk, data))
+			    __tcp_retransmit_skb(sk, data, 1))
 				break;
 		}
 		tcp_rearm_rto(sk);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 6451b83d81e9..4876b256a70a 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -2266,7 +2266,7 @@ void tcp_send_loss_probe(struct sock *sk)
 	if (WARN_ON(!skb || !tcp_skb_pcount(skb)))
 		goto rearm_timer;
 
-	if (__tcp_retransmit_skb(sk, skb))
+	if (__tcp_retransmit_skb(sk, skb, 1))
 		goto rearm_timer;
 
 	/* Record snd_nxt for loss detection. */
@@ -2551,17 +2551,17 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *to,
  * state updates are done by the caller.  Returns non-zero if an
  * error occurred which prevented the send.
  */
-int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
+int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
 {
-	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_connection_sock *icsk = inet_csk(sk);
+	struct tcp_sock *tp = tcp_sk(sk);
 	unsigned int cur_mss;
-	int err;
+	int diff, len, err;
+
 
-	/* Inconslusive MTU probe */
-	if (icsk->icsk_mtup.probe_size) {
+	/* Inconclusive MTU probe */
+	if (icsk->icsk_mtup.probe_size)
 		icsk->icsk_mtup.probe_size = 0;
-	}
 
 	/* Do not sent more than we queued. 1/4 is reserved for possible
 	 * copying overhead: fragmentation, tunneling, mangling etc.
@@ -2594,30 +2594,27 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 	    TCP_SKB_CB(skb)->seq != tp->snd_una)
 		return -EAGAIN;
 
-	if (skb->len > cur_mss) {
-		if (tcp_fragment(sk, skb, cur_mss, cur_mss, GFP_ATOMIC))
+	len = cur_mss * segs;
+	if (skb->len > len) {
+		if (tcp_fragment(sk, skb, len, cur_mss, GFP_ATOMIC))
 			return -ENOMEM; /* We'll try again later. */
 	} else {
-		int oldpcount = tcp_skb_pcount(skb);
+		if (skb_unclone(skb, GFP_ATOMIC))
+			return -ENOMEM;
 
-		if (unlikely(oldpcount > 1)) {
-			if (skb_unclone(skb, GFP_ATOMIC))
-				return -ENOMEM;
-			tcp_init_tso_segs(skb, cur_mss);
-			tcp_adjust_pcount(sk, skb, oldpcount - tcp_skb_pcount(skb));
-		}
+		diff = tcp_skb_pcount(skb);
+		tcp_set_skb_tso_segs(skb, cur_mss);
+		diff -= tcp_skb_pcount(skb);
+		if (diff)
+			tcp_adjust_pcount(sk, skb, diff);
+		if (skb->len < cur_mss)
+			tcp_retrans_try_collapse(sk, skb, cur_mss);
 	}
 
 	/* RFC3168, section 6.1.1.1. ECN fallback */
 	if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN_ECN) == TCPHDR_SYN_ECN)
 		tcp_ecn_clear_syn(sk, skb);
 
-	tcp_retrans_try_collapse(sk, skb, cur_mss);
-
-	/* Make a copy, if the first transmission SKB clone we made
-	 * is still in somebody's hands, else make a clone.
-	 */
-
 	/* make sure skb->data is aligned on arches that require it
 	 * and check if ack-trimming & collapsing extended the headroom
 	 * beyond what csum_start can cover.
@@ -2633,20 +2630,22 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
 	}
 
 	if (likely(!err)) {
+		segs = tcp_skb_pcount(skb);
+
 		TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS;
 		/* Update global TCP statistics. */
-		TCP_INC_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS);
+		TCP_ADD_STATS(sock_net(sk), TCP_MIB_RETRANSSEGS, segs);
 		if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN)
 			NET_INC_STATS_BH(sock_net(sk), LINUX_MIB_TCPSYNRETRANS);
-		tp->total_retrans++;
+		tp->total_retrans += segs;
 	}
 	return err;
 }
 
-int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb)
+int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
-	int err = __tcp_retransmit_skb(sk, skb);
+	int err = __tcp_retransmit_skb(sk, skb, segs);
 
 	if (err == 0) {
 #if FASTRETRANS_DEBUG > 0
@@ -2737,6 +2736,7 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
 
 	tcp_for_write_queue_from(skb, sk) {
 		__u8 sacked = TCP_SKB_CB(skb)->sacked;
+		int segs;
 
 		if (skb == tcp_send_head(sk))
 			break;
@@ -2744,14 +2744,8 @@ void tcp_xmit_retransmit_queue(struct sock *sk)
 		if (!hole)
 			tp->retransmit_skb_hint = skb;
 
-		/* Assume this retransmit will generate
-		 * only one packet for congestion window
-		 * calculation purposes.  This works because
-		 * tcp_retransmit_skb() will chop up the
-		 * packet to be MSS sized and all the
-		 * packet counting works out.
-		 */
-		if (tcp_packets_in_flight(tp) >= tp->snd_cwnd)
+		segs = tp->snd_cwnd - tcp_packets_in_flight(tp);
+		if (segs <= 0)
 			return;
 
 		if (fwd_rexmitting) {
@@ -2788,7 +2782,7 @@ begin_fwd:
 		if (sacked & (TCPCB_SACKED_ACKED|TCPCB_SACKED_RETRANS))
 			continue;
 
-		if (tcp_retransmit_skb(sk, skb))
+		if (tcp_retransmit_skb(sk, skb, segs))
 			return;
 
 		NET_INC_STATS_BH(sock_net(sk), mib_idx);
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 49bc474f8e35..373b03e78aaa 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -404,7 +404,7 @@ void tcp_retransmit_timer(struct sock *sk)
 			goto out;
 		}
 		tcp_enter_loss(sk);
-		tcp_retransmit_skb(sk, tcp_write_queue_head(sk));
+		tcp_retransmit_skb(sk, tcp_write_queue_head(sk), 1);
 		__sk_dst_reset(sk);
 		goto out_reset_timer;
 	}
@@ -436,7 +436,7 @@ void tcp_retransmit_timer(struct sock *sk)
 
 	tcp_enter_loss(sk);
 
-	if (tcp_retransmit_skb(sk, tcp_write_queue_head(sk)) > 0) {
+	if (tcp_retransmit_skb(sk, tcp_write_queue_head(sk), 1) > 0) {
 		/* Retransmission failed because of local congestion,
 		 * do not backoff.
 		 */
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related

* [PATCH net-next 0/2] net: avoid some atomic ops when FASYNC is not used
From: Eric Dumazet @ 2016-04-25 17:39 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet

We can avoid some atomic operations on sockets not using FASYNC

Eric Dumazet (2):
  net: SOCKWQ_ASYNC_NOSPACE optimizations
  net: SOCKWQ_ASYNC_WAITDATA optimizations

 include/net/sock.h | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply

* [PATCH] net: ethernet: davinci_emac: Fix devioctl while in fixed link
From: Neil Armstrong @ 2016-04-25 17:41 UTC (permalink / raw)
  To: David S. Miller, Andrew Lunn, Tom Lendacky, Mugunthan V N, netdev,
	linux-kernel
  Cc: Neil Armstrong, Brian Hutchinson

When configured in fixed link, the DaVinci emac driver sets the
priv->phydev to NULL and further ioctl calls to the phy_mii_ioctl()
causes the kernel to crash.

Cc: Brian Hutchinson <b.hutchman@gmail.com>
Fixes: 1bb6aa56bb38 ("net: davinci_emac: Add support for fixed-link PHY")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
---
 drivers/net/ethernet/ti/davinci_emac.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/davinci_emac.c b/drivers/net/ethernet/ti/davinci_emac.c
index 58d58f0..f56d66e 100644
--- a/drivers/net/ethernet/ti/davinci_emac.c
+++ b/drivers/net/ethernet/ti/davinci_emac.c
@@ -1512,7 +1512,10 @@ static int emac_devioctl(struct net_device *ndev, struct ifreq *ifrq, int cmd)
 
 	/* TODO: Add phy read and write and private statistics get feature */
 
-	return phy_mii_ioctl(priv->phydev, ifrq, cmd);
+	if (priv->phydev)
+		return phy_mii_ioctl(priv->phydev, ifrq, cmd);
+	else
+		return -EOPNOTSUPP;
 }
 
 static int match_first_device(struct device *dev, void *data)
-- 
1.9.1

^ permalink raw reply related

* [PATCH net-next 2/2] net: SOCKWQ_ASYNC_WAITDATA optimizations
From: Eric Dumazet @ 2016-04-25 17:39 UTC (permalink / raw)
  To: David S . Miller; +Cc: netdev, Eric Dumazet, Eric Dumazet
In-Reply-To: <1461605974-4242-1-git-send-email-edumazet@google.com>

SOCKWQ_ASYNC_WAITDATA is set/cleared in sk_wait_data()
and equivalent functions, so that sock_wake_async() can send
a SIGIO only when necessary.

Since these atomic operations are really not needed unless
socket expressed interest in FASYNC, we can omit them in most
cases.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sock.h | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 0f48aad9f8e8..3df778ccaa82 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1940,7 +1940,8 @@ static inline unsigned long sock_wspace(struct sock *sk)
  */
 static inline void sk_set_bit(int nr, struct sock *sk)
 {
-	if (nr == SOCKWQ_ASYNC_NOSPACE && !sock_flag(sk, SOCK_FASYNC))
+	if ((nr == SOCKWQ_ASYNC_NOSPACE || nr == SOCKWQ_ASYNC_WAITDATA) &&
+	    !sock_flag(sk, SOCK_FASYNC))
 		return;

 	set_bit(nr, &sk->sk_wq_raw->flags);
@@ -1948,7 +1949,8 @@ static inline void sk_set_bit(int nr, struct sock *sk)

 static inline void sk_clear_bit(int nr, struct sock *sk)
 {
-	if (nr == SOCKWQ_ASYNC_NOSPACE && !sock_flag(sk, SOCK_FASYNC))
+	if ((nr == SOCKWQ_ASYNC_NOSPACE || nr == SOCKWQ_ASYNC_WAITDATA) &&
+	    !sock_flag(sk, SOCK_FASYNC))
 		return;

 	clear_bit(nr, &sk->sk_wq_raw->flags);
-- 
2.8.0.rc3.226.g39d4020

^ permalink raw reply related

* Re: [PATCH v2 net-next] tcp-tso: do not split TSO packets at retransmit time
From: Eric Dumazet @ 2016-04-25 17:43 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller; +Cc: netdev
In-Reply-To: <1461605974-4242-3-git-send-email-edumazet@google.com>

On Mon, 2016-04-25 at 10:39 -0700, Eric Dumazet wrote:
> Linux TCP stack painfully segments all TSO/GSO packets before retransmits.
> 

humpf.

> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> ---
>  include/net/tcp.h     |  4 ++--
>  net/ipv4/tcp_input.c  |  2 +-
>  net/ipv4/tcp_output.c | 64 +++++++++++++++++++++++----------------------------
>  net/ipv4/tcp_timer.c  |  4 ++--
>  4 files changed, 34 insertions(+), 40 deletions(-)

Please ignore, I forgot to clean this one, it was already merged.

Sorry for the noise.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: ethernet: enc28j60: add device tree support
From: Michael Heimpold @ 2016-04-25 17:46 UTC (permalink / raw)
  To: Andrew F. Davis
  Cc: Jonathan Cameron, Mark Brown, netdev, devicetree, Rob Herring,
	Pawel Moll, Mark Rutland, Ian Campbell, Kumar Gala
In-Reply-To: <571E3A3D.5080602@ti.com>

Hi,

Am Monday 25 April 2016, 10:39:41 schrieben Sie:
> On 04/24/2016 04:28 PM, Michael Heimpold wrote:
> > The following patch adds the required match table for device tree support
> > (and while at, fix the indent). It's also possible to specify the
> > MAC address in the DT blob.
> > 
> > Also add the corresponding binding documentation file.
> > 
> > Signed-off-by: Michael Heimpold <mhei@heimpold.de>
> > ---
> > 
> > v2: * took care of Arnd Bergmann's review comments
> > 
> >       - allow to specify MAC address via DT
> >       - unconditionally define DT id table
> >     
> >     * increased the driver version minor number
> >     * driver author's email address bounces, removed from address list
> >  
> >  .../devicetree/bindings/net/microchip-enc28j60.txt | 50
> >  ++++++++++++++++++++++ drivers/net/ethernet/microchip/enc28j60.c        
> >   | 20 +++++++--
> >  2 files changed, 67 insertions(+), 3 deletions(-)
> >  create mode 100644
> >  Documentation/devicetree/bindings/net/microchip-enc28j60.txt> 
> > diff --git a/Documentation/devicetree/bindings/net/microchip-enc28j60.txt
> > b/Documentation/devicetree/bindings/net/microchip-enc28j60.txt new file
> > mode 100644
> > index 0000000..847a97b
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/net/microchip-enc28j60.txt
> > @@ -0,0 +1,50 @@
> > +* Microchip ENC28J60
> > +
> > +This is a standalone 10 MBit ethernet controller with SPI interface.
> > +
> > +For each device connected to a SPI bus, define a child node within
> > +the SPI master node.
> > +
> > +Required properties:
> > +- compatible: Should be "microchip,enc28j60"
> > +- reg: Specify the SPI chip select the ENC28J60 is wired to
> > +- interrupts: Specify the interrupt and interrupt type (usually falling
> > edge) +
> > +Optional properties:
> > +- interrupt-parent: Specify the pHandle of the source interrupt
> > +- spi-max-frequency: Maximum frequency of the SPI bus when accessing the
> > ENC28J60. +  According to the ENC28J80 datasheet, the chip allows a
> > maximum of 20 MHz, however, +  board designs may need to limit this
> > value.
> > +- local-mac-address: See ethernet.txt in the same directory.
> > +
> > +
> > +Example (for NXP i.MX28 with pin control stuff for GPIO irq):
> > +
> > +        ssp2: ssp@80014000 {
> > +                compatible = "fsl,imx28-spi";
> > +                pinctrl-names = "default";
> > +                pinctrl-0 = <&spi2_pins_b &spi2_sck_cfg>;
> > +                status = "okay";
> > +
> > +                enc28j60: ethernet@0 {
> > +                        compatible = "microchip,enc28j60";
> > +                        pinctrl-names = "default";
> > +                        pinctrl-0 = <&enc28j60_pins>;
> > +                        reg = <0>;
> > +                        interrupt-parent = <&gpio3>;
> > +                        interrupts = <3 IRQ_TYPE_EDGE_FALLING>;
> > +                        spi-max-frequency = <12000000>;
> > +                };
> > +        };
> > +
> > +        pinctrl@80018000 {
> > +                enc28j60_pins: enc28j60_pins@0 {
> > +                        reg = <0>;
> > +                        fsl,pinmux-ids = <
> > +                                MX28_PAD_AUART0_RTS__GPIO_3_3    /*
> > Interrupt */ +                        >;
> > +                        fsl,drive-strength = <MXS_DRIVE_4mA>;
> > +                        fsl,voltage = <MXS_VOLTAGE_HIGH>;
> > +                        fsl,pull-up = <MXS_PULL_DISABLE>;
> > +                };
> > +        };
> > diff --git a/drivers/net/ethernet/microchip/enc28j60.c
> > b/drivers/net/ethernet/microchip/enc28j60.c index b723622..7066954 100644
> > --- a/drivers/net/ethernet/microchip/enc28j60.c
> > +++ b/drivers/net/ethernet/microchip/enc28j60.c
> > @@ -28,11 +28,12 @@
> > 
> >  #include <linux/skbuff.h>
> >  #include <linux/delay.h>
> >  #include <linux/spi/spi.h>
> > 
> > +#include <linux/of_net.h>
> > 
> >  #include "enc28j60_hw.h"
> >  
> >  #define DRV_NAME	"enc28j60"
> > 
> > -#define DRV_VERSION	"1.01"
> > +#define DRV_VERSION	"1.02"
> > 
> >  #define SPI_OPLEN	1
> > 
> > @@ -1548,6 +1549,7 @@ static int enc28j60_probe(struct spi_device *spi)
> > 
> >  {
> >  
> >  	struct net_device *dev;
> >  	struct enc28j60_net *priv;
> > 
> > +	const void *macaddr;
> > 
> >  	int ret = 0;
> >  	
> >  	if (netif_msg_drv(&debug))
> > 
> > @@ -1579,7 +1581,12 @@ static int enc28j60_probe(struct spi_device *spi)
> > 
> >  		ret = -EIO;
> >  		goto error_irq;
> >  	
> >  	}
> > 
> > -	eth_hw_addr_random(dev);
> > +
> > +	macaddr = of_get_mac_address(spi->dev.of_node);
> > +	if (macaddr)
> 
> You should also check if it is a valid MAC for Ethernet, recommend:
> 
> if (macaddr && is_valid_ether_addr(macaddr))
> 

But of_get_mac_address already takes care of this, see
http://lxr.free-electrons.com/source/drivers/of/of_net.c#L45
Also it already checks whether spi->dev.of_node is populated at all.
It returns NULL in both error cases.
So I prefered to omit both test here.

Regards,
Michael

> > +		ether_addr_copy(dev->dev_addr, macaddr);
> > +	else
> > +		eth_hw_addr_random(dev);
> > 
> >  	enc28j60_set_hw_macaddr(dev);
> >  	
> >  	/* Board setup must set the relevant edge trigger type;
> > 
> > @@ -1634,9 +1641,16 @@ static int enc28j60_remove(struct spi_device *spi)
> > 
> >  	return 0;
> >  
> >  }
> > 
> > +static const struct of_device_id enc28j60_dt_ids[] = {
> > +	{ .compatible = "microchip,enc28j60" },
> > +	{ /* sentinel */ }
> > +};
> > +MODULE_DEVICE_TABLE(of, enc28j60_dt_ids);
> > +
> > 
> >  static struct spi_driver enc28j60_driver = {
> >  
> >  	.driver = {
> > 
> > -		   .name = DRV_NAME,
> > +		.name = DRV_NAME,
> > +		.of_match_table = enc28j60_dt_ids,
> > 
> >  	 },
> >  	
> >  	.probe = enc28j60_probe,
> >  	.remove = enc28j60_remove,

^ permalink raw reply

* Re: [PATCH] [RFC] net: dsa: mv88e6xxx: Pre-initialize err in mv88e6xxx_port_bridge_join()
From: Geert Uytterhoeven @ 2016-04-25 17:53 UTC (permalink / raw)
  To: Vivien Didelot
  Cc: David S. Miller, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <87vb35a942.fsf@ketchup.mtl.sfl>

Hi Vivien,

On Mon, Apr 25, 2016 at 7:31 PM, Vivien Didelot
<vivien.didelot@savoirfairelinux.com> wrote:
> Geert Uytterhoeven <geert@linux-m68k.org> writes:
>> On Mon, Apr 25, 2016 at 5:03 PM, Vivien Didelot
>> <vivien.didelot@savoirfairelinux.com> wrote:
>>> Geert Uytterhoeven <geert@linux-m68k.org> writes:
>>>> drivers/net/dsa/mv88e6xxx.c: In function ‘mv88e6xxx_port_bridge_join’:
>>>> drivers/net/dsa/mv88e6xxx.c:2184: warning: ‘err’ may be used uninitialized in this function
>>>
>>> Interesting, I don't have those warnings on 207afda1b5036009...
>>
>> It depends on the compiler version (still using 4.1.2) and options.
>>
>>>> If netdev_notifier_changeupper_info.upper_dev is ever NULL, the bridge
>>>> parameter will be NULL too, and the function will return an
>>>> uninitialized value.
>>>>
>>>> Pre-initialize err to zero to fix this.
>>>>
>>>> Fixes: 207afda1b5036009 ("net: dsa: mv88e6xxx: share the same default FDB")
>>>> Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
>>>> ---
>>>> Can this actually happen?
>>>
>>> bridge cannot be NULL here. Also ps->ports[port].bridge_dev is assigned
>>> to it before entering the for loop, so _mv88e6xxx_port_based_vlan_map
>>> will be called at least for this port.
>>
>> But there's no way the compiler can know that...
>
> Or maybe it can in new configurations. Anyway, this fix doesn't hurt,
> with a relevant commit message, I'd ack it.

What would you consider a relevant commit message?

Thanks!

Gr{oetje,eeting}s,

                        Geert

^ permalink raw reply

* Re: [PATCH v3 00/18] wcn36xx fixes
From: Kalle Valo @ 2016-04-25 17:56 UTC (permalink / raw)
  To: Bjorn Andersson
  Cc: Eugene Krasnikov, Pontus Fuchs, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <1461042056-10607-1-git-send-email-bjorn.andersson@linaro.org>

Bjorn Andersson <bjorn.andersson@linaro.org> writes:

> The bulk of the following patches have been sitting in Eugene's Github tree for
> quite some time. They fix various issues existing in the mainline drivers, so
> they should be merged there too.
>
> Also included are two new fixes, of my own; the important one being the
> reordering of deletion of the bss, as this crashes the firmware on the
> Dragonbaord 410c (apq8016 with pronto & wcn3620).
>
> Lastly is a patch that adds a bunch of new capabilities found in the downstream
> driver.
>
> Changes since v2:
> - Restore BEACON_TEMPLATE_SIZE to not break UPDATE_PROBE_RSP_TEMPLATE_REQ
> - Added patch to correct WCN36XX_HAL_RMV_BSSKEY_RSP decoder
> - Added patch with missing capabilities from downstream
>
> Changes since v1:
> - Reorder patch 6 and 7 to not break the build temporarily
> - Inline fix from Jason Mobarak in the TIM PVM padding
>
> Bjorn Andersson (3):
>   wcn36xx: Delete BSS before idling link
>   wcn36xx: Correct remove bss key response encoding
>   wcn36xx: Fill in capability list
>
> Pontus Fuchs (15):
>   wcn36xx: Clean up wcn36xx_smd_send_beacon
>   wcn36xx: Pad TIM PVM if needed
>   wcn36xx: Add helper macros to cast vif to private vif and vice versa
>   wcn36xx: Use consistent name for private vif
>   wcn36xx: Use define for invalid index and fix typo
>   wcn36xx: Add helper macros to cast sta to priv
>   wcn36xx: Fetch private sta data from sta entry instead of from vif
>   wcn36xx: Remove sta pointer in private vif struct
>   wcn36xx: Parse trigger_ba response properly
>   wcn36xx: Copy all members in config_sta v1 conversion
>   wcn36xx: Use allocated self sta index instead of hard coded
>   wcn36xx: Clear encrypt_type when deleting bss key
>   wcn36xx: Track association state
>   wcn36xx: Implement multicast filtering
>   wcn36xx: Use correct command struct for EXIT_BMPS_REQ

All applied, thanks.

-- 
Kalle Valo

^ permalink raw reply

* Re: [PATCH v2 net-next] net: ethernet: enc28j60: add device tree support
From: Andrew F. Davis @ 2016-04-25 18:04 UTC (permalink / raw)
  To: Michael Heimpold
  Cc: Jonathan Cameron, Mark Brown, netdev-u79uwXL29TY76Z2rM5mHXA,
	devicetree-u79uwXL29TY76Z2rM5mHXA, Rob Herring, Pawel Moll,
	Mark Rutland, Ian Campbell, Kumar Gala
In-Reply-To: <7030050.tME0DkDS0E@kerker>

On 04/25/2016 12:46 PM, Michael Heimpold wrote:
> Hi,
> 
> Am Monday 25 April 2016, 10:39:41 schrieben Sie:
>> On 04/24/2016 04:28 PM, Michael Heimpold wrote:
>>> -	eth_hw_addr_random(dev);
>>> +
>>> +	macaddr = of_get_mac_address(spi->dev.of_node);
>>> +	if (macaddr)
>>
>> You should also check if it is a valid MAC for Ethernet, recommend:
>>
>> if (macaddr && is_valid_ether_addr(macaddr))
>>
> 
> But of_get_mac_address already takes care of this, see
> http://lxr.free-electrons.com/source/drivers/of/of_net.c#L45
> Also it already checks whether spi->dev.of_node is populated at all.
> It returns NULL in both error cases.
> So I prefered to omit both test here.
> 

Ah, missed that, no problem here then.

Hmm, I wonder how many other drivers then do this check needlessly..
Time to fire-up Coccinelle :)

Andrew
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH net v2 3/3] gre: allow creation of gretap interfaces in metadata mode
From: pravin shelar @ 2016-04-25 18:00 UTC (permalink / raw)
  To: Jiri Benc
  Cc: Linux Kernel Network Developers, Pravin B Shelar, Thomas Graf,
	Simon Horman
In-Reply-To: <4917335385a75586c1f13b6a6d570d4cc5b0b132.1461495411.git.jbenc@redhat.com>

On Sun, Apr 24, 2016 at 4:00 AM, Jiri Benc <jbenc@redhat.com> wrote:
> The IFLA_GRE_REMOTE attribute does not make sense together with collect
> metadata and is ignored in such case. However, iproute2 always sets it; it
> will be zero if there's no remote address specified on the command line.
>
> Remove the check for non-zero IFLA_GRE_REMOTE when collect medata flag is
> set.
>
Rather than cover up in ip_gre kernel module, why not just fix
iproute2 to set the attribute correctly?


> Before the patch, this command returns failure, after the patch, it works as
> expected:
>
> ip link add gre1 type gretap external
>
> Fixes: 2e15ea390e6f4 ("ip_gre: Add support to collect tunnel metadata.")
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
> New in v2.
> ---

>  net/ipv4/ip_gre.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>

^ permalink raw reply

* Re: [PATCH V3] net: stmmac: socfpga: Remove re-registration of reset controller
From: Joachim Eastwood @ 2016-04-25 18:11 UTC (permalink / raw)
  To: Marek Vasut
  Cc: netdev, peppe.cavallaro, alexandre.torgue, Matthew Gerlach,
	Dinh Nguyen, David S . Miller
In-Reply-To: <1461240710-5381-1-git-send-email-marex@denx.de>

Hi Marek,

On 21 April 2016 at 14:11, Marek Vasut <marex@denx.de> wrote:
> Both socfpga_dwmac_parse_data() in dwmac-socfpga.c and stmmac_dvr_probe()
> in stmmac_main.c functions call devm_reset_control_get() to register an
> reset controller for the stmmac. This results in an attempt to register
> two reset controllers for the same non-shared reset line.
>
> The first attempt to register the reset controller works fine. The second
> attempt fails with warning from the reset controller core, see below.
> The warning is produced because the reset line is non-shared and thus
> it is allowed to have only up-to one reset controller associated with
> that reset line, not two or more.
>
> The solution has multiple parts. First, the original socfpga_dwmac_init()
> is tweaked to use reset controller pointer from the stmmac_priv (private
> data of the stmmac core) instead of the local instance, which was used
> before. The local re-registration of the reset controller is removed.
>
> Next, the socfpga_dwmac_init() is moved after stmmac_dvr_probe() in the
> probe function. This order is legal according to Altera and it makes the
> code much easier, since there is no need to temporarily register and
> unregister the reset controller ; the reset controller is already registered
> by the stmmac_dvr_probe().
>
> Finally, plat_dat->exit and socfpga_dwmac_exit() is no longer necessary,
> since the functionality is already performed by the stmmac core.

I am trying to rebase my changes on top of your two patches and
noticed a couple of things.

>  static int socfpga_dwmac_init(struct platform_device *pdev, void *priv)
>  {
> -       struct socfpga_dwmac    *dwmac = priv;
> +       struct socfpga_dwmac *dwmac = priv;
>         struct net_device *ndev = platform_get_drvdata(pdev);
>         struct stmmac_priv *stpriv = NULL;
>         int ret = 0;
>
> -       if (ndev)
> -               stpriv = netdev_priv(ndev);
> +       if (!ndev)
> +               return -EINVAL;

ndev can never be NULL here. socfpga_dwmac_init() is only called if
stmmac_dvr_probe() succeeds or we are running the resume callback. So
I don't see how this could ever be NULL.


> +
> +       stpriv = netdev_priv(ndev);

It's not really nice to access 'stmmac_priv' as it should be private
to the core driver, but I don't see any other good solution right now.


> +       if (!stpriv)
> +               return -EINVAL;
>
>         /* Assert reset to the enet controller before changing the phy mode */
> -       if (dwmac->stmmac_rst)
> -               reset_control_assert(dwmac->stmmac_rst);
> +       if (stpriv->stmmac_rst)
> +               reset_control_assert(stpriv->stmmac_rst);
>
>         /* Setup the phy mode in the system manager registers according to
>          * devicetree configuration
> @@ -227,8 +210,8 @@ static int socfpga_dwmac_init(struct platform_device *pdev, void *priv)
>         /* Deassert reset for the phy configuration to be sampled by
>          * the enet controller, and operation to start in requested mode
>          */
> -       if (dwmac->stmmac_rst)
> -               reset_control_deassert(dwmac->stmmac_rst);
> +       if (stpriv->stmmac_rst)
> +               reset_control_deassert(stpriv->stmmac_rst);
>
>         /* Before the enet controller is suspended, the phy is suspended.
>          * This causes the phy clock to be gated. The enet controller is
> @@ -245,7 +228,7 @@ static int socfpga_dwmac_init(struct platform_device *pdev, void *priv)
>          * control register 0, and can be modified by the phy driver
>          * framework.
>          */
> -       if (stpriv && stpriv->phydev)
> +       if (stpriv->phydev)
>                 phy_resume(stpriv->phydev);

Before this change phy_resume() was only called during driver resume
when , but your patches cause phy_resume() to called at probe time as
well. Is this okey?


regards,
Joachim Eastwood

^ permalink raw reply

* [PATCH] net: fix net_gso_ok for new GSO types.
From: Marcelo Ricardo Leitner @ 2016-04-25 18:13 UTC (permalink / raw)
  To: netdev

Fix casting in net_gso_ok. Otherwise the shift on
gso_type << NETIF_F_GSO_SHIFT may hit the 32th bit and make it look like
a INT_MIN, which is then promoted from signed to uint64 which is
0xffffffff80000000, resulting in wrong behavior when it is and'ed with
the feature itself, as in:

This test app:
#include <stdio.h>
#include <stdint.h>

int main(int argc, char **argv)
{
	uint64_t feature1;
	uint64_t feature2;
	int gso_type = 1 << 15;

	feature1 = gso_type << 16;
	feature2 = (uint64_t)gso_type << 16;
	printf("%lx %lx\n", feature1, feature2);

	return 0;
}

Gives:
ffffffff80000000 80000000

So that this:
   return (features & feature) == feature;
Actually works on more bits than expected and invalid ones.

Fix is to promote it earlier.

Issue noted while rebasing SCTP GSO patch but posting separetely as
someone else may experience this meanwhile.

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---

Dave I couldn't find a good way to indent this that wouldn't be uglier
than a bit long line, but let me know if you prefer otherwise.

 include/linux/netdevice.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1f6d5db471a24f7f23bbe425c359d64adb5dfd67..13ee05f71e3d74c9d2feef6c371547a4b2f82879 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3991,7 +3991,7 @@ netdev_features_t netif_skb_features(struct sk_buff *skb);

 static inline bool net_gso_ok(netdev_features_t features, int gso_type)
 {
-	netdev_features_t feature = gso_type << NETIF_F_GSO_SHIFT;
+	netdev_features_t feature = (netdev_features_t)gso_type << NETIF_F_GSO_SHIFT;

 	/* check flags correspondence */
 	BUILD_BUG_ON(SKB_GSO_TCPV4   != (NETIF_F_TSO >> NETIF_F_GSO_SHIFT));
-- 
2.5.0

^ permalink raw reply related

* Re: [PATCH RFC] b43: stop hardcoding LED behavior
From: Lucas Stach @ 2016-04-25 18:21 UTC (permalink / raw)
  To: Michael Büsch; +Cc: Kalle Valo, netdev, linux-wireless, b43-dev
In-Reply-To: <20160425175326.407eba27@wiggum>

Am Montag, den 25.04.2016, 17:53 +0200 schrieb Michael Büsch:
> On Mon, 25 Apr 2016 09:40:51 +0200
> Lucas Stach <dev@lynxeye.de> wrote:
> 
> > 
> > On my system the SPROM correctly defines the only wired LED (radio)
> > but
> > skips all others, leading to the hardcode to register LEDs with RX
> > and TX
> > triggers.
> Hm ok. It probably is a good idea to change the condition from
> 
> if (sprom[led_index] == 0xFF)
> 
> to
> 
> if ((sprom[0] & sprom[1] & sprom[2] & sprom[3]) == 0xFF)
> 
> So the hardcoding only happens if there is no LED configured in the
> SPROM. (I think my card does this (see below), but I can check that
> later)
> 
> 
> > 
> > These triggers cause many uneccesary CPU wakeups to drive LEDs
> > that aren't even present in the system, reducing battery runtime.
> 
> Numbers please. Did you measure that is actually causes more
> _wakeups_?
> How many?
> The led work is placed in the mac80211 workqueue and LED updates only
> happen on behalf of mac80211 activities (by default). It only causes
> additional wakeups, if there's nothing else scheduled on the
> workqueue
> anyways (which might well be the case. So we need numbers. :)
> 
The blinking LEDs use a timer to enforce a constant blink rate at a
50ms on/off interval. While they are only triggered if there is some
RX/TX activity in the system, they cause up to 20 wakeups/second/led.
As the timers used for LED activity aren't deferrable, this hardcode is
causing 40 unnecessary CPU wakeups/s in my system.

> 
> > 
> > Remove the hardcode to stop it from doing any harm. If this code is
> > useful
> > for others it should probably be reworked as a quirk table
> > triggering only
> > for individual systems that need it.
> 
> There are cards that need it. I don't know how many that are, but I
> own
> an older 4306 PC-Card card that needs this.
> 
> So this effectively is a regression for this card.
> 
> So I don't think this is acceptable.
> You should at least make this configurable via module parameter or
> such.
> Or maybe the change from above already is enough. It should work for
> your case.
> 
There are some people that find those kinds of blinking LEDs
distracting, so a module parameter to disable them altogether might be
a good thing to have. Causing CPU wakeups in a system where those LEDs
aren't even physically populated is clearly undesired behavior.

If checking that the SPROM doesn't define any LED behavior is enough to
not regress your use case, I would be glad to rework the patch
accordingly.

Regards,
Lucas
> 
> > 
> > Signed-off-by: Lucas Stach <dev@lynxeye.de>
> > ---
> >  drivers/net/wireless/broadcom/b43/leds.c | 26 ++----------------
> > --------
> >  1 file changed, 2 insertions(+), 24 deletions(-)
> > 
> > diff --git a/drivers/net/wireless/broadcom/b43/leds.c
> > b/drivers/net/wireless/broadcom/b43/leds.c
> > index d79ab2a..77d2dad 100644
> > --- a/drivers/net/wireless/broadcom/b43/leds.c
> > +++ b/drivers/net/wireless/broadcom/b43/leds.c
> > @@ -224,31 +224,9 @@ static void b43_led_get_sprominfo(struct
> > b43_wldev *dev,
> >  
> >  	if (sprom[led_index] == 0xFF) {
> >  		/* There is no LED information in the SPROM
> > -		 * for this LED. Hardcode it here. */
> > +		 * for this LED. Keep it disabled. */
> >  		*activelow = false;
> > -		switch (led_index) {
> > -		case 0:
> > -			*behaviour = B43_LED_ACTIVITY;
> > -			*activelow = true;
> > -			if (dev->dev->board_vendor ==
> > PCI_VENDOR_ID_COMPAQ)
> > -				*behaviour = B43_LED_RADIO_ALL;
> > -			break;
> > -		case 1:
> > -			*behaviour = B43_LED_RADIO_B;
> > -			if (dev->dev->board_vendor ==
> > PCI_VENDOR_ID_ASUSTEK)
> > -				*behaviour = B43_LED_ASSOC;
> > -			break;
> > -		case 2:
> > -			*behaviour = B43_LED_RADIO_A;
> > -			break;
> > -		case 3:
> > -			*behaviour = B43_LED_OFF;
> > -			break;
> > -		default:
> > -			*behaviour = B43_LED_OFF;
> > -			B43_WARN_ON(1);
> > -			return;
> > -		}
> > +		*behaviour = B43_LED_OFF;
> >  	} else {
> >  		*behaviour = sprom[led_index] & B43_LED_BEHAVIOUR;
> >  		*activelow = !!(sprom[led_index] &
> > B43_LED_ACTIVELOW);
> 
> 
> 

^ permalink raw reply

* [net-next PATCH 0/8] Fix Tunnel features and enable GSO partial for Mellanox adapters
From: Alexander Duyck @ 2016-04-25 18:30 UTC (permalink / raw)
  To: talal, netdev, davem, galp, ogerlitz, eranbe

This patch series is meant to allow us to get the best performance possible
for Mellanox ConnectX-3/4 adapters in terms of VXLAN tunnels.

The first few patches address issues I found when just trying to collect
performance numbers.  Specifically I was unable to get rates of any more
than 1 or 2 Mb/s if I was using a tunnel that ran over IPv6.  In addition I
found a few other items related to GSO_PARTIAL and the TSO_MANGLEID that
needed to be addressed.

The last 4 patches go through and enable GSO_PARTIAL for tunnels that have
an outer checksum enabled, and then enable IPv6 support where we can.  In
my tests I found that the mlx4 doesn't support outer IPv6 but the mlx5 did
so the code is updated to reflect that in the patches that enable IPv6
support.

---

Alexander Duyck (8):
      net: Disable segmentation if checksumming is not supported
      gso: Only allow GSO_PARTIAL if we can checksum the inner protocol
      net: Fix netdev_fix_features so that TSO_MANGLEID is only available with TSO
      vxlan: Add checksum check to the features check function
      mlx4: Add support for UDP tunnel segmentation with outer checksum offload
      mlx4: Add support for inner IPv6 checksum offloads and TSO
      mlx5e: Add support for UDP tunnel segmentation with outer checksum offload
      mlx5e: Fix IPv6 tunnel checksum offload

 drivers/net/ethernet/mellanox/mlx4/en_netdev.c    |   38 +++++++++++++++++----
 drivers/net/ethernet/mellanox/mlx4/en_tx.c        |   15 +++++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c |   10 ++++--
 include/linux/if_ether.h                          |    5 +++
 include/net/vxlan.h                               |    4 ++
 net/core/dev.c                                    |    6 +++
 net/core/skbuff.c                                 |    6 ++-
 7 files changed, 67 insertions(+), 17 deletions(-)

^ permalink raw reply

* [net-next PATCH 1/8] net: Disable segmentation if checksumming is not supported
From: Alexander Duyck @ 2016-04-25 18:31 UTC (permalink / raw)
  To: talal, netdev, davem, galp, ogerlitz, eranbe
In-Reply-To: <20160425182442.11331.88349.stgit@ahduyck-xeon-server>

In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum
offload for tunnels.  With this being the case we should disable GSO in
addition to the checksum offload features when we find that a device cannot
perform a checksum on a given packet type.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 net/core/dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 6324bc9267f7..d6d9f286c4e1 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2815,7 +2815,7 @@ static netdev_features_t harmonize_features(struct sk_buff *skb,
 
 	if (skb->ip_summed != CHECKSUM_NONE &&
 	    !can_checksum_protocol(features, type)) {
-		features &= ~NETIF_F_CSUM_MASK;
+		features &= ~(NETIF_F_CSUM_MASK | NETIF_F_GSO_MASK);
 	} else if (illegal_highdma(skb->dev, skb)) {
 		features &= ~NETIF_F_SG;
 	}

^ permalink raw reply related

* [net-next PATCH 2/8] gso: Only allow GSO_PARTIAL if we can checksum the inner protocol
From: Alexander Duyck @ 2016-04-25 18:31 UTC (permalink / raw)
  To: talal, netdev, davem, galp, ogerlitz, eranbe
In-Reply-To: <20160425182442.11331.88349.stgit@ahduyck-xeon-server>

This patch addresses a possible issue that can occur if we get into any odd
corner cases where we support TSO for a given protocol but not the checksum
or scatter-gather offload.  There are few drivers floating around that
setup their tunnels this way and by enforcing the checksum piece we can
avoid mangling any frames.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 net/core/skbuff.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7ff7788b0151..d2871d081750 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3080,8 +3080,7 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 	unsigned int headroom;
 	unsigned int len = head_skb->len;
 	__be16 proto;
-	bool csum;
-	int sg = !!(features & NETIF_F_SG);
+	bool csum, sg;
 	int nfrags = skb_shinfo(head_skb)->nr_frags;
 	int err = -ENOMEM;
 	int i = 0;
@@ -3093,13 +3092,14 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
 	if (unlikely(!proto))
 		return ERR_PTR(-EINVAL);
 
+	sg = !!(features & NETIF_F_SG);
 	csum = !!can_checksum_protocol(features, proto);
 
 	/* GSO partial only requires that we trim off any excess that
 	 * doesn't fit into an MSS sized block, so take care of that
 	 * now.
 	 */
-	if (features & NETIF_F_GSO_PARTIAL) {
+	if (sg && csum && (features & NETIF_F_GSO_PARTIAL)) {
 		partial_segs = len / mss;
 		mss *= partial_segs;
 	}

^ permalink raw reply related

* [net-next PATCH 3/8] net: Fix netdev_fix_features so that TSO_MANGLEID is only available with TSO
From: Alexander Duyck @ 2016-04-25 18:31 UTC (permalink / raw)
  To: talal, netdev, davem, galp, ogerlitz, eranbe
In-Reply-To: <20160425182442.11331.88349.stgit@ahduyck-xeon-server>

This change makes it so that we will strip the TSO_MANGLEID bit if TSO is
not present.  This way we will also handle ECN correctly of TSO is not
present.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
---
 net/core/dev.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index d6d9f286c4e1..6a5ef49ed1ab 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6720,6 +6720,10 @@ static netdev_features_t netdev_fix_features(struct net_device *dev,
 		features &= ~NETIF_F_TSO6;
 	}
 
+	/* TSO with IPv4 ID mangling requires IPv4 TSO be enabled */
+	if ((features & NETIF_F_TSO_MANGLEID) && !(features & NETIF_F_TSO))
+		features &= ~NETIF_F_TSO_MANGLEID;
+
 	/* TSO ECN requires that TSO is present as well. */
 	if ((features & NETIF_F_ALL_TSO) == NETIF_F_TSO_ECN)
 		features &= ~NETIF_F_TSO_ECN;

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox