Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Alexei Starovoitov @ 2016-04-01  1:45 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, davem, netdev, sasha.levin, daniel,
	mkubecek
In-Reply-To: <1459473592.6473.243.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, Mar 31, 2016 at 06:19:52PM -0700, Eric Dumazet wrote:
> On Fri, 2016-04-01 at 02:21 +0200, Hannes Frederic Sowa wrote:
> 
> > 
> > [   31.064029] ===============================
> > [   31.064030] [ INFO: suspicious RCU usage. ]
> > [   31.064032] 4.5.0+ #13 Not tainted
> > [   31.064033] -------------------------------
> > [   31.064034] include/net/sock.h:1594 suspicious 
> > rcu_dereference_check() usage!
> > [   31.064035]
> >                 other info that might help us debug this:
> > 
> > [   31.064041]
> >                 rcu_scheduler_active = 1, debug_locks = 1
> > [   31.064042] no locks held by ssh/817.
> > [   31.064043]
> >                 stack backtrace:
> > [   31.064045] CPU: 0 PID: 817 Comm: ssh Not tainted 4.5.0+ #13
> > [   31.064046] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), 
> > BIOS 1.8.2-20150714_191134- 04/01/2014
> > [   31.064047]  0000000000000286 000000006730b46b ffff8800badf7bd0 
> > ffffffff81442b33
> > [   31.064050]  ffff8800b8c78000 0000000000000001 ffff8800badf7c00 
> > ffffffff8110ae75
> > [   31.064052]  ffff880035ea2f00 ffff8800b8e28000 0000000000000003 
> > 00000000000004c4
> > [   31.064054] Call Trace:
> > [   31.064058]  [<ffffffff81442b33>] dump_stack+0x85/0xc2
> > [   31.064062]  [<ffffffff8110ae75>] lockdep_rcu_suspicious+0xc5/0x100
> > [   31.064064]  [<ffffffff8173bf57>] __sk_dst_check+0x77/0xb0
> > [   31.064066]  [<ffffffff8182e502>] inet6_sk_rebuild_header+0x52/0x300
> > [   31.064068]  [<ffffffff813bb61e>] ? selinux_skb_peerlbl_sid+0x5e/0xa0
> > [   31.064070]  [<ffffffff813bb69e>] ? 
> > selinux_inet_conn_established+0x3e/0x40
> > [   31.064072]  [<ffffffff817c2bad>] tcp_finish_connect+0x4d/0x270
> > [   31.064074]  [<ffffffff817c33f7>] tcp_rcv_state_process+0x627/0xe40
> > [   31.064076]  [<ffffffff81866584>] tcp_v6_do_rcv+0xd4/0x410
> > [   31.064078]  [<ffffffff8173bc65>] release_sock+0x85/0x1c0
> > [   31.064079]  [<ffffffff817e9983>] __inet_stream_connect+0x1c3/0x340
> > [   31.064081]  [<ffffffff8173b089>] ? lock_sock_nested+0x49/0xb0
> > [   31.064083]  [<ffffffff81100270>] ? abort_exclusive_wait+0xb0/0xb0
> > [   31.064084]  [<ffffffff817e9b38>] inet_stream_connect+0x38/0x50
> > [   31.064086]  [<ffffffff8173794f>] SYSC_connect+0xcf/0xf0
> > [   31.064088]  [<ffffffff8110d069>] ? trace_hardirqs_on_caller+0x129/0x1b0
> > [   31.064090]  [<ffffffff8100301b>] ? trace_hardirqs_on_thunk+0x1b/0x1d
> > [   31.064091]  [<ffffffff8173854e>] SyS_connect+0xe/0x10
> > [   31.064094]  [<ffffffff818a0e7c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
> > 
> > Bye,
> > Hannes
> 
> Thanks.
> 
> As you can see, release_sock() messes badly lockdep (once your other
> patches are in )
> 
> Once we properly fix release_sock() and/or __release_sock(), all these
> false positives disappear.

+1. Nice catch.

Eric, what's your take on Hannes's patch 2 ?
Is it more accurate to ask lockdep to check for actual lock
or lockdep can rely on owned flag?
Potentially there could be races between setting the flag and
actual lock... but that code is contained, so unlikely.
Will we find the real issues with this 'stronger' check or
just spend a ton of time adapting to new model like your other
patch for release_sock and whatever may need to come next...

^ permalink raw reply

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Eric Dumazet @ 2016-04-01  1:45 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: davem, netdev, sasha.levin, daniel, alexei.starovoitov, mkubecek
In-Reply-To: <1459474756.6473.248.camel@edumazet-glaptop3.roam.corp.google.com>

On Thu, 2016-03-31 at 18:39 -0700, Eric Dumazet wrote:
> On Fri, 2016-04-01 at 03:36 +0200, Hannes Frederic Sowa wrote:
> > On Fri, Apr 1, 2016, at 03:19, Eric Dumazet wrote:
> > > Thanks.
> > > 
> > > As you can see, release_sock() messes badly lockdep (once your other
> > > patches are in )
> > > 
> > > Once we properly fix release_sock() and/or __release_sock(), all these
> > > false positives disappear.
> > 
> > This was a loopback connection. I need to study release_sock and
> > __release_sock more as I cannot currently see an issue with the lockdep
> > handling.
> 
> Okay, please try :
> 
> diff --git a/net/core/sock.c b/net/core/sock.c
> index b67b9aedb230..570dcd91d64e 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2429,10 +2429,6 @@ EXPORT_SYMBOL(lock_sock_nested);
>  
>  void release_sock(struct sock *sk)
>  {
> -	/*
> -	 * The sk_lock has mutex_unlock() semantics:
> -	 */
> -	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
>  
>  	spin_lock_bh(&sk->sk_lock.slock);
>  	if (sk->sk_backlog.tail)
> @@ -2445,6 +2441,10 @@ void release_sock(struct sock *sk)
>  		sk->sk_prot->release_cb(sk);
>  
>  	sock_release_ownership(sk);
> +	/*
> +	 * The sk_lock has mutex_unlock() semantics:
> +	 */
> +	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
>  	if (waitqueue_active(&sk->sk_lock.wq))
>  		wake_up(&sk->sk_lock.wq);
>  	spin_unlock_bh(&sk->sk_lock.slock);

Also take a look at commit c3f9b01849ef3bc69024990092b9f42e20df7797

We might need to include the mutex_release() in sock_release_ownership()

Thanks !

^ permalink raw reply

* RE: [PATCH] fec: Do not access unexisting register in Coldfire
From: Fugang Duan @ 2016-04-01  1:39 UTC (permalink / raw)
  To: Fabio Estevam, davem@davemloft.net
  Cc: troy.kisky@boundarydevices.com, gerg@uclinux.org,
	netdev@vger.kernel.org, Fabio Estevam
In-Reply-To: <1459436717-12809-1-git-send-email-festevam@gmail.com>

From: Fabio Estevam <festevam@gmail.com> Sent: Thursday, March 31, 2016 11:05 PM
> To: davem@davemloft.net
> Cc: Fugang Duan <fugang.duan@nxp.com>; troy.kisky@boundarydevices.com;
> gerg@uclinux.org; netdev@vger.kernel.org; Fabio Estevam
> <fabio.estevam@nxp.com>
> Subject: [PATCH] fec: Do not access unexisting register in Coldfire
> 
> From: Fabio Estevam <fabio.estevam@nxp.com>
> 
> Commit 55cd48c821de ("net: fec: stop the "rcv is not +last, " error
> messages") introduces a write to a register that does not exist in Coldfire.
> 
> Move the FEC_FTRL register access inside the FEC_QUIRK_HAS_RACC 'if' block,
> so that we guarantee it will not be used on Coldfire CPUs.
> 
> Reported-by: Greg Ungerer <gerg@uclinux.org>
> Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
> ---
>  drivers/net/ethernet/freescale/fec_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec_main.c
> b/drivers/net/ethernet/freescale/fec_main.c
> index 37c0815..08243c2 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -943,8 +943,8 @@ fec_restart(struct net_device *ndev)
>  		else
>  			val &= ~FEC_RACC_OPTIONS;
>  		writel(val, fep->hwp + FEC_RACC);
> +		writel(PKT_MAXBUF_SIZE, fep->hwp + FEC_FTRL);
>  	}
> -	writel(PKT_MAXBUF_SIZE, fep->hwp + FEC_FTRL);
>  #endif
> 
>  	/*
> --
> 1.9.1

If you stick to do it like this,  you must add comments on the quirk flag FEC_QUIRK_HAS_RACC. 

^ permalink raw reply

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Hannes Frederic Sowa @ 2016-04-01  1:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, netdev, sasha.levin, daniel, alexei.starovoitov, mkubecek
In-Reply-To: <1459474756.6473.248.camel@edumazet-glaptop3.roam.corp.google.com>

On 01.04.2016 03:39, Eric Dumazet wrote:
> On Fri, 2016-04-01 at 03:36 +0200, Hannes Frederic Sowa wrote:
>> On Fri, Apr 1, 2016, at 03:19, Eric Dumazet wrote:
>>> Thanks.
>>>
>>> As you can see, release_sock() messes badly lockdep (once your other
>>> patches are in )
>>>
>>> Once we properly fix release_sock() and/or __release_sock(), all these
>>> false positives disappear.
>>
>> This was a loopback connection. I need to study release_sock and
>> __release_sock more as I cannot currently see an issue with the lockdep
>> handling.
>
> Okay, please try :
>
> diff --git a/net/core/sock.c b/net/core/sock.c
> index b67b9aedb230..570dcd91d64e 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2429,10 +2429,6 @@ EXPORT_SYMBOL(lock_sock_nested);
>
>   void release_sock(struct sock *sk)
>   {
> -	/*
> -	 * The sk_lock has mutex_unlock() semantics:
> -	 */
> -	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
>
>   	spin_lock_bh(&sk->sk_lock.slock);
>   	if (sk->sk_backlog.tail)
> @@ -2445,6 +2441,10 @@ void release_sock(struct sock *sk)
>   		sk->sk_prot->release_cb(sk);
>
>   	sock_release_ownership(sk);
> +	/*
> +	 * The sk_lock has mutex_unlock() semantics:
> +	 */
> +	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
>   	if (waitqueue_active(&sk->sk_lock.wq))
>   		wake_up(&sk->sk_lock.wq);
>   	spin_unlock_bh(&sk->sk_lock.slock);


Looks much better with your patch already. I slowly begin to understand, 
this is really tricky... :)

Bye,
Hannes

^ permalink raw reply

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Hannes Frederic Sowa @ 2016-04-01  2:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, netdev, sasha.levin, daniel, alexei.starovoitov, mkubecek
In-Reply-To: <1459475124.6473.250.camel@edumazet-glaptop3.roam.corp.google.com>

On 01.04.2016 03:45, Eric Dumazet wrote:
> On Thu, 2016-03-31 at 18:39 -0700, Eric Dumazet wrote:
>> On Fri, 2016-04-01 at 03:36 +0200, Hannes Frederic Sowa wrote:
>>> On Fri, Apr 1, 2016, at 03:19, Eric Dumazet wrote:
>>>> Thanks.
>>>>
>>>> As you can see, release_sock() messes badly lockdep (once your other
>>>> patches are in )
>>>>
>>>> Once we properly fix release_sock() and/or __release_sock(), all these
>>>> false positives disappear.
>>>
>>> This was a loopback connection. I need to study release_sock and
>>> __release_sock more as I cannot currently see an issue with the lockdep
>>> handling.
>>
>> Okay, please try :
>>
>> diff --git a/net/core/sock.c b/net/core/sock.c
>> index b67b9aedb230..570dcd91d64e 100644
>> --- a/net/core/sock.c
>> +++ b/net/core/sock.c
>> @@ -2429,10 +2429,6 @@ EXPORT_SYMBOL(lock_sock_nested);
>>
>>   void release_sock(struct sock *sk)
>>   {
>> -	/*
>> -	 * The sk_lock has mutex_unlock() semantics:
>> -	 */
>> -	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
>>
>>   	spin_lock_bh(&sk->sk_lock.slock);
>>   	if (sk->sk_backlog.tail)
>> @@ -2445,6 +2441,10 @@ void release_sock(struct sock *sk)
>>   		sk->sk_prot->release_cb(sk);
>>
>>   	sock_release_ownership(sk);
>> +	/*
>> +	 * The sk_lock has mutex_unlock() semantics:
>> +	 */
>> +	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
>>   	if (waitqueue_active(&sk->sk_lock.wq))
>>   		wake_up(&sk->sk_lock.wq);
>>   	spin_unlock_bh(&sk->sk_lock.slock);
>
> Also take a look at commit c3f9b01849ef3bc69024990092b9f42e20df7797
>
> We might need to include the mutex_release() in sock_release_ownership()

I thought so first, as well. But given the double check for the 
spin_lock and the "mutex" we end up with the same result for the 
lockdep_sock_is_held check.

Do you see other consequences?

Thanks,
Hannes

^ permalink raw reply

* Re: [PATCH net-next 1/6] net: skbuff: don't use union for napi_id and sender_cpu
From: Jason Wang @ 2016-04-01  2:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem, mst, netdev, linux-kernel
In-Reply-To: <1459420341.6473.225.camel@edumazet-glaptop3.roam.corp.google.com>



On 03/31/2016 06:32 PM, Eric Dumazet wrote:
> On Thu, 2016-03-31 at 13:50 +0800, Jason Wang wrote:
>> We use a union for napi_id and send_cpu, this is ok for most of the
>> cases except when we want to support busy polling for tun which needs
>> napi_id to be stored and passed to socket during tun_net_xmit(). In
>> this case, napi_id was overridden with sender_cpu before tun_net_xmit()
>> was called if XPS was enabled. Fixing by not using union for napi_id
>> and sender_cpu.
>>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>  include/linux/skbuff.h | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> index 15d0df9..8aee891 100644
>> --- a/include/linux/skbuff.h
>> +++ b/include/linux/skbuff.h
>> @@ -743,11 +743,11 @@ struct sk_buff {
>>  	__u32			hash;
>>  	__be16			vlan_proto;
>>  	__u16			vlan_tci;
>> -#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS)
>> -	union {
>> -		unsigned int	napi_id;
>> -		unsigned int	sender_cpu;
>> -	};
>> +#if defined(CONFIG_NET_RX_BUSY_POLL)
>> +	unsigned int		napi_id;
>> +#endif
>> +#if defined(CONFIG_XPS)
>> +	unsigned int		sender_cpu;
>>  #endif
>>  	union {
>>  #ifdef CONFIG_NETWORK_SECMARK
> Hmmm...
>
> This is a serious problem.
>
> Making skb bigger (8 bytes because of alignment) was not considered
> valid for sender_cpu introduction. We worked quite hard to avoid this,
> if you take a look at git history :(
>
> Can you describe more precisely the problem and code path ?
>

The problem is we want to support busy polling for tun. This needs
napi_id to be passed to tun socket by sk_mark_napi_id() during
tun_net_xmit(). But before reaching this, XPS will set sender_cpu will
make us can't see correct napi_id.

^ permalink raw reply

* Re: [PULL REQUEST] Please pull rdma.git
From: David Miller @ 2016-04-01  2:18 UTC (permalink / raw)
  To: leon-2ukJVAZIZ/Y
  Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA,
	matanb-VPRAkNaXOzVWk0Htik3J/w, ogerlitz-VPRAkNaXOzVWk0Htik3J/w,
	leonro-VPRAkNaXOzVWk0Htik3J/w
In-Reply-To: <20160331233828.GE2670-2ukJVAZIZ/Y@public.gmane.org>

From: Leon Romanovsky <leon-2ukJVAZIZ/Y@public.gmane.org>
Date: Fri, 1 Apr 2016 02:38:28 +0300

> On Wed, Mar 23, 2016 at 09:37:54AM -0400, Doug Ledford wrote:
>> On 03/23/2016 06:57 AM, Leon Romanovsky wrote:
>> > On Sat, Mar 19, 2016 at 02:37:08PM -0700, Linus Torvalds wrote:
>> >> So the *best* situation would be:
>> >>
>> >>  - your two groups talk it over, and figure out what the common commits are
>> >>
>> >>  - you put those common commits as a "base" branch in git
>> >>
>> >>  - you ask the two upper-level maintainers to both pull that base branch
>> >>
>> >>  - you then make sure that you send the later patches (whether as
>> >> emailed patches or as pull requests) based on top of that base branch.
>> > 
>> > Hi David and Doug,
>> > 
>> > Are you OK with the approach suggested by Linus?
>> > We are eager to know it, so we will adopt it as soon
>> > as possible in our development flow.
>> > 
>> > The original thread [1].
>> > 
>> > [1] http://thread.gmane.org/gmane.linux.drivers.rdma/34907
>> > 
>> > Thanks.
>> > 
>> 
>> I'm fine with it.  Since I happen to use topic branches to build my
>> for-next from anyway, I might need to be the one that Dave pulls from
>> versus the other way around.
> 
> Resending to linux-netdev.
> 
> David,
> Can you please express your opinion about Linus's suggestion to
> eliminate merge conflicts in Mellanox related products?

Sure, sounds fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: qdisc spin lock
From: David Miller @ 2016-04-01  2:19 UTC (permalink / raw)
  To: make0818; +Cc: xiyou.wangcong, netdev
In-Reply-To: <CAAmHdhxagKnLP1_5ZW7HTsVBu0TSFYKCvNstAEWN-NHrdnvvVQ@mail.gmail.com>

From: Michael Ma <make0818@gmail.com>
Date: Thu, 31 Mar 2016 16:48:43 -0700

> I didn't really know that multiple qdiscs can be isolated using MQ so
 ...

Please stop top-posting.

^ permalink raw reply

* Re: Reboot Failed was occurred after v4.4-rc1 during IPv6 Ready Logo Conformance Test
From: Yuki Machida @ 2016-04-01  2:40 UTC (permalink / raw)
  To: netdev
In-Reply-To: <56FCCD18.9050306@jp.fujitsu.com>


On 2016年03月31日 16:09, Yuki Machida wrote:
> Hi all,
> 
> Reboot Failed was occurred at Linux Kernel after v4.4-rc1 during IPv6 Ready Logo Conformance Test.
> Not Fix a bug in v4.5-rc7 yet.
> I will conform that it fix a bug in v4.6-rc1.
I conform that it was not occurred in v4.6-rc1.

I will find a patch for this bug, and conform it was applied -stable kernel.
If it was not applied -stable kernel, I will report again.

> Currently, it is under investigation.
> 
> IPv6 Ready Logo
> https://www.ipv6ready.org/
> TAHI Project
> http://www.tahi.org/
> 
> I ran the IPv6 Ready Logo Core Conformance Test on Intel D510MO (Atom D510).
> It is using userland build with yocto project.
> 
> Test Environment
> Test Specification          : 4.0.6
> Tool Version                : REL_3_3_2
> Test Program Version        : V6LC_5_0_0
> Target Device               : Intel D510MO (Atom D510)
> 
> I conform that random testcases are failed.
> (e.g. No.5, No.130, No.131, No.134, No.167 and No.168)
> 
> Regards,
> Yuki Machida
> 

^ permalink raw reply

* Re: [PATCH net-next 1/6] net: skbuff: don't use union for napi_id and sender_cpu
From: Jason Wang @ 2016-04-01  2:46 UTC (permalink / raw)
  To: David Miller, eric.dumazet; +Cc: mst, netdev, linux-kernel
In-Reply-To: <20160331.160115.1737831060132252055.davem@davemloft.net>



On 04/01/2016 04:01 AM, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Thu, 31 Mar 2016 03:32:21 -0700
>
>> On Thu, 2016-03-31 at 13:50 +0800, Jason Wang wrote:
>>> We use a union for napi_id and send_cpu, this is ok for most of the
>>> cases except when we want to support busy polling for tun which needs
>>> napi_id to be stored and passed to socket during tun_net_xmit(). In
>>> this case, napi_id was overridden with sender_cpu before tun_net_xmit()
>>> was called if XPS was enabled. Fixing by not using union for napi_id
>>> and sender_cpu.
>>>
>>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>> ---
>>>  include/linux/skbuff.h | 10 +++++-----
>>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>>> index 15d0df9..8aee891 100644
>>> --- a/include/linux/skbuff.h
>>> +++ b/include/linux/skbuff.h
>>> @@ -743,11 +743,11 @@ struct sk_buff {
>>>  	__u32			hash;
>>>  	__be16			vlan_proto;
>>>  	__u16			vlan_tci;
>>> -#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS)
>>> -	union {
>>> -		unsigned int	napi_id;
>>> -		unsigned int	sender_cpu;
>>> -	};
>>> +#if defined(CONFIG_NET_RX_BUSY_POLL)
>>> +	unsigned int		napi_id;
>>> +#endif
>>> +#if defined(CONFIG_XPS)
>>> +	unsigned int		sender_cpu;
>>>  #endif
>>>  	union {
>>>  #ifdef CONFIG_NETWORK_SECMARK
>> Hmmm...
>>
>> This is a serious problem.
>>
>> Making skb bigger (8 bytes because of alignment) was not considered
>> valid for sender_cpu introduction. We worked quite hard to avoid this,
>> if you take a look at git history :(
>>
>> Can you describe more precisely the problem and code path ?
> From what I can see they are doing busy poll loops in the TX code paths,
> as well as the RX code paths, of vhost.
>
> Doing this in the TX side makes little sense to me.  The busy poll
> implementations in the drivers only process their RX queues when
> ->ndo_busy_poll() is invoked.  So I wonder what this is accomplishing
> for the vhost TX case?

In vhost TX case, it's possible that new packets were arrived at rx
queue during tx polling. Consider tx and rx were processed in one
thread, poll rx looks feasible to me.

^ permalink raw reply

* Re: [PATCH net-next 1/6] net: skbuff: don't use union for napi_id and sender_cpu
From: Eric Dumazet @ 2016-04-01  2:55 UTC (permalink / raw)
  To: Jason Wang; +Cc: davem, mst, netdev, linux-kernel
In-Reply-To: <56FDD94E.8090908@redhat.com>

On Fri, 2016-04-01 at 10:13 +0800, Jason Wang wrote:


> 
> The problem is we want to support busy polling for tun. This needs
> napi_id to be passed to tun socket by sk_mark_napi_id() during
> tun_net_xmit(). But before reaching this, XPS will set sender_cpu will
> make us can't see correct napi_id.
> 

Looks like napi_id should have precedence then ?

Only forwarding should allow the field to be cleared to allow XPS to do
its job.

Maybe skb_sender_cpu_clear() was removed too early (commit
64d4e3431e686dc37ce388ba531c4c4e866fb141)

Look, it is 8pm here, I am pretty sure a solution can be found,
but I am also need to take a break, I started at 3am today...

^ permalink raw reply

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Eric Dumazet @ 2016-04-01  3:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Hannes Frederic Sowa, davem, netdev, sasha.levin, daniel,
	mkubecek
In-Reply-To: <20160401014516.GA11017@ast-mbp.thefacebook.com>

On Thu, 2016-03-31 at 18:45 -0700, Alexei Starovoitov wrote:

> Eric, what's your take on Hannes's patch 2 ?
> Is it more accurate to ask lockdep to check for actual lock
> or lockdep can rely on owned flag?
> Potentially there could be races between setting the flag and
> actual lock... but that code is contained, so unlikely.
> Will we find the real issues with this 'stronger' check or
> just spend a ton of time adapting to new model like your other
> patch for release_sock and whatever may need to come next...

More precise lockdep checks are certainly good, I only objected to 4/4
trying to work around another bug.

But why do we rush for 'net' tree ?

This looks net-next material to me.

Locking changes are often subtle, lets take the time to do them
properly.

^ permalink raw reply

* Re: [PULL REQUEST] Please pull rdma.git
From: Leon Romanovsky @ 2016-04-01  2:46 UTC (permalink / raw)
  To: David Miller
  Cc: dledford, torvalds, linux-rdma, netdev, matanb, ogerlitz, leonro
In-Reply-To: <20160331.221847.53272383417094737.davem@davemloft.net>

On Thu, Mar 31, 2016 at 10:18:47PM -0400, David Miller wrote:
> From: Leon Romanovsky <leon@leon.nu>
> Date: Fri, 1 Apr 2016 02:38:28 +0300
> 
> > On Wed, Mar 23, 2016 at 09:37:54AM -0400, Doug Ledford wrote:
> >> On 03/23/2016 06:57 AM, Leon Romanovsky wrote:
> >> > On Sat, Mar 19, 2016 at 02:37:08PM -0700, Linus Torvalds wrote:
> >> >> So the *best* situation would be:
> >> >>
> >> >>  - your two groups talk it over, and figure out what the common commits are
> >> >>
> >> >>  - you put those common commits as a "base" branch in git
> >> >>
> >> >>  - you ask the two upper-level maintainers to both pull that base branch
> >> >>
> >> >>  - you then make sure that you send the later patches (whether as
> >> >> emailed patches or as pull requests) based on top of that base branch.
> >> > 
> >> > Hi David and Doug,
> >> > 
> >> > Are you OK with the approach suggested by Linus?
> >> > We are eager to know it, so we will adopt it as soon
> >> > as possible in our development flow.
> >> > 
> >> > The original thread [1].
> >> > 
> >> > [1] http://thread.gmane.org/gmane.linux.drivers.rdma/34907
> >> > 
> >> > Thanks.
> >> > 
> >> 
> >> I'm fine with it.  Since I happen to use topic branches to build my
> >> for-next from anyway, I might need to be the one that Dave pulls from
> >> versus the other way around.
> > 
> > Resending to linux-netdev.
> > 
> > David,
> > Can you please express your opinion about Linus's suggestion to
> > eliminate merge conflicts in Mellanox related products?
> 
> Sure, sounds fine.

Thank you, I appreciate a lot Doug's and your openness and
willingness to help us eliminate the future merge obstacles.

^ permalink raw reply

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Hannes Frederic Sowa @ 2016-04-01  3:06 UTC (permalink / raw)
  To: Eric Dumazet, Alexei Starovoitov
  Cc: davem, netdev, sasha.levin, daniel, mkubecek
In-Reply-To: <1459479818.6473.265.camel@edumazet-glaptop3.roam.corp.google.com>

On Fri, Apr 1, 2016, at 05:03, Eric Dumazet wrote:
> On Thu, 2016-03-31 at 18:45 -0700, Alexei Starovoitov wrote:
> 
> > Eric, what's your take on Hannes's patch 2 ?
> > Is it more accurate to ask lockdep to check for actual lock
> > or lockdep can rely on owned flag?
> > Potentially there could be races between setting the flag and
> > actual lock... but that code is contained, so unlikely.
> > Will we find the real issues with this 'stronger' check or
> > just spend a ton of time adapting to new model like your other
> > patch for release_sock and whatever may need to come next...
> 
> More precise lockdep checks are certainly good, I only objected to 4/4
> trying to work around another bug.
> 
> But why do we rush for 'net' tree ?
> 
> This looks net-next material to me.
> 
> Locking changes are often subtle, lets take the time to do them
> properly.

I certainly can see my mistake now trying to paper over the splats. Do
you object if I send the first patches to fix up the reported lockdep?

^ permalink raw reply

* [v7, 0/5] Fix eSDHC host version register bug
From: Yangbo Lu @ 2016-04-01  3:07 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-clk-u79uwXL29TY76Z2rM5mHXA,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mmc-u79uwXL29TY76Z2rM5mHXA
  Cc: ulf.hansson-QSEj5FYQhm4dnm+yROfE0A, Zhao Qiang, Russell King,
	Yangbo Lu, Bhupesh Sharma, Santosh Shilimkar, Jochen Friedrich,
	scott.wood-3arQi8VN3Tc, Rob Herring, Claudiu Manoil, Kumar Gala,
	leoyang.li-3arQi8VN3Tc, xiaobo.xie-3arQi8VN3Tc

This patchset is used to fix a host version register bug in the T4240-R1.0-R2.0
eSDHC controller. To get the SoC version and revision, it's needed to add the
GUTS driver to access the global utilities registers.

So, the first three patches are to add the GUTS driver.
The following two patches are to enable GUTS driver support to get SVR in eSDHC
driver and fix host version for T4240.

Yangbo Lu (5):
  ARM64: dts: ls2080a: add device configuration node
  soc: fsl: add GUTS driver for QorIQ platforms
  dt: move guts devicetree doc out of powerpc directory
  powerpc/fsl: move mpc85xx.h to include/linux/fsl
  mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0

 .../bindings/{powerpc => soc}/fsl/guts.txt         |   3 +
 arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi     |   6 ++
 arch/powerpc/kernel/cpu_setup_fsl_booke.S          |   2 +-
 drivers/clk/clk-qoriq.c                            |   3 +-
 drivers/i2c/busses/i2c-mpc.c                       |   2 +-
 drivers/iommu/fsl_pamu.c                           |   3 +-
 drivers/mmc/host/Kconfig                           |   1 +
 drivers/mmc/host/sdhci-of-esdhc.c                  |  23 ++++
 drivers/net/ethernet/freescale/gianfar.c           |   2 +-
 drivers/soc/Kconfig                                |   2 +-
 drivers/soc/fsl/Kconfig                            |   8 ++
 drivers/soc/fsl/Makefile                           |   1 +
 drivers/soc/fsl/guts.c                             | 119 +++++++++++++++++++++
 include/linux/fsl/guts.h                           |  98 ++++++++---------
 .../asm/mpc85xx.h => include/linux/fsl/svr.h       |   4 +-
 15 files changed, 219 insertions(+), 58 deletions(-)
 rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/guts.txt (91%)
 create mode 100644 drivers/soc/fsl/Kconfig
 create mode 100644 drivers/soc/fsl/guts.c
 rename arch/powerpc/include/asm/mpc85xx.h => include/linux/fsl/svr.h (97%)

-- 
2.1.0.27.g96db324

^ permalink raw reply

* [v7, 1/5] ARM64: dts: ls2080a: add device configuration node
From: Yangbo Lu @ 2016-04-01  3:07 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-clk-u79uwXL29TY76Z2rM5mHXA,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mmc-u79uwXL29TY76Z2rM5mHXA
  Cc: ulf.hansson-QSEj5FYQhm4dnm+yROfE0A, Zhao Qiang, Russell King,
	Yangbo Lu, Bhupesh Sharma, Santosh Shilimkar, Jochen Friedrich,
	scott.wood-3arQi8VN3Tc, Rob Herring, Claudiu Manoil, Kumar Gala,
	leoyang.li-3arQi8VN3Tc, xiaobo.xie-3arQi8VN3Tc
In-Reply-To: <1459480051-3701-1-git-send-email-yangbo.lu-3arQi8VN3Tc@public.gmane.org>

Add the dts node for device configuration unit that provides
general purpose configuration and status for the device.

Signed-off-by: Yangbo Lu <yangbo.lu-3arQi8VN3Tc@public.gmane.org>
---
Changes for v2:
	- None
Changes for v3:
	- None
Changes for v4:
	- None
Changes for v5:
	- Added this patch
Changes for v6:
	- None
Changes for v7:
	- None
---
 arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
index 9d746c6..8724cf1 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls2080a.dtsi
@@ -191,6 +191,12 @@
 			clocks = <&sysclk>;
 		};
 
+		dcfg: dcfg@1e00000 {
+			compatible = "fsl,ls2080a-dcfg", "syscon";
+			reg = <0x0 0x1e00000 0x0 0x10000>;
+			little-endian;
+		};
+
 		serial0: serial@21c0500 {
 			compatible = "fsl,ns16550", "ns16550a";
 			reg = <0x0 0x21c0500 0x0 0x100>;
-- 
2.1.0.27.g96db324

^ permalink raw reply related

* [v7, 2/5] soc: fsl: add GUTS driver for QorIQ platforms
From: Yangbo Lu @ 2016-04-01  3:07 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-clk-u79uwXL29TY76Z2rM5mHXA,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mmc-u79uwXL29TY76Z2rM5mHXA
  Cc: ulf.hansson-QSEj5FYQhm4dnm+yROfE0A, Zhao Qiang, Russell King,
	Yangbo Lu, Bhupesh Sharma, Santosh Shilimkar, Jochen Friedrich,
	scott.wood-3arQi8VN3Tc, Rob Herring, Claudiu Manoil, Kumar Gala,
	leoyang.li-3arQi8VN3Tc, xiaobo.xie-3arQi8VN3Tc
In-Reply-To: <1459480051-3701-1-git-send-email-yangbo.lu-3arQi8VN3Tc@public.gmane.org>

The global utilities block controls power management, I/O device
enabling, power-onreset(POR) configuration monitoring, alternate
function selection for multiplexed signals,and clock control.

This patch adds GUTS driver to manage and access global utilities
block.

Signed-off-by: Yangbo Lu <yangbo.lu-3arQi8VN3Tc@public.gmane.org>
---
Changes for v2:
	- None
Changes for v3:
	- None
Changes for v4:
	- Added this patch
Changes for v5:
	- Modified copyright info
	- Changed MODULE_LICENSE to GPL
	- Changed EXPORT_SYMBOL_GPL to EXPORT_SYMBOL
	- Made FSL_GUTS user-invisible
	- Added a complete compatible list for GUTS
	- Stored guts info in file-scope variable
	- Added mfspr() getting SVR
	- Redefined GUTS APIs
	- Called fsl_guts_init rather than using platform driver
	- Removed useless parentheses
	- Removed useless 'extern' key words
Changes for v6:
	- Made guts thread safe in fsl_guts_init
Changes for v7:
	- Removed 'ifdef' for function declaration in guts.h
---
 drivers/soc/Kconfig      |   2 +-
 drivers/soc/fsl/Kconfig  |   8 ++++
 drivers/soc/fsl/Makefile |   1 +
 drivers/soc/fsl/guts.c   | 119 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fsl/guts.h |  98 +++++++++++++++++++-------------------
 5 files changed, 179 insertions(+), 49 deletions(-)
 create mode 100644 drivers/soc/fsl/Kconfig
 create mode 100644 drivers/soc/fsl/guts.c

diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
index cb58ef0..7106463 100644
--- a/drivers/soc/Kconfig
+++ b/drivers/soc/Kconfig
@@ -2,7 +2,7 @@ menu "SOC (System On Chip) specific Drivers"
 
 source "drivers/soc/bcm/Kconfig"
 source "drivers/soc/brcmstb/Kconfig"
-source "drivers/soc/fsl/qe/Kconfig"
+source "drivers/soc/fsl/Kconfig"
 source "drivers/soc/mediatek/Kconfig"
 source "drivers/soc/qcom/Kconfig"
 source "drivers/soc/rockchip/Kconfig"
diff --git a/drivers/soc/fsl/Kconfig b/drivers/soc/fsl/Kconfig
new file mode 100644
index 0000000..b313759
--- /dev/null
+++ b/drivers/soc/fsl/Kconfig
@@ -0,0 +1,8 @@
+#
+# Freescale SOC drivers
+#
+
+source "drivers/soc/fsl/qe/Kconfig"
+
+config FSL_GUTS
+	bool
diff --git a/drivers/soc/fsl/Makefile b/drivers/soc/fsl/Makefile
index 203307f..02afb7f 100644
--- a/drivers/soc/fsl/Makefile
+++ b/drivers/soc/fsl/Makefile
@@ -4,3 +4,4 @@
 
 obj-$(CONFIG_QUICC_ENGINE)		+= qe/
 obj-$(CONFIG_CPM)			+= qe/
+obj-$(CONFIG_FSL_GUTS)			+= guts.o
diff --git a/drivers/soc/fsl/guts.c b/drivers/soc/fsl/guts.c
new file mode 100644
index 0000000..fa155e6
--- /dev/null
+++ b/drivers/soc/fsl/guts.c
@@ -0,0 +1,119 @@
+/*
+ * Freescale QorIQ Platforms GUTS Driver
+ *
+ * Copyright (C) 2016 Freescale Semiconductor, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/mutex.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+#include <linux/fsl/guts.h>
+
+struct guts {
+	struct ccsr_guts __iomem *regs;
+	bool little_endian;
+};
+
+static struct guts *guts;
+static DEFINE_MUTEX(guts_lock);
+
+u32 fsl_guts_get_svr(void)
+{
+	u32 svr = 0;
+
+	if (!guts || !guts->regs) {
+#ifdef CONFIG_PPC
+		svr =  mfspr(SPRN_SVR);
+#endif
+		return svr;
+	}
+
+	if (guts->little_endian)
+		svr = ioread32(&guts->regs->svr);
+	else
+		svr = ioread32be(&guts->regs->svr);
+
+	return svr;
+}
+EXPORT_SYMBOL(fsl_guts_get_svr);
+
+/*
+ * Table for matching compatible strings, for device tree
+ * guts node, for Freescale QorIQ SOCs.
+ */
+static const struct of_device_id guts_of_match[] = {
+	/* For T4 & B4 Series SOCs */
+	{ .compatible = "fsl,qoriq-device-config-1.0", },
+	/* For P Series SOCs */
+	{ .compatible = "fsl,qoriq-device-config-2.0", },
+	{ .compatible = "fsl,p1010-guts", },
+	{ .compatible = "fsl,p1020-guts", },
+	{ .compatible = "fsl,p1021-guts", },
+	{ .compatible = "fsl,p1022-guts", },
+	{ .compatible = "fsl,p1023-guts", },
+	{ .compatible = "fsl,p2020-guts", },
+	/* For BSC Series SOCs */
+	{ .compatible = "fsl,bsc9131-guts", },
+	{ .compatible = "fsl,bsc9132-guts", },
+	/* For MPC85xx Series SOCs */
+	{ .compatible = "fsl,mpc8536-guts", },
+	{ .compatible = "fsl,mpc8544-guts", },
+	{ .compatible = "fsl,mpc8548-guts", },
+	{ .compatible = "fsl,mpc8568-guts", },
+	{ .compatible = "fsl,mpc8569-guts", },
+	{ .compatible = "fsl,mpc8572-guts", },
+	/* For Layerscape Series SOCs */
+	{ .compatible = "fsl,ls1021a-dcfg", },
+	{ .compatible = "fsl,ls1043a-dcfg", },
+	{ .compatible = "fsl,ls2080a-dcfg", },
+	{}
+};
+
+int fsl_guts_init(void)
+{
+	struct device_node *np;
+	int ret;
+
+	mutex_lock(&guts_lock);
+	/* Initialize guts only once */
+	if (guts) {
+		ret = guts->regs ? 0 : -ENOMEM;
+		goto out;
+	}
+
+	np = of_find_matching_node(NULL, guts_of_match);
+	if (!np) {
+		ret = -ENODEV;
+		goto out;
+	}
+
+	guts = kzalloc(sizeof(*guts), GFP_KERNEL);
+	if (!guts) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	guts->little_endian = of_property_read_bool(np, "little-endian");
+
+	guts->regs = of_iomap(np, 0);
+	if (!guts->regs) {
+		ret = -ENOMEM;
+		kfree(guts);
+		goto out;
+	}
+
+	of_node_put(np);
+	ret = 0;
+out:
+	mutex_unlock(&guts_lock);
+	return ret;
+}
+EXPORT_SYMBOL(fsl_guts_init);
diff --git a/include/linux/fsl/guts.h b/include/linux/fsl/guts.h
index 649e917..ac3103b 100644
--- a/include/linux/fsl/guts.h
+++ b/include/linux/fsl/guts.h
@@ -29,83 +29,85 @@
  * #ifdefs.
  */
 struct ccsr_guts {
-	__be32	porpllsr;	/* 0x.0000 - POR PLL Ratio Status Register */
-	__be32	porbmsr;	/* 0x.0004 - POR Boot Mode Status Register */
-	__be32	porimpscr;	/* 0x.0008 - POR I/O Impedance Status and Control Register */
-	__be32	pordevsr;	/* 0x.000c - POR I/O Device Status Register */
-	__be32	pordbgmsr;	/* 0x.0010 - POR Debug Mode Status Register */
-	__be32	pordevsr2;	/* 0x.0014 - POR device status register 2 */
+	u32	porpllsr;	/* 0x.0000 - POR PLL Ratio Status Register */
+	u32	porbmsr;	/* 0x.0004 - POR Boot Mode Status Register */
+	u32	porimpscr;	/* 0x.0008 - POR I/O Impedance Status and Control Register */
+	u32	pordevsr;	/* 0x.000c - POR I/O Device Status Register */
+	u32	pordbgmsr;	/* 0x.0010 - POR Debug Mode Status Register */
+	u32	pordevsr2;	/* 0x.0014 - POR device status register 2 */
 	u8	res018[0x20 - 0x18];
-	__be32	porcir;		/* 0x.0020 - POR Configuration Information Register */
+	u32	porcir;		/* 0x.0020 - POR Configuration Information Register */
 	u8	res024[0x30 - 0x24];
-	__be32	gpiocr;		/* 0x.0030 - GPIO Control Register */
+	u32	gpiocr;		/* 0x.0030 - GPIO Control Register */
 	u8	res034[0x40 - 0x34];
-	__be32	gpoutdr;	/* 0x.0040 - General-Purpose Output Data Register */
+	u32	gpoutdr;	/* 0x.0040 - General-Purpose Output Data Register */
 	u8	res044[0x50 - 0x44];
-	__be32	gpindr;		/* 0x.0050 - General-Purpose Input Data Register */
+	u32	gpindr;		/* 0x.0050 - General-Purpose Input Data Register */
 	u8	res054[0x60 - 0x54];
-	__be32	pmuxcr;		/* 0x.0060 - Alternate Function Signal Multiplex Control */
-        __be32  pmuxcr2;	/* 0x.0064 - Alternate function signal multiplex control 2 */
-        __be32  dmuxcr;		/* 0x.0068 - DMA Mux Control Register */
+	u32	pmuxcr;		/* 0x.0060 - Alternate Function Signal Multiplex Control */
+	u32	pmuxcr2;	/* 0x.0064 - Alternate function signal multiplex control 2 */
+	u32	dmuxcr;		/* 0x.0068 - DMA Mux Control Register */
         u8	res06c[0x70 - 0x6c];
-	__be32	devdisr;	/* 0x.0070 - Device Disable Control */
+	u32	devdisr;	/* 0x.0070 - Device Disable Control */
 #define CCSR_GUTS_DEVDISR_TB1	0x00001000
 #define CCSR_GUTS_DEVDISR_TB0	0x00004000
-	__be32	devdisr2;	/* 0x.0074 - Device Disable Control 2 */
+	u32	devdisr2;	/* 0x.0074 - Device Disable Control 2 */
 	u8	res078[0x7c - 0x78];
-	__be32  pmjcr;		/* 0x.007c - 4 Power Management Jog Control Register */
-	__be32	powmgtcsr;	/* 0x.0080 - Power Management Status and Control Register */
-	__be32  pmrccr;		/* 0x.0084 - Power Management Reset Counter Configuration Register */
-	__be32  pmpdccr;	/* 0x.0088 - Power Management Power Down Counter Configuration Register */
-	__be32  pmcdr;		/* 0x.008c - 4Power management clock disable register */
-	__be32	mcpsumr;	/* 0x.0090 - Machine Check Summary Register */
-	__be32	rstrscr;	/* 0x.0094 - Reset Request Status and Control Register */
-	__be32  ectrstcr;	/* 0x.0098 - Exception reset control register */
-	__be32  autorstsr;	/* 0x.009c - Automatic reset status register */
-	__be32	pvr;		/* 0x.00a0 - Processor Version Register */
-	__be32	svr;		/* 0x.00a4 - System Version Register */
+	u32	pmjcr;		/* 0x.007c - 4 Power Management Jog Control Register */
+	u32	powmgtcsr;	/* 0x.0080 - Power Management Status and Control Register */
+	u32	pmrccr;		/* 0x.0084 - Power Management Reset Counter Configuration Register */
+	u32	pmpdccr;	/* 0x.0088 - Power Management Power Down Counter Configuration Register */
+	u32	pmcdr;		/* 0x.008c - 4Power management clock disable register */
+	u32	mcpsumr;	/* 0x.0090 - Machine Check Summary Register */
+	u32	rstrscr;	/* 0x.0094 - Reset Request Status and Control Register */
+	u32	ectrstcr;	/* 0x.0098 - Exception reset control register */
+	u32	autorstsr;	/* 0x.009c - Automatic reset status register */
+	u32	pvr;		/* 0x.00a0 - Processor Version Register */
+	u32	svr;		/* 0x.00a4 - System Version Register */
 	u8	res0a8[0xb0 - 0xa8];
-	__be32	rstcr;		/* 0x.00b0 - Reset Control Register */
+	u32	rstcr;		/* 0x.00b0 - Reset Control Register */
 	u8	res0b4[0xc0 - 0xb4];
-	__be32  iovselsr;	/* 0x.00c0 - I/O voltage select status register
+	u32	iovselsr;	/* 0x.00c0 - I/O voltage select status register
 					     Called 'elbcvselcr' on 86xx SOCs */
 	u8	res0c4[0x100 - 0xc4];
-	__be32	rcwsr[16];	/* 0x.0100 - Reset Control Word Status registers
+	u32	rcwsr[16];	/* 0x.0100 - Reset Control Word Status registers
 					     There are 16 registers */
 	u8	res140[0x224 - 0x140];
-	__be32  iodelay1;	/* 0x.0224 - IO delay control register 1 */
-	__be32  iodelay2;	/* 0x.0228 - IO delay control register 2 */
+	u32	iodelay1;	/* 0x.0224 - IO delay control register 1 */
+	u32	iodelay2;	/* 0x.0228 - IO delay control register 2 */
 	u8	res22c[0x604 - 0x22c];
-	__be32	pamubypenr; 	/* 0x.604 - PAMU bypass enable register */
+	u32	pamubypenr;	/* 0x.604 - PAMU bypass enable register */
 	u8	res608[0x800 - 0x608];
-	__be32	clkdvdr;	/* 0x.0800 - Clock Divide Register */
+	u32	clkdvdr;	/* 0x.0800 - Clock Divide Register */
 	u8	res804[0x900 - 0x804];
-	__be32	ircr;		/* 0x.0900 - Infrared Control Register */
+	u32	ircr;		/* 0x.0900 - Infrared Control Register */
 	u8	res904[0x908 - 0x904];
-	__be32	dmacr;		/* 0x.0908 - DMA Control Register */
+	u32	dmacr;		/* 0x.0908 - DMA Control Register */
 	u8	res90c[0x914 - 0x90c];
-	__be32	elbccr;		/* 0x.0914 - eLBC Control Register */
+	u32	elbccr;		/* 0x.0914 - eLBC Control Register */
 	u8	res918[0xb20 - 0x918];
-	__be32	ddr1clkdr;	/* 0x.0b20 - DDR1 Clock Disable Register */
-	__be32	ddr2clkdr;	/* 0x.0b24 - DDR2 Clock Disable Register */
-	__be32	ddrclkdr;	/* 0x.0b28 - DDR Clock Disable Register */
+	u32	ddr1clkdr;	/* 0x.0b20 - DDR1 Clock Disable Register */
+	u32	ddr2clkdr;	/* 0x.0b24 - DDR2 Clock Disable Register */
+	u32	ddrclkdr;	/* 0x.0b28 - DDR Clock Disable Register */
 	u8	resb2c[0xe00 - 0xb2c];
-	__be32	clkocr;		/* 0x.0e00 - Clock Out Select Register */
+	u32	clkocr;		/* 0x.0e00 - Clock Out Select Register */
 	u8	rese04[0xe10 - 0xe04];
-	__be32	ddrdllcr;	/* 0x.0e10 - DDR DLL Control Register */
+	u32	ddrdllcr;	/* 0x.0e10 - DDR DLL Control Register */
 	u8	rese14[0xe20 - 0xe14];
-	__be32	lbcdllcr;	/* 0x.0e20 - LBC DLL Control Register */
-	__be32  cpfor;		/* 0x.0e24 - L2 charge pump fuse override register */
+	u32	lbcdllcr;	/* 0x.0e20 - LBC DLL Control Register */
+	u32	cpfor;		/* 0x.0e24 - L2 charge pump fuse override register */
 	u8	rese28[0xf04 - 0xe28];
-	__be32	srds1cr0;	/* 0x.0f04 - SerDes1 Control Register 0 */
-	__be32	srds1cr1;	/* 0x.0f08 - SerDes1 Control Register 0 */
+	u32	srds1cr0;	/* 0x.0f04 - SerDes1 Control Register 0 */
+	u32	srds1cr1;	/* 0x.0f08 - SerDes1 Control Register 0 */
 	u8	resf0c[0xf2c - 0xf0c];
-	__be32  itcr;		/* 0x.0f2c - Internal transaction control register */
+	u32	itcr;		/* 0x.0f2c - Internal transaction control register */
 	u8	resf30[0xf40 - 0xf30];
-	__be32	srds2cr0;	/* 0x.0f40 - SerDes2 Control Register 0 */
-	__be32	srds2cr1;	/* 0x.0f44 - SerDes2 Control Register 0 */
+	u32	srds2cr0;	/* 0x.0f40 - SerDes2 Control Register 0 */
+	u32	srds2cr1;	/* 0x.0f44 - SerDes2 Control Register 0 */
 } __attribute__ ((packed));
 
+u32 fsl_guts_get_svr(void);
+int fsl_guts_init(void);
 
 /* Alternate function signal multiplex control */
 #define MPC85xx_PMUXCR_QE(x) (0x8000 >> (x))
-- 
2.1.0.27.g96db324

^ permalink raw reply related

* [v7, 3/5] dt: move guts devicetree doc out of powerpc directory
From: Yangbo Lu @ 2016-04-01  3:07 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-clk-u79uwXL29TY76Z2rM5mHXA,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mmc-u79uwXL29TY76Z2rM5mHXA
  Cc: ulf.hansson-QSEj5FYQhm4dnm+yROfE0A, Zhao Qiang, Russell King,
	Yangbo Lu, Bhupesh Sharma, Santosh Shilimkar, Jochen Friedrich,
	scott.wood-3arQi8VN3Tc, Rob Herring, Claudiu Manoil, Kumar Gala,
	leoyang.li-3arQi8VN3Tc, xiaobo.xie-3arQi8VN3Tc
In-Reply-To: <1459480051-3701-1-git-send-email-yangbo.lu-3arQi8VN3Tc@public.gmane.org>

Move guts devicetree doc to Documentation/devicetree/bindings/soc/fsl/
since it's used by not only PowerPC but also ARM. And add a specification
for 'little-endian' property.

Signed-off-by: Yangbo Lu <yangbo.lu-3arQi8VN3Tc@public.gmane.org>
---
Changes for v2:
	- None
Changes for v3:
	- None
Changes for v4:
	- Added this patch
Changes for v5:
	- Modified the description for little-endian property
Changes for v6:
	- None
Changes for v7:
	- None
---
 Documentation/devicetree/bindings/{powerpc => soc}/fsl/guts.txt | 3 +++
 1 file changed, 3 insertions(+)
 rename Documentation/devicetree/bindings/{powerpc => soc}/fsl/guts.txt (91%)

diff --git a/Documentation/devicetree/bindings/powerpc/fsl/guts.txt b/Documentation/devicetree/bindings/soc/fsl/guts.txt
similarity index 91%
rename from Documentation/devicetree/bindings/powerpc/fsl/guts.txt
rename to Documentation/devicetree/bindings/soc/fsl/guts.txt
index b71b203..07adca9 100644
--- a/Documentation/devicetree/bindings/powerpc/fsl/guts.txt
+++ b/Documentation/devicetree/bindings/soc/fsl/guts.txt
@@ -25,6 +25,9 @@ Recommended properties:
  - fsl,liodn-bits : Indicates the number of defined bits in the LIODN
    registers, for those SOCs that have a PAMU device.
 
+ - little-endian : Indicates that the global utilities block is little
+   endian. The default is big endian.
+
 Examples:
 	global-utilities@e0000 {	/* global utilities block */
 		compatible = "fsl,mpc8548-guts";
-- 
2.1.0.27.g96db324

^ permalink raw reply related

* [v7, 4/5] powerpc/fsl: move mpc85xx.h to include/linux/fsl
From: Yangbo Lu @ 2016-04-01  3:07 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-clk-u79uwXL29TY76Z2rM5mHXA,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mmc-u79uwXL29TY76Z2rM5mHXA
  Cc: ulf.hansson-QSEj5FYQhm4dnm+yROfE0A, Zhao Qiang, Russell King,
	Yangbo Lu, Bhupesh Sharma, Santosh Shilimkar, Jochen Friedrich,
	scott.wood-3arQi8VN3Tc, Rob Herring, Claudiu Manoil, Kumar Gala,
	leoyang.li-3arQi8VN3Tc, xiaobo.xie-3arQi8VN3Tc
In-Reply-To: <1459480051-3701-1-git-send-email-yangbo.lu-3arQi8VN3Tc@public.gmane.org>

Move mpc85xx.h to include/linux/fsl and rename it to svr.h as
a common header file. It has been used for mpc85xx and it will
be used for ARM-based SoC as well.

Signed-off-by: Yangbo Lu <yangbo.lu-3arQi8VN3Tc@public.gmane.org>
Acked-by: Wolfram Sang <wsa-z923LK4zBo2bacvFa/9K2g@public.gmane.org>
---
Changes for v2:
	- None
Changes for v3:
	- None
Changes for v4:
	- None
Changes for v5:
	- Changed to Move mpc85xx.h to include/linux/fsl/
	- Adjusted '#include <linux/fsl/svr.h>' position in file
Changes for v6:
	- None
Changes for v7:
	- Added 'Acked-by: Wolfram Sang' for I2C part
	- Also applied to arch/powerpc/kernel/cpu_setup_fsl_booke.S
---
 arch/powerpc/kernel/cpu_setup_fsl_booke.S                     | 2 +-
 drivers/clk/clk-qoriq.c                                       | 3 +--
 drivers/i2c/busses/i2c-mpc.c                                  | 2 +-
 drivers/iommu/fsl_pamu.c                                      | 3 +--
 drivers/net/ethernet/freescale/gianfar.c                      | 2 +-
 arch/powerpc/include/asm/mpc85xx.h => include/linux/fsl/svr.h | 4 ++--
 6 files changed, 7 insertions(+), 9 deletions(-)
 rename arch/powerpc/include/asm/mpc85xx.h => include/linux/fsl/svr.h (97%)

diff --git a/arch/powerpc/kernel/cpu_setup_fsl_booke.S b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
index 462aed9..2b0284e 100644
--- a/arch/powerpc/kernel/cpu_setup_fsl_booke.S
+++ b/arch/powerpc/kernel/cpu_setup_fsl_booke.S
@@ -13,13 +13,13 @@
  *
  */
 
+#include <linux/fsl/svr.h>
 #include <asm/page.h>
 #include <asm/processor.h>
 #include <asm/cputable.h>
 #include <asm/ppc_asm.h>
 #include <asm/mmu-book3e.h>
 #include <asm/asm-offsets.h>
-#include <asm/mpc85xx.h>
 
 _GLOBAL(__e500_icache_setup)
 	mfspr	r0, SPRN_L1CSR1
diff --git a/drivers/clk/clk-qoriq.c b/drivers/clk/clk-qoriq.c
index 7bc1c45..fc7f722 100644
--- a/drivers/clk/clk-qoriq.c
+++ b/drivers/clk/clk-qoriq.c
@@ -13,6 +13,7 @@
 #include <linux/clk.h>
 #include <linux/clk-provider.h>
 #include <linux/fsl/guts.h>
+#include <linux/fsl/svr.h>
 #include <linux/io.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
@@ -1148,8 +1149,6 @@ bad_args:
 }
 
 #ifdef CONFIG_PPC
-#include <asm/mpc85xx.h>
-
 static const u32 a4510_svrs[] __initconst = {
 	(SVR_P2040 << 8) | 0x10,	/* P2040 1.0 */
 	(SVR_P2040 << 8) | 0x11,	/* P2040 1.1 */
diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c
index 48ecffe..600704c 100644
--- a/drivers/i2c/busses/i2c-mpc.c
+++ b/drivers/i2c/busses/i2c-mpc.c
@@ -27,9 +27,9 @@
 #include <linux/i2c.h>
 #include <linux/interrupt.h>
 #include <linux/delay.h>
+#include <linux/fsl/svr.h>
 
 #include <asm/mpc52xx.h>
-#include <asm/mpc85xx.h>
 #include <sysdev/fsl_soc.h>
 
 #define DRV_NAME "mpc-i2c"
diff --git a/drivers/iommu/fsl_pamu.c b/drivers/iommu/fsl_pamu.c
index a34355f..af8fb27 100644
--- a/drivers/iommu/fsl_pamu.c
+++ b/drivers/iommu/fsl_pamu.c
@@ -21,11 +21,10 @@
 #include "fsl_pamu.h"
 
 #include <linux/fsl/guts.h>
+#include <linux/fsl/svr.h>
 #include <linux/interrupt.h>
 #include <linux/genalloc.h>
 
-#include <asm/mpc85xx.h>
-
 /* define indexes for each operation mapping scenario */
 #define OMI_QMAN        0x00
 #define OMI_FMAN        0x01
diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index d2f917a..2224b10 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -86,11 +86,11 @@
 #include <linux/udp.h>
 #include <linux/in.h>
 #include <linux/net_tstamp.h>
+#include <linux/fsl/svr.h>
 
 #include <asm/io.h>
 #ifdef CONFIG_PPC
 #include <asm/reg.h>
-#include <asm/mpc85xx.h>
 #endif
 #include <asm/irq.h>
 #include <asm/uaccess.h>
diff --git a/arch/powerpc/include/asm/mpc85xx.h b/include/linux/fsl/svr.h
similarity index 97%
rename from arch/powerpc/include/asm/mpc85xx.h
rename to include/linux/fsl/svr.h
index 213f3a8..8d13836 100644
--- a/arch/powerpc/include/asm/mpc85xx.h
+++ b/include/linux/fsl/svr.h
@@ -9,8 +9,8 @@
  * (at your option) any later version.
  */
 
-#ifndef __ASM_PPC_MPC85XX_H
-#define __ASM_PPC_MPC85XX_H
+#ifndef FSL_SVR_H
+#define FSL_SVR_H
 
 #define SVR_REV(svr)	((svr) & 0xFF)		/* SOC design resision */
 #define SVR_MAJ(svr)	(((svr) >>  4) & 0xF)	/* Major revision field*/
-- 
2.1.0.27.g96db324

^ permalink raw reply related

* [v7, 5/5] mmc: sdhci-of-esdhc: fix host version for T4240-R1.0-R2.0
From: Yangbo Lu @ 2016-04-01  3:07 UTC (permalink / raw)
  To: devicetree-u79uwXL29TY76Z2rM5mHXA,
	linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-clk-u79uwXL29TY76Z2rM5mHXA,
	linux-i2c-u79uwXL29TY76Z2rM5mHXA,
	iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA,
	netdev-u79uwXL29TY76Z2rM5mHXA, linux-mmc-u79uwXL29TY76Z2rM5mHXA
  Cc: ulf.hansson-QSEj5FYQhm4dnm+yROfE0A, Zhao Qiang, Russell King,
	Yangbo Lu, Bhupesh Sharma, Santosh Shilimkar, Jochen Friedrich,
	scott.wood-3arQi8VN3Tc, Rob Herring, Claudiu Manoil, Kumar Gala,
	leoyang.li-3arQi8VN3Tc, xiaobo.xie-3arQi8VN3Tc
In-Reply-To: <1459480051-3701-1-git-send-email-yangbo.lu-3arQi8VN3Tc@public.gmane.org>

The eSDHC of T4240-R1.0-R2.0 has incorrect vender version and spec version.
Acturally the right version numbers should be VVN=0x13 and SVN = 0x1.
This patch adds the GUTS driver support for eSDHC driver to get SVR(System
version register). And fix host version to avoid that incorrect version
numbers break down the ADMA data transfer.

Signed-off-by: Yangbo Lu <yangbo.lu-3arQi8VN3Tc@public.gmane.org>
Acked-by: Ulf Hansson <ulf.hansson-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>
---
Changes for v2:
	- Got SVR through iomap instead of dts
Changes for v3:
	- Managed GUTS through syscon instead of iomap in eSDHC driver
Changes for v4:
	- Got SVR by GUTS driver instead of SYSCON
Changes for v5:
	- Changed to get SVR through API fsl_guts_get_svr()
	- Combined patch 4, patch 5 and patch 6 into one
Changes for v6:
	- Added 'Acked-by: Ulf Hansson'
Changes for v7:
	- None
---
 drivers/mmc/host/Kconfig          |  1 +
 drivers/mmc/host/sdhci-of-esdhc.c | 23 +++++++++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 04feea8..5743b05 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -142,6 +142,7 @@ config MMC_SDHCI_OF_ESDHC
 	depends on MMC_SDHCI_PLTFM
 	depends on PPC || ARCH_MXC || ARCH_LAYERSCAPE
 	select MMC_SDHCI_IO_ACCESSORS
+	select FSL_GUTS
 	help
 	  This selects the Freescale eSDHC controller support.
 
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
index 3f34d35..68cc020 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -18,6 +18,8 @@
 #include <linux/of.h>
 #include <linux/delay.h>
 #include <linux/module.h>
+#include <linux/fsl/svr.h>
+#include <linux/fsl/guts.h>
 #include <linux/mmc/host.h>
 #include "sdhci-pltfm.h"
 #include "sdhci-esdhc.h"
@@ -28,6 +30,8 @@
 struct sdhci_esdhc {
 	u8 vendor_ver;
 	u8 spec_ver;
+	u32 soc_ver;
+	u8 soc_rev;
 };
 
 /**
@@ -73,6 +77,8 @@ static u32 esdhc_readl_fixup(struct sdhci_host *host,
 static u16 esdhc_readw_fixup(struct sdhci_host *host,
 				     int spec_reg, u32 value)
 {
+	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
+	struct sdhci_esdhc *esdhc = sdhci_pltfm_priv(pltfm_host);
 	u16 ret;
 	int shift = (spec_reg & 0x2) * 8;
 
@@ -80,6 +86,13 @@ static u16 esdhc_readw_fixup(struct sdhci_host *host,
 		ret = value & 0xffff;
 	else
 		ret = (value >> shift) & 0xffff;
+
+	/* Workaround for T4240-R1.0-R2.0 eSDHC which has incorrect
+	 * vendor version and spec version information.
+	 */
+	if ((spec_reg == SDHCI_HOST_VERSION) &&
+	    (esdhc->soc_ver == SVR_T4240) && (esdhc->soc_rev <= 0x20))
+		ret = (VENDOR_V_23 << SDHCI_VENDOR_VER_SHIFT) | SDHCI_SPEC_200;
 	return ret;
 }
 
@@ -567,10 +580,20 @@ static void esdhc_init(struct platform_device *pdev, struct sdhci_host *host)
 	struct sdhci_pltfm_host *pltfm_host;
 	struct sdhci_esdhc *esdhc;
 	u16 host_ver;
+	u32 svr;
 
 	pltfm_host = sdhci_priv(host);
 	esdhc = sdhci_pltfm_priv(pltfm_host);
 
+	fsl_guts_init();
+	svr = fsl_guts_get_svr();
+	if (svr) {
+		esdhc->soc_ver = SVR_SOC_VER(svr);
+		esdhc->soc_rev = SVR_REV(svr);
+	} else {
+		dev_err(&pdev->dev, "Failed to get SVR value!\n");
+	}
+
 	host_ver = sdhci_readw(host, SDHCI_HOST_VERSION);
 	esdhc->vendor_ver = (host_ver & SDHCI_VENDOR_VER_MASK) >>
 			     SDHCI_VENDOR_VER_SHIFT;
-- 
2.1.0.27.g96db324

^ permalink raw reply related

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Eric Dumazet @ 2016-04-01  3:13 UTC (permalink / raw)
  To: Hannes Frederic Sowa
  Cc: davem, netdev, sasha.levin, daniel, alexei.starovoitov, mkubecek
In-Reply-To: <56FDD67E.2040904@stressinduktion.org>

On Fri, 2016-04-01 at 04:01 +0200, Hannes Frederic Sowa wrote:

> I thought so first, as well. But given the double check for the 
> spin_lock and the "mutex" we end up with the same result for the 
> lockdep_sock_is_held check.
> 
> Do you see other consequences?

Well, we release the spinlock in __release_sock()

So another thread could come and acquire the socket, then call
mutex_acquire() while the first thread did not call yet mutex_release()

So maybe lockdep will complain (but I do not know lockdep enough to
tell)

So maybe the following would be better :

(Absolutely untested, really I need to take a break)

diff --git a/include/net/sock.h b/include/net/sock.h
index 255d3e03727b..7d5dfa7e1918 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1327,7 +1327,13 @@ static inline void sk_wmem_free_skb(struct sock *sk, struct sk_buff *skb)
 
 static inline void sock_release_ownership(struct sock *sk)
 {
-	sk->sk_lock.owned = 0;
+	if (sk->sk_lock.owned) {
+		/*
+		 * The sk_lock has mutex_unlock() semantics:
+		 */
+		mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
+		sk->sk_lock.owned = 0;
+	}
 }
 
 /*
diff --git a/net/core/sock.c b/net/core/sock.c
index b67b9aedb230..c7ab98e72346 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2429,10 +2429,6 @@ EXPORT_SYMBOL(lock_sock_nested);
 
 void release_sock(struct sock *sk)
 {
-	/*
-	 * The sk_lock has mutex_unlock() semantics:
-	 */
-	mutex_release(&sk->sk_lock.dep_map, 1, _RET_IP_);
 
 	spin_lock_bh(&sk->sk_lock.slock);
 	if (sk->sk_backlog.tail)

^ permalink raw reply related

* Re: [PATCH net 4/4] tcp: various missing rcu_read_lock around __sk_dst_get
From: Hannes Frederic Sowa @ 2016-04-01  3:31 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, netdev, sasha.levin, daniel, alexei.starovoitov, mkubecek
In-Reply-To: <1459480383.6473.270.camel@edumazet-glaptop3.roam.corp.google.com>



On Fri, Apr 1, 2016, at 05:13, Eric Dumazet wrote:
> On Fri, 2016-04-01 at 04:01 +0200, Hannes Frederic Sowa wrote:
> 
> > I thought so first, as well. But given the double check for the 
> > spin_lock and the "mutex" we end up with the same result for the 
> > lockdep_sock_is_held check.
> > 
> > Do you see other consequences?
> 
> Well, we release the spinlock in __release_sock()
> 
> So another thread could come and acquire the socket, then call
> mutex_acquire() while the first thread did not call yet mutex_release()
> 
> So maybe lockdep will complain (but I do not know lockdep enough to
> tell)
> 
> So maybe the following would be better :
> 
> (Absolutely untested, really I need to take a break)

I quickly tested the patch and my scripts didn't show any splats so far.
This patch seems more consistent albeit I don't think it is relevant for
lockdep_sock_is_held as we only flip owned while holding slock. But this
definitely needs more review.

Thanks a lot!

^ permalink raw reply

* Re: qdisc spin lock
From: John Fastabend @ 2016-04-01  3:44 UTC (permalink / raw)
  To: Michael Ma, Cong Wang; +Cc: Linux Kernel Network Developers
In-Reply-To: <CAAmHdhxagKnLP1_5ZW7HTsVBu0TSFYKCvNstAEWN-NHrdnvvVQ@mail.gmail.com>

On 16-03-31 04:48 PM, Michael Ma wrote:
> I didn't really know that multiple qdiscs can be isolated using MQ so
> that each txq can be associated with a particular qdisc. Also we don't
> really have multiple interfaces...

MQ will assign a default qdisc to each txq and the default qdisc can
be changed to htb or any other qdisc of your choice.

> 
> With this MQ solution we'll still need to assign transmit queues to
> different classes by doing some math on the bandwidth limit if I
> understand correctly, which seems to be less convenient compared with
> a solution purely within HTB.
> 

Agreed.

> I assume that with this solution I can still share qdisc among
> multiple transmit queues - please let me know if this is not the case.

Nope sorry doesn't work that way unless you employ some sort of stacked
netdevice strategy which does start to get a bit complex. The basic hint
would be to stack some type of virtual netdev on top of a device and
run the htb qdisc there. Push traffic onto the netdev depending on the
class it belongs to. Its ugly yes.

Noting all that I posted an RFC patch some time back to allow writing
qdiscs that do not require taking the lock. I'll try to respin these
and submit them when net-next opens again. The next logical step is to
write a "better" HTB probably using a shared counter and dropping the
requirement that it be exact.

Sorry I didn't get a chance to look at the paper in your post so not
sure if they suggest something similar or not.

Thanks,
John

> 
> 2016-03-31 15:16 GMT-07:00 Cong Wang <xiyou.wangcong@gmail.com>:
>> On Wed, Mar 30, 2016 at 12:20 AM, Michael Ma <make0818@gmail.com> wrote:
>>> As far as I understand the design of TC is to simplify locking schema
>>> and minimize the work in __qdisc_run so that throughput won’t be
>>> affected, especially with large packets. However if the scenario is
>>> that multiple classes in the queueing discipline only have the shaping
>>> limit, there isn’t really a necessary correlation between different
>>> classes. The only synchronization point should be when the packet is
>>> dequeued from the qdisc queue and enqueued to the transmit queue of
>>> the device. My question is – is it worth investing on avoiding the
>>> locking contention by partitioning the queue/lock so that this
>>> scenario is addressed with relatively smaller latency?
>>
>> If your HTB classes don't share bandwidth, why do you still make them
>> under the same hierarchy? IOW, you can just isolate them either with some
>> other qdisc or just separated interfaces.

^ permalink raw reply

* Re: [PATCH RFC net-next] net: core: Pass XPS select queue decision to skb_tx_hash
From: John Fastabend @ 2016-04-01  3:49 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Saeed Mahameed, Linux Netdev List, Eric Dumazet, Tom Herbert,
	Jiri Pirko, David S. Miller, John Fastabend
In-Reply-To: <CALzJLG-xJe6_-2a=djpLxBR5xQY562m06eLLCP04GdTrzmWJuQ@mail.gmail.com>

On 16-03-30 11:30 AM, Saeed Mahameed wrote:
> On Wed, Mar 30, 2016 at 8:04 PM, John Fastabend
> <john.fastabend@gmail.com> wrote:
>>
>> OK, so let me see if I get this right now. This was the precedence
>> before the patch in the normal no select queue case,
>>
>>         (1) socket mapping sk_tx_queue_mapping iff !ooo_okay
>>         (2) xps
>>         (3) skb->queue_mapping
>>         (4) qoffset/qcount (hash over tc queues)
>>         (5) hash over num_tx_queues
>>
>> With this patch the precedence is a bit changed because
>> skb_tx_hash is always called.
>>
>>         (1) socket mapping sk_tx_queue_mapping iff !ooo_okay
>>         (2) skb->queue_mapping
>>         (3) qoffset/qcount
>>            (hash over tc queues if xps choice is > qcount)
>>         (4) xps
>>         (5) hash over num_tx_queues
>>
>> Sound right? Nice thing about this with correct configuration
>> of tc with qcount = xps_queues it sort of works as at least
> 
> Yes !
> for qcount = xps_queues which almost all drivers default
> configurations goes this way, it works like charm, xps selects the
> exact TC TX queue at the correct offset without any need for further
> SKB hashing.
> and even if by mistake XPS was also configured on TC TX queue then
> this patch will detect that the xps hash is out of this TC
> offset/qcount range and will re-hash. But i don't see why would user
> or driver do such strange configuration.
> 
>> I expect it to. I think the question is are people OK with
>> letting skb->queue_mapping take precedence. I am at least
>> because it makes the skb edit queue_mapping action from tc
>> easier to use.
>>
> 
> skb->queue_mapping toke precedence also before this patch, the only
> thing this patch came to change is how to compute the txq when
> skb->queue_mapping is not present, so we don't need to worry about
> this.
> 

I don't believe that is correct in the general case. Perhaps
in the ndo_select_queue path though. See this line,

        if (queue_index < 0 || skb->ooo_okay ||
            queue_index >= dev->real_num_tx_queues) {
                int new_index = get_xps_queue(dev, skb);
                if (new_index < 0)
                        new_index = skb_tx_hash(dev, skb);

The skb_tx_hash() routine is never called if xps is enabled.
And so we never get into the call to do this,

        if (skb_rx_queue_recorded(skb)) {
                hash = skb_get_rx_queue(skb);
                while (unlikely(hash >= num_tx_queues))
                        hash -= num_tx_queues;
                return hash;
        }

Right? FWIW I think that using queue_mapping before xps is better
because we can use tc to pick the queue_mapping them programmatically
if we want for these special cases instead if wanted.

>> And just a comment on the code why not just move get_xps_queue
>> into skb_tx_hash at this point if its always being called as the
>> "hint". Then we avoid calling it in the case queue_mapping is
>> set.
>>
> 
> Very good point, the only place that calls skb_tx_hash(dev, skb) other
> than __netdev_pick_tx is mlx4 driver and they did it there just
> because they wanted to bypass XPS configuration if TC QoS is
> configured, with this fix we don't have to bypass XPS at all for when
> TC is configured.
> 
> I will change it.
> 

Great thanks.

^ permalink raw reply

* Re: [PATCH] rds: rds-stress show all zeros after few minutes
From: santosh.shilimkar @ 2016-04-01  3:59 UTC (permalink / raw)
  To: shamir rabinovitch, rds-devel, netdev; +Cc: davem
In-Reply-To: <1459385402-28449-1-git-send-email-shamir.rabinovitch@oracle.com>

Hi Shamir,

Nice to see this one soon on the list,
Just to make $subject more relevant. How about below?

RDS: fix congestion map corruption for PAGE_SIZE > 8k

On 3/30/16 5:50 PM, shamir rabinovitch wrote:
> Issue can be seen on platforms that use 8K and above page size
> while rds fragment size is 4K. On those platforms single page is
> shared between 2 or more rds fragments. Each fragment has it's own
> offeset and rds cong map code need to take this offset to account.
> Not taking this offset to account lead to reading the data fragment
> as congestion map fragment and hang of the rds transmit due to far
> cong map corruption.
>
> Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
> Reviewed-by: Ajaykumar Hotchandani <ajaykumar.hotchandani@oracle.com>
> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Tested-by: Anand Bibhuti <anand.bibhuti@oracle.com>
>
> Signed-off-by: shamir rabinovitch <shamir.rabinovitch@oracle.com>
> ---
>   net/rds/ib_recv.c |    2 +-
>   net/rds/iw_recv.c |    2 +-
>   net/rds/page.c    |    5 +++--
>   3 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
> index 977fb86..abc8cc8 100644
> --- a/net/rds/ib_recv.c
> +++ b/net/rds/ib_recv.c
> @@ -796,7 +796,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
>
>   		addr = kmap_atomic(sg_page(&frag->f_sg));
>
> -		src = addr + frag_off;
> +		src = addr + frag->f_sg.offset + frag_off;
>   		dst = (void *)map->m_page_addrs[map_page] + map_off;
>   		for (k = 0; k < to_copy; k += 8) {
>   			/* Record ports that became uncongested, ie
> diff --git a/net/rds/iw_recv.c b/net/rds/iw_recv.c
If you refresh the patch against 4.6-rc1, you won't need to
patch iw_recv.c :-)


> diff --git a/net/rds/page.c b/net/rds/page.c
> index 5a14e6d..715cbaa 100644
> --- a/net/rds/page.c
> +++ b/net/rds/page.c
> @@ -135,8 +135,9 @@ int rds_page_remainder_alloc(struct scatterlist *scat, unsigned long bytes,
>   			if (rem->r_offset != 0)
>   				rds_stats_inc(s_page_remainder_hit);
>
> -			rem->r_offset += bytes;
> -			if (rem->r_offset == PAGE_SIZE) {
> +			/* some hw (e.g. sparc) require aligned memory */
> +			rem->r_offset += ALIGN(bytes, 8);
> +			if (rem->r_offset >= PAGE_SIZE) {
>   				__free_page(rem->r_page);
>   				rem->r_page = NULL;
>   			}
>
This hunk I missed out looks like. This doesn't belong to the
$subject patch. Could you please add this in separate patch. I
will need more than just "some hw (e.g. sparc) require aligned memory"

Once you fix these, please repost the updated version, and I will add
them to the 4.7 queue. Thanks !!

Regards,
Santosh

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox