From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752109AbbGaCEB (ORCPT <rfc822;w@1wt.eu>);
	Thu, 30 Jul 2015 22:04:01 -0400
Received: from g1t5424.austin.hp.com ([15.216.225.54]:57808 "EHLO
	g1t5424.austin.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751445AbbGaCEA (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Thu, 30 Jul 2015 22:04:00 -0400
Message-ID: <55BAD78A.1000308@hp.com>
Date: Thu, 30 Jul 2015 22:03:54 -0400
From: Waiman Long <waiman.long@hp.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12
MIME-Version: 1.0
To: Peter Zijlstra <peterz@infradead.org>
CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
        linux-kernel@vger.kernel.org, mingo@kernel.org, jiangshanlai@gmail.com,
        dipankar@in.ibm.com, akpm@linux-foundation.org,
        mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
        tglx@linutronix.de, rostedt@goodmis.org, dhowells@redhat.com,
        edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com,
        oleg@redhat.com, bobby.prani@gmail.com, dave@stgolabs.net
Subject: Re: [PATCH tip/core/rcu 19/19] rcu: Add fastpath bypassing funnel
 locking
References: <20150717232901.GA22511@linux.vnet.ibm.com> <1437175764-24096-1-git-send-email-paulmck@linux.vnet.ibm.com> <1437175764-24096-19-git-send-email-paulmck@linux.vnet.ibm.com> <20150730144455.GZ19282@twins.programming.kicks-ass.net>
In-Reply-To: <20150730144455.GZ19282@twins.programming.kicks-ass.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 07/30/2015 10:44 AM, Peter Zijlstra wrote:
> On Fri, Jul 17, 2015 at 04:29:24PM -0700, Paul E. McKenney wrote:
>
>>   	/*
>> +	 * First try directly acquiring the root lock in order to reduce
>> +	 * latency in the common case where expedited grace periods are
>> +	 * rare.  We check mutex_is_locked() to avoid pathological levels of
>> +	 * memory contention on ->exp_funnel_mutex in the heavy-load case.
>> +	 */
>> +	rnp0 = rcu_get_root(rsp);
>> +	if (!mutex_is_locked(&rnp0->exp_funnel_mutex)) {
>> +		if (mutex_trylock(&rnp0->exp_funnel_mutex)) {
>> +			if (sync_exp_work_done(rsp, rnp0, NULL,
>> +					&rsp->expedited_workdone0, s))
>> +				return NULL;
>> +			return rnp0;
>> +		}
>> +	}
> So our 'new' locking primitives do things like:
>
> static __always_inline int queued_spin_trylock(struct qspinlock *lock)
> {
>          if (!atomic_read(&lock->val)&&
>             (atomic_cmpxchg(&lock->val, 0, _Q_LOCKED_VAL) == 0))
>                  return 1;
>          return 0;
> }
>
> mutexes do not do this.
>
> Now I suppose the question is, does that extra read slow down the
> (common) uncontended case? (remember, we should optimize locks for the
> uncontended case, heavy lock contention should be fixed with better
> locking schemes, not lock implementations).

I suppose the extra read may slow down the uncontended case, but I am 
not sure by how much as I haven't run any test to quantify this. 
However, there are use cases where it is advantageous to do a read 
first, like when the lock cacheline is likely to be hot (in the 
slowpath, for example). So it depends on how the trylock is being used.

Cheers,
Longman