linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: waiman.long@hp.com (Waiman Long)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 2/9] locking/qrwlock: avoid redundant atomic_add_return on read_lock_slowpath
Date: Tue, 07 Jul 2015 15:28:13 -0400	[thread overview]
Message-ID: <559C284D.4090907@hp.com> (raw)
In-Reply-To: <20150707181941.GL23879@arm.com>

On 07/07/2015 02:19 PM, Will Deacon wrote:
> On Tue, Jul 07, 2015 at 06:51:54PM +0100, Waiman Long wrote:
>> On 07/07/2015 01:24 PM, Will Deacon wrote:
>>> When a slow-path reader gets to the front of the wait queue outside of
>>> interrupt context, it waits for any writers to drain, increments the
>>> reader count and again waits for any additional writers that may have
>>> snuck in between the initial check and the increment.
>>>
>>> Given that this second check is performed with acquire semantics, there
>>> is no need to perform the increment using atomic_add_return, which acts
>>> as a full barrier.
>>>
>>> This patch changes the slow-path code to use smp_load_acquire and
>>> atomic_add instead of atomic_add_return. Since the check only involves
>>> the writer count, we can perform the acquire after the add.
>>>
>>> Signed-off-by: Will Deacon<will.deacon@arm.com>
>>> ---
>>>    kernel/locking/qrwlock.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
>>> index 96b77d1e0545..4e29bef688ac 100644
>>> --- a/kernel/locking/qrwlock.c
>>> +++ b/kernel/locking/qrwlock.c
>>> @@ -98,7 +98,8 @@ void queued_read_lock_slowpath(struct qrwlock *lock, u32 cnts)
>>>    	while (atomic_read(&lock->cnts)&   _QW_WMASK)
>>>    		cpu_relax_lowlatency();
>>>
>>> -	cnts = atomic_add_return(_QR_BIAS,&lock->cnts) - _QR_BIAS;
>>> +	atomic_add(_QR_BIAS,&lock->cnts);
>>> +	cnts = smp_load_acquire((u32 *)&lock->cnts);
>>>    	rspin_until_writer_unlock(lock, cnts);
>>>
>>>    	/*
>> Atomic add in x86 is actually a full barrier too. The performance
>> difference between "lock add" and "lock xadd" should be minor. The
>> additional load, however, could potentially cause an additional
>> cacheline load on a contended lock. So do you see actual performance
>> benefit of this change in ARM?
> I'd need to re-run the numbers, but atomic_add is significantly less
> work on ARM than atomic_add_return, which basically has two full memory
> barriers compared to none for the former.
>
> Will

I think a compromise is to encapsulate that in an inlined function that 
can be overridden by architecture specific code. For example,

#ifndef queued_inc_reader_return
static inline u32 queued_inc_reader_return(struct qrwlock *lock)
{
     return atomic_add_return(_QR_BIAS, &lock->cnts) - _QR_BIAS;
}
#endif

:

cnts = queued_inc_reader_return(lock);

This also means that you will need to keep the function prototype for 
rspin_until_writer_unlock() in the later patch. Other than that, I don't 
see any other issue in the patch series.

BTW, are you also planning to use qspinlock in the ARM64 code?

Cheers,
Longman

  reply	other threads:[~2015-07-07 19:28 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-07 17:24 [PATCH 0/9] locking/qrwlock: get qrwlocks up and running on arm64 Will Deacon
2015-07-07 17:24 ` [PATCH 1/9] locking/qrwlock: include <linux/spinlock.h> for arch_spin_{lock, unlock} Will Deacon
2015-07-07 17:24 ` [PATCH 2/9] locking/qrwlock: avoid redundant atomic_add_return on read_lock_slowpath Will Deacon
2015-07-07 17:51   ` Waiman Long
2015-07-07 18:19     ` Will Deacon
2015-07-07 19:28       ` Waiman Long [this message]
2015-07-08  9:59         ` Peter Zijlstra
2015-07-08 13:37           ` Will Deacon
2015-07-07 21:30     ` Peter Zijlstra
2015-07-07 17:24 ` [PATCH 3/9] locking/qrwlock: tidy up rspin_until_writer_unlock Will Deacon
2015-07-07 17:24 ` [PATCH 4/9] locking/qrwlock: implement queue_write_unlock using smp_store_release Will Deacon
2015-07-08 10:00   ` Peter Zijlstra
2015-07-07 17:24 ` [PATCH 5/9] locking/qrwlock: remove redundant cmpxchg barriers on writer slow-path Will Deacon
2015-07-08 10:05   ` Peter Zijlstra
2015-07-08 13:34     ` Will Deacon
2015-07-07 17:24 ` [PATCH 6/9] locking/qrwlock: allow architectures to hook in to contended paths Will Deacon
2015-07-08 10:06   ` Peter Zijlstra
2015-07-08 13:35     ` Will Deacon
2015-07-07 17:24 ` [PATCH 7/9] locking/qrwlock: expose internal lock structure in qrwlock definition Will Deacon
2015-07-07 17:24 ` [PATCH 8/9] arm64: cmpxchg: implement cmpxchg_relaxed Will Deacon
2015-07-07 17:24 ` [PATCH 9/9] arm64: locking: replace read/write locks with generic qrwlock code Will Deacon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=559C284D.4090907@hp.com \
    --to=waiman.long@hp.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).