public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
@ 2014-06-04 20:38 Pranith Kumar
  2014-06-04 20:56 ` Andev
  0 siblings, 1 reply; 7+ messages in thread
From: Pranith Kumar @ 2014-06-04 20:38 UTC (permalink / raw)
  To: peterz; +Cc: linux-kernel

remove a redundant comparision

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 kernel/locking/rwsem-xadd.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 1f99664b..6f8bd3c 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -249,8 +249,7 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
 {
     if (!(count & RWSEM_ACTIVE_MASK)) {
         /* try acquiring the write lock */
-        if (sem->count == RWSEM_WAITING_BIAS &&
-            cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
+        if (cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
                 RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) {
             if (!list_is_singular(&sem->wait_list))
                 rwsem_atomic_update(RWSEM_WAITING_BIAS, sem);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
  2014-06-04 20:38 [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it Pranith Kumar
@ 2014-06-04 20:56 ` Andev
  2014-06-05  7:22   ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Andev @ 2014-06-04 20:56 UTC (permalink / raw)
  To: Pranith Kumar; +Cc: Peter Zijlstra, LKML

On Wed, Jun 4, 2014 at 4:38 PM, Pranith Kumar <pranith@gatech.edu> wrote:
> remove a redundant comparision
>
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> ---
>  kernel/locking/rwsem-xadd.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> index 1f99664b..6f8bd3c 100644
> --- a/kernel/locking/rwsem-xadd.c
> +++ b/kernel/locking/rwsem-xadd.c
> @@ -249,8 +249,7 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
>  {
>      if (!(count & RWSEM_ACTIVE_MASK)) {
>          /* try acquiring the write lock */
> -        if (sem->count == RWSEM_WAITING_BIAS &&
> -            cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> +        if (cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
>                  RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) {

This was mainly done to avoid the cost of a cmpxchg in case where they
are not equal. Not sure if it really makes a difference though.

-- 
Andev

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
  2014-06-04 20:56 ` Andev
@ 2014-06-05  7:22   ` Peter Zijlstra
  2014-06-05 17:54     ` Davidlohr Bueso
  2014-06-06  3:09     ` Pranith Kumar
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Zijlstra @ 2014-06-05  7:22 UTC (permalink / raw)
  To: Andev; +Cc: Pranith Kumar, LKML

[-- Attachment #1: Type: text/plain, Size: 1438 bytes --]

On Wed, Jun 04, 2014 at 04:56:50PM -0400, Andev wrote:
> On Wed, Jun 4, 2014 at 4:38 PM, Pranith Kumar <pranith@gatech.edu> wrote:
> > remove a redundant comparision
> >
> > Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> > ---
> >  kernel/locking/rwsem-xadd.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > index 1f99664b..6f8bd3c 100644
> > --- a/kernel/locking/rwsem-xadd.c
> > +++ b/kernel/locking/rwsem-xadd.c
> > @@ -249,8 +249,7 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
> >  {
> >      if (!(count & RWSEM_ACTIVE_MASK)) {
> >          /* try acquiring the write lock */
> > -        if (sem->count == RWSEM_WAITING_BIAS &&
> > -            cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> > +        if (cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> >                  RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) {
> 
> This was mainly done to avoid the cost of a cmpxchg in case where they
> are not equal. Not sure if it really makes a difference though.

It does, a cache hot cmpxchg instruction is 24 cycles (as is pretty much
any other LOCKed ins, as measured on my WSM-EP), not to mention that
cmpxchg is a RMW so it needs to grab the cacheline in exclusive mode.

A read, which allows the cacheline to remain in shared, and non LOCKed
ops are way faster.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
  2014-06-05  7:22   ` Peter Zijlstra
@ 2014-06-05 17:54     ` Davidlohr Bueso
  2014-06-05 18:08       ` Davidlohr Bueso
  2014-06-06  3:09     ` Pranith Kumar
  1 sibling, 1 reply; 7+ messages in thread
From: Davidlohr Bueso @ 2014-06-05 17:54 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Andev, Pranith Kumar, LKML

On Thu, 2014-06-05 at 09:22 +0200, Peter Zijlstra wrote:
> On Wed, Jun 04, 2014 at 04:56:50PM -0400, Andev wrote:
> > On Wed, Jun 4, 2014 at 4:38 PM, Pranith Kumar <pranith@gatech.edu> wrote:
> > > remove a redundant comparision
> > >
> > > Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> > > ---
> > >  kernel/locking/rwsem-xadd.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > > index 1f99664b..6f8bd3c 100644
> > > --- a/kernel/locking/rwsem-xadd.c
> > > +++ b/kernel/locking/rwsem-xadd.c
> > > @@ -249,8 +249,7 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
> > >  {
> > >      if (!(count & RWSEM_ACTIVE_MASK)) {
> > >          /* try acquiring the write lock */
> > > -        if (sem->count == RWSEM_WAITING_BIAS &&
> > > -            cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> > > +        if (cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> > >                  RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) {
> > 
> > This was mainly done to avoid the cost of a cmpxchg in case where they
> > are not equal. Not sure if it really makes a difference though.
> 
> It does, a cache hot cmpxchg instruction is 24 cycles (as is pretty much
> any other LOCKed ins, as measured on my WSM-EP), not to mention that
> cmpxchg is a RMW so it needs to grab the cacheline in exclusive mode.
> 
> A read, which allows the cacheline to remain in shared, and non LOCKed
> ops are way faster.

Yep, and we also do it in mutexes. The numbers and benefits on larger
systems speaks for themselves. It would, perhaps, be worth adding a
comment as it does seem redundant if you're not thinking about the
cacheline when reading the code.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
  2014-06-05 17:54     ` Davidlohr Bueso
@ 2014-06-05 18:08       ` Davidlohr Bueso
  2014-06-06  7:01         ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Davidlohr Bueso @ 2014-06-05 18:08 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Andev, Pranith Kumar, LKML, jason.low2

On Thu, 2014-06-05 at 10:54 -0700, Davidlohr Bueso wrote:
> On Thu, 2014-06-05 at 09:22 +0200, Peter Zijlstra wrote:
> > On Wed, Jun 04, 2014 at 04:56:50PM -0400, Andev wrote:
> > > On Wed, Jun 4, 2014 at 4:38 PM, Pranith Kumar <pranith@gatech.edu> wrote:
> > > > remove a redundant comparision
> > > >
> > > > Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> > > > ---
> > > >  kernel/locking/rwsem-xadd.c | 3 +--
> > > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
> > > > index 1f99664b..6f8bd3c 100644
> > > > --- a/kernel/locking/rwsem-xadd.c
> > > > +++ b/kernel/locking/rwsem-xadd.c
> > > > @@ -249,8 +249,7 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
> > > >  {
> > > >      if (!(count & RWSEM_ACTIVE_MASK)) {
> > > >          /* try acquiring the write lock */
> > > > -        if (sem->count == RWSEM_WAITING_BIAS &&
> > > > -            cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> > > > +        if (cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
> > > >                  RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) {
> > > 
> > > This was mainly done to avoid the cost of a cmpxchg in case where they
> > > are not equal. Not sure if it really makes a difference though.
> > 
> > It does, a cache hot cmpxchg instruction is 24 cycles (as is pretty much
> > any other LOCKed ins, as measured on my WSM-EP), not to mention that
> > cmpxchg is a RMW so it needs to grab the cacheline in exclusive mode.
> > 
> > A read, which allows the cacheline to remain in shared, and non LOCKed
> > ops are way faster.
> 
> Yep, and we also do it in mutexes. The numbers and benefits on larger
> systems speaks for themselves. It would, perhaps, be worth adding a
> comment as it does seem redundant if you're not thinking about the
> cacheline when reading the code.

I knew I had formally read this technique somewhere:
http://pdos.csail.mit.edu/6.828/2010/readings/mcs.pdf (part 2.1).

Peter, what do you think of adding a new cmp_cmpxchg() or dcmpxchg()
call for such scenarios?



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
  2014-06-05  7:22   ` Peter Zijlstra
  2014-06-05 17:54     ` Davidlohr Bueso
@ 2014-06-06  3:09     ` Pranith Kumar
  1 sibling, 0 replies; 7+ messages in thread
From: Pranith Kumar @ 2014-06-06  3:09 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Andev, LKML, davidlohr, jason.low2

On Thu, Jun 5, 2014 at 3:22 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Jun 04, 2014 at 04:56:50PM -0400, Andev wrote:
>> On Wed, Jun 4, 2014 at 4:38 PM, Pranith Kumar <pranith@gatech.edu> wrote:
>> > remove a redundant comparision
>> >
>> > Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
>> > ---
>> >  kernel/locking/rwsem-xadd.c | 3 +--
>> >  1 file changed, 1 insertion(+), 2 deletions(-)
>> >
>> > diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
>> > index 1f99664b..6f8bd3c 100644
>> > --- a/kernel/locking/rwsem-xadd.c
>> > +++ b/kernel/locking/rwsem-xadd.c
>> > @@ -249,8 +249,7 @@ static inline bool rwsem_try_write_lock(long count, struct rw_semaphore *sem)
>> >  {
>> >      if (!(count & RWSEM_ACTIVE_MASK)) {
>> >          /* try acquiring the write lock */
>> > -        if (sem->count == RWSEM_WAITING_BIAS &&
>> > -            cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
>> > +        if (cmpxchg(&sem->count, RWSEM_WAITING_BIAS,
>> >                  RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_WAITING_BIAS) {
>>
>> This was mainly done to avoid the cost of a cmpxchg in case where they
>> are not equal. Not sure if it really makes a difference though.
>
> It does, a cache hot cmpxchg instruction is 24 cycles (as is pretty much
> any other LOCKed ins, as measured on my WSM-EP), not to mention that
> cmpxchg is a RMW so it needs to grab the cacheline in exclusive mode.
>
> A read, which allows the cacheline to remain in shared, and non LOCKed
> ops are way faster.


Ok, this means that we need to use more of such swaps on highly
contended paths. As Davidlohr suggested later on, I think it would be
a good idea to document this and add an API.

-- 
Pranith

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it
  2014-06-05 18:08       ` Davidlohr Bueso
@ 2014-06-06  7:01         ` Peter Zijlstra
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Zijlstra @ 2014-06-06  7:01 UTC (permalink / raw)
  To: Davidlohr Bueso; +Cc: Andev, Pranith Kumar, LKML, jason.low2

[-- Attachment #1: Type: text/plain, Size: 812 bytes --]

On Thu, Jun 05, 2014 at 11:08:23AM -0700, Davidlohr Bueso wrote:
> I knew I had formally read this technique somewhere:
> http://pdos.csail.mit.edu/6.828/2010/readings/mcs.pdf (part 2.1).
> 
> Peter, what do you think of adding a new cmp_cmpxchg() or dcmpxchg()
> call for such scenarios?

Don't like dcmpxchg(), too easy to confuse with double-cmpxchg or
somesuch.

That said, I'm not entirely sure we want this primitive, the thing is,
people might use it ;-)

And its somewhat dangerous in that it explicitly does not provide any
kind of memory barrier on the fail path, where cmpxchg() is an
unconditional full memory barrier.

Also, you really don't want to use it in loops.

So I think it makes more sense to leave things as are and simply apply
the pattern where safe and meaningful.

[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-06-06  7:01 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-04 20:38 [RFC PATCH 1/1] remove redundant compare, cmpxchg already does it Pranith Kumar
2014-06-04 20:56 ` Andev
2014-06-05  7:22   ` Peter Zijlstra
2014-06-05 17:54     ` Davidlohr Bueso
2014-06-05 18:08       ` Davidlohr Bueso
2014-06-06  7:01         ` Peter Zijlstra
2014-06-06  3:09     ` Pranith Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox