All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@intel.com>
To: Peter Hurley <peter@hurleysoftware.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Michel Lespinasse <walken@google.com>,
	Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andi Kleen <andi@firstfloor.org>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Matthew R Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] rwsem: check the lock before cpmxchg in down_write_trylock and rwsem_do_wake
Date: Sun, 23 Jun 2013 09:16:13 +0800	[thread overview]
Message-ID: <51C64C5D.5090400@intel.com> (raw)
In-Reply-To: <51C55082.5040500@hurleysoftware.com>

On 06/22/2013 03:21 PM, Peter Hurley wrote:
> On 06/21/2013 07:51 PM, Tim Chen wrote:
>> Doing cmpxchg will cause cache bouncing when checking
>> sem->count. This could cause scalability issue
>> in a large machine (e.g. a 80 cores box).
>>
>> A pre-read of sem->count can mitigate this.
>>
>> Signed-off-by: Alex Shi <alex.shi@intel.com>
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> ---
>>   include/asm-generic/rwsem.h |    8 ++++----
>>   lib/rwsem.c                 |   21 +++++++++++++--------
>>   2 files changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h
>> index bb1e2cd..052d973 100644
>> --- a/include/asm-generic/rwsem.h
>> +++ b/include/asm-generic/rwsem.h
>> @@ -70,11 +70,11 @@ static inline void __down_write(struct
>> rw_semaphore *sem)
>>
>>   static inline int __down_write_trylock(struct rw_semaphore *sem)
>>   {
>> -    long tmp;
>> +    if (unlikely(&sem->count != RWSEM_UNLOCKED_VALUE))
>                      ^^^^^^^^^^^
> 
> This is probably not what you want.
> 

this function logical is quite simple. check the sem->count before
cmpxchg is no harm this logical.

So could you like to tell us what should we want?

> 
>> +        return 0;
>>
>> -    tmp = cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
>> -              RWSEM_ACTIVE_WRITE_BIAS);
>> -    return tmp == RWSEM_UNLOCKED_VALUE;
>> +    return cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
>> +        RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_UNLOCKED_VALUE;
>>   }
>>
>>   /*
>> diff --git a/lib/rwsem.c b/lib/rwsem.c
>> index 19c5fa9..2072af5 100644
>> --- a/lib/rwsem.c
>> +++ b/lib/rwsem.c
>> @@ -75,7 +75,7 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum
>> rwsem_wake_type wake_type)
>>                * will block as they will notice the queued writer.
>>                */
>>               wake_up_process(waiter->task);
>> -        goto out;
>> +        return sem;
> 
> Please put these flow control changes in a separate patch.

I had sent the split patches to Tim&Davidlohr. They will send them out
as a single patchset.
> 
> 
>>       }
>>
>>       /* Writers might steal the lock before we grant it to the next
>> reader.
>> @@ -85,15 +85,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum
>> rwsem_wake_type wake_type)
>>       adjustment = 0;
>>       if (wake_type != RWSEM_WAKE_READ_OWNED) {
>>           adjustment = RWSEM_ACTIVE_READ_BIAS;
>> - try_reader_grant:
>> -        oldcount = rwsem_atomic_update(adjustment, sem) - adjustment;
>> -        if (unlikely(oldcount < RWSEM_WAITING_BIAS)) {
>> -            /* A writer stole the lock. Undo our reader grant. */
>> +        while (1) {
>> +            /* A writer stole the lock. */
>> +            if (sem->count < RWSEM_WAITING_BIAS)
>> +                return sem;
> 
> I'm all for structured looping instead of goto labels but this optimization
> is only useful on the 1st iteration. IOW, on the second iteration you
> already
> know that you need to try for reclaiming the lock.
> 

sorry. could you like to say more clear, what's the 1st or 2nd iteration
or others?
> 
>> +
>> +            oldcount = rwsem_atomic_update(adjustment, sem)
>> +                                - adjustment;
>> +            if (likely(oldcount >= RWSEM_WAITING_BIAS))
>> +                break;
>> +
>> +             /* A writer stole the lock.  Undo our reader grant. */
>>               if (rwsem_atomic_update(-adjustment, sem) &
>>                           RWSEM_ACTIVE_MASK)
>> -                goto out;
>> +                return sem;
>>               /* Last active locker left. Retry waking readers. */
>> -            goto try_reader_grant;
>>           }
>>       }
>>
>> @@ -136,7 +142,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum
>> rwsem_wake_type wake_type)
>>       sem->wait_list.next = next;
>>       next->prev = &sem->wait_list;
>>
>> - out:
>>       return sem;
>>   }
> 
> 
> Alex and Tim,
> 
> Was there a v1 of this series; ie., is this v2 (or higher)?
> 
> How are you validating lock correctness/behavior with this series?

some benchmark tested against this patch, mainly aim7. plus by eyes, we
didn't change the logical except check the lock value before  do locking
> 
> Regards,
> Peter Hurley
> 


-- 
Thanks
    Alex

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Alex Shi <alex.shi@intel.com>
To: Peter Hurley <peter@hurleysoftware.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Michel Lespinasse <walken@google.com>,
	Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andi Kleen <andi@firstfloor.org>,
	Davidlohr Bueso <davidlohr.bueso@hp.com>,
	Matthew R Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Rik van Riel <riel@redhat.com>,
	linux-kernel@vger.kernel.org, linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] rwsem: check the lock before cpmxchg in down_write_trylock and rwsem_do_wake
Date: Sun, 23 Jun 2013 09:16:13 +0800	[thread overview]
Message-ID: <51C64C5D.5090400@intel.com> (raw)
In-Reply-To: <51C55082.5040500@hurleysoftware.com>

On 06/22/2013 03:21 PM, Peter Hurley wrote:
> On 06/21/2013 07:51 PM, Tim Chen wrote:
>> Doing cmpxchg will cause cache bouncing when checking
>> sem->count. This could cause scalability issue
>> in a large machine (e.g. a 80 cores box).
>>
>> A pre-read of sem->count can mitigate this.
>>
>> Signed-off-by: Alex Shi <alex.shi@intel.com>
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> ---
>>   include/asm-generic/rwsem.h |    8 ++++----
>>   lib/rwsem.c                 |   21 +++++++++++++--------
>>   2 files changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/include/asm-generic/rwsem.h b/include/asm-generic/rwsem.h
>> index bb1e2cd..052d973 100644
>> --- a/include/asm-generic/rwsem.h
>> +++ b/include/asm-generic/rwsem.h
>> @@ -70,11 +70,11 @@ static inline void __down_write(struct
>> rw_semaphore *sem)
>>
>>   static inline int __down_write_trylock(struct rw_semaphore *sem)
>>   {
>> -    long tmp;
>> +    if (unlikely(&sem->count != RWSEM_UNLOCKED_VALUE))
>                      ^^^^^^^^^^^
> 
> This is probably not what you want.
> 

this function logical is quite simple. check the sem->count before
cmpxchg is no harm this logical.

So could you like to tell us what should we want?

> 
>> +        return 0;
>>
>> -    tmp = cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
>> -              RWSEM_ACTIVE_WRITE_BIAS);
>> -    return tmp == RWSEM_UNLOCKED_VALUE;
>> +    return cmpxchg(&sem->count, RWSEM_UNLOCKED_VALUE,
>> +        RWSEM_ACTIVE_WRITE_BIAS) == RWSEM_UNLOCKED_VALUE;
>>   }
>>
>>   /*
>> diff --git a/lib/rwsem.c b/lib/rwsem.c
>> index 19c5fa9..2072af5 100644
>> --- a/lib/rwsem.c
>> +++ b/lib/rwsem.c
>> @@ -75,7 +75,7 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum
>> rwsem_wake_type wake_type)
>>                * will block as they will notice the queued writer.
>>                */
>>               wake_up_process(waiter->task);
>> -        goto out;
>> +        return sem;
> 
> Please put these flow control changes in a separate patch.

I had sent the split patches to Tim&Davidlohr. They will send them out
as a single patchset.
> 
> 
>>       }
>>
>>       /* Writers might steal the lock before we grant it to the next
>> reader.
>> @@ -85,15 +85,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum
>> rwsem_wake_type wake_type)
>>       adjustment = 0;
>>       if (wake_type != RWSEM_WAKE_READ_OWNED) {
>>           adjustment = RWSEM_ACTIVE_READ_BIAS;
>> - try_reader_grant:
>> -        oldcount = rwsem_atomic_update(adjustment, sem) - adjustment;
>> -        if (unlikely(oldcount < RWSEM_WAITING_BIAS)) {
>> -            /* A writer stole the lock. Undo our reader grant. */
>> +        while (1) {
>> +            /* A writer stole the lock. */
>> +            if (sem->count < RWSEM_WAITING_BIAS)
>> +                return sem;
> 
> I'm all for structured looping instead of goto labels but this optimization
> is only useful on the 1st iteration. IOW, on the second iteration you
> already
> know that you need to try for reclaiming the lock.
> 

sorry. could you like to say more clear, what's the 1st or 2nd iteration
or others?
> 
>> +
>> +            oldcount = rwsem_atomic_update(adjustment, sem)
>> +                                - adjustment;
>> +            if (likely(oldcount >= RWSEM_WAITING_BIAS))
>> +                break;
>> +
>> +             /* A writer stole the lock.  Undo our reader grant. */
>>               if (rwsem_atomic_update(-adjustment, sem) &
>>                           RWSEM_ACTIVE_MASK)
>> -                goto out;
>> +                return sem;
>>               /* Last active locker left. Retry waking readers. */
>> -            goto try_reader_grant;
>>           }
>>       }
>>
>> @@ -136,7 +142,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, enum
>> rwsem_wake_type wake_type)
>>       sem->wait_list.next = next;
>>       next->prev = &sem->wait_list;
>>
>> - out:
>>       return sem;
>>   }
> 
> 
> Alex and Tim,
> 
> Was there a v1 of this series; ie., is this v2 (or higher)?
> 
> How are you validating lock correctness/behavior with this series?

some benchmark tested against this patch, mainly aim7. plus by eyes, we
didn't change the logical except check the lock value before  do locking
> 
> Regards,
> Peter Hurley
> 


-- 
Thanks
    Alex

  reply	other threads:[~2013-06-23  1:16 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1371855277.git.tim.c.chen@linux.intel.com>
2013-06-21 23:51 ` [PATCH 1/2] rwsem: check the lock before cpmxchg in down_write_trylock and rwsem_do_wake Tim Chen
2013-06-21 23:51   ` Tim Chen
2013-06-22  0:10   ` Alex Shi
2013-06-22  0:10     ` Alex Shi
2013-06-22  0:15     ` Davidlohr Bueso
2013-06-22  0:15       ` Davidlohr Bueso
2013-06-24 16:34     ` Tim Chen
2013-06-24 16:34       ` Tim Chen
2013-06-22  7:21   ` Peter Hurley
2013-06-22  7:21     ` Peter Hurley
2013-06-23  1:16     ` Alex Shi [this message]
2013-06-23  1:16       ` Alex Shi
2013-06-23  5:10       ` Andi Kleen
2013-06-23  5:10         ` Andi Kleen
2013-06-23 11:52         ` Alex Shi
2013-06-23 11:52           ` Alex Shi
2013-06-21 23:51 ` [PATCH 2/2] rwsem: do optimistic spinning for writer lock acquisition Tim Chen
2013-06-21 23:51   ` Tim Chen
2013-06-22  0:00   ` Davidlohr Bueso
2013-06-22  0:00     ` Davidlohr Bueso
2013-06-22  7:57   ` Peter Hurley
2013-06-22  7:57     ` Peter Hurley
2013-06-23 20:03     ` Davidlohr Bueso
2013-06-23 20:03       ` Davidlohr Bueso
2013-06-24 17:11       ` Tim Chen
2013-06-24 17:11         ` Tim Chen
2013-06-24 18:49         ` Peter Hurley
2013-06-24 18:49           ` Peter Hurley
2013-06-24 19:13           ` Tim Chen
2013-06-24 19:13             ` Tim Chen
2013-06-24 20:32             ` Peter Hurley
2013-06-24 20:32               ` Peter Hurley
2013-06-24 20:17           ` Tim Chen
2013-06-24 20:17             ` Tim Chen
2013-06-24 20:48             ` Peter Hurley
2013-06-24 20:48               ` Peter Hurley
2013-06-24 21:30               ` Tim Chen
2013-06-24 21:30                 ` Tim Chen
2013-06-25  7:37             ` Peter Zijlstra
2013-06-25  7:37               ` Peter Zijlstra
2013-06-25 16:00               ` Tim Chen
2013-06-25 16:00                 ` Tim Chen
2013-06-24 21:58     ` Tim Chen
2013-06-24 21:58       ` Tim Chen
2013-06-24 22:08       ` Peter Hurley
2013-06-24 22:08         ` Peter Hurley
2013-06-24  8:46   ` Peter Zijlstra
2013-06-24  8:46     ` Peter Zijlstra
2013-06-24 16:36     ` Tim Chen
2013-06-24 16:36       ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51C64C5D.5090400@intel.com \
    --to=alex.shi@intel.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=dave.hansen@intel.com \
    --cc=davidlohr.bueso@hp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mingo@elte.hu \
    --cc=peter@hurleysoftware.com \
    --cc=riel@redhat.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.