All of lore.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <waiman.long@hp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Matt Fleming <matt.fleming@intel.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Michel Lespinasse <walken@google.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	Norton, Sc
Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation
Date: Tue, 23 Jul 2013 20:03:36 -0400	[thread overview]
Message-ID: <51EF19D8.2090307@hp.com> (raw)
In-Reply-To: <20130722103402.GA1991@gmail.com>

On 07/22/2013 06:34 AM, Ingo Molnar wrote:
> * Waiman Long<waiman.long@hp.com>  wrote:
>
>> I had run some performance tests using the fserver and new_fserver
>> benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core
>> DL980 with HT on. The following kernels were used:
>>
>> 1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c
>>     replaced by a rwlock
>> 2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested
>>     by Ingo
>> 3. Modified 3.10.1 kernel + queue read/write lock
>> 4. Modified 3.10.1 kernel + queue read/write lock in classic read/write
>>     lock behavior
>>
>> The last one is with the read lock stealing flag set in the qrwlock
>> structure to give priority to readers and behave more like the classic
>> read/write lock with less fairness.
>>
>> The following table shows the averaged results in the 200-1000
>> user ranges:
>>
>> +-----------------+--------+--------+--------+--------+
>> |  Kernel         |    1   |    2   |    3   |   4    |
>> +-----------------+--------+--------+--------+--------+
>> | fserver JPM     | 245598 | 274457 | 403348 | 411941 |
>> | % change from 1 |   0%   | +11.8% | +64.2% | +67.7% |
>> +-----------------+--------+--------+--------+--------+
>> | new-fserver JPM | 231549 | 269807 | 399093 | 399418 |
>> | % change from 1 |   0%   | +16.5% | +72.4% | +72.5% |
>> +-----------------+--------+--------+--------+--------+
> So it's not just herding that is a problem.
>
> I'm wondering, how sensitive is this particular benchmark to fairness?
> I.e. do the 200-1000 simulated users each perform the same number of ops,
> so that any smearing of execution time via unfairness gets amplified?
>
> I.e. does steady-state throughput go up by 60%+ too with your changes?

For this particular benchmark, there are interplay of different locks 
that determine the overall performance of the system. Yes, I got steady 
state performance gain of 60%+ with the qrwlock change with the modified 
mbcache.c. Without the modified mbcache.c file, the performance gain 
drop to 20-30%. I am still trying to find out more about the performance 
variations in different situations.

Regards,
Longman

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Matt Fleming <matt.fleming@intel.com>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Michel Lespinasse <walken@google.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>,
	George Spelvin <linux@horizon.com>
Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation
Date: Tue, 23 Jul 2013 20:03:36 -0400	[thread overview]
Message-ID: <51EF19D8.2090307@hp.com> (raw)
Message-ID: <20130724000336.Cu2CiAGG95GmFQJ2NVavNpyC3bRSEm_eZqTfrlfvx0c@z> (raw)
In-Reply-To: <20130722103402.GA1991@gmail.com>

On 07/22/2013 06:34 AM, Ingo Molnar wrote:
> * Waiman Long<waiman.long@hp.com>  wrote:
>
>> I had run some performance tests using the fserver and new_fserver
>> benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core
>> DL980 with HT on. The following kernels were used:
>>
>> 1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c
>>     replaced by a rwlock
>> 2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested
>>     by Ingo
>> 3. Modified 3.10.1 kernel + queue read/write lock
>> 4. Modified 3.10.1 kernel + queue read/write lock in classic read/write
>>     lock behavior
>>
>> The last one is with the read lock stealing flag set in the qrwlock
>> structure to give priority to readers and behave more like the classic
>> read/write lock with less fairness.
>>
>> The following table shows the averaged results in the 200-1000
>> user ranges:
>>
>> +-----------------+--------+--------+--------+--------+
>> |  Kernel         |    1   |    2   |    3   |   4    |
>> +-----------------+--------+--------+--------+--------+
>> | fserver JPM     | 245598 | 274457 | 403348 | 411941 |
>> | % change from 1 |   0%   | +11.8% | +64.2% | +67.7% |
>> +-----------------+--------+--------+--------+--------+
>> | new-fserver JPM | 231549 | 269807 | 399093 | 399418 |
>> | % change from 1 |   0%   | +16.5% | +72.4% | +72.5% |
>> +-----------------+--------+--------+--------+--------+
> So it's not just herding that is a problem.
>
> I'm wondering, how sensitive is this particular benchmark to fairness?
> I.e. do the 200-1000 simulated users each perform the same number of ops,
> so that any smearing of execution time via unfairness gets amplified?
>
> I.e. does steady-state throughput go up by 60%+ too with your changes?

For this particular benchmark, there are interplay of different locks 
that determine the overall performance of the system. Yes, I got steady 
state performance gain of 60%+ with the qrwlock change with the modified 
mbcache.c. Without the modified mbcache.c file, the performance gain 
drop to 20-30%. I am still trying to find out more about the performance 
variations in different situations.

Regards,
Longman

WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hp.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-arch@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Matt Fleming <matt.fleming@intel.com>,
	Herbert Xu <herbert@gondor.hengli.com.au>,
	Akinobu Mita <akinobu.mita@gmail.com>,
	Rusty Russell <rusty@rustcorp.com.au>,
	Michel Lespinasse <walken@google.com>,
	Andi Kleen <andi@firstfloor.org>, Rik van Riel <riel@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>,
	George Spelvin <linux@horizon.com>
Subject: Re: [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation
Date: Tue, 23 Jul 2013 20:03:36 -0400	[thread overview]
Message-ID: <51EF19D8.2090307@hp.com> (raw)
In-Reply-To: <20130722103402.GA1991@gmail.com>

On 07/22/2013 06:34 AM, Ingo Molnar wrote:
> * Waiman Long<waiman.long@hp.com>  wrote:
>
>> I had run some performance tests using the fserver and new_fserver
>> benchmarks (on ext4 filesystems) of the AIM7 test suite on a 80-core
>> DL980 with HT on. The following kernels were used:
>>
>> 1. Modified 3.10.1 kernel with mb_cache_spinlock in fs/mbcache.c
>>     replaced by a rwlock
>> 2. Modified 3.10.1 kernel + modified __read_lock_failed code as suggested
>>     by Ingo
>> 3. Modified 3.10.1 kernel + queue read/write lock
>> 4. Modified 3.10.1 kernel + queue read/write lock in classic read/write
>>     lock behavior
>>
>> The last one is with the read lock stealing flag set in the qrwlock
>> structure to give priority to readers and behave more like the classic
>> read/write lock with less fairness.
>>
>> The following table shows the averaged results in the 200-1000
>> user ranges:
>>
>> +-----------------+--------+--------+--------+--------+
>> |  Kernel         |    1   |    2   |    3   |   4    |
>> +-----------------+--------+--------+--------+--------+
>> | fserver JPM     | 245598 | 274457 | 403348 | 411941 |
>> | % change from 1 |   0%   | +11.8% | +64.2% | +67.7% |
>> +-----------------+--------+--------+--------+--------+
>> | new-fserver JPM | 231549 | 269807 | 399093 | 399418 |
>> | % change from 1 |   0%   | +16.5% | +72.4% | +72.5% |
>> +-----------------+--------+--------+--------+--------+
> So it's not just herding that is a problem.
>
> I'm wondering, how sensitive is this particular benchmark to fairness?
> I.e. do the 200-1000 simulated users each perform the same number of ops,
> so that any smearing of execution time via unfairness gets amplified?
>
> I.e. does steady-state throughput go up by 60%+ too with your changes?

For this particular benchmark, there are interplay of different locks 
that determine the overall performance of the system. Yes, I got steady 
state performance gain of 60%+ with the qrwlock change with the modified 
mbcache.c. Without the modified mbcache.c file, the performance gain 
drop to 20-30%. I am still trying to find out more about the performance 
variations in different situations.

Regards,
Longman

  reply	other threads:[~2013-07-24  0:03 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-13  1:34 [PATCH RFC 0/2] qrwlock: Introducing a queue read/write lock implementation Waiman Long
2013-07-13  1:34 ` Waiman Long
2013-07-13  1:34 ` [PATCH RFC 1/2] qrwlock: A " Waiman Long
2013-07-13  1:34   ` Waiman Long
2013-07-15 14:39   ` Steven Rostedt
2013-07-15 14:39     ` Steven Rostedt
2013-07-15 20:44     ` Waiman Long
2013-07-15 20:44       ` Waiman Long
2013-07-15 22:31   ` Thomas Gleixner
2013-07-15 22:31     ` Thomas Gleixner
2013-07-16  1:19     ` Waiman Long
2013-07-16  1:19       ` Waiman Long
2013-07-18  7:42       ` Ingo Molnar
2013-07-18  7:42         ` Ingo Molnar
2013-07-18  7:42         ` Ingo Molnar
2013-07-18 13:40         ` Waiman Long
2013-07-18 13:40           ` Waiman Long
2013-07-18 13:40           ` Waiman Long
2013-07-19  8:40           ` Ingo Molnar
2013-07-19  8:40             ` Ingo Molnar
2013-07-19  8:40             ` Ingo Molnar
2013-07-19 15:30             ` Waiman Long
2013-07-19 15:30               ` Waiman Long
2013-07-19 15:30               ` Waiman Long
2013-07-22 10:34               ` Ingo Molnar
2013-07-22 10:34                 ` Ingo Molnar
2013-07-22 10:34                 ` Ingo Molnar
2013-07-24  0:03                 ` Waiman Long [this message]
2013-07-24  0:03                   ` Waiman Long
2013-07-24  0:03                   ` Waiman Long
2013-07-18 10:22       ` Thomas Gleixner
2013-07-18 10:22         ` Thomas Gleixner
2013-07-18 14:19         ` Waiman Long
2013-07-18 14:19           ` Waiman Long
2013-07-21  5:42           ` Raghavendra K T
2013-07-21  5:42             ` Raghavendra K T
2013-07-21  5:42             ` Raghavendra K T
2013-07-23 23:54             ` Waiman Long
2013-07-23 23:54               ` Waiman Long
2013-07-23 23:54               ` Waiman Long
2013-07-13  1:34 ` [PATCH RFC 2/2] x86 qrwlock: Enable x86 to use queue read/write lock Waiman Long
2013-07-13  1:34   ` Waiman Long
  -- strict thread matches above, loose matches on Subject: below --
2013-07-18 12:55 [PATCH RFC 1/2] qrwlock: A queue read/write lock implementation George Spelvin
2013-07-18 13:43 ` Waiman Long
2013-07-18 18:46   ` George Spelvin
2013-07-19 15:43     ` Waiman Long
2013-07-19 21:11       ` George Spelvin
2013-07-19 21:35         ` Waiman Long
2013-07-18 13:18 George Spelvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51EF19D8.2090307@hp.com \
    --to=waiman.long@hp.com \
    --cc=akinobu.mita@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=arnd@arndb.de \
    --cc=aswin@hp.com \
    --cc=catalin.marinas@arm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matt.fleming@intel.com \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=richard@nod.at \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=rusty@rustcorp.com.au \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=walken@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.