Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-doc@vger.kernel.org, Jason Low <jason.low2@hp.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP
Date: Fri, 7 Oct 2016 17:45:29 -0400	[thread overview]
Message-ID: <57F81779.4050101@hpe.com> (raw)
In-Reply-To: <20161006214751.GU27872@dastard>

On 10/06/2016 05:47 PM, Dave Chinner wrote:
> On Thu, Oct 06, 2016 at 11:17:18AM -0700, Davidlohr Bueso wrote:
>> On Thu, 18 Aug 2016, Waiman Long wrote:
>>
>>> Currently, when down_read() fails, the active read locking isn't undone
>>> until the rwsem_down_read_failed() function grabs the wait_lock. If the
>>> wait_lock is contended, it may takes a while to get the lock. During
>>> that period, writer lock stealing will be disabled because of the
>>> active read lock.
>>>
>>> This patch will release the active read lock ASAP so that writer lock
>>> stealing can happen sooner. The only downside is when the reader is
>>> the first one in the wait queue as it has to issue another atomic
>>> operation to update the count.
>>>
>>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>>> the fio test with multithreaded randrw and randwrite tests on the
>>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>>> the aggregated bandwidths before and after the patch were as follows:
>>>
>>> Test      BW before patch     BW after patch  % change
>>> ----      ---------------     --------------  --------
>>> randrw        1210 MB/s          1352 MB/s      +12%
>>> randwrite     1622 MB/s          1710 MB/s      +5.4%
>> Yeah, this is really a bad workload to make decisions on locking
>> heuristics imo - if I'm thinking of the same workload. Mainly because
>> concurrent buffered io to the same file isn't very realistic and you
>> end up pathologically pounding on i_rwsem (which used to be until
>> recently i_mutex until Al's parallel lookup/readdir). Obviously write
>> lock stealing wins in this case.
> Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> XFS level and never took exclusive locks.
>
> *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> match the buffered IO single writer POSIX semantics - the test is a
> bad test based on the fact it exercised a path that is under heavy
> development and so can't be used as a regression test across
> multiple kernels.
>
> If you want to stress concurrent access to a single file, please
> use direct IO, not DAX or buffered IO.

Thanks for the update. I will change the test when I update this patch.

Cheers,
Longman

WARNING: multiple messages have this Message-ID (diff)

From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org,
	linux-s390@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-doc@vger.kernel.org, Jason Low <jason.low2@hp.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP
Date: Fri, 07 Oct 2016 21:45:29 +0000	[thread overview]
Message-ID: <57F81779.4050101@hpe.com> (raw)
In-Reply-To: <20161006214751.GU27872@dastard>

On 10/06/2016 05:47 PM, Dave Chinner wrote:
> On Thu, Oct 06, 2016 at 11:17:18AM -0700, Davidlohr Bueso wrote:
>> On Thu, 18 Aug 2016, Waiman Long wrote:
>>
>>> Currently, when down_read() fails, the active read locking isn't undone
>>> until the rwsem_down_read_failed() function grabs the wait_lock. If the
>>> wait_lock is contended, it may takes a while to get the lock. During
>>> that period, writer lock stealing will be disabled because of the
>>> active read lock.
>>>
>>> This patch will release the active read lock ASAP so that writer lock
>>> stealing can happen sooner. The only downside is when the reader is
>>> the first one in the wait queue as it has to issue another atomic
>>> operation to update the count.
>>>
>>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>>> the fio test with multithreaded randrw and randwrite tests on the
>>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>>> the aggregated bandwidths before and after the patch were as follows:
>>>
>>> Test      BW before patch     BW after patch  % change
>>> ----      ---------------     --------------  --------
>>> randrw        1210 MB/s          1352 MB/s      +12%
>>> randwrite     1622 MB/s          1710 MB/s      +5.4%
>> Yeah, this is really a bad workload to make decisions on locking
>> heuristics imo - if I'm thinking of the same workload. Mainly because
>> concurrent buffered io to the same file isn't very realistic and you
>> end up pathologically pounding on i_rwsem (which used to be until
>> recently i_mutex until Al's parallel lookup/readdir). Obviously write
>> lock stealing wins in this case.
> Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> XFS level and never took exclusive locks.
>
> *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> match the buffered IO single writer POSIX semantics - the test is a
> bad test based on the fact it exercised a path that is under heavy
> development and so can't be used as a regression test across
> multiple kernels.
>
> If you want to stress concurrent access to a single file, please
> use direct IO, not DAX or buffered IO.

Thanks for the update. I will change the test when I update this patch.

Cheers,
Longman

WARNING: multiple messages have this Message-ID (diff)

From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>, <linux-kernel@vger.kernel.org>,
	<x86@kernel.org>, <linux-alpha@vger.kernel.org>,
	<linux-ia64@vger.kernel.org>, <linux-s390@vger.kernel.org>,
	<linux-arch@vger.kernel.org>, <linux-doc@vger.kernel.org>,
	Jason Low <jason.low2@hp.com>, Jonathan Corbet <corbet@lwn.net>,
	Scott J Norton <scott.norton@hpe.com>,
	Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP
Date: Fri, 7 Oct 2016 17:45:29 -0400	[thread overview]
Message-ID: <57F81779.4050101@hpe.com> (raw)
In-Reply-To: <20161006214751.GU27872@dastard>

On 10/06/2016 05:47 PM, Dave Chinner wrote:
> On Thu, Oct 06, 2016 at 11:17:18AM -0700, Davidlohr Bueso wrote:
>> On Thu, 18 Aug 2016, Waiman Long wrote:
>>
>>> Currently, when down_read() fails, the active read locking isn't undone
>>> until the rwsem_down_read_failed() function grabs the wait_lock. If the
>>> wait_lock is contended, it may takes a while to get the lock. During
>>> that period, writer lock stealing will be disabled because of the
>>> active read lock.
>>>
>>> This patch will release the active read lock ASAP so that writer lock
>>> stealing can happen sooner. The only downside is when the reader is
>>> the first one in the wait queue as it has to issue another atomic
>>> operation to update the count.
>>>
>>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>>> the fio test with multithreaded randrw and randwrite tests on the
>>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>>> the aggregated bandwidths before and after the patch were as follows:
>>>
>>> Test      BW before patch     BW after patch  % change
>>> ----      ---------------     --------------  --------
>>> randrw        1210 MB/s          1352 MB/s      +12%
>>> randwrite     1622 MB/s          1710 MB/s      +5.4%
>> Yeah, this is really a bad workload to make decisions on locking
>> heuristics imo - if I'm thinking of the same workload. Mainly because
>> concurrent buffered io to the same file isn't very realistic and you
>> end up pathologically pounding on i_rwsem (which used to be until
>> recently i_mutex until Al's parallel lookup/readdir). Obviously write
>> lock stealing wins in this case.
> Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> XFS level and never took exclusive locks.
>
> *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> match the buffered IO single writer POSIX semantics - the test is a
> bad test based on the fact it exercised a path that is under heavy
> development and so can't be used as a regression test across
> multiple kernels.
>
> If you want to stress concurrent access to a single file, please
> use direct IO, not DAX or buffered IO.

Thanks for the update. I will change the test when I update this patch.

Cheers,
Longman

next prev parent reply	other threads:[~2016-10-07 21:45 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-18 21:11 [RFC PATCH-tip v4 00/10] locking/rwsem: Enable reader optimistic spinning Waiman Long
2016-08-18 21:11 ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 01/10] locking/osq: Make lock/unlock proper acquire/release barrier Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-10-04 19:06   ` Davidlohr Bueso
2016-10-04 19:06     ` Davidlohr Bueso
2016-10-04 21:28     ` Jason Low
2016-10-04 21:28       ` Jason Low
2016-10-05 12:19     ` Waiman Long
2016-10-05 12:19       ` Waiman Long
2016-10-05 12:19       ` Waiman Long
2016-10-05 15:11       ` Waiman Long
2016-10-05 15:11         ` Waiman Long
2016-10-05 15:11         ` Waiman Long
2016-10-06  5:47         ` Davidlohr Bueso
2016-10-06  5:47           ` Davidlohr Bueso
2016-10-06 19:30           ` Waiman Long
2016-10-06 19:30             ` Waiman Long
2016-10-06 19:30             ` Waiman Long
2016-10-10  5:39             ` [PATCH] locking/osq: Provide proper lock/unlock and relaxed flavors Davidlohr Bueso
2016-10-10  5:39               ` Davidlohr Bueso
2016-10-06 19:31           ` [RFC PATCH-tip v4 01/10] locking/osq: Make lock/unlock proper acquire/release barrier Jason Low
2016-10-06 19:31             ` Jason Low
2016-10-06 19:31             ` Jason Low
2016-08-18 21:11 ` [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock ASAP Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-10-06 18:17   ` Davidlohr Bueso
2016-10-06 18:17     ` Davidlohr Bueso
2016-10-06 21:47     ` Dave Chinner
2016-10-06 21:47       ` Dave Chinner
2016-10-06 22:51       ` Davidlohr Bueso
2016-10-06 22:51         ` Davidlohr Bueso
2016-10-07 21:45       ` Waiman Long [this message]
2016-10-07 21:45         ` Waiman Long
2016-10-07 21:45         ` Waiman Long
2016-10-09 15:17       ` Christoph Hellwig
2016-10-09 15:17         ` Christoph Hellwig
2016-10-10  6:07         ` Dave Chinner
2016-10-10  6:07           ` Dave Chinner
2016-10-10  9:34           ` Christoph Hellwig
2016-10-10  9:34             ` Christoph Hellwig
2016-10-11 21:06             ` Dave Chinner
2016-10-11 21:06               ` Dave Chinner
2016-10-16  5:57               ` Christoph Hellwig
2016-10-16  5:57                 ` Christoph Hellwig
2016-08-18 21:11 ` [RFC PATCH-tip v4 03/10] locking/rwsem: Make rwsem_spin_on_owner() return a tri-state value Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 04/10] locking/rwsem: Enable count-based spinning on reader Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 05/10] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 06/10] locking/rwsem: Move common rwsem macros to asm-generic/rwsem_types.h Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 07/10] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-19  5:57   ` Wanpeng Li
2016-08-19  5:57     ` Wanpeng Li
2016-08-19 16:21     ` Waiman Long
2016-08-19 16:21       ` Waiman Long
2016-08-19 16:21       ` Waiman Long
2016-08-22  2:15       ` Wanpeng Li
2016-08-22  2:15         ` Wanpeng Li
2016-08-18 21:11 ` [RFC PATCH-tip v4 08/10] locking/rwsem: Enable spinning readers Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 09/10] locking/rwsem: Enable reactivation of reader spinning Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-18 21:11 ` [RFC PATCH-tip v4 10/10] locking/rwsem: Add a boot parameter to reader spinning threshold Waiman Long
2016-08-18 21:11   ` Waiman Long
2016-08-24  1:46   ` [lkp] [locking/rwsem] INFO: rcu_preempt detected stalls on CPUs/tasks kernel test robot
2016-08-24  1:46     ` kernel test robot
2016-08-24  1:46     ` [lkp] " kernel test robot
2016-08-24  4:00   ` [RFC PATCH-tip v4 10/10] locking/rwsem: Add a boot parameter to reader spinning threshold Davidlohr Bueso
2016-08-24  4:00     ` Davidlohr Bueso
2016-08-24 18:39     ` Waiman Long
2016-08-24 18:39       ` Waiman Long
2016-08-24 18:39       ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57F81779.4050101@hpe.com \
    --to=waiman.long@hpe.com \
    --cc=corbet@lwn.net \
    --cc=dave@stgolabs.net \
    --cc=david@fromorbit.com \
    --cc=doug.hatch@hpe.com \
    --cc=jason.low2@hp.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=scott.norton@hpe.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.