From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-arch@vger.kernel.org, linux-s390@vger.kernel.org,
Davidlohr Bueso <dave@stgolabs.net>,
linux-ia64@vger.kernel.org, Scott J Norton <scott.norton@hpe.com>,
Peter Zijlstra <peterz@infradead.org>,
x86@kernel.org, linux-kernel@vger.kernel.org, xfs@oss.sgi.com,
Ingo Molnar <mingo@redhat.com>,
linux-alpha@vger.kernel.org, Douglas Hatch <doug.hatch@hpe.com>,
Jason Low <jason.low2@hp.com>
Subject: Re: [RFC PATCH-tip 6/6] xfs: Enable reader optimistic spinning for DAX inodes
Date: Wed, 15 Jun 2016 14:55:11 -0400 [thread overview]
Message-ID: <5761A48F.1040304@hpe.com> (raw)
In-Reply-To: <20160614230613.GB26977@dastard>
On 06/14/2016 07:06 PM, Dave Chinner wrote:
> On Tue, Jun 14, 2016 at 02:12:39PM -0400, Waiman Long wrote:
>> This patch enables reader optimistic spinning for inodes that are
>> under a DAX-based mount point.
>>
>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>> the fio test with multithreaded randrw and randwrite tests on the
>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>> the aggregated bandwidths before and after the patch were as follows:
>>
>> Test BW before patch BW after patch % change
>> ---- --------------- -------------- --------
>> randrw 1352 MB/s 2164 MB/s +60%
>> randwrite 1710 MB/s 2550 MB/s +49%
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>> fs/xfs/xfs_icache.c | 9 +++++++++
>> 1 files changed, 9 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index 99ee6ee..09f284f 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -71,6 +71,15 @@ xfs_inode_alloc(
>>
>> mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
>>
>> + /*
>> + * Enable reader spinning for DAX nount point
>> + */
>> + if (mp->m_flags& XFS_MOUNT_DAX) {
>> + rwsem_set_rspin_threshold(&ip->i_iolock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_mmaplock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_lock.mr_lock);
>> + }
> That's wrong. DAX is a per-inode flag, not a mount wide flag. This
> needs to be done once the inode has been fully initialised and
> IS_DAX(inode) can be run.
>
> Also, the benchmark doesn't show that all these locks are being
> tested by this benchmark. e.g. the i_mmaplock isn't involved in
> the benchmark's IO paths at all. It's only taken in page faults and
> truncate paths....
>
> I'd also like to see how much of the gain comes from the iolock vs
> the ilock, as the ilock is nested inside the iolock and so
> contention is much rarer....
This patch has now been superseded by a second one where changes to the
xfs code is no longer needed. The new patch will enable reader spinning
for all rwsem and dynamically disable it depending on past history.
> As it is, I'm *extremely* paranoid when it comes to changes to core
> locking like this. Performance is secondary to correctness, and we
> need much more than just a few benchmarks to verify there aren't
> locking bugs being introduced....
The core rwsem locking logic hasn't been changed. There are some minor
changes, however, on what RWSEM_WAITING_BIAS value to use that requires
more eyeballs to make sure that it hasn't introduced any new bug.
Cheers,
Longman
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org,
linux-s390@vger.kernel.org, linux-arch@vger.kernel.org,
xfs@oss.sgi.com, Davidlohr Bueso <dave@stgolabs.net>,
Jason Low <jason.low2@hp.com>,
Scott J Norton <scott.norton@hpe.com>,
Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RFC PATCH-tip 6/6] xfs: Enable reader optimistic spinning for DAX inodes
Date: Wed, 15 Jun 2016 14:55:11 -0400 [thread overview]
Message-ID: <5761A48F.1040304@hpe.com> (raw)
Message-ID: <20160615185511.__OG1bTTnwA2QZrjUqmEeYeDr2yuqr3SIV1Y75LtCF4@z> (raw)
In-Reply-To: <20160614230613.GB26977@dastard>
On 06/14/2016 07:06 PM, Dave Chinner wrote:
> On Tue, Jun 14, 2016 at 02:12:39PM -0400, Waiman Long wrote:
>> This patch enables reader optimistic spinning for inodes that are
>> under a DAX-based mount point.
>>
>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>> the fio test with multithreaded randrw and randwrite tests on the
>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>> the aggregated bandwidths before and after the patch were as follows:
>>
>> Test BW before patch BW after patch % change
>> ---- --------------- -------------- --------
>> randrw 1352 MB/s 2164 MB/s +60%
>> randwrite 1710 MB/s 2550 MB/s +49%
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>> fs/xfs/xfs_icache.c | 9 +++++++++
>> 1 files changed, 9 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index 99ee6ee..09f284f 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -71,6 +71,15 @@ xfs_inode_alloc(
>>
>> mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
>>
>> + /*
>> + * Enable reader spinning for DAX nount point
>> + */
>> + if (mp->m_flags& XFS_MOUNT_DAX) {
>> + rwsem_set_rspin_threshold(&ip->i_iolock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_mmaplock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_lock.mr_lock);
>> + }
> That's wrong. DAX is a per-inode flag, not a mount wide flag. This
> needs to be done once the inode has been fully initialised and
> IS_DAX(inode) can be run.
>
> Also, the benchmark doesn't show that all these locks are being
> tested by this benchmark. e.g. the i_mmaplock isn't involved in
> the benchmark's IO paths at all. It's only taken in page faults and
> truncate paths....
>
> I'd also like to see how much of the gain comes from the iolock vs
> the ilock, as the ilock is nested inside the iolock and so
> contention is much rarer....
This patch has now been superseded by a second one where changes to the
xfs code is no longer needed. The new patch will enable reader spinning
for all rwsem and dynamically disable it depending on past history.
> As it is, I'm *extremely* paranoid when it comes to changes to core
> locking like this. Performance is secondary to correctness, and we
> need much more than just a few benchmarks to verify there aren't
> locking bugs being introduced....
The core rwsem locking logic hasn't been changed. There are some minor
changes, however, on what RWSEM_WAITING_BIAS value to use that requires
more eyeballs to make sure that it hasn't introduced any new bug.
Cheers,
Longman
WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org,
linux-s390@vger.kernel.org, linux-arch@vger.kernel.org,
xfs@oss.sgi.com, Davidlohr Bueso <dave@stgolabs.net>,
Jason Low <jason.low2@hp.com>,
Scott J Norton <scott.norton@hpe.com>,
Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RFC PATCH-tip 6/6] xfs: Enable reader optimistic spinning for DAX inodes
Date: Wed, 15 Jun 2016 18:55:11 +0000 [thread overview]
Message-ID: <5761A48F.1040304@hpe.com> (raw)
In-Reply-To: <20160614230613.GB26977@dastard>
On 06/14/2016 07:06 PM, Dave Chinner wrote:
> On Tue, Jun 14, 2016 at 02:12:39PM -0400, Waiman Long wrote:
>> This patch enables reader optimistic spinning for inodes that are
>> under a DAX-based mount point.
>>
>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>> the fio test with multithreaded randrw and randwrite tests on the
>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>> the aggregated bandwidths before and after the patch were as follows:
>>
>> Test BW before patch BW after patch % change
>> ---- --------------- -------------- --------
>> randrw 1352 MB/s 2164 MB/s +60%
>> randwrite 1710 MB/s 2550 MB/s +49%
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>> fs/xfs/xfs_icache.c | 9 +++++++++
>> 1 files changed, 9 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index 99ee6ee..09f284f 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -71,6 +71,15 @@ xfs_inode_alloc(
>>
>> mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
>>
>> + /*
>> + * Enable reader spinning for DAX nount point
>> + */
>> + if (mp->m_flags& XFS_MOUNT_DAX) {
>> + rwsem_set_rspin_threshold(&ip->i_iolock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_mmaplock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_lock.mr_lock);
>> + }
> That's wrong. DAX is a per-inode flag, not a mount wide flag. This
> needs to be done once the inode has been fully initialised and
> IS_DAX(inode) can be run.
>
> Also, the benchmark doesn't show that all these locks are being
> tested by this benchmark. e.g. the i_mmaplock isn't involved in
> the benchmark's IO paths at all. It's only taken in page faults and
> truncate paths....
>
> I'd also like to see how much of the gain comes from the iolock vs
> the ilock, as the ilock is nested inside the iolock and so
> contention is much rarer....
This patch has now been superseded by a second one where changes to the
xfs code is no longer needed. The new patch will enable reader spinning
for all rwsem and dynamically disable it depending on past history.
> As it is, I'm *extremely* paranoid when it comes to changes to core
> locking like this. Performance is secondary to correctness, and we
> need much more than just a few benchmarks to verify there aren't
> locking bugs being introduced....
The core rwsem locking logic hasn't been changed. There are some minor
changes, however, on what RWSEM_WAITING_BIAS value to use that requires
more eyeballs to make sure that it hasn't introduced any new bug.
Cheers,
Longman
WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <waiman.long@hpe.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>, <linux-kernel@vger.kernel.org>,
<x86@kernel.org>, <linux-alpha@vger.kernel.org>,
<linux-ia64@vger.kernel.org>, <linux-s390@vger.kernel.org>,
<linux-arch@vger.kernel.org>, <xfs@oss.sgi.com>,
Davidlohr Bueso <dave@stgolabs.net>,
Jason Low <jason.low2@hp.com>,
Scott J Norton <scott.norton@hpe.com>,
Douglas Hatch <doug.hatch@hpe.com>
Subject: Re: [RFC PATCH-tip 6/6] xfs: Enable reader optimistic spinning for DAX inodes
Date: Wed, 15 Jun 2016 14:55:11 -0400 [thread overview]
Message-ID: <5761A48F.1040304@hpe.com> (raw)
In-Reply-To: <20160614230613.GB26977@dastard>
On 06/14/2016 07:06 PM, Dave Chinner wrote:
> On Tue, Jun 14, 2016 at 02:12:39PM -0400, Waiman Long wrote:
>> This patch enables reader optimistic spinning for inodes that are
>> under a DAX-based mount point.
>>
>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>> the fio test with multithreaded randrw and randwrite tests on the
>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>> the aggregated bandwidths before and after the patch were as follows:
>>
>> Test BW before patch BW after patch % change
>> ---- --------------- -------------- --------
>> randrw 1352 MB/s 2164 MB/s +60%
>> randwrite 1710 MB/s 2550 MB/s +49%
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>> fs/xfs/xfs_icache.c | 9 +++++++++
>> 1 files changed, 9 insertions(+), 0 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index 99ee6ee..09f284f 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -71,6 +71,15 @@ xfs_inode_alloc(
>>
>> mrlock_init(&ip->i_iolock, MRLOCK_BARRIER, "xfsio", ip->i_ino);
>>
>> + /*
>> + * Enable reader spinning for DAX nount point
>> + */
>> + if (mp->m_flags& XFS_MOUNT_DAX) {
>> + rwsem_set_rspin_threshold(&ip->i_iolock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_mmaplock.mr_lock);
>> + rwsem_set_rspin_threshold(&ip->i_lock.mr_lock);
>> + }
> That's wrong. DAX is a per-inode flag, not a mount wide flag. This
> needs to be done once the inode has been fully initialised and
> IS_DAX(inode) can be run.
>
> Also, the benchmark doesn't show that all these locks are being
> tested by this benchmark. e.g. the i_mmaplock isn't involved in
> the benchmark's IO paths at all. It's only taken in page faults and
> truncate paths....
>
> I'd also like to see how much of the gain comes from the iolock vs
> the ilock, as the ilock is nested inside the iolock and so
> contention is much rarer....
This patch has now been superseded by a second one where changes to the
xfs code is no longer needed. The new patch will enable reader spinning
for all rwsem and dynamically disable it depending on past history.
> As it is, I'm *extremely* paranoid when it comes to changes to core
> locking like this. Performance is secondary to correctness, and we
> need much more than just a few benchmarks to verify there aren't
> locking bugs being introduced....
The core rwsem locking logic hasn't been changed. There are some minor
changes, however, on what RWSEM_WAITING_BIAS value to use that requires
more eyeballs to make sure that it hasn't introduced any new bug.
Cheers,
Longman
next prev parent reply other threads:[~2016-06-15 18:55 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-14 18:12 [RFC PATCH-tip 0/6] locking/rwsem: Enable reader optimistic spinning Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` [RFC PATCH-tip 1/6] locking/rwsem: Stop active read lock ASAP Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` [RFC PATCH-tip 2/6] locking/rwsem: Enable optional count-based spinning on reader Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:27 ` Davidlohr Bueso
2016-06-14 18:27 ` Davidlohr Bueso
2016-06-14 18:27 ` Davidlohr Bueso
2016-06-14 19:11 ` Waiman Long
2016-06-14 19:11 ` Waiman Long
2016-06-14 19:11 ` Waiman Long
2016-06-14 19:11 ` Waiman Long
2016-06-14 18:12 ` [RFC PATCH-tip 3/6] locking/rwsem: move down rwsem_down_read_failed function Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` [RFC PATCH-tip 4/6] locking/rwsem: Change RWSEM_WAITING_BIAS for better disambiguation Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` [RFC PATCH-tip 5/6] locking/rwsem: Enable spinning readers Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` [RFC PATCH-tip 6/6] xfs: Enable reader optimistic spinning for DAX inodes Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:12 ` Waiman Long
2016-06-14 18:24 ` Christoph Hellwig
2016-06-14 18:24 ` Christoph Hellwig
2016-06-14 18:24 ` Christoph Hellwig
2016-06-14 19:08 ` Waiman Long
2016-06-14 19:08 ` Waiman Long
2016-06-14 19:08 ` Waiman Long
2016-06-14 19:08 ` Waiman Long
2016-06-14 23:06 ` Dave Chinner
2016-06-14 23:06 ` Dave Chinner
2016-06-14 23:06 ` Dave Chinner
2016-06-15 18:55 ` Waiman Long [this message]
2016-06-15 18:55 ` Waiman Long
2016-06-15 18:55 ` Waiman Long
2016-06-15 18:55 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5761A48F.1040304@hpe.com \
--to=waiman.long@hpe.com \
--cc=dave@stgolabs.net \
--cc=david@fromorbit.com \
--cc=doug.hatch@hpe.com \
--cc=jason.low2@hp.com \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-ia64@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=scott.norton@hpe.com \
--cc=x86@kernel.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.