From mboxrd@z Thu Jan  1 00:00:00 1970
From: Waiman Long <waiman.long@hpe.com>
Subject: Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock
 ASAP
Date: Fri, 7 Oct 2016 17:45:29 -0400
Message-ID: <57F81779.4050101@hpe.com>
References: <1471554672-38662-1-git-send-email-Waiman.Long@hpe.com> <1471554672-38662-3-git-send-email-Waiman.Long@hpe.com> <20161006181718.GA14967@linux-80c1.suse> <20161006214751.GU27872@dastard>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Return-path: <linux-kernel-owner@vger.kernel.org>
In-Reply-To: <20161006214751.GU27872@dastard>
Sender: linux-kernel-owner@vger.kernel.org
To: Dave Chinner <david@fromorbit.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org, x86@kernel.org, linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org, linux-s390@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, Jason Low <jason.low2@hp.com>, Jonathan Corbet <corbet@lwn.net>, Scott J Norton <scott.norton@hpe.com>, Douglas Hatch <doug.hatch@hpe.com>
List-Id: linux-arch.vger.kernel.org

On 10/06/2016 05:47 PM, Dave Chinner wrote:
> On Thu, Oct 06, 2016 at 11:17:18AM -0700, Davidlohr Bueso wrote:
>> On Thu, 18 Aug 2016, Waiman Long wrote:
>>
>>> Currently, when down_read() fails, the active read locking isn't undone
>>> until the rwsem_down_read_failed() function grabs the wait_lock. If the
>>> wait_lock is contended, it may takes a while to get the lock. During
>>> that period, writer lock stealing will be disabled because of the
>>> active read lock.
>>>
>>> This patch will release the active read lock ASAP so that writer lock
>>> stealing can happen sooner. The only downside is when the reader is
>>> the first one in the wait queue as it has to issue another atomic
>>> operation to update the count.
>>>
>>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>>> the fio test with multithreaded randrw and randwrite tests on the
>>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>>> the aggregated bandwidths before and after the patch were as follows:
>>>
>>> Test      BW before patch     BW after patch  % change
>>> ----      ---------------     --------------  --------
>>> randrw        1210 MB/s          1352 MB/s      +12%
>>> randwrite     1622 MB/s          1710 MB/s      +5.4%
>> Yeah, this is really a bad workload to make decisions on locking
>> heuristics imo - if I'm thinking of the same workload. Mainly because
>> concurrent buffered io to the same file isn't very realistic and you
>> end up pathologically pounding on i_rwsem (which used to be until
>> recently i_mutex until Al's parallel lookup/readdir). Obviously write
>> lock stealing wins in this case.
> Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> XFS level and never took exclusive locks.
>
> *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> match the buffered IO single writer POSIX semantics - the test is a
> bad test based on the fact it exercised a path that is under heavy
> development and so can't be used as a regression test across
> multiple kernels.
>
> If you want to stress concurrent access to a single file, please
> use direct IO, not DAX or buffered IO.

Thanks for the update. I will change the test when I update this patch.

Cheers,
Longman

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-arch-owner@vger.kernel.org>
Received: from mail-sn1nam01on0107.outbound.protection.outlook.com ([104.47.32.107]:15072
        "EHLO NAM01-SN1-obe.outbound.protection.outlook.com"
        rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP
        id S1751068AbcJGXTg (ORCPT <rfc822;linux-arch@vger.kernel.org>);
        Fri, 7 Oct 2016 19:19:36 -0400
Message-ID: <57F81779.4050101@hpe.com>
Date: Fri, 7 Oct 2016 17:45:29 -0400
From: Waiman Long <waiman.long@hpe.com>
MIME-Version: 1.0
Subject: Re: [RFC PATCH-tip v4 02/10] locking/rwsem: Stop active read lock
 ASAP
References: <1471554672-38662-1-git-send-email-Waiman.Long@hpe.com> <1471554672-38662-3-git-send-email-Waiman.Long@hpe.com> <20161006181718.GA14967@linux-80c1.suse> <20161006214751.GU27872@dastard>
In-Reply-To: <20161006214751.GU27872@dastard>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-arch-owner@vger.kernel.org
List-ID: <linux-arch.vger.kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org, x86@kernel.org, linux-alpha@vger.kernel.org, linux-ia64@vger.kernel.org, linux-s390@vger.kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, Jason Low <jason.low2@hp.com>, Jonathan Corbet <corbet@lwn.net>, Scott J Norton <scott.norton@hpe.com>, Douglas Hatch <doug.hatch@hpe.com>
Message-ID: <20161007214529.XHpQUeqyO62amXIL4JOGy3EicM66zbLcBF6gnV9RauQ@z>

On 10/06/2016 05:47 PM, Dave Chinner wrote:
> On Thu, Oct 06, 2016 at 11:17:18AM -0700, Davidlohr Bueso wrote:
>> On Thu, 18 Aug 2016, Waiman Long wrote:
>>
>>> Currently, when down_read() fails, the active read locking isn't undone
>>> until the rwsem_down_read_failed() function grabs the wait_lock. If the
>>> wait_lock is contended, it may takes a while to get the lock. During
>>> that period, writer lock stealing will be disabled because of the
>>> active read lock.
>>>
>>> This patch will release the active read lock ASAP so that writer lock
>>> stealing can happen sooner. The only downside is when the reader is
>>> the first one in the wait queue as it has to issue another atomic
>>> operation to update the count.
>>>
>>> On a 4-socket Haswell machine running on a 4.7-rc1 tip-based kernel,
>>> the fio test with multithreaded randrw and randwrite tests on the
>>> same file on a XFS partition on top of a NVDIMM with DAX were run,
>>> the aggregated bandwidths before and after the patch were as follows:
>>>
>>> Test      BW before patch     BW after patch  % change
>>> ----      ---------------     --------------  --------
>>> randrw        1210 MB/s          1352 MB/s      +12%
>>> randwrite     1622 MB/s          1710 MB/s      +5.4%
>> Yeah, this is really a bad workload to make decisions on locking
>> heuristics imo - if I'm thinking of the same workload. Mainly because
>> concurrent buffered io to the same file isn't very realistic and you
>> end up pathologically pounding on i_rwsem (which used to be until
>> recently i_mutex until Al's parallel lookup/readdir). Obviously write
>> lock stealing wins in this case.
> Except that it's DAX, and in 4.7-rc1 that used shared locking at the
> XFS level and never took exclusive locks.
>
> *However*, the DAX IO path locking in XFS  has changed in 4.9-rc1 to
> match the buffered IO single writer POSIX semantics - the test is a
> bad test based on the fact it exercised a path that is under heavy
> development and so can't be used as a regression test across
> multiple kernels.
>
> If you want to stress concurrent access to a single file, please
> use direct IO, not DAX or buffered IO.

Thanks for the update. I will change the test when I update this patch.

Cheers,
Longman