From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-yh0-f45.google.com ([209.85.213.45]:61937 "EHLO mail-yh0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751697Ab3IQBQB (ORCPT ); Mon, 16 Sep 2013 21:16:01 -0400 Message-ID: <5237AB9A.1030604@gmail.com> Date: Mon, 16 Sep 2013 18:08:42 -0700 From: David Daney MIME-Version: 1.0 To: Peter Hurley CC: Josef Bacik , Andrew Morton , linux-btrfs@vger.kernel.org, walken@google.com, mingo@elte.hu, linux-kernel@vger.kernel.org Subject: Re: [PATCH] rwsem: add rwsem_is_contended References: <1377872041-390-1-git-send-email-jbacik@fusionio.com> <20130916160547.371b74f91511a42ac263449e@linux-foundation.org> <20130917000516.GJ2446@localhost.localdomain> <5237A257.1070303@gmail.com> <5237A461.3010802@hurleysoftware.com> In-Reply-To: <5237A461.3010802@hurleysoftware.com> Content-Type: text/plain; charset=UTF-8; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 09/16/2013 05:37 PM, Peter Hurley wrote: > On 09/16/2013 08:29 PM, David Daney wrote: >> On 09/16/2013 05:05 PM, Josef Bacik wrote: >>> On Mon, Sep 16, 2013 at 04:05:47PM -0700, Andrew Morton wrote: >>>> On Fri, 30 Aug 2013 10:14:01 -0400 Josef Bacik >>>> wrote: >>>> >>>>> Btrfs uses an rwsem to control access to its extent tree. Threads >>>>> will hold a >>>>> read lock on this rwsem while they scan the extent tree, and if >>>>> need_resched() >>>>> they will drop the lock and schedule. The transaction commit needs >>>>> to take a >>>>> write lock for this rwsem for a very short period to switch out the >>>>> commit >>>>> roots. If there are a lot of threads doing this caching operation >>>>> we can starve >>>>> out the committers which slows everybody out. To address this we >>>>> want to add >>>>> this functionality to see if our rwsem has anybody waiting to take >>>>> a write lock >>>>> so we can drop it and schedule for a bit to allow the commit to >>>>> continue. >>>>> Thanks, >>>>> >>>> >>>> This sounds rather nasty and hacky. Rather then working around a >>>> locking shortcoming in a caller it would be better to fix/enhance the >>>> core locking code. What would such a change need to do? >>>> >>>> Presently rwsem waiters are fifo-queued, are they not? So the commit >>>> thread will eventually get that lock. Apparently that's not working >>>> adequately for you but I don't fully understand what it is about these >>>> dynamics which is causing observable problems. >>>> >>> >>> So the problem is not that its normal lock starvation, it's more our >>> particular >>> use case that is causing the starvation. We can have lots of people >>> holding >>> readers and simply never give them up for long periods of time, which >>> is why we >>> need this is_contended helper so we know to drop things and let the >>> committer >>> through. Thanks, >> >> You could easily achieve the same thing by putting an "is_contending" >> flag in parallel with the rwsem and testing that: > > Which adds a bunch more bus-locked operations to contended over Would that be a problem in this particular case? Has it been measured? > , when > a unlocked if (list_empty()) is sufficient. I don't object to adding rwsem_is_contended() *if* it is required. I was just pointing out that there may be other options. The patch adds a bunch of new semantics to rwsem. There is a trade off between increased complexity of core code, and generalizing subsystem specific optimizations that may not be globally useful. Is it worth it in this case? I do not know. David Daney