linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Hurley <peter@hurleysoftware.com>
To: Josef Bacik <jbacik@fusionio.com>
Cc: David Daney <ddaney.cavm@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-btrfs@vger.kernel.org, walken@google.com, mingo@elte.hu,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] rwsem: add rwsem_is_contended
Date: Mon, 16 Sep 2013 21:22:12 -0400	[thread overview]
Message-ID: <5237AEC4.4050404@hurleysoftware.com> (raw)
In-Reply-To: <20130917011150.GK2446@localhost.localdomain>

On 09/16/2013 09:11 PM, Josef Bacik wrote:
> On Mon, Sep 16, 2013 at 06:08:42PM -0700, David Daney wrote:
>> On 09/16/2013 05:37 PM, Peter Hurley wrote:
>>> On 09/16/2013 08:29 PM, David Daney wrote:
>>>> On 09/16/2013 05:05 PM, Josef Bacik wrote:
>>>>> On Mon, Sep 16, 2013 at 04:05:47PM -0700, Andrew Morton wrote:
>>>>>> On Fri, 30 Aug 2013 10:14:01 -0400 Josef Bacik <jbacik@fusionio.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Btrfs uses an rwsem to control access to its extent tree.  Threads
>>>>>>> will hold a
>>>>>>> read lock on this rwsem while they scan the extent tree, and if
>>>>>>> need_resched()
>>>>>>> they will drop the lock and schedule.  The transaction commit needs
>>>>>>> to take a
>>>>>>> write lock for this rwsem for a very short period to switch out the
>>>>>>> commit
>>>>>>> roots.  If there are a lot of threads doing this caching operation
>>>>>>> we can starve
>>>>>>> out the committers which slows everybody out.  To address this we
>>>>>>> want to add
>>>>>>> this functionality to see if our rwsem has anybody waiting to take
>>>>>>> a write lock
>>>>>>> so we can drop it and schedule for a bit to allow the commit to
>>>>>>> continue.
>>>>>>> Thanks,
>>>>>>>
>>>>>>
>>>>>> This sounds rather nasty and hacky.  Rather then working around a
>>>>>> locking shortcoming in a caller it would be better to fix/enhance the
>>>>>> core locking code.  What would such a change need to do?
>>>>>>
>>>>>> Presently rwsem waiters are fifo-queued, are they not?  So the commit
>>>>>> thread will eventually get that lock.  Apparently that's not working
>>>>>> adequately for you but I don't fully understand what it is about these
>>>>>> dynamics which is causing observable problems.
>>>>>>
>>>>>
>>>>> So the problem is not that its normal lock starvation, it's more our
>>>>> particular
>>>>> use case that is causing the starvation.  We can have lots of people
>>>>> holding
>>>>> readers and simply never give them up for long periods of time, which
>>>>> is why we
>>>>> need this is_contended helper so we know to drop things and let the
>>>>> committer
>>>>> through.  Thanks,
>>>>
>>>> You could easily achieve the same thing by putting an "is_contending"
>>>> flag in parallel with the rwsem and testing that:
>>>
>>> Which adds a bunch more bus-locked operations to contended over
>>
>> Would that be a problem in this particular case?  Has it been measured?
>>
>>> , when
>>> a unlocked if (list_empty()) is sufficient.
>>
>> I don't object to adding rwsem_is_contended() *if* it is required.  I was
>> just pointing out that there may be other options.
>>
>> The patch adds a bunch of new semantics to rwsem.  There is a trade off
>> between increased complexity of core code, and generalizing subsystem
>> specific optimizations that may not be globally useful.
>>
>> Is it worth it in this case?  I do not know.
>>
>
> So what you suggested is actually what we did in order to prove that this was
> what the problem was.  I'm ok with continuing to do that, I just figured adding
> something like rwsem_is_contended() would be nice in case anybody else runs into
> the issue in the future, plus it would save me an atomic_t in an already large
> structure.

I saw the original patch you linked to earlier in the discussion, and
I agree that for your use case adding a contention test is cleaner and clearer
than other options.

That said, I think this extension is only useful for readers: writers should be
getting their business done and releasing the sem.

Also, I think the comment above the function should be clearer that the lock
must already be held by the caller; IOW, this is not a trylock replacement.

Regards,
Peter Hurley


  reply	other threads:[~2013-09-17  1:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-30 14:14 [PATCH] rwsem: add rwsem_is_contended Josef Bacik
2013-08-31 14:51 ` Peter Zijlstra
2013-09-03 15:49   ` Josef Bacik
2013-09-01  8:32 ` Michel Lespinasse
2013-09-02 17:18   ` Peter Hurley
2013-09-03 13:18     ` Josef Bacik
2013-09-04 11:46       ` Peter Hurley
2013-09-04 12:13         ` Josef Bacik
2013-09-03 15:47   ` Josef Bacik
2013-09-04 12:11     ` Peter Hurley
2013-09-16 23:05 ` Andrew Morton
2013-09-17  0:05   ` Josef Bacik
2013-09-17  0:29     ` David Daney
2013-09-17  0:37       ` Peter Hurley
2013-09-17  1:08         ` David Daney
2013-09-17  1:11           ` Josef Bacik
2013-09-17  1:22             ` Peter Hurley [this message]
2013-09-17  6:53   ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5237AEC4.4050404@hurleysoftware.com \
    --to=peter@hurleysoftware.com \
    --cc=akpm@linux-foundation.org \
    --cc=ddaney.cavm@gmail.com \
    --cc=jbacik@fusionio.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).