From: Josef Bacik <jbacik@fusionio.com>
To: Peter Hurley <peter@hurleysoftware.com>
Cc: Michel Lespinasse <walken@google.com>,
Josef Bacik <jbacik@fusionio.com>, <linux-btrfs@vger.kernel.org>,
<mingo@elte.hu>, <akpm@linux-foundation.org>,
<linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] rwsem: add rwsem_is_contended
Date: Tue, 3 Sep 2013 09:18:05 -0400 [thread overview]
Message-ID: <20130903131805.GA15634@localhost.localdomain> (raw)
In-Reply-To: <5224C850.2060103@hurleysoftware.com>
On Mon, Sep 02, 2013 at 01:18:08PM -0400, Peter Hurley wrote:
> On 09/01/2013 04:32 AM, Michel Lespinasse wrote:
> >Hi Josef,
> >
> >On Fri, Aug 30, 2013 at 7:14 AM, Josef Bacik <jbacik@fusionio.com> wrote:
> >>Btrfs uses an rwsem to control access to its extent tree. Threads will hold a
> >>read lock on this rwsem while they scan the extent tree, and if need_resched()
> >>they will drop the lock and schedule. The transaction commit needs to take a
> >>write lock for this rwsem for a very short period to switch out the commit
> >>roots. If there are a lot of threads doing this caching operation we can starve
> >>out the committers which slows everybody out. To address this we want to add
> >>this functionality to see if our rwsem has anybody waiting to take a write lock
> >>so we can drop it and schedule for a bit to allow the commit to continue.
> >>Thanks,
> >>
> >>Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> >
> >FYI, I once tried to introduce something like this before, but my use
> >case was pretty weak so it was not accepted at the time. I don't think
> >there were any objections to the API itself though, and I think it's
> >potentially a good idea if you use case justifies it.
>
> Exactly, I'm concerned about the use case: readers can't starve writers.
> Of course, lots of existing readers can temporarily prevent a writer from
> acquiring, but those readers would already have the lock. Any new readers
> wouldn't be able to prevent a waiting writer from obtaining the lock.
>
> Josef,
> Could you be more explicit, maybe with some detailed numbers about the
> condition you report?
>
Sure, this came from a community member
http://article.gmane.org/gmane.comp.file-systems.btrfs/28081
With the old approach we could block between 1-2 seconds waiting for this rwsem,
and with the new approach where we allow many more of these caching threads we
were staving out the writer for 80 seconds.
So what happens is these threads will scan our extent tree to put together the
free space cache, and they'll hold this lock while they are doing the scanning.
The only way they will drop this lock is if we hit need_resched(), but because
these threads are going to do quite a bit of IO I imagine we're not ever being
flagged with need_resched() because we schedule while waiting for IO. So these
threads will hold onto this lock for bloody ever without giving it up so the
committer can take the write lock. His patch to "fix" the problem was to have
an atomic that let us know somebody was waiting for a write lock and then we'd
drop the reader lock and schedule.
So really we're just using a rwsem in a really mean way for writers. I'm open
to other suggestions but I think this probably the cleanest way. Thanks,
Josef
next prev parent reply other threads:[~2013-09-03 13:18 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-30 14:14 [PATCH] rwsem: add rwsem_is_contended Josef Bacik
2013-08-31 14:51 ` Peter Zijlstra
2013-09-03 15:49 ` Josef Bacik
2013-09-01 8:32 ` Michel Lespinasse
2013-09-02 17:18 ` Peter Hurley
2013-09-03 13:18 ` Josef Bacik [this message]
2013-09-04 11:46 ` Peter Hurley
2013-09-04 12:13 ` Josef Bacik
2013-09-03 15:47 ` Josef Bacik
2013-09-04 12:11 ` Peter Hurley
2013-09-16 23:05 ` Andrew Morton
2013-09-17 0:05 ` Josef Bacik
2013-09-17 0:29 ` David Daney
2013-09-17 0:37 ` Peter Hurley
2013-09-17 1:08 ` David Daney
2013-09-17 1:11 ` Josef Bacik
2013-09-17 1:22 ` Peter Hurley
2013-09-17 6:53 ` Ingo Molnar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130903131805.GA15634@localhost.localdomain \
--to=jbacik@fusionio.com \
--cc=akpm@linux-foundation.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peter@hurleysoftware.com \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).