From: Liu Bo <liubo2009@cn.fujitsu.com>
To: Chris Mason <chris.mason@oracle.com>,
Josef Bacik <josef@redhat.com>,
linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] Btrfs: complete page writeback before doing ordered extents
Date: Wed, 25 Apr 2012 15:52:07 +0800 [thread overview]
Message-ID: <4F97AD27.2070506@cn.fujitsu.com> (raw)
In-Reply-To: <20120424141529.GI22794@shiny>
On 04/24/2012 10:15 PM, Chris Mason wrote:
> On Tue, Apr 24, 2012 at 09:50:39AM +0800, Liu Bo wrote:
>> On 04/24/2012 01:33 AM, Josef Bacik wrote:
>>
>>> We can deadlock waiting for pages to end writeback because we are doing an
>>> allocation while hold a tree lock since the ordered extent stuff will
>>> require tree locks. A quick easy way to fix this is to end page writeback
>>> before we do our ordered io stuff, which works fine since we don't really
>>> need the page for this to work. Eventually we want to make this work happen
>>> as soon as the io is completed and then push the ordered extent stuff off to
>>> a worker thread, but at this stage we need this deadlock fixed with changing
>>> as little as possible. Thanks,
>>>
>>
>> Hi Josef,
>>
>> I'm ok with the patch, but could you show us more details about the deadlock between allocation and endio?
>
> Josef and I have been talking about this one off-list for a while. It's
> a deadlock I tracked down in my overnight stress runs.
>
> Basically what we have is the io-less dirty throttling code saying there
> are too many pages in writeback, and so new allocations are backing up
> and waiting for pages to leave writeback.
>
> But the pages can't leave writeback because we're waiting on more memory
> to complete the metadata changes at endio time. Strictly speaking the
> VM is doing something wrong here, our NOFS allocations shouldn't be
> waiting for writeback to finish.
>
> But, strictly speaking we're doing something wrong too, we're doing too
> many allocations with pages tied up in writeback.
>
> So this splits the page from the metadata changes. We're still doing
> the metadata changes after the IO is complete, but we're doing them
> after we've let the pages go.
>
> -chris
>
Now it's clear, thanks for the explanation. :)
thanks,
liubo
prev parent reply other threads:[~2012-04-25 7:52 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-23 17:33 [PATCH] Btrfs: complete page writeback before doing ordered extents Josef Bacik
2012-04-24 1:50 ` Liu Bo
2012-04-24 14:15 ` Chris Mason
2012-04-25 7:52 ` Liu Bo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4F97AD27.2070506@cn.fujitsu.com \
--to=liubo2009@cn.fujitsu.com \
--cc=chris.mason@oracle.com \
--cc=josef@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).