From: Chris Mason <chris.mason@oracle.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jan Kara <jack@suse.cz>,
"jens.axboe@oracle.com" <jens.axboe@oracle.com>,
LKML <linux-kernel@vger.kernel.org>,
Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH] fs: Fix busyloop in wb_writeback()
Date: Mon, 21 Sep 2009 10:45:51 -0400 [thread overview]
Message-ID: <20090921144551.GA10825@think> (raw)
In-Reply-To: <20090921143107.GA6567@localhost>
On Mon, Sep 21, 2009 at 10:31:07PM +0800, Wu Fengguang wrote:
> On Mon, Sep 21, 2009 at 10:19:10PM +0800, Chris Mason wrote:
> > On Mon, Sep 21, 2009 at 10:11:09PM +0800, Wu Fengguang wrote:
> > > On Mon, Sep 21, 2009 at 09:45:11PM +0800, Jan Kara wrote:
> > > > On Mon 21-09-09 09:08:59, Wu Fengguang wrote:
> > > > > On Mon, Sep 21, 2009 at 01:43:56AM +0800, Jan Kara wrote:
> > > > > > So when we see inode under writeback, we put it to b_more_io. So I think
> > > > > > my patch really fixes the issue when two threads are racing on writing the
> > > > > > same inode.
> > > > >
> > > > > Ah OK. So it busy loops when there are more syncing threads than dirty
> > > > > files. For example, one bdi flush thread plus one process running
> > > > > balance_dirty_pages().
> > > > Yes.
> > > >
> > > > > > > The busy loop does exists, when bdi is congested.
> > > > > > > In this case, write_cache_pages() will refuse to write anything,
> > > > > > > we used to be calling congestion_wait() to take a breath, but now
> > > > > > > wb_writeback() purged that call and thus created a busy loop.
> > > > > > I don't think congestion is an issue here. The device needen't be
> > > > > > congested for the busyloop to happen.
> > > > >
> > > > > bdi congestion is a different case. When there are only one syncing
> > > > > thread, b_more_io inodes won't have I_SYNC, so your patch is a no-op.
> > > > > wb_writeback() or any of its sub-routines must wait/yield for a while
> > > > > to avoid busy looping on the congestion. Where is the wait with Jens'
> > > > > new code?
> > > > I agree someone must wait when we bail out due to congestion. But we bail
> > > > out only when wbc->nonblocking is set.
> > >
> > > Here is another problem. wbc->nonblocking used to be set for kupdate
> > > and background writebacks, but now it's gone. So they will be blocked
> > > in get_request_wait(). That's fine, no busy loops.
> > >
> > > However this inverts the priority. pageout() still have nonblocking=1.
> > > So now vmscan can easily be live locked by heavy background writebacks.
> >
> > The important part of the nonblocking check for pageout is really to
> > make sure that it doesn't get stuck locking a buffer that is actually
> > under IO, which happens in ext3/reiserfs data=ordered mode.
>
> OK.
>
> > Having pageout wait for a request is fine. Its just as likely to wait
> > for a request when it does actually start the IO, regardless of the
> > congestion checks earlier in the call chain.
>
> There are fundamental differences. The congestion wait is live lock for
> pageout, while wait_on_page_writeback() will finish in bounded time.
>
> > I'd drop any congestion checks in the nooks and crannies of the
> > writeback paths.
>
> Let's work on a better solution then?
Today, wbc->nonblocking and congestion are checked together:
1) in writeback_inodes_wb before we call writeback_single_inode
2) in write_cache_pages, before we call writepage
3) in write_cache_pages, after we call writepage
If we delete all 3, we get rid of the livelock but keep the check that
makes sure we don't wait on locked buffers that are under IO.
If we delete #1 and #2, we'll get rid of the livelock but pageout will
still stop trying to do IO on this backing dev once it finds some
congestion.
I think either way is fine ;)
-chris
next prev parent reply other threads:[~2009-09-21 14:46 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-16 17:22 [PATCH] fs: Fix busyloop in wb_writeback() Jan Kara
2009-09-16 18:41 ` Jens Axboe
2009-09-17 9:09 ` Jan Kara
2009-09-21 13:01 ` Wu Fengguang
2009-09-21 13:06 ` Jens Axboe
2009-09-21 13:10 ` Wu Fengguang
2009-09-21 13:40 ` Jens Axboe
2009-09-21 13:19 ` Jan Kara
2009-09-21 13:28 ` Wu Fengguang
2009-09-19 1:53 ` Wu Fengguang
2009-09-20 2:35 ` Wu Fengguang
2009-09-20 17:43 ` Jan Kara
2009-09-21 1:08 ` Wu Fengguang
2009-09-21 13:45 ` Jan Kara
2009-09-21 14:11 ` Wu Fengguang
2009-09-21 14:19 ` Chris Mason
2009-09-21 14:31 ` Wu Fengguang
2009-09-21 14:45 ` Chris Mason [this message]
2009-09-22 9:14 ` Wu Fengguang
2009-09-23 7:56 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090921144551.GA10825@think \
--to=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=jack@suse.cz \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox