From: Jan Kara <jack@suse.cz>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Jan Kara <jack@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
Dave Chinner <david@fromorbit.com>,
Chris Mason <chris.mason@oracle.com>,
Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Christoph Hellwig <hch@infradead.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Mel Gorman <mel@csn.ul.ie>, Minchan Kim <minchan.kim@gmail.com>
Subject: Re: [PATCH 2/5] writeback: stop periodic/background work on seeing sync works
Date: Tue, 3 Aug 2010 14:39:22 +0200 [thread overview]
Message-ID: <20100803123922.GC3322@quack.suse.cz> (raw)
In-Reply-To: <20100803105520.GA3322@quack.suse.cz>
[-- Attachment #1: Type: text/plain, Size: 2521 bytes --]
On Tue 03-08-10 12:55:20, Jan Kara wrote:
> On Tue 03-08-10 11:01:25, Wu Fengguang wrote:
> > On Tue, Aug 03, 2010 at 04:51:52AM +0800, Jan Kara wrote:
> > > On Fri 30-07-10 12:03:06, Wu Fengguang wrote:
> > > > On Fri, Jul 30, 2010 at 12:20:27AM +0800, Jan Kara wrote:
> > > > > On Thu 29-07-10 19:51:44, Wu Fengguang wrote:
> > > > > > The periodic/background writeback can run forever. So when any
> > > > > > sync work is enqueued, increase bdi->sync_works to notify the
> > > > > > active non-sync works to exit. Non-sync works queued after sync
> > > > > > works won't be affected.
> > > > > Hmm, wouldn't it be simpler logic to just make for_kupdate and
> > > > > for_background work always yield when there's some other work to do (as
> > > > > they are livelockable from the definition of the target they have) and
> > > > > make sure any other work isn't livelockable?
> > > >
> > > > Good idea!
> > > >
> > > > > The only downside is that
> > > > > non-livelockable work cannot be "fair" in the sense that we cannot switch
> > > > > inodes after writing MAX_WRITEBACK_PAGES.
> > > >
> > > > Cannot switch indoes _before_ finish with the current
> > > > MAX_WRITEBACK_PAGES batch?
> > > Well, even after writing all those MAX_WRITEBACK_PAGES. Because what you
> > > want to do in a non-livelockable work is: take inode, write it, never look at
> > > it again for this work. Because if you later return to the inode, it can
> > > have newer dirty pages and thus you cannot really avoid livelock. Of
> > > course, this all assumes .nr_to_write isn't set to something small. That
> > > avoids the livelock as well.
> >
> > I do have a poor man's solution that can handle this case.
> > https://kerneltrap.org/mailarchive/linux-fsdevel/2009/10/7/6476473/thread
> > It may do more extra works, but will stop livelock in theory.
> So I don't think sync work on it's own is a problem. There we can just
> give up any fairness and just go inode by inode. IMHO it's much simpler that
> way. The remaining types of work we have are "for_reclaim" and then ones
> triggered by filesystems to get rid of delayed allocated data. These cases
> can easily have well defined and low nr_to_write so they wouldn't be
> livelockable either. What do you think?
Fengguang, how about merging also the attached simple patch together with
my fix? With these two patches, I'm not able to trigger any sync livelock
while without one of them I hit them quite easily...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
[-- Attachment #2: 0001-mm-Avoid-resetting-wb_start-after-each-writeback-ro.patch --]
[-- Type: text/x-patch, Size: 1877 bytes --]
>From e4b708115825bca6a1020eed2356e2aab0567e3a Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Tue, 3 Aug 2010 10:35:02 +0000
Subject: [PATCH] mm: Avoid resetting wb_start after each writeback round
WB_SYNC_NONE writeback is done in rounds of 1024 pages so that we don't write
out some huge inode for too long while starving writeout of other inodes. To
avoid livelocks, we record time we started writeback in wbc->wb_start and do
not write out inodes which were dirtied after this time. But currently,
writeback_inodes_wb() resets wb_start each time it is called thus effectively
invalidating this logic and making any WB_SYNC_NONE writeback prone to
livelocks.
This patch makes sure wb_start is set only once when we start writeback.
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/fs-writeback.c | 5 +++--
1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 6bdc924..aa59394 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -530,7 +530,8 @@ void writeback_inodes_wb(struct bdi_writeback *wb,
{
int ret = 0;
- wbc->wb_start = jiffies; /* livelock avoidance */
+ if (!wbc->wb_start)
+ wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
if (!wbc->for_kupdate || list_empty(&wb->b_io))
queue_io(wb, wbc->older_than_this);
@@ -559,7 +560,6 @@ static void __writeback_inodes_sb(struct super_block *sb,
{
WARN_ON(!rwsem_is_locked(&sb->s_umount));
- wbc->wb_start = jiffies; /* livelock avoidance */
spin_lock(&inode_lock);
if (!wbc->for_kupdate || list_empty(&wb->b_io))
queue_io(wb, wbc->older_than_this);
@@ -625,6 +625,7 @@ static long wb_writeback(struct bdi_writeback *wb,
wbc.range_end = LLONG_MAX;
}
+ wbc.wb_start = jiffies; /* livelock avoidance */
for (;;) {
/*
* Stop writeback when nr_pages has been consumed
--
1.6.0.2
next prev parent reply other threads:[~2010-08-03 12:39 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-07-29 11:51 [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Wu Fengguang
2010-07-29 11:51 ` [PATCH 1/5] writeback: introduce wbc.for_sync to cover the two sync stages Wu Fengguang
2010-07-29 15:04 ` Jan Kara
2010-07-30 5:10 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 2/5] writeback: stop periodic/background work on seeing sync works Wu Fengguang
2010-07-29 16:20 ` Jan Kara
2010-07-30 4:03 ` Wu Fengguang
2010-08-02 20:51 ` Jan Kara
2010-08-03 3:01 ` Wu Fengguang
2010-08-03 10:55 ` Jan Kara
2010-08-03 12:39 ` Jan Kara [this message]
2010-08-03 12:59 ` Wu Fengguang
2010-08-03 13:18 ` Jan Kara
2010-08-03 13:22 ` Wu Fengguang
2010-08-03 13:44 ` Wu Fengguang
2010-08-03 13:48 ` Wu Fengguang
2010-08-03 14:36 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp Wu Fengguang
2010-07-29 15:02 ` Jan Kara
2010-07-30 5:17 ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 4/5] writeback: introduce bdi_start_inode_writeback() Wu Fengguang
2010-07-29 11:51 ` [PATCH 5/5] vmscan: transfer async file writeback to the flusher Wu Fengguang
2010-07-29 16:09 ` [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Jan Kara
2010-07-30 5:34 ` Wu Fengguang
2010-07-29 23:23 ` Dave Chinner
2010-07-30 7:58 ` Wu Fengguang
2010-07-30 9:22 ` KOSAKI Motohiro
2010-07-30 12:25 ` Wu Fengguang
2010-07-30 11:12 ` Dave Chinner
2010-07-30 13:18 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100803123922.GC3322@quack.suse.cz \
--to=jack@suse.cz \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=chris.mason@oracle.com \
--cc=david@fromorbit.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=hch@infradead.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=minchan.kim@gmail.com \
--cc=npiggin@suse.de \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).