linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Christoph Hellwig <hch@infradead.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mel Gorman <mel@csn.ul.ie>, Minchan Kim <minchan.kim@gmail.com>
Subject: Re: [PATCH 2/5] writeback: stop periodic/background work on seeing sync works
Date: Tue, 3 Aug 2010 20:59:24 +0800	[thread overview]
Message-ID: <20100803125924.GA31827@localhost> (raw)
In-Reply-To: <20100803123922.GC3322@quack.suse.cz>

On Tue, Aug 03, 2010 at 08:39:22PM +0800, Jan Kara wrote:
> On Tue 03-08-10 12:55:20, Jan Kara wrote:
> > On Tue 03-08-10 11:01:25, Wu Fengguang wrote:
> > > On Tue, Aug 03, 2010 at 04:51:52AM +0800, Jan Kara wrote:
> > > > On Fri 30-07-10 12:03:06, Wu Fengguang wrote:
> > > > > On Fri, Jul 30, 2010 at 12:20:27AM +0800, Jan Kara wrote:
> > > > > > On Thu 29-07-10 19:51:44, Wu Fengguang wrote:
> > > > > > > The periodic/background writeback can run forever. So when any
> > > > > > > sync work is enqueued, increase bdi->sync_works to notify the
> > > > > > > active non-sync works to exit. Non-sync works queued after sync
> > > > > > > works won't be affected.
> > > > > >   Hmm, wouldn't it be simpler logic to just make for_kupdate and
> > > > > > for_background work always yield when there's some other work to do (as
> > > > > > they are livelockable from the definition of the target they have) and
> > > > > > make sure any other work isn't livelockable?
> > > > > 
> > > > > Good idea!
> > > > > 
> > > > > > The only downside is that
> > > > > > non-livelockable work cannot be "fair" in the sense that we cannot switch
> > > > > > inodes after writing MAX_WRITEBACK_PAGES.
> > > > > 
> > > > > Cannot switch indoes _before_ finish with the current
> > > > > MAX_WRITEBACK_PAGES batch? 
> > > >   Well, even after writing all those MAX_WRITEBACK_PAGES. Because what you
> > > > want to do in a non-livelockable work is: take inode, write it, never look at
> > > > it again for this work. Because if you later return to the inode, it can
> > > > have newer dirty pages and thus you cannot really avoid livelock. Of
> > > > course, this all assumes .nr_to_write isn't set to something small. That
> > > > avoids the livelock as well.
> > > 
> > > I do have a poor man's solution that can handle this case.
> > > https://kerneltrap.org/mailarchive/linux-fsdevel/2009/10/7/6476473/thread
> > > It may do more extra works, but will stop livelock in theory.
> >   So I don't think sync work on it's own is a problem. There we can just
> > give up any fairness and just go inode by inode. IMHO it's much simpler that
> > way. The remaining types of work we have are "for_reclaim" and then ones
> > triggered by filesystems to get rid of delayed allocated data. These cases
> > can easily have well defined and low nr_to_write so they wouldn't be
> > livelockable either. What do you think?
>   Fengguang, how about merging also the attached simple patch together with
> my fix? With these two patches, I'm not able to trigger any sync livelock
> while without one of them I hit them quite easily...

This looks OK. However note that redirty_tail() can modify
dirtied_when unexpectedly. So the more we rely on wb_start, the more
possibility an inode is (wrongly) skipped by sync. I have a bunch of
patches to remove redirty_tail(). However they may not be good
candidates for 2.6.36..

Thanks,
Fengguang


> 								Honza
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR

> From e4b708115825bca6a1020eed2356e2aab0567e3a Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@suse.cz>
> Date: Tue, 3 Aug 2010 10:35:02 +0000
> Subject: [PATCH] mm: Avoid resetting wb_start after each writeback round
> 
> WB_SYNC_NONE writeback is done in rounds of 1024 pages so that we don't write
> out some huge inode for too long while starving writeout of other inodes. To
> avoid livelocks, we record time we started writeback in wbc->wb_start and do
> not write out inodes which were dirtied after this time. But currently,
> writeback_inodes_wb() resets wb_start each time it is called thus effectively
> invalidating this logic and making any WB_SYNC_NONE writeback prone to
> livelocks.
> 
> This patch makes sure wb_start is set only once when we start writeback.
> 
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/fs-writeback.c |    5 +++--
>  1 files changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 6bdc924..aa59394 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -530,7 +530,8 @@ void writeback_inodes_wb(struct bdi_writeback *wb,
>  {
>  	int ret = 0;
>  
> -	wbc->wb_start = jiffies; /* livelock avoidance */
> +	if (!wbc->wb_start)
> +		wbc->wb_start = jiffies; /* livelock avoidance */
>  	spin_lock(&inode_lock);
>  	if (!wbc->for_kupdate || list_empty(&wb->b_io))
>  		queue_io(wb, wbc->older_than_this);
> @@ -559,7 +560,6 @@ static void __writeback_inodes_sb(struct super_block *sb,
>  {
>  	WARN_ON(!rwsem_is_locked(&sb->s_umount));
>  
> -	wbc->wb_start = jiffies; /* livelock avoidance */
>  	spin_lock(&inode_lock);
>  	if (!wbc->for_kupdate || list_empty(&wb->b_io))
>  		queue_io(wb, wbc->older_than_this);
> @@ -625,6 +625,7 @@ static long wb_writeback(struct bdi_writeback *wb,
>  		wbc.range_end = LLONG_MAX;
>  	}
>  
> +	wbc.wb_start = jiffies; /* livelock avoidance */
>  	for (;;) {
>  		/*
>  		 * Stop writeback when nr_pages has been consumed
> -- 
> 1.6.0.2
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-08-03 12:51 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-29 11:51 [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Wu Fengguang
2010-07-29 11:51 ` [PATCH 1/5] writeback: introduce wbc.for_sync to cover the two sync stages Wu Fengguang
2010-07-29 15:04   ` Jan Kara
2010-07-30  5:10     ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 2/5] writeback: stop periodic/background work on seeing sync works Wu Fengguang
2010-07-29 16:20   ` Jan Kara
2010-07-30  4:03     ` Wu Fengguang
2010-08-02 20:51       ` Jan Kara
2010-08-03  3:01         ` Wu Fengguang
2010-08-03 10:55           ` Jan Kara
2010-08-03 12:39             ` Jan Kara
2010-08-03 12:59               ` Wu Fengguang [this message]
2010-08-03 13:18                 ` Jan Kara
2010-08-03 13:22                 ` Wu Fengguang
2010-08-03 13:44                   ` Wu Fengguang
2010-08-03 13:48                     ` Wu Fengguang
2010-08-03 14:36             ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 3/5] writeback: prevent sync livelock with the sync_after timestamp Wu Fengguang
2010-07-29 15:02   ` Jan Kara
2010-07-30  5:17     ` Wu Fengguang
2010-07-29 11:51 ` [PATCH 4/5] writeback: introduce bdi_start_inode_writeback() Wu Fengguang
2010-07-29 11:51 ` [PATCH 5/5] vmscan: transfer async file writeback to the flusher Wu Fengguang
2010-07-29 16:09 ` [PATCH 0/5] [RFC] transfer ASYNC vmscan writeback IO to the flusher threads Jan Kara
2010-07-30  5:34   ` Wu Fengguang
2010-07-29 23:23 ` Dave Chinner
2010-07-30  7:58   ` Wu Fengguang
2010-07-30  9:22     ` KOSAKI Motohiro
2010-07-30 12:25       ` Wu Fengguang
2010-07-30 11:12     ` Dave Chinner
2010-07-30 13:18       ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100803125924.GA31827@localhost \
    --to=fengguang.wu@intel.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).