linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, Jan Kara <jack@suse.cz>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 04/17] writeback: try more writeback as long as something was written
Date: Mon, 9 May 2011 18:05:38 +0200	[thread overview]
Message-ID: <20110509160538.GR4122@quack.suse.cz> (raw)
In-Reply-To: <20110506031612.294824260@intel.com>

On Fri 06-05-11 11:08:25, Wu Fengguang wrote:
> writeback_inodes_wb()/__writeback_inodes_sb() are not aggressive in that
> they only populate possibly a subset of eligible inodes into b_io at
> entrance time. When the queued set of inodes are all synced, they just
> return, possibly with all queued inode pages written but still
> wbc.nr_to_write > 0.
> 
> For kupdate and background writeback, there may be more eligible inodes
> sitting in b_dirty when the current set of b_io inodes are completed. So
> it is necessary to try another round of writeback as long as we made some
> progress in this round. When there are no more eligible inodes, no more
> inodes will be enqueued in queue_io(), hence nothing could/will be
> synced and we may safely bail.
> 
> For example, imagine 100 inodes
> 
>         i0, i1, i2, ..., i90, i91, i99
> 
> At queue_io() time, i90-i99 happen to be expired and moved to s_io for
> IO. When finished successfully, if their total size is less than
> MAX_WRITEBACK_PAGES, nr_to_write will be > 0. Then wb_writeback() will
> quit the background work (w/o this patch) while it's still over
> background threshold. This will be a fairly normal/frequent case I guess.
> 
> Jan raised the concern
> 
> 	I'm just afraid that in some pathological cases this could
> 	result in bad writeback pattern - like if there is some process
> 	which manages to dirty just a few pages while we are doing
> 	writeout, this looping could result in writing just a few pages
> 	in each round which is bad for fragmentation etc.
> 
> However it requires really strong timing to make that to (continuously)
> happen.  In practice it's very hard to produce such a pattern even if
> there is such a possibility in theory. I actually tried to write 1 page
> per 1ms with this command
> 
> 	write-and-fsync -n10000 -S 1000 -c 4096 /fs/test
> 
> and do sync(1) at the same time. The sync completes quickly on ext4,
> xfs, btrfs. The readers could try other write-and-sleep patterns and
> check if it can block sync for longer time.
  After some thought I realized that i_dirtied_when is going to be updated
in these cases and so we stop writing back the inode soon. So I think we
should be fine in the end. You can add:
Acked-by: Jan Kara <jack@suse.cz>

								Honza
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> ---
>  fs/fs-writeback.c |   16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> --- linux-next.orig/fs/fs-writeback.c	2011-05-05 23:30:24.000000000 +0800
> +++ linux-next/fs/fs-writeback.c	2011-05-05 23:30:25.000000000 +0800
> @@ -739,23 +739,23 @@ static long wb_writeback(struct bdi_writ
>  		wrote += write_chunk - wbc.nr_to_write;
>  
>  		/*
> -		 * If we consumed everything, see if we have more
> +		 * Did we write something? Try for more
> +		 *
> +		 * Dirty inodes are moved to b_io for writeback in batches.
> +		 * The completion of the current batch does not necessarily
> +		 * mean the overall work is done. So we keep looping as long
> +		 * as made some progress on cleaning pages or inodes.
>  		 */
> -		if (wbc.nr_to_write <= 0)
> +		if (wbc.nr_to_write < write_chunk)
>  			continue;
>  		if (wbc.inodes_cleaned)
>  			continue;
>  		/*
> -		 * Didn't write everything and we don't have more IO, bail
> +		 * No more inodes for IO, bail
>  		 */
>  		if (!wbc.more_io)
>  			break;
>  		/*
> -		 * Did we write something? Try for more
> -		 */
> -		if (wbc.nr_to_write < write_chunk)
> -			continue;
> -		/*
>  		 * Nothing written. Wait for some inode to
>  		 * become available for writeback. Otherwise
>  		 * we'll just busyloop.
> 
> 

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2011-05-09 16:05 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-06  3:08 [PATCH 00/17] writeback fixes and cleanups for 2.6.40 Wu Fengguang
2011-05-06  3:08 ` [PATCH 01/17] writeback: introduce wbc.tagged_sync for the WB_SYNC_NONE sync stage Wu Fengguang
2011-05-06 16:08   ` Jan Kara
2011-05-06  3:08 ` [PATCH 02/17] writeback: update dirtied_when for synced inode to prevent livelock Wu Fengguang
2011-05-06 16:33   ` Jan Kara
2011-05-10  2:14     ` Wu Fengguang
2011-05-10 12:05       ` Jan Kara
2011-05-06  3:08 ` [PATCH 03/17] writeback: introduce writeback_control.inodes_cleaned Wu Fengguang
2011-05-06 14:36   ` Jan Kara
2011-05-10  2:23     ` Wu Fengguang
2011-05-10 13:52       ` Jan Kara
2011-05-10 15:00         ` Wu Fengguang
2011-05-06  3:08 ` [PATCH 04/17] writeback: try more writeback as long as something was written Wu Fengguang
2011-05-09 16:05   ` Jan Kara [this message]
2011-05-10  2:40     ` Wu Fengguang
2011-05-06  3:08 ` [PATCH 05/17] writeback: the kupdate expire timestamp should be a moving target Wu Fengguang
2011-05-06  3:08 ` [PATCH 06/17] writeback: sync expired inodes first in background writeback Wu Fengguang
2011-05-06 19:02   ` Rik van Riel
2011-05-09 16:08   ` Jan Kara
2011-05-09 16:18     ` Rik van Riel
2011-05-10  2:45       ` Wu Fengguang
2011-05-06  3:08 ` [PATCH 07/17] writeback: refill b_io iff empty Wu Fengguang
2011-05-06  3:08 ` [PATCH 08/17] writeback: split inode_wb_list_lock into bdi_writeback.list_lock Wu Fengguang
2011-05-06  3:08 ` [PATCH 09/17] writeback: elevate queue_io() into wb_writeback() Wu Fengguang
2011-05-09 16:15   ` Jan Kara
2011-05-06  3:08 ` [PATCH 10/17] writeback: avoid extra sync work at enqueue time Wu Fengguang
2011-05-09 16:16   ` Jan Kara
2011-05-06  3:08 ` [PATCH 11/17] writeback: add bdi_dirty_limit() kernel-doc Wu Fengguang
2011-05-06  3:08 ` [PATCH 12/17] writeback: skip balance_dirty_pages() for in-memory fs Wu Fengguang
2011-05-06  3:08 ` [PATCH 13/17] writeback: remove writeback_control.more_io Wu Fengguang
2011-05-06  3:08 ` [PATCH 14/17] writeback: make writeback_control.nr_to_write straight Wu Fengguang
2011-05-09 16:54   ` Jan Kara
2011-05-10  3:19     ` Wu Fengguang
2011-05-10 13:44       ` Jan Kara
2011-05-11 14:38         ` Wu Fengguang
2011-05-11 14:54           ` Jan Kara
2011-05-06  3:08 ` [PATCH 15/17] writeback: remove .nonblocking and .encountered_congestion Wu Fengguang
2011-05-06  3:08 ` [PATCH 16/17] writeback: trace event writeback_single_inode Wu Fengguang
2011-05-06  4:16   ` [PATCH 16/17] writeback: trace event writeback_single_inode (v2) Wu Fengguang
2011-05-06  3:08 ` [PATCH 17/17] writeback: trace event writeback_queue_io Wu Fengguang
2011-05-06  4:06 ` [PATCH 00/17] writeback fixes and cleanups for 2.6.40 Anca Emanuel
2011-05-06  4:09   ` Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2011-05-12 13:57 [PATCH 00/17] writeback fixes and cleanups for 2.6.40 (v2) Wu Fengguang
2011-05-12 13:57 ` [PATCH 04/17] writeback: try more writeback as long as something was written Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110509160538.GR4122@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=fengguang.wu@intel.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).