linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH 2/2] writeback: Replace some redirty_tail() calls with requeue_io()
Date: Thu, 13 Oct 2011 22:30:19 +0800	[thread overview]
Message-ID: <20111013143019.GC6938@localhost> (raw)
In-Reply-To: <1318453043-32057-3-git-send-email-jack@suse.cz>

Jan,

This looks fine, too.

If no objections, I'd like to push the patches (preferably with
updated patch 1) to linux-next ASAP.

Thanks,
Fengguang

On Thu, Oct 13, 2011 at 04:57:23AM +0800, Jan Kara wrote:
> Calling redirty_tail() can put off inode writeback for upto 30 seconds (or
> whatever dirty_expire_centisecs is). This is unnecessarily big delay in some
> cases and in other cases it is a really bad thing. In particular XFS tries to
> be nice to writeback and when ->write_inode is called for an inode with locked
> ilock, it just redirties the inode and returns EAGAIN. That currently causes
> writeback_single_inode() to redirty_tail() the inode. As contended ilock is
> common thing with XFS while extending files the result can be that inode
> writeout is put off for a really long time.
> 
> Now that we have more robust busyloop prevention in wb_writeback() we can
> call requeue_io() in cases where quick retry is required without fear of
> raising CPU consumption too much.
> 
> CC: Christoph Hellwig <hch@infradead.org>
> Acked-by: Wu Fengguang <fengguang.wu@intel.com>
> Signed-off-by: Jan Kara <jack@suse.cz>
> ---
>  fs/fs-writeback.c |   56 +++++++++++++++++++++++++++-------------------------
>  1 files changed, 29 insertions(+), 27 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index b619f3a..094afcd 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -356,6 +356,7 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb,
>  	long nr_to_write = wbc->nr_to_write;
>  	unsigned dirty;
>  	int ret;
> +	bool inode_written = false;
>  
>  	assert_spin_locked(&wb->list_lock);
>  	assert_spin_locked(&inode->i_lock);
> @@ -420,6 +421,8 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb,
>  	/* Don't write the inode if only I_DIRTY_PAGES was set */
>  	if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
>  		int err = write_inode(inode, wbc);
> +		if (!err)
> +			inode_written = true;
>  		if (ret == 0)
>  			ret = err;
>  	}
> @@ -430,17 +433,20 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb,
>  	if (!(inode->i_state & I_FREEING)) {
>  		/*
>  		 * Sync livelock prevention. Each inode is tagged and synced in
> -		 * one shot. If still dirty, it will be redirty_tail()'ed below.
> -		 * Update the dirty time to prevent enqueue and sync it again.
> +		 * one shot. If still dirty, update dirty time and put it back
> +		 * to dirty list to prevent enqueue and syncing it again.
>  		 */
>  		if ((inode->i_state & I_DIRTY) &&
> -		    (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages))
> +		    (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)) {
>  			inode->dirtied_when = jiffies;
> -
> -		if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
> +			redirty_tail(inode, wb);
> +		} else if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
>  			/*
> -			 * We didn't write back all the pages.  nfs_writepages()
> -			 * sometimes bales out without doing anything.
> +			 * We didn't write back all the pages. We may have just
> +			 * run out of our writeback slice, or nfs_writepages()
> +			 * sometimes bales out without doing anything, or e.g.
> +			 * btrfs ignores for_kupdate writeback requests for
> +			 * metadata inodes.
>  			 */
>  			inode->i_state |= I_DIRTY_PAGES;
>  			if (wbc->nr_to_write <= 0) {
> @@ -450,11 +456,9 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb,
>  				requeue_io(inode, wb);
>  			} else {
>  				/*
> -				 * Writeback blocked by something other than
> -				 * congestion. Delay the inode for some time to
> -				 * avoid spinning on the CPU (100% iowait)
> -				 * retrying writeback of the dirty page/inode
> -				 * that cannot be performed immediately.
> +				 * Writeback blocked by something. Put inode
> +				 * back to dirty list to prevent livelocking of
> +				 * writeback.
>  				 */
>  				redirty_tail(inode, wb);
>  			}
> @@ -463,9 +467,19 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb,
>  			 * Filesystems can dirty the inode during writeback
>  			 * operations, such as delayed allocation during
>  			 * submission or metadata updates after data IO
> -			 * completion.
> +			 * completion. Also inode could have been dirtied by
> +			 * some process aggressively touching metadata.
> +			 * Finally, filesystem could just fail to write the
> +			 * inode for some reason. We have to distinguish the
> +			 * last case from the previous ones - in the last case
> +			 * we want to give the inode quick retry, in the
> +			 * other cases we want to put it back to the dirty list
> +			 * to avoid livelocking of writeback.
>  			 */
> -			redirty_tail(inode, wb);
> +			if (inode_written)
> +				redirty_tail(inode, wb);
> +			else
> +				requeue_io(inode, wb);
>  		} else {
>  			/*
>  			 * The inode is clean.  At this point we either have
> @@ -581,13 +595,6 @@ static long writeback_sb_inodes(struct super_block *sb,
>  		wrote += write_chunk - wbc.nr_to_write;
>  		if (!(inode->i_state & I_DIRTY))
>  			wrote++;
> -		if (wbc.pages_skipped) {
> -			/*
> -			 * writeback is not making progress due to locked
> -			 * buffers.  Skip this inode for now.
> -			 */
> -			redirty_tail(inode, wb);
> -		}
>  		spin_unlock(&inode->i_lock);
>  		spin_unlock(&wb->list_lock);
>  		iput(inode);
> @@ -618,12 +625,7 @@ static long __writeback_inodes_wb(struct bdi_writeback *wb,
>  		struct super_block *sb = inode->i_sb;
>  
>  		if (!grab_super_passive(sb)) {
> -			/*
> -			 * grab_super_passive() may fail consistently due to
> -			 * s_umount being grabbed by someone else. Don't use
> -			 * requeue_io() to avoid busy retrying the inode/sb.
> -			 */
> -			redirty_tail(inode, wb);
> +			requeue_io(inode, wb);
>  			continue;
>  		}
>  		wrote += writeback_sb_inodes(sb, wb, work);
> -- 
> 1.7.1

  reply	other threads:[~2011-10-13 14:30 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-10-12 20:57 [PATCH 0/2 v4] writeback: Improve busyloop prevention and inode requeueing Jan Kara
2011-10-12 20:57 ` [PATCH 1/2] writeback: Improve busyloop prevention Jan Kara
2011-10-13 14:26   ` Wu Fengguang
2011-10-13 20:13     ` Jan Kara
2011-10-14  7:18       ` Christoph Hellwig
2011-10-14 19:31         ` Chris Mason
     [not found]     ` <20111013143939.GA9691@localhost>
2011-10-13 20:18       ` Jan Kara
2011-10-14 16:00         ` Wu Fengguang
2011-10-14 16:28           ` Wu Fengguang
2011-10-18  0:51             ` Jan Kara
2011-10-18 14:35               ` Wu Fengguang
2011-10-19 11:56                 ` Jan Kara
2011-10-19 13:25                   ` Wu Fengguang
2011-10-19 13:30                   ` Wu Fengguang
2011-10-19 13:35                   ` Wu Fengguang
2011-10-20 12:09                   ` Wu Fengguang
2011-10-20 12:33                     ` Wu Fengguang
2011-10-20 13:39                       ` Wu Fengguang
2011-10-20 22:26                         ` Jan Kara
2011-10-22  4:20                           ` Wu Fengguang
2011-10-24 15:45                             ` Jan Kara
     [not found]                           ` <20111027063133.GA10146@localhost>
2011-10-27 20:31                             ` Jan Kara
     [not found]                               ` <20111101134231.GA31718@localhost>
2011-11-01 21:53                                 ` Jan Kara
2011-11-02 17:25                                   ` Wu Fengguang
     [not found]                               ` <20111102185603.GA4034@localhost>
2011-11-03  1:51                                 ` Jan Kara
2011-11-03 14:52                                   ` Wu Fengguang
     [not found]                                   ` <20111104152054.GA11577@localhost>
2011-11-08 23:52                                     ` Jan Kara
2011-11-09 13:51                                       ` Wu Fengguang
2011-11-10 14:50                                       ` Jan Kara
2011-12-05  8:02                                         ` Wu Fengguang
2011-12-07 10:13                                           ` Jan Kara
2011-12-07 11:45                                             ` Wu Fengguang
     [not found]                           ` <20111027064745.GA14017@localhost>
2011-10-27 20:50                             ` Jan Kara
2011-10-20  9:46               ` Christoph Hellwig
2011-10-20 15:32                 ` Jan Kara
2011-10-15 12:41           ` Wu Fengguang
2011-10-12 20:57 ` [PATCH 2/2] writeback: Replace some redirty_tail() calls with requeue_io() Jan Kara
2011-10-13 14:30   ` Wu Fengguang [this message]
2011-10-13 14:15 ` [PATCH 0/2 v4] writeback: Improve busyloop prevention and inode requeueing Wu Fengguang
  -- strict thread matches above, loose matches on Subject: below --
2011-09-08  0:44 [PATCH 1/2] writeback: Improve busyloop prevention Jan Kara
2011-09-08  0:44 ` [PATCH 2/2] writeback: Replace some redirty_tail() calls with requeue_io() Jan Kara
2011-09-08  1:22   ` Wu Fengguang
2011-09-08 15:03     ` Jan Kara
2011-09-18 14:07       ` Wu Fengguang
2011-10-05 17:39         ` Jan Kara
2011-10-07 13:43           ` Wu Fengguang
2011-10-07 14:22             ` Jan Kara
2011-10-07 14:29               ` Wu Fengguang
2011-10-07 14:45                 ` Jan Kara
2011-10-07 15:29                   ` Wu Fengguang
2011-10-08  4:00                   ` Wu Fengguang
2011-10-08 11:52                     ` Wu Fengguang
2011-10-08 13:49                       ` Wu Fengguang
2011-10-09  0:27                         ` Wu Fengguang
2011-10-09  8:44                           ` Wu Fengguang
2011-10-10 11:21                     ` Jan Kara
2011-10-10 11:31                       ` Wu Fengguang
2011-10-10 23:30                         ` Jan Kara
2011-10-11  2:36                           ` Wu Fengguang
2011-10-11 21:53                             ` Jan Kara
2011-10-12  2:44                               ` Wu Fengguang
2011-10-12 19:34                                 ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111013143019.GC6938@localhost \
    --to=fengguang.wu@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).