linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Christoph Hellwig <hch@infradead.org>
Subject: Re: [PATCH 2/2] writeback: Replace some redirty_tail() calls with requeue_io()
Date: Sun, 18 Sep 2011 22:07:37 +0800	[thread overview]
Message-ID: <20110918140737.GA15366@localhost> (raw)
In-Reply-To: <20110908150340.GB28149@quack.suse.cz>

On Thu, Sep 08, 2011 at 11:03:40PM +0800, Jan Kara wrote:
> On Thu 08-09-11 09:22:36, Wu Fengguang wrote:
> > Jan,
> > 
> > > @@ -420,6 +421,8 @@ writeback_single_inode(struct inode *inode, struct bdi_writeback *wb,
> > >  	/* Don't write the inode if only I_DIRTY_PAGES was set */
> > >  	if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC)) {
> > >  		int err = write_inode(inode, wbc);
> > > +		if (!err)
> > > +			inode_written = true;
> > >  		if (ret == 0)
> > >  			ret = err;
> > >  	}
> > 
> > write_inode() typically return error after redirtying the inode.
> > So the conditions inode_written=false and (inode->i_state & I_DIRTY)
> > are mostly on/off together. For the cases they disagree, it's probably
> > a filesystem bug -- at least I don't think some FS will deliberately 
> > return success while redirtying the inode, or the reverse.
>   There is a possibility someone else redirties the inode between the moment
> I_DIRTY bits are cleared in writeback_single_inode() and the check for
> I_DIRTY is done after ->write_inode() is called. Especially when
> write_inode() blocks waiting for some IO this isn't that hard to happen. So
> there are valid (although relatively rare) cases when inode_written is
> different from the result of I_DIRTY check.

Ah yes, that's good point.

> > >  		} else if (inode->i_state & I_DIRTY) {
> > >  			/*
> > >  			 * Filesystems can dirty the inode during writeback
> > >  			 * operations, such as delayed allocation during
> > >  			 * submission or metadata updates after data IO
> > > -			 * completion.
> > > +			 * completion. Also inode could have been dirtied by
> > > +			 * some process aggressively touching metadata.
> > > +			 * Finally, filesystem could just fail to write the
> > > +			 * inode for some reason. We have to distinguish the
> > > +			 * last case from the previous ones - in the last case
> > > +			 * we want to give the inode quick retry, in the
> > > +			 * other cases we want to put it back to the dirty list
> > > +			 * to avoid livelocking of writeback.
> > >  			 */
> > > -			redirty_tail(inode, wb);
> > > +			if (inode_written)
> > > +				redirty_tail(inode, wb);
> > 
> > Can you elaborate the livelock in the below inode_written=true case?
> > Why the sleep in the wb_writeback() loop is not enough?
>   In case someone would be able to consistently trigger the race window and
> redirty the inode before we check here, we would loop for a long time
> always writing just this inode and thus effectivelly stalling other
> writeback. That's why I push redirtied inode behind other inodes in the
> dirty list.

Agreed. All the left to do is to confirm whether this addresses
Christoph's original problem.

Acked-by: Wu Fengguang <fengguang.wu@intel.com>

Thanks,
Fengguang

  reply	other threads:[~2011-09-18 14:07 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-08  0:44 [PATCH 1/2] writeback: Improve busyloop prevention Jan Kara
2011-09-08  0:44 ` [PATCH 2/2] writeback: Replace some redirty_tail() calls with requeue_io() Jan Kara
2011-09-08  1:22   ` Wu Fengguang
2011-09-08 15:03     ` Jan Kara
2011-09-18 14:07       ` Wu Fengguang [this message]
2011-10-05 17:39         ` Jan Kara
2011-10-07 13:43           ` Wu Fengguang
2011-10-07 14:22             ` Jan Kara
2011-10-07 14:29               ` Wu Fengguang
2011-10-07 14:45                 ` Jan Kara
2011-10-07 15:29                   ` Wu Fengguang
2011-10-08  4:00                   ` Wu Fengguang
2011-10-08 11:52                     ` Wu Fengguang
2011-10-08 13:49                       ` Wu Fengguang
2011-10-09  0:27                         ` Wu Fengguang
2011-10-09  8:44                           ` Wu Fengguang
2011-10-10 11:21                     ` Jan Kara
2011-10-10 11:31                       ` Wu Fengguang
2011-10-10 23:30                         ` Jan Kara
2011-10-11  2:36                           ` Wu Fengguang
2011-10-11 21:53                             ` Jan Kara
2011-10-12  2:44                               ` Wu Fengguang
2011-10-12 19:34                                 ` Jan Kara
2011-09-08  0:57 ` [PATCH 1/2] writeback: Improve busyloop prevention Wu Fengguang
2011-09-08 13:49   ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2011-10-12 20:57 [PATCH 0/2 v4] writeback: Improve busyloop prevention and inode requeueing Jan Kara
2011-10-12 20:57 ` [PATCH 2/2] writeback: Replace some redirty_tail() calls with requeue_io() Jan Kara
2011-10-13 14:30   ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110918140737.GA15366@localhost \
    --to=fengguang.wu@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).