From: Wu Fengguang <fengguang.wu@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: Eric Sandeen <sandeen@sandeen.net>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>,
Masayoshi MIZUMA <m.mizuma@jp.fujitsu.com>,
"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>,
Nick Piggin <npiggin@suse.de>, Jeff Layton <jlayton@redhat.com>
Subject: Re: [PATCH] skip I_CLEAR state inodes
Date: Wed, 3 Jun 2009 22:10:21 +0800 [thread overview]
Message-ID: <20090603141021.GB5738@localhost> (raw)
In-Reply-To: <20090602113736.GB15010@duck.suse.cz>
On Tue, Jun 02, 2009 at 07:37:36PM +0800, Jan Kara wrote:
> On Tue 02-06-09 16:55:23, Wu Fengguang wrote:
> > On Tue, Jun 02, 2009 at 05:38:35AM +0800, Eric Sandeen wrote:
> > > Wu Fengguang wrote:
> > > > Add I_CLEAR tests to drop_pagecache_sb(), generic_sync_sb_inodes() and
> > > > add_dquot_ref().
> > > >
> > > > clear_inode() will switch inode state from I_FREEING to I_CLEAR,
> > > > and do so _outside_ of inode_lock. So any I_FREEING testing is
> > > > incomplete without the testing of I_CLEAR.
> > > >
> > > > Masayoshi MIZUMA first discovered the bug in drop_pagecache_sb() and
> > > > Jan Kara reminds fixing the other two cases. Thanks!
> > >
> > > Is there a reason it's not done for __sync_single_inode as well?
> >
> > It missed the glance because it don't have an obvious '|' in the line ;)
> >
> > > Jeff Layton asked the question and I'm following it up :)
> > >
> > > __sync_single_inode currently only tests I_FREEING, but I think we are
> > > safe because __sync_single_inode sets I_SYNC, and clear_inode waits for
> > > I_SYNC to be cleared before it changes I_STATE.
> >
> > But I_SYNC is removed just before the I_FREEING test, so we still have
> > a small race window?
> >
> > > On the other hand, testing I_CLEAR here probably would be safe anyway,
> > > and it'd be bonus points for consistency?
> >
> > So let's add the I_CLEAR test?
> >
> > > Same basic question for generic_sync_sb_inodes, which has a
> > > BUG_ON(inode->i_state & I_FREEING), seems like this could check I_CLWAR
> > > as well?
> >
> > Yes, we can add I_CLEAR here to catch more error condition.
> >
> > Thanks,
> > Fengguang
> >
> > ---
> > skip I_CLEAR state inodes in writeback routines
> >
> > The I_FREEING test in __sync_single_inode() is racy because
> > clear_inode() can set i_state to I_CLEAR between the clear of I_SYNC
> > and the test of I_FREEING.
> >
> > Also extend the coverage of BUG_ON(I_FREEING) to I_CLEAR.
> >
> > Reported-by: Jeff Layton <jlayton@redhat.com>
> > Reported-by: Eric Sandeen <sandeen@sandeen.net>
> > Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
> > ---
> > fs/fs-writeback.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > --- linux.orig/fs/fs-writeback.c
> > +++ linux/fs/fs-writeback.c
> > @@ -316,7 +316,7 @@ __sync_single_inode(struct inode *inode,
> > spin_lock(&inode_lock);
> > WARN_ON(inode->i_state & I_NEW);
> > inode->i_state &= ~I_SYNC;
> > - if (!(inode->i_state & I_FREEING)) {
> > + if (!(inode->i_state & (I_FREEING | I_CLEAR))) {
> > if (!(inode->i_state & I_DIRTY) &&
> > mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
> Is the whole if needed? I had an impression that everyone calling
> __sync_single_inode() should better take care it does not race with inode
> freeing... So WARN_ON would be more appropriate IMHO.
>
> > /*
> > @@ -518,7 +518,7 @@ void generic_sync_sb_inodes(struct super
> > if (current_is_pdflush() && !writeback_acquire(bdi))
> > break;
> >
> > - BUG_ON(inode->i_state & I_FREEING);
> > + BUG_ON(inode->i_state & (I_FREEING | I_CLEAR));
> > __iget(inode);
> > pages_skipped = wbc->pages_skipped;
> > __writeback_single_inode(inode, wbc);
> Looking at this code, it looks a bit suspicious. What prevents this s_io
> list scan to race with inode freeing? In particular generic_forget_inode()
Good catch.
> can drop inode_lock to write the inode and in the mean time
> generic_sync_sb_inodes() can come, get a reference to the inode and start
> it's writeback... Subsequent iput() would then call generic_forget_inode()
Another possibility:
generic_forget_inode
inode->i_state |= I_WILL_FREE;
spin_unlock(&inode_lock);
generic_sync_sb_inodes()
spin_lock(&inode_lock);
__iget(inode);
__writeback_single_inode
// see non zero i_count
WARN_ON(inode->i_state & I_WILL_FREE);
I'm wondering why didn't we saw reports on the last WARN_ON()?
Did we missed something?
> on the inode again. So shouldn't we skip I_FREEING|I_CLEAR|I_WILL_FREE|I_NEW
> inodes in this scan like we do for later in the function for another scan?
next prev parent reply other threads:[~2009-06-03 14:10 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-18 8:13 [PATCH][BUG] Lack of mutex_lock in drop_pagecache_sb() Masasyoshi MIZUMA
2009-03-23 10:38 ` Wu Fengguang
2009-03-24 7:06 ` Masayoshi MIZUMA
2009-03-24 7:44 ` Wu Fengguang
2009-03-24 12:05 ` Jan Kara
2009-03-24 12:11 ` Wu Fengguang
2009-03-24 12:40 ` [PATCH] skip I_CLEAR state inodes Wu Fengguang
2009-03-30 7:18 ` [PATCH][RESEND for 2.6.29-rc8-mm1] " Wu Fengguang
2009-03-31 23:43 ` Andrew Morton
2009-04-01 0:53 ` Wu Fengguang
2009-06-01 21:38 ` [PATCH] " Eric Sandeen
2009-06-02 8:55 ` Wu Fengguang
2009-06-02 10:27 ` Jeff Layton
2009-06-02 11:37 ` Jan Kara
2009-06-02 21:48 ` Eric Sandeen
2009-06-03 10:45 ` Jeff Layton
2009-06-03 13:32 ` Wu Fengguang
2009-06-03 14:00 ` Jan Kara
2009-06-03 14:10 ` Wu Fengguang [this message]
2009-06-03 14:16 ` Jan Kara
2009-06-03 14:47 ` Wu Fengguang
2009-06-06 3:07 ` [PATCH] writeback: skip new or to-be-freed inodes Wu Fengguang
2009-06-08 7:03 ` Artem Bityutskiy
2009-06-08 7:03 ` Artem Bityutskiy
2009-06-08 9:29 ` Wu Fengguang
2009-06-08 10:45 ` Christoph Hellwig
2009-06-09 7:24 ` Artem Bityutskiy
2009-06-09 7:24 ` Artem Bityutskiy
2009-06-09 7:03 ` Artem Bityutskiy
2009-06-09 7:03 ` Artem Bityutskiy
2009-06-08 17:07 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090603141021.GB5738@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=jack@suse.cz \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=m.mizuma@jp.fujitsu.com \
--cc=npiggin@suse.de \
--cc=sandeen@sandeen.net \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.