Re: [PATCH 16/21] fs: Protect inode->i_state with the inode->i_lock

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Al Viro <viro@ZenIV.linux.org.uk>
To: Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 16/21] fs: Protect inode->i_state with the inode->i_lock
Date: Sun, 24 Oct 2010 20:17:35 +0100	[thread overview]
Message-ID: <20101024191735.GU19804@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20101024162131.GA23677@infradead.org>

On Sun, Oct 24, 2010 at 12:21:31PM -0400, Christoph Hellwig wrote:
> On Sun, Oct 24, 2010 at 10:13:10AM -0400, Christoph Hellwig wrote:
> > On Sat, Oct 23, 2010 at 10:37:52PM +0100, Al Viro wrote:
> > > 	* invalidate_inodes() - collect I_FREEING/I_WILL_FREE on a separate
> > > list, then (after we'd evicted the stuff we'd decided to evict) wait until
> > > they get freed by whatever's freeing them already.
> > 
> > Note that we would only have to do this for the umount case.  For others
> > it's pretty pointless.
> 
> Now that I've looked into it I think we basically fine right now.
> 
> If we're in umount there should be no other I_FREEING inodes.
> 
>  - concurrent prune_icache is prevented by iprune_sem.
>  - concurrent other invalidate_inodes should not happen because we're
>    in unmount and the filesystem should not be reachable any more,
>    and even if it did iprune_sem would protect us.
>  - how could a concurrent iput_final happen?  filesystem is not
>    accessible anymore, and iput of fs internal inodes is single-threaded
>    with the rest of the actual umount process.
> 
> So just skipping over I_FREEING inodes here should be fine for
> non-umount callers, and for umount we could even do a WARN_ON.

FWIW, I think we should kill most of invalidate_inodes() callers.  Look:
	* call in generic_shutdown_super() is legitimate.  The first one,
that is.  The second should be replaced with check for ->s_list being
non-empty.  Note that after the first pass we should have kicked out
everything with zero i_count.  Everything that gets dropped to zero
i_count after that (i.e. during ->put_super()) will be evicted immediately
and won't stay.  I.e. the second call will evict *nothing*; it's just
an overblown way to check if there are any inodes left.
	* call in ext2_remount() is hogwash - we do that with at least
root inode pinned down, so it will fail, along with the remount attempt.
	* ntfs_fill_super() call - no-op.  MS_ACTIVE hasn't been set
yet, so there will be no inodes with zero i_count sitting around.
	* gfs2 calls - same story (no MS_ACTIVE yet in fill_super(),
MS_ACTIVE already removed *and* invalidate_inodes() already called
in gfs2_put_super())
	* smb reconnect logics.  AFAICS, that's complete crap; we *never*
retain inodes on smbfs.  IOW, nothing for invalidate_inodes() to do, other
than evict fsnotify marks.  Which is to say, we are calling the wrong
function there, even assuming that fsnotify should try to work there.
	* finally, __invalidate_device().  Which has a slew of callers of
its own and is *very* different from normal situation.  Here we have
underlying device gone bad.

So I'm going to do the following:
	1) split evict_inodes() off invalidate_inodes() and simplify it.
	2) switch generic_shutdown_super() to that sucker, called once.
	3) kill all calls of invalidate_inodes() except __invalidate_device()
one.
	4) think hard about __invalidate_device() situation.

	evict_inodes() should *not* see any inodes with
I_NEW/I_FREEING/I_WILL_FREE.  Just skip.  It might see I_DIRTY/I_SYNC,
but that's OK - evict_inode() will wait for that.

	OTOH, invalidate_inodes() from __invalidate_device() can run in
parallel with e.g. final iput().  Currently it's not a problem, but
we'll need to start skipping I_FREEING/I_WILL_FREE ones there if we want
to change iput() locking.

	And yes, iprune_sem is a trouble waiting to happen - one fs stuck
in e.g. truncate_inode_pages() and we are seriously fucked; any non-lazy
umount() will get stuck as well.

next prev parent reply	other threads:[~2010-10-24 19:17 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-21  0:49 Inode Lock Scalability V6 Dave Chinner
2010-10-21  0:49 ` [PATCH 01/21] fs: switch bdev inode bdi's correctly Dave Chinner
2010-10-21  0:49 ` [PATCH 02/21] kernel: add bl_list Dave Chinner
2010-10-21  0:49 ` [PATCH 03/21] fs: Convert nr_inodes and nr_unused to per-cpu counters Dave Chinner
2010-10-21  0:49 ` [PATCH 04/21] fs: Implement lazy LRU updates for inodes Dave Chinner
2010-10-21  2:14   ` Christian Stroetmann
2010-10-21 10:07   ` Nick Piggin
2010-10-21 12:22     ` Christoph Hellwig
2010-10-23  9:32   ` Al Viro
2010-10-21  0:49 ` [PATCH 05/21] fs: inode split IO and LRU lists Dave Chinner
2010-10-21  0:49 ` [PATCH 06/21] fs: Clean up inode reference counting Dave Chinner
2010-10-21  1:41   ` Christoph Hellwig
2010-10-21  0:49 ` [PATCH 07/21] exofs: use iput() for inode reference count decrements Dave Chinner
2010-10-21  0:49 ` [PATCH 08/21] fs: rework icount to be a locked variable Dave Chinner
2010-10-21 19:40   ` Al Viro
2010-10-21 22:32     ` Dave Chinner
2010-10-21  0:49 ` [PATCH 09/21] fs: Factor inode hash operations into functions Dave Chinner
2010-10-21  0:49 ` [PATCH 10/21] fs: Stop abusing find_inode_fast in iunique Dave Chinner
2010-10-21  0:49 ` [PATCH 11/21] fs: move i_ref increments into find_inode/find_inode_fast Dave Chinner
2010-10-21  0:49 ` [PATCH 12/21] fs: remove inode_add_to_list/__inode_add_to_list Dave Chinner
2010-10-21  0:49 ` [PATCH 13/21] fs: Introduce per-bucket inode hash locks Dave Chinner
2010-10-21  0:49 ` [PATCH 14/21] fs: add a per-superblock lock for the inode list Dave Chinner
2010-10-21  0:49 ` [PATCH 15/21] fs: split locking of inode writeback and LRU lists Dave Chinner
2010-10-21  0:49 ` [PATCH 16/21] fs: Protect inode->i_state with the inode->i_lock Dave Chinner
2010-10-22  1:56   ` Al Viro
2010-10-22  2:26     ` Nick Piggin
2010-10-22  3:14     ` Dave Chinner
2010-10-22 10:37       ` Al Viro
2010-10-22 11:40         ` Christoph Hellwig
2010-10-23 21:40           ` Al Viro
2010-10-23 21:37         ` Al Viro
2010-10-24 14:13           ` Christoph Hellwig
2010-10-24 16:21             ` Christoph Hellwig
2010-10-24 19:17               ` Al Viro [this message]
2010-10-24 20:04                 ` Christoph Hellwig
2010-10-24 20:36                   ` Al Viro
2010-10-24  2:18       ` Nick Piggin
2010-10-21  0:49 ` [PATCH 17/21] fs: protect wake_up_inode with inode->i_lock Dave Chinner
2010-10-21  2:17   ` Christoph Hellwig
2010-10-21 13:16     ` Nick Piggin
2010-10-21  0:49 ` [PATCH 18/21] fs: introduce a per-cpu last_ino allocator Dave Chinner
2010-10-21  0:49 ` [PATCH 19/21] fs: icache remove inode_lock Dave Chinner
2010-10-21  2:14   ` Christian Stroetmann
2010-10-21  0:49 ` [PATCH 20/21] fs: Reduce inode I_FREEING and factor inode disposal Dave Chinner
2010-10-21  0:49 ` [PATCH 21/21] fs: do not assign default i_ino in new_inode Dave Chinner
2010-10-21  5:04 ` Inode Lock Scalability V7 (was V6) Dave Chinner
2010-10-21 13:20   ` Nick Piggin
2010-10-21 23:52     ` Dave Chinner
2010-10-22  0:45       ` Nick Piggin
2010-10-22  2:20         ` Al Viro
2010-10-22  2:34           ` Nick Piggin
2010-10-22  2:41             ` Nick Piggin
2010-10-22  2:48               ` Nick Piggin
2010-10-22  3:12                 ` Al Viro
2010-10-22  4:48                   ` Nick Piggin
2010-10-22  3:07             ` Al Viro
2010-10-22  4:46               ` Nick Piggin
2010-10-22  5:01                 ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101024191735.GU19804@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).