public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Jan Kara <jack@suse.cz>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org, Al Viro <viro@ZenIV.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	dchinner@redhat.com, sandeen@redhat.com,
	Kamal Mostafa <kamal@canonical.com>, Ben Myers <bpm@sgi.com>,
	Alex Elder <elder@kernel.org>,
	xfs@oss.sgi.com
Subject: Re: [PATCH 11/19] xfs: Convert to new freezing code
Date: Mon, 12 Mar 2012 09:45:37 +1100	[thread overview]
Message-ID: <20120311224537.GV5091@dastard> (raw)
In-Reply-To: <20120309142253.GB14159@quack.suse.cz>

On Fri, Mar 09, 2012 at 03:22:53PM +0100, Jan Kara wrote:
> On Fri 09-03-12 10:20:41, Dave Chinner wrote:
> > On Mon, Mar 05, 2012 at 05:01:09PM +0100, Jan Kara wrote:
> > > Generic code now blocks all writers from standard write paths. So we block all
> > > writers coming from ioctl and replace blocking of transactions on frozen
> > > filesystem with a debugging check. As a bonus, we get a protection of ioctl
> > > against racing remount read-only. We also convert xfs_file_aio_write() to a
> > > non-racy freeze protection.
> > ....
> > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> > > index 329b06a..6468a2a 100644
> > > --- a/fs/xfs/xfs_trans.c
> > > +++ b/fs/xfs/xfs_trans.c
> > > @@ -577,7 +577,6 @@ xfs_trans_alloc(
> > >  	xfs_mount_t	*mp,
> > >  	uint		type)
> > >  {
> > > -	xfs_wait_for_freeze(mp, SB_FREEZE_TRANS);
> > >  	return _xfs_trans_alloc(mp, type, KM_SLEEP);
> > >  }
> > 
> > So what is there to stop internal XFS threads from starting
> > transactions when the filesystem is frozen? Previously this
> > SB_FREEZE_TRANS would guarantee even internal fucntions would get
> > stopped, but now there's nothing?
>   I've checked XFS now and I didn't find any work that would be done on a
> clean filesystem. I found:
> _xfs_mru_cache_reap() - doesn't seem to do any IO

Causes iput() on inodes, which if dropping the last reference to the
inode can cause them to enter reclaim and hence have transactions
run on them to either truncate blocks beyonds EOF or to do the
second phase of an unlink(). So it will definitely break the freeze
model.

> xfs_end_io(), xfs_buf_iodone_work() - shouldn't be a problem either since only
>   reads can happen on a frozen filesystem
> xfs_reclaim_worker() - everything is clean so no IO should happen

It will do work as other things push inodes into reclaim. e.g.
filestreams inode expiry. And it will still run to reclaim clean
inodes...

> xfs_flush_worker() - everything is clean so no IO should happen
> xfs_sync_worker() - again the same if I understand the code right

xfs_sync_worker() will always trigger a log force, so if there is
anything that has dirtied the journal, it will trigger IO. We have
no protection against the journal being dirtied anymore, so no
guarantees can be given here.

Basically, your patchset creates a "shell" around the outside of the
filesystem that catches all the external modifications that can
occur through the VFS and ioctls. The "shell" is now the only layer
of defense because the patchset removes the layer of protection that
prevents internal modifications from occurring.

For example, the XFS patch added a bunch of protection to ioctls on
XFS that only modify file metadata and so never were protected
against data freezes - they were all implicitly protected by the inner
transaction subsystem freeze. All of the above cases were protected
by the inner layer of defense, which is now gone.

Not to mention the shrinkers. The inode cache shrinkers can
cause inodes to enter reclaim which can trigger EOF truncation just
like _xfs_mru_cache_reap(). The dquot cache shrinker can also issue
IO, and dquots will be dirtied by EOF truncation. Same with buffers,
and the buffer cache shrinker.

I'm sure other filesystems have just as complex or even more complex
internal paths that trigger dirtying and IO that the "freeze shell"
model does not prevent. I don't think auditing is good enough - the
shell model is, IMO, too easy to break and too hard to test for
breakage...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2012-03-11 22:46 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-05 16:00 [PATCH 00/19] Fix filesystem freezing deadlocks Jan Kara
2012-03-05 16:00 ` [PATCH 01/19] mm: Make default vm_ops provide ->page_mkwrite handler Jan Kara
2012-03-23 22:45   ` Andrew Morton
2012-03-27  7:55     ` Jan Kara
2012-03-27 21:38       ` Andrew Morton
2012-03-27 22:08         ` Jan Kara
2012-03-27 22:45           ` Dave Chinner
2012-03-28  9:48             ` Jan Kara
2012-03-05 16:01 ` [PATCH 02/19] fs: Push mnt_want_write() outside of i_mutex Jan Kara
2012-03-05 16:01 ` [PATCH 03/19] fat: " Jan Kara
2012-03-05 16:01 ` [PATCH 04/19] btrfs: " Jan Kara
2012-03-05 16:01 ` [PATCH 05/19] nfsd: " Jan Kara
2012-03-05 16:01 ` [PATCH 06/19] fs: Improve filesystem freezing handling Jan Kara
2012-03-05 16:01 ` [PATCH 07/19] fs: Add freezing handling to mnt_want_write() / mnt_drop_write() Jan Kara
2012-03-05 16:01 ` [PATCH 08/19] fs: Skip atime update on frozen filesystem Jan Kara
2012-03-05 16:01 ` [PATCH 09/19] fs: Protect write paths by sb_start_write - sb_end_write Jan Kara
2012-03-05 16:01 ` [PATCH 10/19] ext4: Convert to new freezing mechanism Jan Kara
2012-03-07 22:32   ` Kamal Mostafa
2012-03-08  9:05     ` Jan Kara
2012-03-05 16:01 ` [PATCH 11/19] xfs: Convert to new freezing code Jan Kara
2012-03-08 23:20   ` Dave Chinner
2012-03-09  8:23     ` Jan Kara
2012-03-09 14:22     ` Jan Kara
2012-03-11 22:45       ` Dave Chinner [this message]
2012-03-12 17:55         ` Jan Kara
2012-03-12 23:48           ` Dave Chinner
2012-03-13 21:30             ` Jan Kara
2012-03-14  3:00               ` Dave Chinner
2012-03-05 16:01 ` [PATCH 12/19] ocfs2: Convert to new freezing mechanism Jan Kara
2012-03-05 16:01 ` [PATCH 13/19] gfs2: " Jan Kara
2012-03-05 16:01 ` [PATCH 14/19] fuse: " Jan Kara
2012-03-05 16:01 ` [PATCH 15/19] ntfs: " Jan Kara
2012-03-05 16:01 ` [PATCH 16/19] nilfs2: " Jan Kara
2012-03-05 16:01 ` [PATCH 17/19] btrfs: " Jan Kara
2012-03-05 16:01 ` [PATCH 18/19] fs: Remove old " Jan Kara
2012-03-05 16:01 ` [PATCH 19/19] fs: Refuse to freeze filesystem with open but unlinked files Jan Kara
2012-03-11 20:22 ` [PATCH 00/19] Fix filesystem freezing deadlocks Kamal Mostafa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120311224537.GV5091@dastard \
    --to=david@fromorbit.com \
    --cc=bpm@sgi.com \
    --cc=dchinner@redhat.com \
    --cc=elder@kernel.org \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=kamal@canonical.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox