linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 01/12] xfs: free COW staging extents when freezing filesystem
Date: Thu, 17 Jan 2019 09:24:28 -0800	[thread overview]
Message-ID: <20190117172428.GA4424@magnolia> (raw)
In-Reply-To: <20190111162837.GA33275@bfoster>

On Fri, Jan 11, 2019 at 11:28:38AM -0500, Brian Foster wrote:
> On Mon, Dec 31, 2018 at 06:16:57PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > When we're freezing the filesystem, free all the COW staging extents
> > before we shut the log down so that we can minimize the amount of
> > recovery work that will be necessary during the next mount.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_icache.c |   17 ++++++++++++-----
> >  fs/xfs/xfs_super.c  |   18 ++++++++++++++++++
> >  2 files changed, 30 insertions(+), 5 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> > index 245483cc282b..36d986087abb 100644
> > --- a/fs/xfs/xfs_icache.c
> > +++ b/fs/xfs/xfs_icache.c
> > @@ -1741,6 +1741,7 @@ xfs_inode_free_cowblocks(
> >  	void			*args)
> >  {
> >  	struct xfs_eofblocks	*eofb = args;
> > +	uint			lock_mode = 0;
> >  	int			match;
> >  	int			ret = 0;
> >  
> > @@ -1761,9 +1762,15 @@ xfs_inode_free_cowblocks(
> >  			return 0;
> >  	}
> >  
> > -	/* Free the CoW blocks */
> > -	xfs_ilock(ip, XFS_IOLOCK_EXCL);
> > -	xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> > +	/*
> > +	 * Free the CoW blocks.  We don't need to lock the inode if we're in
> > +	 * the process of freezing the filesystem because we've already locked
> > +	 * out writes and page faults.
> > +	 */
> > +	if (ip->i_mount->m_super->s_writers.frozen == SB_UNFROZEN)
> > +		lock_mode = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
> 
> Ok, but is there a problem with actually locking in this context? If so,
> I'd expect the comment to say something like: "Free the COW blocks.
> Don't lock the inode if we're in the process of freezing the filesystem
> because <some bad thing happens>. This is safe because we've already
> locked out writes and page faults."

Hmmm, it's now been so long since I wrote this patch I don't remember
why I did this.  Wait something's coming back...

> > +	if (lock_mode)
> > +		xfs_ilock(ip, lock_mode);
> >  
> >  	/*
> >  	 * Check again, nobody else should be able to dirty blocks or change
> > @@ -1772,8 +1779,8 @@ xfs_inode_free_cowblocks(
> >  	if (xfs_prep_free_cowblocks(ip))
> >  		ret = xfs_reflink_cancel_cow_range(ip, 0, NULLFILEOFF, false);
> >  
> > -	xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
> > -	xfs_iunlock(ip, XFS_IOLOCK_EXCL);
> > +	if (lock_mode)
> > +		xfs_iunlock(ip, lock_mode);
> >  
> >  	return ret;
> >  }
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 6903b36eac5d..bb4953cfd683 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -1102,6 +1102,24 @@ xfs_fs_sync_fs(
> >  	if (!wait)
> >  		return 0;
> >  
> > +	/*
> > +	 * If we're in the middle of freezing the filesystem, this is our last
> > +	 * chance to run regular transactions.  Clear all the COW staging
> > +	 * extents so that we can freeze the filesystem with as little recovery
> > +	 * work to do at the next mount as possible.  It's safe to do this
> > +	 * without locking inodes because the freezer code flushed all the
> > +	 * dirty data from the page cache and locked out writes and page faults.
> > +	 */
> 
> Well, we have XFS_TRANS_NO_WRITECOUNT. It seems we could use an eofb
> freeze flag or some such to notify the scan that we're in a special
> freeze context and in turn use that to tweak the locking (instead of the
> SB_FREEZE_* checks) and avoid blocking on transaction allocation. Would
> that allow us to run this in xfs_fs_freeze() context?

...oh, right.  Another thread could have taken the io/mmap locks (say
for fallocate) before the fs entered SB_FROZEN, and is now blocked
in xfs_trans_alloc waiting for the fs to unfreeze.  I can't remember the
exact circumstances though.

> I'm also a little curious about the high-level tradeoff we're making.
> This patch means that a freeze, and thus something like an LVM snapshot,
> would clear/reset all COW blocks in the fs, right? If so, ISTM that
> could be annoying if the user had a bunch of reflinked vdisk images or
> something on the fs.

Yeah, it's sort of painful to drop all those speculative preallocations.

> Is the tradeoff just that if the user freezes and then doesn't unfreeze
> before the next mount that log recovery will have to deal with COW
> blocks, or are there considerations for a typical freeze/unfreeze cycle
> as well? Is this more expensive at recovery time than at freeze time?

It's entirely to reduce the recovery work necessary after freezing the
fs, snapshotting the fs (via lvm or whatever) and then mounting the
snapshot separately.  TBH this patch has languished for years now
because it's only necessary to optimize this one usage.

Maybe it's easier to drop this one, unless anyone feels particularly
passionate about clearing out cow staging extents before a freeze?

--D

> Brian
> 
> > +	if (sb->s_writers.frozen == SB_FREEZE_PAGEFAULT) {
> > +		int		error;
> > +
> > +		error = xfs_icache_free_cowblocks(mp, NULL);
> > +		if (error) {
> > +			xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE);
> > +			return error;
> > +		}
> > +	}
> > +
> >  	xfs_log_force(mp, XFS_LOG_SYNC);
> >  	if (laptop_mode) {
> >  		/*
> > 

  reply	other threads:[~2019-01-17 17:24 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-01  2:16 [PATCH 00/12] xfs: deferred inode inactivation Darrick J. Wong
2019-01-01  2:16 ` [PATCH 01/12] xfs: free COW staging extents when freezing filesystem Darrick J. Wong
2019-01-11 16:28   ` Brian Foster
2019-01-17 17:24     ` Darrick J. Wong [this message]
2019-01-17 18:14       ` Brian Foster
2019-01-17 20:20         ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 02/12] xfs: refactor the predicate part of xfs_free_eofblocks Darrick J. Wong
2019-01-11 19:05   ` Brian Foster
2019-01-17 17:33     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 03/12] xfs: decide if inode needs inactivation Darrick J. Wong
2019-01-01  2:17 ` [PATCH 04/12] xfs: track unlinked inactive inode fs summary counters Darrick J. Wong
2019-01-01  2:17 ` [PATCH 05/12] xfs: track unlinked inactive inode quota counters Darrick J. Wong
2019-01-01  2:17 ` [PATCH 06/12] xfs: refactor walking of per-AG RECLAIM inodes Darrick J. Wong
2019-01-11 19:06   ` Brian Foster
2019-01-17 17:43     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 07/12] xfs: refactor eofblocks inode match code Darrick J. Wong
2019-01-02  9:50   ` Nikolay Borisov
2019-01-17 18:05     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 08/12] xfs: deferred inode inactivation Darrick J. Wong
2019-01-01  2:17 ` [PATCH 09/12] xfs: retry fs writes when there isn't space Darrick J. Wong
2019-01-01  2:17 ` [PATCH 10/12] xfs: force inactivation before fallocate when space is low Darrick J. Wong
2019-01-01  2:17 ` [PATCH 11/12] xfs: parallelize inode inactivation Darrick J. Wong
2019-01-01  2:18 ` [PATCH 12/12] xfs: wait for deferred inactivation when destroying unlinked inodes Darrick J. Wong
2019-01-03 12:46   ` Dave Chinner
2019-01-17 18:41     ` Darrick J. Wong
2019-01-17 22:21       ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190117172428.GA4424@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).