linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/3] xfs: proper replay of deferred ops queued during log recovery
Date: Mon, 4 May 2020 20:06:51 -0700	[thread overview]
Message-ID: <20200505030651.GE5716@magnolia> (raw)
In-Reply-To: <20200505023305.GM2040@dread.disaster.area>

On Tue, May 05, 2020 at 12:33:05PM +1000, Dave Chinner wrote:
> On Mon, May 04, 2020 at 06:13:39PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > When we replay unfinished intent items that have been recovered from the
> > log, it's possible that the replay will cause the creation of more
> > deferred work items.  As outlined in commit 509955823cc9c ("xfs: log
> > recovery should replay deferred ops in order"), later work items have an
> > implicit ordering dependency on earlier work items.  Therefore, recovery
> > must replay the items (both recovered and created) in the same order
> > that they would have been during normal operation.
> > 
> > For log recovery, we enforce this ordering by using an empty transaction
> > to collect deferred ops that get created in the process of recovering a
> > log intent item to prevent them from being committed before the rest of
> > the recovered intent items.  After we finish committing all the
> > recovered log items, we allocate a transaction with an enormous block
> > reservation, splice our huge list of created deferred ops into that
> > transaction, and commit it, thereby finishing all those ops.
> > 
> > This is /really/ hokey -- it's the one place in XFS where we allow
> > nested transactions; the splicing of the defer ops list is is inelegant
> > and has to be done twice per recovery function; and the broken way we
> > handle inode pointers and block reservations cause subtle use-after-free
> > and allocator problems that will be fixed by this patch and the two
> > patches after it.
> > 
> > Therefore, replace the hokey empty transaction with a structure designed
> > to capture each chain of deferred ops that are created as part of
> > recovering a single unfinished log intent.  Finally, refactor the loop
> > that replays those chains to do so using one transaction per chain.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> FWIW, I don't like the "freezer" based naming here. It's too easily
> confused with freezing and thawing the filesystem....
> 
> I know, "delayed deferred ops" isn't much better, but at least it
> won't get confused with existing unrelated functionality.

xfs_defer_{freeze,thaw} -> xfs_defer_{capture,relink} ?

> I've barely looked at the code, so no real comments on that yet,
> but I did notice this:
> 
> > @@ -2495,35 +2515,59 @@ xlog_recover_process_data(
> >  /* Take all the collected deferred ops and finish them in order. */
> >  static int
> >  xlog_finish_defer_ops(
> > -	struct xfs_trans	*parent_tp)
> > +	struct xfs_mount	*mp,
> > +	struct list_head	*dfops_freezers)
> >  {
> > -	struct xfs_mount	*mp = parent_tp->t_mountp;
> > +	struct xfs_defer_freezer *dff, *next;
> >  	struct xfs_trans	*tp;
> >  	int64_t			freeblks;
> >  	uint			resblks;
> ....
> > +		resblks = min_t(int64_t, UINT_MAX, freeblks);
> > +		resblks = (resblks * 15) >> 4;
> 
> Can overflow when freeblks > (UINT_MAX / 15).

D'oh.  Ugh, I hate this whole fugly hack.

TBH I've been thinking that perhaps the freezer function should be
capturing the unused transaction block reservation when we capture the
dfops chain from the transaction.

When we set up the second transaction, we then set t_blk_res to the
captured block reservation.  So long as the recovery function is smart
enough to set up sufficient reservation we should avoid hitting ENOSPC,
right?

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2020-05-05  3:06 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-05  1:13 [PATCH v2 0/3] xfs: fix inode use-after-free during log recovery Darrick J. Wong
2020-05-05  1:13 ` [PATCH 1/3] xfs: proper replay of deferred ops queued " Darrick J. Wong
2020-05-05  2:33   ` Dave Chinner
2020-05-05  3:06     ` Darrick J. Wong [this message]
2020-05-05  5:10       ` Dave Chinner
2020-05-05 15:08         ` Darrick J. Wong
2020-05-05  1:13 ` [PATCH 2/3] xfs: reduce log recovery transaction block reservations Darrick J. Wong
2020-05-05  1:13 ` [PATCH 3/3] xfs: fix an incore inode UAF in xfs_bui_recover Darrick J. Wong
2020-05-05 14:11   ` Brian Foster
2020-05-06  0:34     ` Darrick J. Wong
2020-05-06 13:56       ` Brian Foster
2020-05-06 17:01         ` Darrick J. Wong
2020-05-07  9:53           ` Brian Foster
2020-05-07 15:09             ` Darrick J. Wong
2020-05-07 16:58               ` Brian Foster
  -- strict thread matches above, loose matches on Subject: below --
2020-09-17  3:29 [PATCH 0/3] xfs: fix how we deal with new intents during recovery Darrick J. Wong
2020-09-17  3:29 ` [PATCH 1/3] xfs: proper replay of deferred ops queued during log recovery Darrick J. Wong
2020-09-23  7:48   ` Christoph Hellwig
2020-09-23 15:46     ` Darrick J. Wong
2020-04-22  2:08 [PATCH 0/3] xfs: fix inode use-after-free " Darrick J. Wong
2020-04-22  2:08 ` [PATCH 1/3] xfs: proper replay of deferred ops queued " Darrick J. Wong
2020-04-24 14:02   ` Brian Foster
2020-04-28 22:28     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200505030651.GE5716@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).