Re: [PATCH 01/14] xfs: repair the AGF and AGFL

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 01/14] xfs: repair the AGF and AGFL
Date: Wed, 6 Jun 2018 14:06:24 +1000	[thread overview]
Message-ID: <20180606040624.GO10363@dastard> (raw)
In-Reply-To: <20180605231856.GO9437@magnolia>

On Tue, Jun 05, 2018 at 04:18:56PM -0700, Darrick J. Wong wrote:
> On Mon, Jun 04, 2018 at 11:52:55AM +1000, Dave Chinner wrote:
> > On Wed, May 30, 2018 at 12:30:45PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Regenerate the AGF and AGFL from the rmap data.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > [...]
> > 
> > > +/* Repair the AGF. */
> > > +int
> > > +xfs_repair_agf(
> > > +	struct xfs_scrub_context	*sc)
> > > +{
> > > +	struct xfs_repair_find_ag_btree	fab[] = {
> > > +		{
> > > +			.rmap_owner = XFS_RMAP_OWN_AG,
> > > +			.buf_ops = &xfs_allocbt_buf_ops,
> > > +			.magic = XFS_ABTB_CRC_MAGIC,
> > > +		},
> > > +		{
> > > +			.rmap_owner = XFS_RMAP_OWN_AG,
> > > +			.buf_ops = &xfs_allocbt_buf_ops,
> > > +			.magic = XFS_ABTC_CRC_MAGIC,
> > > +		},
> > > +		{
> > > +			.rmap_owner = XFS_RMAP_OWN_AG,
> > > +			.buf_ops = &xfs_rmapbt_buf_ops,
> > > +			.magic = XFS_RMAP_CRC_MAGIC,
> > > +		},
> > > +		{
> > > +			.rmap_owner = XFS_RMAP_OWN_REFC,
> > > +			.buf_ops = &xfs_refcountbt_buf_ops,
> > > +			.magic = XFS_REFC_CRC_MAGIC,
> > > +		},
> > > +		{
> > > +			.buf_ops = NULL,
> > > +		},
> > > +	};
> > > +	struct xfs_repair_agf_allocbt	raa;
> > > +	struct xfs_agf			old_agf;
> > > +	struct xfs_mount		*mp = sc->mp;
> > > +	struct xfs_buf			*agf_bp;
> > > +	struct xfs_buf			*agfl_bp;
> > > +	struct xfs_agf			*agf;
> > > +	struct xfs_btree_cur		*cur = NULL;
> > > +	struct xfs_perag		*pag;
> > > +	xfs_agblock_t			blocks;
> > > +	xfs_agblock_t			freesp_blocks;
> > > +	int64_t				delta_fdblocks = 0;
> > > +	int				error;
> > > +
> > > +	/* We require the rmapbt to rebuild anything. */
> > > +	if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	xfs_scrub_perag_get(sc->mp, &sc->sa);
> > > +	pag = sc->sa.pag;
> > > +	memset(&raa, 0, sizeof(raa));
> > > +	error = xfs_trans_read_buf(mp, sc->tp, mp->m_ddev_targp,
> > > +			XFS_AG_DADDR(mp, sc->sa.agno, XFS_AGF_DADDR(mp)),
> > > +			XFS_FSS_TO_BB(mp, 1), 0, &agf_bp, NULL);
> > > +	if (error)
> > > +		return error;
> > > +	agf_bp->b_ops = &xfs_agf_buf_ops;
> > > +
> > > +	/*
> > > +	 * Load the AGFL so that we can screen out OWN_AG blocks that
> > > +	 * are on the AGFL now; these blocks might have once been part
> > > +	 * of the bno/cnt/rmap btrees but are not now.
> > > +	 */
> > > +	error = xfs_alloc_read_agfl(mp, sc->tp, sc->sa.agno, &agfl_bp);
> > > +	if (error)
> > > +		return error;
> > > +	error = xfs_agfl_walk(sc->mp, XFS_BUF_TO_AGF(agf_bp), agfl_bp,
> > > +			xfs_repair_agf_check_agfl_block, sc);
> > > +	if (error)
> > > +		return error;
> > 
> > THis is a bit of a chicken/egg situation, isn't it? We haven't
> > repaired the AGFL yet, so how do we know what is valid here?
> 
> Yep.  The AGF is corrupt, so we have to trust the AGFL contents because
> we can't do any serious cross-referencing with any of the btrees rooted
> in the AGF.  If the AGFL contents are obviously bad then we'll bail out.

Can you add that as a comment here?

> > Can we factor this function allow rebuild operation lines?
> 
> Yes...
> 
> > That will help document all the different pieces it is putting
> > together. E.g move the AGF header init to before
> > xfs_repair_find_ag_btree_roots(), and then pass it into
> > xfs_repair_agf_rebuild_roots() which contains the above fab specific
> > code.
> 
> ...however, that's the second (and admittedly not well documented)
> second chicken-egg -- we find the agf btree roots by probing the rmapbt,
> which is rooted in the agf.  So xfs_repair_find_ag_btree_roots has to be
> fed the old agf_bp buffer, and if that blows up then we bail out without
> changing anything.

Same again - factoring and adding comments to explain things like
this will make it much easier to understand.

> > > +/* Record all freespace information. */
> > > +STATIC int
> > > +xfs_repair_agfl_rmap_fn(
> > > +	struct xfs_btree_cur		*cur,
> > > +	struct xfs_rmap_irec		*rec,
> > > +	void				*priv)
> > > +{
> > > +	struct xfs_repair_agfl		*ra = priv;
> > > +	struct xfs_buf			*bp;
> > > +	xfs_fsblock_t			fsb;
> > > +	int				i;
> > > +	int				error = 0;
> > > +
> > > +	if (xfs_scrub_should_terminate(ra->sc, &error))
> > > +		return error;
> > > +
> > > +	/* Record all the OWN_AG blocks... */
> > > +	if (rec->rm_owner == XFS_RMAP_OWN_AG) {
> > > +		fsb = XFS_AGB_TO_FSB(cur->bc_mp, cur->bc_private.a.agno,
> > > +				rec->rm_startblock);
> > > +		error = xfs_repair_collect_btree_extent(ra->sc,
> > > +				&ra->freesp_list, fsb, rec->rm_blockcount);
> > > +		if (error)
> > > +			return error;
> > > +	}
> > > +
> > > +	/* ...and all the rmapbt blocks... */
> > > +	for (i = 0; i < cur->bc_nlevels && cur->bc_ptrs[i] == 1; i++) {
> > 
> > What is the significance of "cur->bc_ptrs[i] == 1"?
> > 
> > This loop looks like it is walking the btree path to this leaf, but
> > bc_ptrs[] will only have a "1" in it if we are at the left-most edge
> > of the tree, right? so what about all the other btree blocks?
> 
> Close.  We're walking up the tree from the leaf towards the root.  For
> each level, we assume that if bc_ptrs[level] == 1, then this is the
> first time we've seen the block at that level, so we remember that we
> saw this rmapbt block.  bc_ptrs is the offset within a block, not the
> offset for the entire level.
> 
> So if our rmapbt tree is:
> 
>    4
>  / | \
> 1  2  3
> 
> Pretend for this example that each leaf block has 100 rmap records.  For
> the first rmap record, we'll observe that bc_ptrs[0] == 1, so we record
> that we saw block 1.  Then we observe that bc_ptrs[1] == 1, so we record
> block 4.  agmeta_list is [1, 4].
> 
> For the second rmap record, we see that bc_ptrs[0] == 2, so we exit the
> loop.  agmeta_list remains [1, 4].
> 
> For the 101st rmap record, we've moved onto leaf block 2.  Now
> bc_ptrs[0] == 1 again, so we record that we saw block 2.  We see that
> bc_ptrs[1] == 2, so we exit the loop.  agmeta_list = [1, 4, 2].
> 
> For the 102nd rmap, bc_ptrs[0] == 2, so we exit.
> 
> For the 201st rmap record, we've moved on to leaf block 3.  bc_ptrs[0]
> == 1, so we add 3 to agmeta_list.  [1, 4, 2, 3].

And that is crying out for either an iterator macro or a helper
function with that explanation above it :P

> > > +
> > > +	/* We require the rmapbt to rebuild anything. */
> > > +	if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	xfs_scrub_perag_get(sc->mp, &sc->sa);
> > > +	xfs_repair_init_extent_list(&ra.freesp_list);
> > > +	xfs_repair_init_extent_list(&ra.agmeta_list);
> > > +	ra.sc = sc;
> > > +
> > > +	error = xfs_alloc_read_agf(mp, sc->tp, sc->sa.agno, 0, &agf_bp);
> > > +	if (error)
> > > +		return error;
> > > +	if (!agf_bp)
> > > +		return -ENOMEM;
> > > +
> > > +	error = xfs_trans_read_buf(mp, sc->tp, mp->m_ddev_targp,
> > > +			XFS_AG_DADDR(mp, sc->sa.agno, XFS_AGFL_DADDR(mp)),
> > > +			XFS_FSS_TO_BB(mp, 1), 0, &agfl_bp, NULL);
> > > +	if (error)
> > > +		return error;
> > > +	agfl_bp->b_ops = &xfs_agfl_buf_ops;
> > 
> > Be nice to have a __xfs_alloc_read_agfl() function that didn't set
> > the ops, and have this and xfs_alloc_read_agfl() both call it.
> 
> Huh?  xfs_alloc_read_agfl always reads the agfl buffer with
> &xfs_agfl_buf_ops, why would we want to call it without the verifier?

You wouldn't:

xfs_alloc_read_agfl()
{
	return __xfs_alloc_read_agfl(..., &xfs_agfl_buf_ops);
}

And then the above simply becomes

	error = __xfs_alloc_read_agfl(..., NULL);
	if (error)
		return error;
	agfl_bp->b_ops = &xfs_agfl_buf_ops;

I'm more concerned about open coding of things we have currently
centralised in helpers, and trying to see if there's ways to keep
the functions centralised.

Don't worry about it - it was really just a comment about "it would
be nice to have...".

> It's only scrub that gets to do screwy things like read buffers with no
> verifier.  libxfs functions should never do that.

I didn't know (well, I don't recall that) we have this rule. Can you
point me at the discussion so I can read up on it?  IMO libxfs is
for centralising common operations, not for enforcing boundaries or
rules on how we access objects.


> > > +int
> > > +xfs_repair_mod_fdblocks(
> > > +	struct xfs_scrub_context	*sc,
> > > +	int64_t				delta_fdblocks)
> > > +{
> > > +	int				error;
> > > +
> > > +	if (delta_fdblocks == 0)
> > > +		return 0;
> > > +
> > > +	if (delta_fdblocks < 0) {
> > > +		error = xfs_trans_reserve_more(sc->tp, -delta_fdblocks, 0);
> > > +		if (error)
> > > +			return error;
> > > +	}
> > > +
> > > +	xfs_trans_mod_sb(sc->tp, XFS_TRANS_SB_FDBLOCKS, delta_fdblocks);
> > 
> > This seems a little hacky - it's working around a transaction
> > reservation overflow warning, right?
> 
> More than that -- we're trying to avoid the situation where the incore
> free block counter goes negative.

Which will only happen if we overflow the transaction reservation,
yes?

> Things go south pretty quickly when
> that happens because transaction reservations succeed when there's not
> enough free space to accomodate them.  We'd rather error out to
> userspace and have the admin unmount and xfs_repair than risk letting
> the fs really blow up.

Sure, but I really don't like retrospective modification of
transaction reservations.  The repair code is already supposed to
have a reservation that is big enough to rebuild the AG trees, so
why should we need to reserve more space while rebuilding the AG
trees?

> Note that this function has to be called before repair dirties anything
> in the repair transaction so we're still at a place where we could back
> out with no harm done.

Still doesn't explain to me what the problem is that this code works
around. And because I don't understand why it is necessary, this just
seems like a hack....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2018-06-06  4:06 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-30 19:30 [PATCH v15.2 00/14] xfs-4.18: online repair support Darrick J. Wong
2018-05-30 19:30 ` [PATCH 01/14] xfs: repair the AGF and AGFL Darrick J. Wong
2018-06-04  1:52   ` Dave Chinner
2018-06-05 23:18     ` Darrick J. Wong
2018-06-06  4:06       ` Dave Chinner [this message]
2018-06-06  4:56         ` Darrick J. Wong
2018-06-07  0:31           ` Dave Chinner
2018-06-07  4:42             ` Darrick J. Wong
2018-06-08  0:55               ` Dave Chinner
2018-06-08  1:23                 ` Darrick J. Wong
2018-05-30 19:30 ` [PATCH 02/14] xfs: repair the AGI Darrick J. Wong
2018-06-04  1:56   ` Dave Chinner
2018-06-05 23:54     ` Darrick J. Wong
2018-05-30 19:30 ` [PATCH 03/14] xfs: repair free space btrees Darrick J. Wong
2018-06-04  2:12   ` Dave Chinner
2018-06-06  1:50     ` Darrick J. Wong
2018-06-06  3:34       ` Dave Chinner
2018-06-06  4:01         ` Darrick J. Wong
2018-05-30 19:31 ` [PATCH 04/14] xfs: repair inode btrees Darrick J. Wong
2018-06-04  3:41   ` Dave Chinner
2018-06-06  3:55     ` Darrick J. Wong
2018-06-06  4:32       ` Dave Chinner
2018-06-06  4:58         ` Darrick J. Wong
2018-05-30 19:31 ` [PATCH 05/14] xfs: repair the rmapbt Darrick J. Wong
2018-05-31  5:42   ` Amir Goldstein
2018-06-06 21:13     ` Darrick J. Wong
2018-05-30 19:31 ` [PATCH 06/14] xfs: repair refcount btrees Darrick J. Wong
2018-05-30 19:31 ` [PATCH 07/14] xfs: repair inode records Darrick J. Wong
2018-05-30 19:31 ` [PATCH 08/14] xfs: zap broken inode forks Darrick J. Wong
2018-05-30 19:31 ` [PATCH 09/14] xfs: repair inode block maps Darrick J. Wong
2018-05-30 19:31 ` [PATCH 10/14] xfs: repair damaged symlinks Darrick J. Wong
2018-05-30 19:31 ` [PATCH 11/14] xfs: repair extended attributes Darrick J. Wong
2018-05-30 19:31 ` [PATCH 12/14] xfs: scrub should set preen if attr leaf has holes Darrick J. Wong
2018-05-30 19:32 ` [PATCH 13/14] xfs: repair quotas Darrick J. Wong
2018-05-30 19:32 ` [PATCH 14/14] xfs: implement live quotacheck as part of quota repair Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180606040624.GO10363@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.