public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Brian Foster <bfoster@redhat.com>
Cc: sandeen@sandeen.net, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist
Date: Mon, 29 Jun 2020 16:21:40 -0700	[thread overview]
Message-ID: <20200629232140.GV7606@magnolia> (raw)
In-Reply-To: <20200629122228.GB10449@bfoster>

On Mon, Jun 29, 2020 at 08:22:28AM -0400, Brian Foster wrote:
> On Thu, Jun 25, 2020 at 01:52:39PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > In commit 9851fd79bfb1, we added a slight amount of slack to the free
> > space btrees being reconstructed so that the initial fix_freelist call
> > (which is run against a totally empty AGFL) would never have to split
> > either free space btree in order to populate the free list.
> > 
> > The new btree bulk loading code in xfs_repair can re-create this
> > situation because it can set the slack values to zero if the filesystem
> > is very full.  However, these days repair has the infrastructure needed
> > to ensure that overestimations of the btree block counts end up on the
> > AGFL or get freed back into the filesystem at the end of phase 5.
> > 
> > Fix this problem by reserving blocks to a separate AGFL block
> > reservation, and checking that between this new reservation and any
> > overages in the bnobt/cntbt fakeroots, we have enough blocks sitting
> > around to populate the AGFL with the minimum number of blocks it needs
> > to handle a split in the bno/cnt/rmap btrees.
> > 
> > Note that we reserve blocks for the new bnobt/cntbt/AGFL at the very end
> > of the reservation steps in phase 5, so the extra allocation should not
> > cause repair to fail if it can't find blocks for btrees.
> > 
> > Fixes: 9851fd79bfb1 ("repair: AGFL rebuild fails if btree split required")
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  repair/agbtree.c |   78 +++++++++++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 68 insertions(+), 10 deletions(-)
> > 
> > 
> > diff --git a/repair/agbtree.c b/repair/agbtree.c
> > index 339b1489..7a4f316c 100644
> > --- a/repair/agbtree.c
> > +++ b/repair/agbtree.c
> ...
> > @@ -262,25 +286,59 @@ _("Unable to compute free space by block btree geometry, error %d.\n"), -error);
> ...
> > +
> > +		/*
> > +		 * Now try to fill the bnobt/cntbt cursors with extra blocks to
> > +		 * populate the AGFL.  If we don't get all the blocks we want,
> > +		 * stop trying to fill the AGFL.
> > +		 */
> > +		wanted = (int64_t)btr_bno->bload.nr_blocks +
> > +				(min_agfl_len / 2) - bno_blocks;
> > +		if (wanted > 0 && fill_agfl) {
> > +			got = reserve_agblocks(sc->mp, agno, btr_bno, wanted);
> > +			if (wanted > got)
> > +				fill_agfl = false;
> > +			btr_bno->bload.nr_blocks += got;
> > +		}
> > +
> > +		wanted = (int64_t)btr_cnt->bload.nr_blocks +
> > +				(min_agfl_len / 2) - cnt_blocks;
> > +		if (wanted > 0 && fill_agfl) {
> > +			got = reserve_agblocks(sc->mp, agno, btr_cnt, wanted);
> > +			if (wanted > got)
> > +				fill_agfl = false;
> > +			btr_cnt->bload.nr_blocks += got;
> > +		}
> 
> It's a little hard to follow this with the nr_blocks sampling and
> whatnot, but I think I get the idea. What's the reason for splitting the
> AGFL res requirement evenly across the two cursors? These AGFL blocks
> all fall into the same overflow pool, right? I was wondering why we
> couldn't just attach the overflow to one, or check one for the full res
> and then the other if more blocks are needed.

I chose to stuff the excess blocks into the bnobt and cntbt bulkload
cursors to avoid having to initialize a semi-phony "bulkload cursor" for
the agfl, and I decided to split them evenly between the two cursors so
that I wouldn't have someday to deal with a bug report about how one
cursor somehow ran out of blocks but the other one had plenty more.

> In thinking about it a bit more, wouldn't the whole algorithm be more
> simple if we reserved the min AGFL requirement first, optionally passed
> 'agfl_res' to reserve_btblocks() such that subsequent reservations can
> steal from it (and then fail if it depletes), then stuff what's left in
> one (or both, if there's a reason for that) of the cursors at the end?

Hmm.  I hadn't thought about that.  In general I wanted the AGFL
reservations to go last because I'd rather we set off with an underfull
AGFL than totally fail because we couldn't get blocks for the
bnobt/cntbt, but I suppose you're right that we could steal from it as
needed to prevent repair failure.

So, uh, I could rework this patch to create a phony agfl bulk load
cursor, fill it before the loop, steal blocks from it to fill the
bnobt/cntbt to satisfy failed allocations, and then dump any remainders
into the bnobt/cntbt cursors afterwards.  How does that sound?

--D

> Brian
> 
> >  
> >  		/* Ok, now how many free space records do we have? */
> >  		*nr_extents = count_bno_extents_blocks(agno, &num_freeblocks);
> >  	} while (1);
> > -
> > -	*extra_blocks = (bno_blocks - btr_bno->bload.nr_blocks) +
> > -			(cnt_blocks - btr_cnt->bload.nr_blocks);
> >  }
> >  
> >  /* Rebuild the free space btrees. */
> > 
> 

  reply	other threads:[~2020-06-29 23:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 20:52 [PATCH 0/2] xfs_repair: more fixes Darrick J. Wong
2020-06-25 20:52 ` [PATCH 1/2] xfs_repair: complain about ag header crc errors Darrick J. Wong
2020-06-29 12:20   ` Brian Foster
2020-06-29 23:11     ` Darrick J. Wong
2020-06-25 20:52 ` [PATCH 2/2] xfs_repair: try to fill the AGFL before we fix the freelist Darrick J. Wong
2020-06-29 12:22   ` Brian Foster
2020-06-29 23:21     ` Darrick J. Wong [this message]
2020-06-30 10:52       ` Brian Foster
2020-07-01 16:19         ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200629232140.GV7606@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox