From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 02/12] xfs: check deferred refcount op continuation parameters
Date: Thu, 27 Oct 2022 16:25:36 -0700 [thread overview]
Message-ID: <Y1sTcJx4qCdZ6MkF@magnolia> (raw)
In-Reply-To: <20221027222403.GB3600936@dread.disaster.area>
On Fri, Oct 28, 2022 at 09:24:03AM +1100, Dave Chinner wrote:
> On Thu, Oct 27, 2022 at 02:32:42PM -0700, Darrick J. Wong wrote:
> > On Fri, Oct 28, 2022 at 07:49:57AM +1100, Dave Chinner wrote:
> > > On Thu, Oct 27, 2022 at 10:14:14AM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <djwong@kernel.org>
> > > >
> > > > If we're in the middle of a deferred refcount operation and decide to
> > > > roll the transaction to avoid overflowing the transaction space, we need
> > > > to check the new agbno/aglen parameters that we're about to record in
> > > > the new intent. Specifically, we need to check that the new extent is
> > > > completely within the filesystem, and that continuation does not put us
> > > > into a different AG.
> > > >
> > > > If the keys of a node block are wrong, the lookup to resume an
> > > > xfs_refcount_adjust_extents operation can put us into the wrong record
> > > > block. If this happens, we might not find that we run out of aglen at
> > > > an exact record boundary, which will cause the loop control to do the
> > > > wrong thing.
> > > >
> > > > The previous patch should take care of that problem, but let's add this
> > > > extra sanity check to stop corruption problems sooner than later.
> > > >
> > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > ---
> > > > fs/xfs/libxfs/xfs_refcount.c | 48 ++++++++++++++++++++++++++++++++++++++++--
> > > > 1 file changed, 46 insertions(+), 2 deletions(-)
> > > >
> > > >
> > > > diff --git a/fs/xfs/libxfs/xfs_refcount.c b/fs/xfs/libxfs/xfs_refcount.c
> > > > index 831353ba96dc..c6aa832a8713 100644
> > > > --- a/fs/xfs/libxfs/xfs_refcount.c
> > > > +++ b/fs/xfs/libxfs/xfs_refcount.c
> > > > @@ -1138,6 +1138,44 @@ xfs_refcount_finish_one_cleanup(
> > > > xfs_trans_brelse(tp, agbp);
> > > > }
> > > >
> > > > +/*
> > > > + * Set up a continuation a deferred refcount operation by updating the intent.
> > > > + * Checks to make sure we're not going to run off the end of the AG.
> > > > + */
> > > > +static inline int
> > > > +xfs_refcount_continue_op(
> > > > + struct xfs_btree_cur *cur,
> > > > + xfs_fsblock_t startblock,
> > > > + xfs_agblock_t new_agbno,
> > > > + xfs_extlen_t new_len,
> > > > + xfs_fsblock_t *fsbp)
> > > > +{
> > > > + struct xfs_mount *mp = cur->bc_mp;
> > > > + struct xfs_perag *pag = cur->bc_ag.pag;
> > > > + xfs_fsblock_t new_fsbno;
> > > > + xfs_agnumber_t old_agno;
> > > > +
> > > > + old_agno = XFS_FSB_TO_AGNO(mp, startblock);
> > > > + new_fsbno = XFS_AGB_TO_FSB(mp, pag->pag_agno, new_agbno);
> > > > +
> > > > + /*
> > > > + * If we don't have any work left to do, then there's no need
> > > > + * to perform the validation of the new parameters.
> > > > + */
> > > > + if (!new_len)
> > > > + goto done;
> > >
> > > Shouldn't we be validating new_fsbno rather than just returning
> > > whatever we calculated here?
> >
> > No. Imagine that the deferred work is performed against the last 30
> > blocks of the last AG in the filesystem. Let's say that the last AG is
> > AG 3 and the AG has 100 blocks. fsblock 3:99 is the last fsblock in the
> > filesystem.
> >
> > Before we start the deferred work, startblock == 3:70 and
> > blockcount == 30. We adjust the refcount of those 30 blocks, so we're
> > done now. The adjust function passes out new_agbno == 70 + 30 and
> > new_len == 30 - 30.
> >
> > The agbno to fsbno conversion sets new_fsbno to 3:100 and new_len is 0.
> > However, fsblock 3/100 is one block past the end of both AG 3 and the
> > filesystem, so the check below will fail:
>
> Sure, but my point here is that the function returns this invalid
> fsbno in *fsbp and assumes that the caller will handle it correctly.
>
> If the caller knows that we aren't going to continue past the
> "new_len == 0" condition, then why is it even calling this function?
> i.e. this isn't a "decide if we are going to continue" function,
> it's a "calculate and validate next fsbno" function...
>
> i.e. the intent doesn't match the name of the function.
<nod> Well I've already moved the if test to the callsite, so I hope
that'll be less confusing.
>
> > > > + if (XFS_IS_CORRUPT(mp, !xfs_verify_fsbext(mp, new_fsbno, new_len)))
> > > > + return -EFSCORRUPTED;
> > > > +
> > > > + if (XFS_IS_CORRUPT(mp, old_agno != XFS_FSB_TO_AGNO(mp, new_fsbno)))
> > > > + return -EFSCORRUPTED;
> > >
> > > We already know what agno new_fsbno sits in - we calculated it
> > > directly from pag->pag_agno above, so this can jsut check against
> > > pag->pag_agno directly, right?
> >
> > We don't actually know what agno new_fsbno sits in because of the way
> > that the agblock -> fsblock conversion works:
> >
> > #define XFS_AGB_TO_FSB(mp,agno,agbno) \
> > (((xfs_fsblock_t)(agno) << (mp)->m_sb.sb_agblklog) | (agbno))
>
> Sure, but FSBs are *sparse* and there is unused, unchecked address
> space between the AGs that agbno overruns can fall into. And when we
> look at XFS_FSB_TO_AGNO():
>
> #define XFS_FSB_TO_AGNO(mp,fsbno) \
> ((xfs_agnumber_t)((fsbno) >> (mp)->m_sb.sb_agblklog))
>
> we can see that it simply truncates away the agbno portion to get
> back to the agno.
>
> IOWs:
>
> 0 sb_agblocks
> +--------------------------+------------+
> (1 << sb_agblklog)
> +------------+
> invalid agbnos!
>
> Hence the agbno needs to be checked agains sb_agblocks to capture AG
> overruns, not converted to a FSB and back to an AGNO as this will
> claim agbnos in the inaccessible address space region between AGs
> are valid....
>
> > Notice how we don't mask off the bits of agbno above sb_agblklog? If
> > sb_agblklog is (say) 20 but agbno has bit 31 set, that bit 31 will bump
> > the AG number by 2^11 AGs.
>
> Yes, but that's only a side effect of the agbno having the high bit
> set - it could have many other bits set and still be out of range.
> i.e. coverting to fsb and back to agno doesn't actually capture all
> cases of the next calculated agbno/fsbno could be invalid.
>
> xfs_verify_fsbext() may capture this by chance because it checks
> the entire agbno portion of the fsb (via XFS_FSB_TO_AGBNO) against
> xfs_ag_block_count(agno), but it won't capture the overruns that
> only bump the AGNO portion of the FSB.
>
> Hence I really think we should be checking new_agbno for validity
> here, not relying on side effects of coverting to/from FSBs and
> verifying fsb extents to capture ag block count overruns in the
> supplied agbno....
Oh, ok. So you want to check the new agbno and new aglen to make sure
that both are within the filesystem and whatnot *before* we call
XFS_AGB_TO_FSBNO, rather than checking the fsblock after the conversion?
I can do that.
--D
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2022-10-27 23:25 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-27 17:14 [PATCHSET v2 00/12] xfs: improve runtime refcountbt corruption detection Darrick J. Wong
2022-10-27 17:14 ` [PATCH 01/12] xfs: make sure aglen never goes negative in xfs_refcount_adjust_extents Darrick J. Wong
2022-10-27 20:41 ` Dave Chinner
2022-10-27 17:14 ` [PATCH 02/12] xfs: check deferred refcount op continuation parameters Darrick J. Wong
2022-10-27 20:49 ` Dave Chinner
2022-10-27 21:32 ` Darrick J. Wong
2022-10-27 21:42 ` Darrick J. Wong
2022-10-27 22:24 ` Dave Chinner
2022-10-27 23:25 ` Darrick J. Wong [this message]
2022-10-27 21:54 ` [PATCH v2.1 " Darrick J. Wong
2022-10-27 17:14 ` [PATCH 03/12] xfs: move _irec structs to xfs_types.h Darrick J. Wong
2022-10-27 17:14 ` [PATCH 04/12] xfs: refactor refcount record usage in xchk_refcountbt_rec Darrick J. Wong
2022-10-27 17:14 ` [PATCH 05/12] xfs: track cow/shared record domains explicitly in xfs_refcount_irec Darrick J. Wong
2022-10-27 21:03 ` Dave Chinner
2022-10-27 21:10 ` Darrick J. Wong
2022-10-27 17:14 ` [PATCH 06/12] xfs: report refcount domain in tracepoints Darrick J. Wong
2022-10-27 21:05 ` Dave Chinner
2022-10-27 17:14 ` [PATCH 07/12] xfs: refactor domain and refcount checking Darrick J. Wong
2022-10-27 21:07 ` Dave Chinner
2022-10-27 17:14 ` [PATCH 08/12] xfs: remove XFS_FIND_RCEXT_SHARED and _COW Darrick J. Wong
2022-10-27 21:11 ` Dave Chinner
2022-10-27 17:14 ` [PATCH 09/12] xfs: check record domain when accessing refcount records Darrick J. Wong
2022-10-27 21:15 ` Dave Chinner
2022-10-27 21:33 ` Darrick J. Wong
2022-10-27 17:14 ` [PATCH 10/12] xfs: fix agblocks check in the cow leftover recovery function Darrick J. Wong
2022-10-27 21:22 ` Dave Chinner
2022-10-27 17:15 ` [PATCH 11/12] xfs: fix uninitialized list head in struct xfs_refcount_recovery Darrick J. Wong
2022-10-27 21:24 ` Dave Chinner
2022-10-27 17:15 ` [PATCH 12/12] xfs: rename XFS_REFC_COW_START to _COWFLAG Darrick J. Wong
2022-10-27 21:25 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1sTcJx4qCdZ6MkF@magnolia \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox