From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Chandan Babu R <chandan.babu@oracle.com>,
Christoph Hellwig <hch@lst.de>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 05/15] xfs: support dynamic btree cursor heights
Date: Wed, 13 Oct 2021 09:52:18 -0700 [thread overview]
Message-ID: <20211013165218.GV24307@magnolia> (raw)
In-Reply-To: <20211013053122.GX2361455@dread.disaster.area>
On Wed, Oct 13, 2021 at 04:31:22PM +1100, Dave Chinner wrote:
> On Tue, Oct 12, 2021 at 04:33:01PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> >
> > Split out the btree level information into a separate struct and put it
> > at the end of the cursor structure as a VLA. The realtime rmap btree
> > (which is rooted in an inode) will require the ability to support many
> > more levels than a per-AG btree cursor, which means that we're going to
> > create two btree cursor caches to conserve memory for the more common
> > case.
> >
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > Reviewed-by: Chandan Babu R <chandan.babu@oracle.com>
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > ---
> > fs/xfs/libxfs/xfs_alloc.c | 6 +-
> > fs/xfs/libxfs/xfs_bmap.c | 10 +--
> > fs/xfs/libxfs/xfs_btree.c | 168 +++++++++++++++++++++++----------------------
> > fs/xfs/libxfs/xfs_btree.h | 28 ++++++--
> > fs/xfs/scrub/bitmap.c | 22 +++---
> > fs/xfs/scrub/bmap.c | 2 -
> > fs/xfs/scrub/btree.c | 47 +++++++------
> > fs/xfs/scrub/trace.c | 7 +-
> > fs/xfs/scrub/trace.h | 10 +--
> > fs/xfs/xfs_super.c | 2 -
> > fs/xfs/xfs_trace.h | 2 -
> > 11 files changed, 164 insertions(+), 140 deletions(-)
>
> Hmmm - subject of the patch doesn't really match the changes being
> made - there's nothing here that makes the btree cursor heights
> dynamic. It's just a structure layout change...
"xfs: prepare xfs_btree_cur for dynamic cursor heights" ?
>
> > @@ -415,9 +415,9 @@ xfs_btree_dup_cursor(
> > * For each level current, re-get the buffer and copy the ptr value.
> > */
> > for (i = 0; i < new->bc_nlevels; i++) {
> > - new->bc_ptrs[i] = cur->bc_ptrs[i];
> > - new->bc_ra[i] = cur->bc_ra[i];
> > - bp = cur->bc_bufs[i];
> > + new->bc_levels[i].ptr = cur->bc_levels[i].ptr;
> > + new->bc_levels[i].ra = cur->bc_levels[i].ra;
> > + bp = cur->bc_levels[i].bp;
> > if (bp) {
> > error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp,
> > xfs_buf_daddr(bp), mp->m_bsize,
> > @@ -429,7 +429,7 @@ xfs_btree_dup_cursor(
> > return error;
> > }
> > }
> > - new->bc_bufs[i] = bp;
> > + new->bc_levels[i].bp = bp;
> > }
> > *ncur = new;
> > return 0;
>
> ObHuh: that dup_cursor code seems like a really obtuse way of doing:
>
> bip = cur->bc_levels[i].bp->b_log_item;
> bip->bli_recur++;
> new->bc_levels[i] = cur->bc_levels[i];
>
> But that's not a problem this patch needs to solve. Just something
> that made me go hmmmm...
Yeah, I noticed that too while I was checking the results of my sed
script.
> > @@ -922,11 +922,11 @@ xfs_btree_readahead(
> > (lev == cur->bc_nlevels - 1))
> > return 0;
> >
> > - if ((cur->bc_ra[lev] | lr) == cur->bc_ra[lev])
> > + if ((cur->bc_levels[lev].ra | lr) == cur->bc_levels[lev].ra)
> > return 0;
>
> That's whacky logic. Surely that's just:
>
> if (cur->bc_levels[lev].ra & lr)
> return 0;
This is an early-exit test, which means the careful check is necessary.
If (some day) someone calls this function with (LEFTRA|RIGHTRA) to
readahead both siblings on a btree level where one sibling has been ra'd
but not the other, we must avoid taking the branch.
> > diff --git a/fs/xfs/libxfs/xfs_btree.h b/fs/xfs/libxfs/xfs_btree.h
> > index 1018bcc43d66..f31f057bec9d 100644
> > --- a/fs/xfs/libxfs/xfs_btree.h
> > +++ b/fs/xfs/libxfs/xfs_btree.h
> > @@ -212,6 +212,19 @@ struct xfs_btree_cur_ino {
> > #define XFS_BTCUR_BMBT_INVALID_OWNER (1 << 1)
> > };
> >
> > +struct xfs_btree_level {
> > + /* buffer pointer */
> > + struct xfs_buf *bp;
> > +
> > + /* key/record number */
> > + uint16_t ptr;
> > +
> > + /* readahead info */
> > +#define XFS_BTCUR_LEFTRA 1 /* left sibling has been read-ahead */
> > +#define XFS_BTCUR_RIGHTRA 2 /* right sibling has been read-ahead */
> > + uint16_t ra;
> > +};
>
> The ra variable is a bit field. Can we define the values obviously
> as bit fields with (1 << 0) and (1 << 1) instead of 1 and 2?
Done.
> > @@ -242,8 +250,17 @@ struct xfs_btree_cur
> > struct xfs_btree_cur_ag bc_ag;
> > struct xfs_btree_cur_ino bc_ino;
> > };
> > +
> > + /* Must be at the end of the struct! */
> > + struct xfs_btree_level bc_levels[];
> > };
> >
> > +static inline size_t
> > +xfs_btree_cur_sizeof(unsigned int nlevels)
> > +{
> > + return struct_size((struct xfs_btree_cur *)NULL, bc_levels, nlevels);
> > +}
>
> Ooooh, yeah, we really need comments explaining how many btree
> levels these VLAs are tracking, because this one doesn't have a "-
> 1" in it like the previous one I commented on....
/*
* Compute the size of a btree cursor that can handle a btree of a given
* height. The bc_levels array handles node and leaf blocks, so its
* size is exactly nlevels.
*/
> > diff --git a/fs/xfs/scrub/trace.c b/fs/xfs/scrub/trace.c
> > index c0ef53fe6611..816dfc8e5a80 100644
> > --- a/fs/xfs/scrub/trace.c
> > +++ b/fs/xfs/scrub/trace.c
> > @@ -21,10 +21,11 @@ xchk_btree_cur_fsbno(
> > struct xfs_btree_cur *cur,
> > int level)
> > {
> > - if (level < cur->bc_nlevels && cur->bc_bufs[level])
> > + if (level < cur->bc_nlevels && cur->bc_levels[level].bp)
> > return XFS_DADDR_TO_FSB(cur->bc_mp,
> > - xfs_buf_daddr(cur->bc_bufs[level]));
> > - if (level == cur->bc_nlevels - 1 && cur->bc_flags & XFS_BTREE_LONG_PTRS)
> > + xfs_buf_daddr(cur->bc_levels[level].bp));
> > + else if (level == cur->bc_nlevels - 1 &&
> > + cur->bc_flags & XFS_BTREE_LONG_PTRS)
>
> No need for an else there as the first if () clause returns.
> Also, needs more () around that "a & b" second line.
TBH I think we check the wrong flag, and that last bit should be:
if (level == cur->bc_nlevels - 1 &&
(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE))
return XFS_INO_TO_FSB(cur->bc_mp, cur->bc_ino.ip->i_ino);
return NULLFSBLOCK;
But for now I'll stick to the straight replacement and tack on another
patch to fix that.
--D
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2021-10-13 16:52 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-12 23:32 [PATCHSET v3 00/15] xfs: support dynamic btree cursor height Darrick J. Wong
2021-10-12 23:32 ` [PATCH 01/15] xfs: remove xfs_btree_cur.bc_blocklog Darrick J. Wong
2021-10-13 0:56 ` Dave Chinner
2021-10-12 23:32 ` [PATCH 02/15] xfs: reduce the size of nr_ops for refcount btree cursors Darrick J. Wong
2021-10-13 0:57 ` Dave Chinner
2021-10-12 23:32 ` [PATCH 03/15] xfs: don't track firstrec/firstkey separately in xchk_btree Darrick J. Wong
2021-10-13 1:02 ` Dave Chinner
2021-10-12 23:32 ` [PATCH 04/15] xfs: dynamically allocate btree scrub context structure Darrick J. Wong
2021-10-13 4:57 ` Dave Chinner
2021-10-13 16:29 ` Darrick J. Wong
2021-10-12 23:33 ` [PATCH 05/15] xfs: support dynamic btree cursor heights Darrick J. Wong
2021-10-13 5:31 ` Dave Chinner
2021-10-13 16:52 ` Darrick J. Wong [this message]
2021-10-13 21:14 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 06/15] xfs: rearrange xfs_btree_cur fields for better packing Darrick J. Wong
2021-10-13 5:34 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 07/15] xfs: refactor btree cursor allocation function Darrick J. Wong
2021-10-13 5:34 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 08/15] xfs: encode the max btree height in the cursor Darrick J. Wong
2021-10-13 5:38 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 09/15] xfs: dynamically allocate cursors based on maxlevels Darrick J. Wong
2021-10-13 5:40 ` Dave Chinner
2021-10-13 16:55 ` Darrick J. Wong
2021-10-12 23:33 ` [PATCH 10/15] xfs: compute actual maximum btree height for critical reservation calculation Darrick J. Wong
2021-10-13 5:49 ` Dave Chinner
2021-10-13 17:07 ` Darrick J. Wong
2021-10-13 20:18 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 11/15] xfs: compute the maximum height of the rmap btree when reflink enabled Darrick J. Wong
2021-10-13 7:25 ` Dave Chinner
2021-10-13 17:47 ` Darrick J. Wong
2021-10-12 23:33 ` [PATCH 12/15] xfs: kill XFS_BTREE_MAXLEVELS Darrick J. Wong
2021-10-13 7:25 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 13/15] xfs: widen btree maxlevels computation to handle 64-bit record counts Darrick J. Wong
2021-10-13 7:28 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 14/15] xfs: compute absolute maximum nlevels for each btree type Darrick J. Wong
2021-10-13 7:57 ` Dave Chinner
2021-10-13 21:36 ` Darrick J. Wong
2021-10-13 23:48 ` Dave Chinner
2021-10-12 23:33 ` [PATCH 15/15] xfs: use separate btree cursor cache " Darrick J. Wong
2021-10-13 8:01 ` Dave Chinner
2021-10-13 21:42 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211013165218.GV24307@magnolia \
--to=djwong@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).