public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, hch@lst.de
Subject: Re: [PATCH 08/10] xfs: hoist the code that moves the incore inode fork broot memory
Date: Thu, 29 Aug 2024 16:35:07 -0700	[thread overview]
Message-ID: <20240829233507.GU6224@frogsfrogsfrogs> (raw)
In-Reply-To: <Zs/i6JZerKLqTLnt@dread.disaster.area>

On Thu, Aug 29, 2024 at 12:54:32PM +1000, Dave Chinner wrote:
> On Tue, Aug 27, 2024 at 04:35:48PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Whenever we change the size of the memory buffer holding an inode fork
> > btree root block, we have to copy the contents over.  Refactor all this
> > into a single function that handles both, in preparation for making
> > xfs_iroot_realloc more generic.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> >  fs/xfs/libxfs/xfs_inode_fork.c |   87 ++++++++++++++++++++++++++--------------
> >  1 file changed, 56 insertions(+), 31 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> > index 60646a6c32ec7..307207473abdb 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.c
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> > @@ -387,6 +387,50 @@ xfs_iroot_free(
> >  	ifp->if_broot = NULL;
> >  }
> >  
> > +/* Move the bmap btree root from one incore buffer to another. */
> > +static void
> > +xfs_ifork_move_broot(
> > +	struct xfs_inode	*ip,
> > +	int			whichfork,
> > +	struct xfs_btree_block	*dst_broot,
> > +	size_t			dst_bytes,
> > +	struct xfs_btree_block	*src_broot,
> > +	size_t			src_bytes,
> > +	unsigned int		numrecs)
> > +{
> > +	struct xfs_mount	*mp = ip->i_mount;
> > +	void			*dptr;
> > +	void			*sptr;
> > +
> > +	ASSERT(xfs_bmap_bmdr_space(src_broot) <= xfs_inode_fork_size(ip, whichfork));
> 
> We pass whichfork just for this debug check. Can you pull this up
> to the callers?

I guess I could do that, but the rtrmap patchset adds its own broot
shrink/grow function specific to rtrmap btree inodes:

static void
xfs_rtrmapbt_broot_move(...)
{
	...
	ASSERT(xfs_rtrmap_droot_space(src_broot) <=
	       xfs_inode_fork_size(ip, whichfork));

so I didn't want to add yet another indirect call just for an assertion.

> > +
> > +	/*
> > +	 * We always have to move the pointers because they are not butted
> > +	 * against the btree block header.
> > +	 */
> > +	if (numrecs) {
> > +		sptr = xfs_bmap_broot_ptr_addr(mp, src_broot, 1, src_bytes);
> > +		dptr = xfs_bmap_broot_ptr_addr(mp, dst_broot, 1, dst_bytes);
> > +		memmove(dptr, sptr, numrecs * sizeof(xfs_fsblock_t));
> > +	}
> > +
> > +	if (src_broot == dst_broot)
> > +		return;
> 
> Urk. So this is encoding caller logic directly into this function.
> ie. the grow cases uses krealloc() which copies the keys and
> pointers but still needs the pointers moved. The buffer is large
> enough for that, so it passes src and dst as the same buffer and
> this code then jumps out after copying the ptrs (a second time) to
> their final resting place.

<nod>

> > +	/*
> > +	 * If the root is being totally relocated, we have to migrate the block
> > +	 * header and the keys that come after it.
> > +	 */
> > +	memcpy(dst_broot, src_broot, xfs_bmbt_block_len(mp));
> > +
> > +	/* Now copy the keys, which come right after the header. */
> > +	if (numrecs) {
> > +		sptr = xfs_bmbt_key_addr(mp, src_broot, 1);
> > +		dptr = xfs_bmbt_key_addr(mp, dst_broot, 1);
> > +		memcpy(dptr, sptr, numrecs * sizeof(struct xfs_bmbt_key));
> > +	}
> 
> And here we do the key copy for the shrink case where we technically
> don't need separate buffers but we really want to minimise memory
> usage if we can so we reallocate a smaller buffer and free the
> original larger one.
> 
> Given this, I think this code is more natural by doing all the
> allocate/free/copy ourselves instead of using krealloc() and it's
> implicit copy for one of the cases.
> 
> i.e. rename this function xfs_ifork_realloc_broot() and make it do
> this:
> 
> {
> 	struct xfs_btree_block *src = ifp->if_broot;
> 	struct xfs_btree_block *dst = NULL;
> 
> 	if (!numrecs)
> 		goto out_free_src;
> 
> 	dst = kmalloc(new_size);
> 
> 	/* copy block header */
> 	memcpy(dst, src, xfs_bmbt_block_len(mp));

I'm not sure I like replacing krealloc with kmalloc here.  For a grow
operation, if the new and old object sizes are close enough that we
reuse the existing slab object, then we only have to move the pointers.
In the best case, the object expands, so all the bytes we had before are
still live and we touch fewer cachelines.  In the worst case we get a
new object, but that's roughly exponential.

For a shrink operation, we definitely want the alloc -> copy -> free
logic because there's no way to guarantee that krealloc-down isn't a nop
operation, which wastes memory.

But I see that this function isn't very cohesive and could be split into
separate ->grow and ->shrink functions that do their own allocations.

Or seeing how the only callers of xfs_iroot_realloc are the btree code
itself, maybe I should just refactor this into a single ->broot_realloc
function in the btree ops which will cut out a lot of indirect calls
from the iroot code.

Yeah.  I'm gonna go do that.  Disregard patch 5 onwards.

> 	/* copy records */
> 	sptr = xfs_bmbt_key_addr(mp, src, 1);
> 	dptr = xfs_bmbt_key_addr(mp, dst, 1);
> 	memcpy(dptr, sptr, numrecs * sizeof(struct xfs_bmbt_key));
> 
> 	/* copy pointers */
> 	sptr = xfs_bmap_broot_ptr_addr(mp, src_broot, 1, src_bytes);
> 	dptr = xfs_bmap_broot_ptr_addr(mp, dst_broot, 1, dst_bytes);
> 	memmove(dptr, sptr, numrecs * sizeof(xfs_fsblock_t));
> 
> out_free_src:
> 	kfree(src);
> 	ifp->if_broot = dst;
> 	ifp->if_broot_bytes = new_size;
> }
> 
> And the callers are now both:
> 
> 	xfs_ifork_realloc_broot(mp, ifp, new_size, old_size, numrecs);
> 
> This also naturally handles the "reduce to zero size" without
> needing any special case code, it avoids the double pointer copy on
> grow, and the operation logic is simple, obvious and easy to
> understand...

Hmm.  The rtrmap patchset starts by moving xfs_ifork_move_broot to
xfs_bmap_btree.c and virtualizes the broot grow/shrink operation to
become a per-btree type operation.  The rtreflink series expands this
usage.

--D

> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 

  reply	other threads:[~2024-08-29 23:35 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-27 23:33 [PATCHSET v4.1] xfs: cleanups for inode rooted btree code Darrick J. Wong
2024-08-27 23:33 ` [PATCH 01/10] xfs: fix C++ compilation errors in xfs_fs.h Darrick J. Wong
2024-08-28  4:09   ` Christoph Hellwig
2024-08-29  1:29   ` Dave Chinner
2024-09-02 15:20   ` Carlos Maiolino
2024-08-27 23:34 ` [PATCH 02/10] xfs: fix FITRIM reporting again Darrick J. Wong
2024-08-28  4:10   ` Christoph Hellwig
2024-08-27 23:34 ` [PATCH 03/10] xfs: fix a sloppy memory handling bug in xfs_iroot_realloc Darrick J. Wong
2024-08-28  4:10   ` Christoph Hellwig
2024-08-27 23:34 ` [PATCH 04/10] xfs: replace shouty XFS_BM{BT,DR} macros Darrick J. Wong
2024-08-28  4:11   ` Christoph Hellwig
2024-08-29  2:00   ` Dave Chinner
2024-08-29 22:10     ` Darrick J. Wong
2024-08-30  3:43       ` Christoph Hellwig
2024-08-27 23:35 ` [PATCH 05/10] xfs: move the zero records logic into xfs_bmap_broot_space_calc Darrick J. Wong
2024-08-28  4:14   ` Christoph Hellwig
2024-08-29  1:15     ` Darrick J. Wong
2024-08-29  3:51       ` Christoph Hellwig
2024-08-29  2:05     ` Dave Chinner
2024-08-29 22:34       ` Darrick J. Wong
2024-08-27 23:35 ` [PATCH 06/10] xfs: refactor the allocation and freeing of incore inode fork btree roots Darrick J. Wong
2024-08-28  4:15   ` Christoph Hellwig
2024-08-28  4:17   ` Christoph Hellwig
2024-08-28 21:50     ` Darrick J. Wong
2024-08-27 23:35 ` [PATCH 07/10] xfs: refactor creation of bmap " Darrick J. Wong
2024-08-28  4:19   ` Christoph Hellwig
2024-08-29  2:13   ` Dave Chinner
2024-08-29 22:46     ` Darrick J. Wong
2024-08-27 23:35 ` [PATCH 08/10] xfs: hoist the code that moves the incore inode fork broot memory Darrick J. Wong
2024-08-28  4:20   ` Christoph Hellwig
2024-08-29  2:54   ` Dave Chinner
2024-08-29 23:35     ` Darrick J. Wong [this message]
2024-08-27 23:36 ` [PATCH 09/10] xfs: rearrange xfs_iroot_realloc a bit Darrick J. Wong
2024-08-28  4:21   ` Christoph Hellwig
2024-08-27 23:36 ` [PATCH 10/10] xfs: standardize the btree maxrecs function parameters Darrick J. Wong
2024-08-28  4:21   ` Christoph Hellwig
2024-08-27 23:45 ` [RFC PATCH] libxfs: compile with a C++ compiler Darrick J. Wong
2024-08-28  0:01   ` Sam James
2024-08-28  0:10     ` Sam James
2024-08-28 23:53     ` Darrick J. Wong
2024-08-29  0:17       ` Sam James
2024-08-29  1:30         ` Kees Cook
2024-08-28  4:25   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240829233507.GU6224@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox