public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: John Garry <john.g.garry@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	brauner@kernel.org, cem@kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	ojaswin@linux.ibm.com, ritesh.list@gmail.com,
	martin.petersen@oracle.com
Subject: Re: [PATCH v5 03/10] xfs: Refactor xfs_reflink_end_cow_extent()
Date: Wed, 12 Mar 2025 16:22:10 -0700	[thread overview]
Message-ID: <20250312232210.GD2803730@frogsfrogsfrogs> (raw)
In-Reply-To: <62f035a9-05e7-40fc-ae05-3d21255d89f4@oracle.com>

On Wed, Mar 12, 2025 at 10:06:11PM +0000, John Garry wrote:
> On 12/03/2025 15:46, Darrick J. Wong wrote:
> > On Wed, Mar 12, 2025 at 01:35:23AM -0700, Christoph Hellwig wrote:
> > > On Wed, Mar 12, 2025 at 08:27:05AM +0000, John Garry wrote:
> > > > On 12/03/2025 07:24, Christoph Hellwig wrote:
> > > > > On Mon, Mar 10, 2025 at 06:39:39PM +0000, John Garry wrote:
> > > > > > Refactor xfs_reflink_end_cow_extent() into separate parts which process
> > > > > > the CoW range and commit the transaction.
> > > > > > 
> > > > > > This refactoring will be used in future for when it is required to commit
> > > > > > a range of extents as a single transaction, similar to how it was done
> > > > > > pre-commit d6f215f359637.
> > > > > 
> > > > > Darrick pointed out that if you do more than just a tiny number
> > > > > of extents per transactions you run out of log reservations very
> > > > > quickly here:
> > > > > 
> > > > > https://urldefense.com/v3/__https://lore.kernel.org/all/20240329162936.GI6390@frogsfrogsfrogs/__;!!ACWV5N9M2RV99hQ!PWLcBof1tKimKUObvCj4vOhljWjFmjtzVHLx9apcU5Rah1xZnmp_3PIq6eSwx6TdEXzMLYYyBfmZLgvj$
> > > > > 
> > > > > how does your scheme deal with that?
> > > > > 
> > > > The resblks calculation in xfs_reflink_end_atomic_cow() takes care of this,
> > > > right? Or does the log reservation have a hard size limit, regardless of
> > > > that calculation?
> > > 
> > > The resblks calculated there are the reserved disk blocks and have
> > > nothing to do with the log reservations, which comes from the
> > > tr_write field passed in.  There is some kind of upper limited to it
> > > obviously by the log size, although I'm not sure if we've formalized
> > > that somewhere.  Dave might be the right person to ask about that.
> > 
> > The (very very rough) upper limit for how many intent items you can
> > attach to a tr_write transaction is:
> > 
> > per_extent_cost = (cui_size + rui_size + bui_size + efi_size + ili_size)
> > max_blocks = tr_write::tr_logres / per_extent_cost
> > 
> > (ili_size is the inode log item size)
> 
> So will it be something like this:
> 
> static size_t
> xfs_compute_awu_max_extents(
> 	struct xfs_mount	*mp)
> {
> 	struct xfs_trans_res	*resp = &M_RES(mp)->tr_write;
> 	size_t			logtotal = xfs_bui_log_format_sizeof(1)+

Might want to call it "per_extent_logres" since that's what it is.

> 				xfs_cui_log_format_sizeof(1) +
> 				xfs_efi_log_format_sizeof(1) +
> 				xfs_rui_log_format_sizeof(1) +
> 				sizeof(struct xfs_inode_log_format);

Something like that, yeah.  You should probably add
xfs_log_dinode_size(ip->i_mount) to that.

What you're really doing is summing the *nbytes output of the
->iop_size() call for each possible log item.  For the four log intent
items it's the xfs_FOO_log_format_sizeof() function like you have above.
For inode items it's:

	*nbytes += sizeof(struct xfs_inode_log_format) +
		   xfs_log_dinode_size(ip->i_mount);

> 	return rounddown_pow_of_two(resp->tr_logres / logtotal);

and like I said earlier, you should double logtotal to be on the safe
side with a 2x safety margin:

	/* 100% safety margin for safety's sake */
	return rounddown_pow_of_two(resp->tr_logres /
				    (2 * per_extent_logres));

I'm curious what number you get back from this function?  Hopefully it's
at least a few hundred blocks.

Thanks for putting that together.  :)

--D

> }
> 
> static inline void
> xfs_compute_awu_max(
> 	struct xfs_mount	*mp, int jjcount)
> {
> ....
> 	mp->m_awu_max =
> 	min_t(unsigned int, awu_max, xfs_compute_awu_max_extents(mp));
> }
> 
> > 
> > ((I would halve that for the sake of paranoia))
> > 
> > since you have to commit all those intent items into the first
> > transaction in the chain.  The difficulty we've always had is computing
> > the size of an intent item in the ondisk log, since that's a (somewhat
> > minor) layering violation -- it's xfs_cui_log_format_sizeof() for a CUI,
> > but then there' could be overhead for the ondisk log headers themselves.
> > 
> > Maybe we ought to formalize the computation of that since reap.c also
> > has a handwavy XREAP_MAX_DEFER_CHAIN that it uses to roll the scrub
> > transaction periodically... because I'd prefer we not add another
> > hardcoded limit.  My guess is that the software fallback can probably
> > support any awu_max that a hardware wants to throw at us, but let's
> > actually figure out the min(sw, hw) that we can support and cap it at
> > that.
> > 
> > --D
> 
> 

  reply	other threads:[~2025-03-12 23:22 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-10 18:39 [PATCH v5 00/10] large atomic writes for xfs with CoW John Garry
2025-03-10 18:39 ` [PATCH v5 01/10] xfs: Pass flags to xfs_reflink_allocate_cow() John Garry
2025-03-12  7:15   ` Christoph Hellwig
2025-03-12  8:19     ` John Garry
2025-03-10 18:39 ` [PATCH v5 02/10] xfs: Switch atomic write size check in xfs_file_write_iter() John Garry
2025-03-12  7:17   ` Christoph Hellwig
2025-03-12  8:21     ` John Garry
2025-03-10 18:39 ` [PATCH v5 03/10] xfs: Refactor xfs_reflink_end_cow_extent() John Garry
2025-03-12  7:24   ` Christoph Hellwig
2025-03-12  8:27     ` John Garry
2025-03-12  8:35       ` Christoph Hellwig
2025-03-12 15:46         ` Darrick J. Wong
2025-03-12 22:06           ` John Garry
2025-03-12 23:22             ` Darrick J. Wong [this message]
2025-03-13  1:25           ` Dave Chinner
2025-03-13  4:51             ` Darrick J. Wong
2025-03-13  6:11               ` John Garry
2025-03-18  0:43                 ` Dave Chinner
2025-03-13  7:21               ` Dave Chinner
2025-03-22  5:19                 ` Darrick J. Wong
2025-03-10 18:39 ` [PATCH v5 04/10] xfs: Reflink CoW-based atomic write support John Garry
2025-03-12  7:27   ` Christoph Hellwig
2025-03-12  9:13     ` John Garry
2025-03-12 13:45       ` Christoph Hellwig
2025-03-12 14:48         ` John Garry
2025-03-10 18:39 ` [PATCH v5 05/10] xfs: Iomap SW-based " John Garry
2025-03-12  7:37   ` Christoph Hellwig
2025-03-12  9:00     ` John Garry
2025-03-12 13:52       ` Christoph Hellwig
2025-03-12 14:57         ` John Garry
2025-03-12 15:55           ` Christoph Hellwig
2025-03-12 16:11             ` John Garry
2025-03-10 18:39 ` [PATCH v5 06/10] xfs: Add xfs_file_dio_write_atomic() John Garry
2025-03-10 18:39 ` [PATCH v5 07/10] xfs: Commit CoW-based atomic writes atomically John Garry
2025-03-12  7:39   ` Christoph Hellwig
2025-03-12  9:04     ` John Garry
2025-03-12 13:54       ` Christoph Hellwig
2025-03-12 15:01         ` John Garry
2025-03-10 18:39 ` [PATCH v5 08/10] xfs: Update atomic write max size John Garry
2025-03-11 14:40   ` Carlos Maiolino
2025-03-12  7:41   ` Christoph Hellwig
2025-03-12  8:09     ` John Garry
2025-03-12  8:13       ` Christoph Hellwig
2025-03-12  8:14         ` John Garry
2025-03-10 18:39 ` [PATCH v5 09/10] xfs: Allow block allocator to take an alignment hint John Garry
2025-03-12  7:42   ` Christoph Hellwig
2025-03-12  8:05     ` John Garry
2025-03-12 13:45       ` Christoph Hellwig
2025-03-12 14:47         ` John Garry
2025-03-12 16:00         ` Darrick J. Wong
2025-03-12 16:28           ` John Garry
2025-03-10 18:39 ` [PATCH RFC v5 10/10] iomap: Rename ATOMIC flags again John Garry
2025-03-12  7:13   ` Christoph Hellwig
2025-03-12 23:59     ` Dave Chinner
2025-03-13  6:28       ` John Garry
2025-03-13  7:02         ` Christoph Hellwig
2025-03-13  7:41           ` John Garry
2025-03-13  7:49             ` Christoph Hellwig
2025-03-13  7:53               ` John Garry
2025-03-13  8:09                 ` Christoph Hellwig
2025-03-13  8:18                   ` Christoph Hellwig
2025-03-13  8:24                     ` John Garry
2025-03-13  8:28                     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250312232210.GD2803730@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=brauner@kernel.org \
    --cc=cem@kernel.org \
    --cc=hch@infradead.org \
    --cc=john.g.garry@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox