linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: John Garry <john.g.garry@oracle.com>
Cc: brauner@kernel.org, djwong@kernel.org, hch@lst.de,
	viro@zeniv.linux.org.uk, jack@suse.cz, cem@kernel.org,
	linux-fsdevel@vger.kernel.org, dchinner@redhat.com,
	linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
	ojaswin@linux.ibm.com, ritesh.list@gmail.com,
	martin.petersen@oracle.com, linux-ext4@vger.kernel.org,
	linux-block@vger.kernel.org, catherine.hoang@oracle.com
Subject: Re: [PATCH v6 11/12] xfs: add xfs_compute_atomic_write_unit_max()
Date: Wed, 9 Apr 2025 08:47:09 +1000	[thread overview]
Message-ID: <Z_WnbfRhKR6RQsSA@dread.disaster.area> (raw)
In-Reply-To: <20250408104209.1852036-12-john.g.garry@oracle.com>

On Tue, Apr 08, 2025 at 10:42:08AM +0000, John Garry wrote:
> Now that CoW-based atomic writes are supported, update the max size of an
> atomic write for the data device.
> 
> The limit of a CoW-based atomic write will be the limit of the number of
> logitems which can fit into a single transaction.

I still think this is the wrong way to define the maximum
size of a COW-based atomic write because it is going to change from
filesystem to filesystem and that variability in supported maximum
length will be exposed to userspace...

i.e. Maximum supported atomic write size really should be defined as
a well documented fixed size (e.g. 16MB). Then the transaction
reservations sizes needed to perform that conversion can be
calculated directly from that maximum size and optimised directly
for the conversion operation that atomic writes need to perform.

.....

> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index b2dd0c0bf509..42b2b7540507 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -615,6 +615,28 @@ xfs_init_mount_workqueues(
>  	return -ENOMEM;
>  }
>  
> +unsigned int
> +xfs_atomic_write_logitems(
> +	struct xfs_mount	*mp)
> +{
> +	unsigned int		efi = xfs_efi_item_overhead(1);
> +	unsigned int		rui = xfs_rui_item_overhead(1);
> +	unsigned int		cui = xfs_cui_item_overhead(1);
> +	unsigned int		bui = xfs_bui_item_overhead(1);
> +	unsigned int		logres = M_RES(mp)->tr_write.tr_logres;
> +
> +	/*
> +	 * Maximum overhead to complete an atomic write ioend in software:
> +	 * remove data fork extent + remove cow fork extent +
> +	 * map extent into data fork
> +	 */
> +	unsigned int		atomic_logitems =
> +		(bui + cui + rui + efi) + (cui + rui) + (bui + rui);

This seems wrong. Unmap from the data fork only logs a (bui + cui)
pair, we don't log a RUI or an EFI until the transaction that
processes the BUI or CUI actually frees an extent from the the BMBT
or removes a block from the refcount btree.

We also need to be able to relog all the intents and everything that
was modified, so we effectively have at least one
xfs_allocfree_block_count() reservation needed here as well. Even
finishing an invalidation BUI can result in BMBT block allocation
occurring if the operation splits an existing extent record and the
insert of the new record causes a BMBT block split....


> +
> +	/* atomic write limits are always a power-of-2 */
> +	return rounddown_pow_of_two(logres / (2 * atomic_logitems));

What is the magic 2 in that division?

> +}

Also this function does not belong in xfs_super.c - that file is for
interfacing with the VFS layer.  Calculating log reservation
constants at mount time is done in xfs_trans_resv.c - I suspect most
of the code in this patch should probably be moved there and run
from xfs_trans_resv_calc()...

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2025-04-08 22:47 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-08 10:41 [PATCH v6 00/12] large atomic writes for xfs John Garry
2025-04-08 10:41 ` [PATCH v6 01/12] fs: add atomic write unit max opt to statx John Garry
2025-04-09  2:23   ` Darrick J. Wong
2025-04-09 10:45   ` Christoph Hellwig
2025-04-08 10:41 ` [PATCH v6 02/12] xfs: add helpers to compute log item overhead John Garry
2025-04-08 22:50   ` Dave Chinner
2025-04-08 23:21     ` Darrick J. Wong
2025-04-09  2:25   ` [PATCH v6.1 " Darrick J. Wong
2025-04-09  2:25   ` [PATCH v6.1 RFC 02.1/12] xfs: add helpers to compute transaction reservation for finishing intent items Darrick J. Wong
2025-04-08 10:42 ` [PATCH v6 03/12] xfs: rename xfs_inode_can_atomicwrite() -> xfs_inode_can_hw_atomicwrite() John Garry
2025-04-09  2:02   ` Darrick J. Wong
2025-04-09 10:46   ` Christoph Hellwig
2025-04-08 10:42 ` [PATCH v6 04/12] xfs: allow block allocator to take an alignment hint John Garry
2025-04-08 10:42 ` [PATCH v6 05/12] xfs: refactor xfs_reflink_end_cow_extent() John Garry
2025-04-08 10:42 ` [PATCH v6 06/12] xfs: refine atomic write size check in xfs_file_write_iter() John Garry
2025-04-08 10:42 ` [PATCH v6 07/12] xfs: add xfs_atomic_write_cow_iomap_begin() John Garry
2025-04-08 10:42 ` [PATCH v6 08/12] xfs: add large atomic writes checks in xfs_direct_write_iomap_begin() John Garry
2025-04-08 10:42 ` [PATCH v6 09/12] xfs: commit CoW-based atomic writes atomically John Garry
2025-04-08 10:42 ` [PATCH v6 10/12] xfs: add xfs_file_dio_write_atomic() John Garry
2025-04-08 10:42 ` [PATCH v6 11/12] xfs: add xfs_compute_atomic_write_unit_max() John Garry
2025-04-08 21:28   ` Darrick J. Wong
2025-04-08 22:47   ` Dave Chinner [this message]
2025-04-09  0:41     ` Darrick J. Wong
2025-04-09  5:30       ` Dave Chinner
2025-04-09  8:15         ` John Garry
2025-04-09 22:49           ` Dave Chinner
2025-04-10  8:58             ` John Garry
2025-04-09 23:46         ` Darrick J. Wong
2025-04-08 10:42 ` [PATCH v6 12/12] xfs: update atomic write limits John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_WnbfRhKR6RQsSA@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=brauner@kernel.org \
    --cc=catherine.hoang@oracle.com \
    --cc=cem@kernel.org \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=john.g.garry@oracle.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).