From: "Darrick J. Wong" <djwong@kernel.org>
To: John Garry <john.g.garry@oracle.com>
Cc: brauner@kernel.org, hch@lst.de, viro@zeniv.linux.org.uk,
jack@suse.cz, cem@kernel.org, linux-fsdevel@vger.kernel.org,
dchinner@redhat.com, linux-xfs@vger.kernel.org,
linux-kernel@vger.kernel.org, ojaswin@linux.ibm.com,
ritesh.list@gmail.com, martin.petersen@oracle.com,
linux-ext4@vger.kernel.org, linux-block@vger.kernel.org,
catherine.hoang@oracle.com
Subject: Re: [PATCH v6 01/12] fs: add atomic write unit max opt to statx
Date: Tue, 8 Apr 2025 19:23:08 -0700 [thread overview]
Message-ID: <20250409022308.GJ6283@frogsfrogsfrogs> (raw)
In-Reply-To: <20250408104209.1852036-2-john.g.garry@oracle.com>
This probably should have cc'd linux-api...
On Tue, Apr 08, 2025 at 10:41:58AM +0000, John Garry wrote:
> XFS will be able to support large atomic writes (atomic write > 1x block)
> in future. This will be achieved by using different operating methods,
> depending on the size of the write.
>
> Specifically a new method of operation based in FS atomic extent remapping
> will be supported in addition to the current HW offload-based method.
>
> The FS method will generally be appreciably slower performing than the
> HW-offload method. However the FS method will be typically able to
> contribute to achieving a larger atomic write unit max limit.
>
> XFS will support a hybrid mode, where HW offload method will be used when
> possible, i.e. HW offload is used when the length of the write is
> supported, and for other times FS-based atomic writes will be used.
>
> As such, there is an atomic write length at which the user may experience
> appreciably slower performance.
>
> Advertise this limit in a new statx field, stx_atomic_write_unit_max_opt.
>
> When zero, it means that there is no such performance boundary.
>
> Masks STATX{_ATTR}_WRITE_ATOMIC can be used to get this new field. This is
> ok for older kernels which don't support this new field, as they would
> report 0 in this field (from zeroing in cp_statx()) already. Furthermore
> those older kernels don't support large atomic writes - apart from block
> fops, but there would be consistent performance there for atomic writes
> in range [unit min, unit max].
>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
Seems fine to me, but I imagine others have stronger opinions.
Acked-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
> block/bdev.c | 3 ++-
> fs/ext4/inode.c | 2 +-
> fs/stat.c | 6 +++++-
> fs/xfs/xfs_iops.c | 2 +-
> include/linux/fs.h | 3 ++-
> include/linux/stat.h | 1 +
> include/uapi/linux/stat.h | 8 ++++++--
> 7 files changed, 18 insertions(+), 7 deletions(-)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index 4844d1e27b6f..b4afc1763e8e 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -1301,7 +1301,8 @@ void bdev_statx(struct path *path, struct kstat *stat,
>
> generic_fill_statx_atomic_writes(stat,
> queue_atomic_write_unit_min_bytes(bd_queue),
> - queue_atomic_write_unit_max_bytes(bd_queue));
> + queue_atomic_write_unit_max_bytes(bd_queue),
> + 0);
> }
>
> stat->blksize = bdev_io_min(bdev);
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 1dc09ed5d403..51a45699112c 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5663,7 +5663,7 @@ int ext4_getattr(struct mnt_idmap *idmap, const struct path *path,
> awu_max = sbi->s_awu_max;
> }
>
> - generic_fill_statx_atomic_writes(stat, awu_min, awu_max);
> + generic_fill_statx_atomic_writes(stat, awu_min, awu_max, 0);
> }
>
> flags = ei->i_flags & EXT4_FL_USER_VISIBLE;
> diff --git a/fs/stat.c b/fs/stat.c
> index f13308bfdc98..c41855f62d22 100644
> --- a/fs/stat.c
> +++ b/fs/stat.c
> @@ -136,13 +136,15 @@ EXPORT_SYMBOL(generic_fill_statx_attr);
> * @stat: Where to fill in the attribute flags
> * @unit_min: Minimum supported atomic write length in bytes
> * @unit_max: Maximum supported atomic write length in bytes
> + * @unit_max_opt: Optimised maximum supported atomic write length in bytes
> *
> * Fill in the STATX{_ATTR}_WRITE_ATOMIC flags in the kstat structure from
> * atomic write unit_min and unit_max values.
> */
> void generic_fill_statx_atomic_writes(struct kstat *stat,
> unsigned int unit_min,
> - unsigned int unit_max)
> + unsigned int unit_max,
> + unsigned int unit_max_opt)
> {
> /* Confirm that the request type is known */
> stat->result_mask |= STATX_WRITE_ATOMIC;
> @@ -153,6 +155,7 @@ void generic_fill_statx_atomic_writes(struct kstat *stat,
> if (unit_min) {
> stat->atomic_write_unit_min = unit_min;
> stat->atomic_write_unit_max = unit_max;
> + stat->atomic_write_unit_max_opt = unit_max_opt;
> /* Initially only allow 1x segment */
> stat->atomic_write_segments_max = 1;
>
> @@ -732,6 +735,7 @@ cp_statx(const struct kstat *stat, struct statx __user *buffer)
> tmp.stx_atomic_write_unit_min = stat->atomic_write_unit_min;
> tmp.stx_atomic_write_unit_max = stat->atomic_write_unit_max;
> tmp.stx_atomic_write_segments_max = stat->atomic_write_segments_max;
> + tmp.stx_atomic_write_unit_max_opt = stat->atomic_write_unit_max_opt;
>
> return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0;
> }
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 756bd3ca8e00..f0e5d83195df 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -610,7 +610,7 @@ xfs_report_atomic_write(
>
> if (xfs_inode_can_atomicwrite(ip))
> unit_min = unit_max = ip->i_mount->m_sb.sb_blocksize;
> - generic_fill_statx_atomic_writes(stat, unit_min, unit_max);
> + generic_fill_statx_atomic_writes(stat, unit_min, unit_max, 0);
> }
>
> STATIC int
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 016b0fe1536e..7b19d8f99aff 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3475,7 +3475,8 @@ void generic_fillattr(struct mnt_idmap *, u32, struct inode *, struct kstat *);
> void generic_fill_statx_attr(struct inode *inode, struct kstat *stat);
> void generic_fill_statx_atomic_writes(struct kstat *stat,
> unsigned int unit_min,
> - unsigned int unit_max);
> + unsigned int unit_max,
> + unsigned int unit_max_opt);
> extern int vfs_getattr_nosec(const struct path *, struct kstat *, u32, unsigned int);
> extern int vfs_getattr(const struct path *, struct kstat *, u32, unsigned int);
> void __inode_add_bytes(struct inode *inode, loff_t bytes);
> diff --git a/include/linux/stat.h b/include/linux/stat.h
> index be7496a6a0dd..e3d00e7bb26d 100644
> --- a/include/linux/stat.h
> +++ b/include/linux/stat.h
> @@ -57,6 +57,7 @@ struct kstat {
> u32 dio_read_offset_align;
> u32 atomic_write_unit_min;
> u32 atomic_write_unit_max;
> + u32 atomic_write_unit_max_opt;
> u32 atomic_write_segments_max;
> };
>
> diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
> index f78ee3670dd5..1686861aae20 100644
> --- a/include/uapi/linux/stat.h
> +++ b/include/uapi/linux/stat.h
> @@ -182,8 +182,12 @@ struct statx {
> /* File offset alignment for direct I/O reads */
> __u32 stx_dio_read_offset_align;
>
> - /* 0xb8 */
> - __u64 __spare3[9]; /* Spare space for future expansion */
> + /* Optimised max atomic write unit in bytes */
> + __u32 stx_atomic_write_unit_max_opt;
> + __u32 __spare2[1];
> +
> + /* 0xc0 */
> + __u64 __spare3[8]; /* Spare space for future expansion */
>
> /* 0x100 */
> };
> --
> 2.31.1
>
>
next prev parent reply other threads:[~2025-04-09 2:23 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-08 10:41 [PATCH v6 00/12] large atomic writes for xfs John Garry
2025-04-08 10:41 ` [PATCH v6 01/12] fs: add atomic write unit max opt to statx John Garry
2025-04-09 2:23 ` Darrick J. Wong [this message]
2025-04-09 10:45 ` Christoph Hellwig
2025-04-08 10:41 ` [PATCH v6 02/12] xfs: add helpers to compute log item overhead John Garry
2025-04-08 22:50 ` Dave Chinner
2025-04-08 23:21 ` Darrick J. Wong
2025-04-09 2:25 ` [PATCH v6.1 " Darrick J. Wong
2025-04-09 2:25 ` [PATCH v6.1 RFC 02.1/12] xfs: add helpers to compute transaction reservation for finishing intent items Darrick J. Wong
2025-04-08 10:42 ` [PATCH v6 03/12] xfs: rename xfs_inode_can_atomicwrite() -> xfs_inode_can_hw_atomicwrite() John Garry
2025-04-09 2:02 ` Darrick J. Wong
2025-04-09 10:46 ` Christoph Hellwig
2025-04-08 10:42 ` [PATCH v6 04/12] xfs: allow block allocator to take an alignment hint John Garry
2025-04-08 10:42 ` [PATCH v6 05/12] xfs: refactor xfs_reflink_end_cow_extent() John Garry
2025-04-08 10:42 ` [PATCH v6 06/12] xfs: refine atomic write size check in xfs_file_write_iter() John Garry
2025-04-08 10:42 ` [PATCH v6 07/12] xfs: add xfs_atomic_write_cow_iomap_begin() John Garry
2025-04-08 10:42 ` [PATCH v6 08/12] xfs: add large atomic writes checks in xfs_direct_write_iomap_begin() John Garry
2025-04-08 10:42 ` [PATCH v6 09/12] xfs: commit CoW-based atomic writes atomically John Garry
2025-04-08 10:42 ` [PATCH v6 10/12] xfs: add xfs_file_dio_write_atomic() John Garry
2025-04-08 10:42 ` [PATCH v6 11/12] xfs: add xfs_compute_atomic_write_unit_max() John Garry
2025-04-08 21:28 ` Darrick J. Wong
2025-04-08 22:47 ` Dave Chinner
2025-04-09 0:41 ` Darrick J. Wong
2025-04-09 5:30 ` Dave Chinner
2025-04-09 8:15 ` John Garry
2025-04-09 22:49 ` Dave Chinner
2025-04-10 8:58 ` John Garry
2025-04-09 23:46 ` Darrick J. Wong
2025-04-08 10:42 ` [PATCH v6 12/12] xfs: update atomic write limits John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250409022308.GJ6283@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=brauner@kernel.org \
--cc=catherine.hoang@oracle.com \
--cc=cem@kernel.org \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=john.g.garry@oracle.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox