From: John Garry <john.g.garry@oracle.com>
To: brauner@kernel.org, djwong@kernel.org, hch@lst.de,
viro@zeniv.linux.org.uk, jack@suse.cz, cem@kernel.org
Cc: linux-fsdevel@vger.kernel.org, dchinner@redhat.com,
linux-xfs@vger.kernel.org, linux-kernel@vger.kernel.org,
ojaswin@linux.ibm.com, ritesh.list@gmail.com,
martin.petersen@oracle.com, linux-ext4@vger.kernel.org,
linux-block@vger.kernel.org, catherine.hoang@oracle.com,
linux-api@vger.kernel.org, John Garry <john.g.garry@oracle.com>
Subject: [PATCH v11 15/16] xfs: update atomic write limits
Date: Sun, 4 May 2025 08:59:22 +0000 [thread overview]
Message-ID: <20250504085923.1895402-16-john.g.garry@oracle.com> (raw)
In-Reply-To: <20250504085923.1895402-1-john.g.garry@oracle.com>
Update the limits returned from xfs_get_atomic_write_{min, max, max_opt)().
No reflink support always means no CoW-based atomic writes.
For updating xfs_get_atomic_write_min(), we support blocksize only and that
depends on HW or reflink support.
For updating xfs_get_atomic_write_max(), for no reflink, we are limited to
blocksize but only if HW support. Otherwise we are limited to combined
limit in mp->m_atomic_write_unit_max.
For updating xfs_get_atomic_write_max_opt(), ultimately we are limited by
the bdev atomic write limit. If xfs_get_atomic_write_max() does not report
> 1x blocksize, then just continue to report 0 as before.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
[djwong: update comments in the helper functions]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
fs/xfs/xfs_file.c | 2 +-
fs/xfs/xfs_iops.c | 52 +++++++++++++++++++++++++++++++++++++++++------
2 files changed, 47 insertions(+), 7 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index f4a66ff85748..48254a72071b 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1557,7 +1557,7 @@ xfs_file_open(
if (xfs_is_shutdown(XFS_M(inode->i_sb)))
return -EIO;
file->f_mode |= FMODE_NOWAIT | FMODE_CAN_ODIRECT;
- if (xfs_inode_can_hw_atomic_write(XFS_I(inode)))
+ if (xfs_get_atomic_write_min(XFS_I(inode)) > 0)
file->f_mode |= FMODE_CAN_ATOMIC_WRITE;
return generic_file_open(inode, file);
}
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 77a0606e9dc9..8cddbb7c149b 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -605,27 +605,67 @@ unsigned int
xfs_get_atomic_write_min(
struct xfs_inode *ip)
{
- if (!xfs_inode_can_hw_atomic_write(ip))
- return 0;
+ struct xfs_mount *mp = ip->i_mount;
+
+ /*
+ * If we can complete an atomic write via atomic out of place writes,
+ * then advertise a minimum size of one fsblock. Without this
+ * mechanism, we can only guarantee atomic writes up to a single LBA.
+ *
+ * If out of place writes are not available, we can guarantee an atomic
+ * write of exactly one single fsblock if the bdev will make that
+ * guarantee for us.
+ */
+ if (xfs_inode_can_hw_atomic_write(ip) || xfs_can_sw_atomic_write(mp))
+ return mp->m_sb.sb_blocksize;
- return ip->i_mount->m_sb.sb_blocksize;
+ return 0;
}
unsigned int
xfs_get_atomic_write_max(
struct xfs_inode *ip)
{
- if (!xfs_inode_can_hw_atomic_write(ip))
+ struct xfs_mount *mp = ip->i_mount;
+
+ /*
+ * If out of place writes are not available, we can guarantee an atomic
+ * write of exactly one single fsblock if the bdev will make that
+ * guarantee for us.
+ */
+ if (!xfs_can_sw_atomic_write(mp)) {
+ if (xfs_inode_can_hw_atomic_write(ip))
+ return mp->m_sb.sb_blocksize;
return 0;
+ }
- return ip->i_mount->m_sb.sb_blocksize;
+ /*
+ * If we can complete an atomic write via atomic out of place writes,
+ * then advertise a maximum size of whatever we can complete through
+ * that means. Hardware support is reported via max_opt, not here.
+ */
+ if (XFS_IS_REALTIME_INODE(ip))
+ return XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].awu_max);
+ return XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_AG].awu_max);
}
unsigned int
xfs_get_atomic_write_max_opt(
struct xfs_inode *ip)
{
- return 0;
+ unsigned int awu_max = xfs_get_atomic_write_max(ip);
+
+ /* if the max is 1x block, then just keep behaviour that opt is 0 */
+ if (awu_max <= ip->i_mount->m_sb.sb_blocksize)
+ return 0;
+
+ /*
+ * Advertise the maximum size of an atomic write that we can tell the
+ * block device to perform for us. In general the bdev limit will be
+ * less than our out of place write limit, but we don't want to exceed
+ * the awu_max.
+ */
+ return min(awu_max, xfs_inode_buftarg(ip)->bt_bdev_awu_max);
}
static void
--
2.31.1
next prev parent reply other threads:[~2025-05-04 9:02 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-04 8:59 [PATCH v11 00/16] large atomic writes for xfs John Garry
2025-05-04 8:59 ` [PATCH v11 01/16] fs: add atomic write unit max opt to statx John Garry
2025-05-04 8:59 ` [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target John Garry
2025-05-05 5:40 ` Christoph Hellwig
2025-05-05 10:04 ` John Garry
2025-05-05 10:49 ` Christoph Hellwig
2025-05-05 10:55 ` John Garry
2025-05-05 14:22 ` Darrick J. Wong
2025-05-05 14:48 ` John Garry
2025-05-05 15:27 ` John Garry
2025-05-06 4:22 ` Christoph Hellwig
2025-05-06 6:57 ` John Garry
2025-05-04 8:59 ` [PATCH v11 03/16] xfs: add helpers to compute log item overhead John Garry
2025-05-04 8:59 ` [PATCH v11 04/16] xfs: add helpers to compute transaction reservation for finishing intent items John Garry
2025-05-04 8:59 ` [PATCH v11 05/16] xfs: rename xfs_inode_can_atomicwrite() -> xfs_inode_can_hw_atomic_write() John Garry
2025-05-04 8:59 ` [PATCH v11 06/16] xfs: ignore HW which cannot atomic write a single block John Garry
2025-05-05 5:43 ` Christoph Hellwig
2025-05-05 5:45 ` John Garry
2025-05-05 8:12 ` John Garry
2025-05-05 8:30 ` Christoph Hellwig
2025-05-05 14:24 ` Darrick J. Wong
2025-05-04 8:59 ` [PATCH v11 07/16] xfs: allow block allocator to take an alignment hint John Garry
2025-05-04 8:59 ` [PATCH v11 08/16] xfs: refactor xfs_reflink_end_cow_extent() John Garry
2025-05-04 8:59 ` [PATCH v11 09/16] xfs: refine atomic write size check in xfs_file_write_iter() John Garry
2025-05-04 8:59 ` [PATCH v11 10/16] xfs: add xfs_atomic_write_cow_iomap_begin() John Garry
2025-05-04 8:59 ` [PATCH v11 11/16] xfs: add large atomic writes checks in xfs_direct_write_iomap_begin() John Garry
2025-05-04 8:59 ` [PATCH v11 12/16] xfs: commit CoW-based atomic writes atomically John Garry
2025-05-04 8:59 ` [PATCH v11 13/16] xfs: add xfs_file_dio_write_atomic() John Garry
2025-05-04 8:59 ` [PATCH v11 14/16] xfs: add xfs_calc_atomic_write_unit_max() John Garry
2025-05-05 5:25 ` Darrick J. Wong
2025-05-05 6:08 ` John Garry
2025-05-05 8:02 ` John Garry
2025-05-05 14:26 ` Darrick J. Wong
2025-05-04 8:59 ` John Garry [this message]
2025-05-04 8:59 ` [PATCH v11 16/16] xfs: allow sysadmins to specify a maximum atomic write limit at mount time John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250504085923.1895402-16-john.g.garry@oracle.com \
--to=john.g.garry@oracle.com \
--cc=brauner@kernel.org \
--cc=catherine.hoang@oracle.com \
--cc=cem@kernel.org \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=jack@suse.cz \
--cc=linux-api@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox