From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: Carlos Maiolino <cem@kernel.org>,
Hans Holmberg <hans.holmberg@wdc.com>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH 27/43] xfs: implement buffered writes to zoned RT devices
Date: Fri, 13 Dec 2024 14:37:57 -0800 [thread overview]
Message-ID: <20241213223757.GQ6678@frogsfrogsfrogs> (raw)
In-Reply-To: <20241211085636.1380516-28-hch@lst.de>
On Wed, Dec 11, 2024 at 09:54:52AM +0100, Christoph Hellwig wrote:
> Implement buffered writes including page faults and block zeroing for
> zoned RT devices. Buffered writes to zoned RT devices are split into
> three phases:
>
> 1) a reservation for the worst case data block usage is taken before
> acquiring the iolock. When there are enough free blocks but not
> enough available one, garbage collection is kicked of to free the
> space before continuing with the write. If there isn't enough
> freeable space, the block reservation is reduced and a short write
> will happen as expected by normal Linux write semantics.
> 2) with the iolock held, the generic iomap buffered write code is
> called, which through the iomap_begin operation usually just inserts
> delalloc extents for the range in a single iteration. Only for
> overwrites of existing data that are not block aligned, or zeroing
> operations the existing extent mapping is read to fill out the srcmap
> and to figure out if zeroing is required.
> 3) the ->map_blocks callback to the generic iomap writeback code
> calls into the zoned space allocator to actually allocate on-disk
> space for the range before kicking of the writeback.
>
> Note that because all writes are out of place, truncate or hole punches
> that are not aligned to block size boundaries need to allocate space.
> For block zeroing from truncate, ->setattr is called with the iolock
> (aka i_rwsem) already held, so a hacky deviation from the above
> scheme is needed. In this case the space reservations is called with
> the iolock held, but is required not to block and can dip into the
> reserved block pool. This can lead to -ENOSPC when truncating a
> file, which is unfortunate. But fixing the calling conventions in
> the VFS is probably much easier with code requiring it already in
> mainline.
Urk, that's messy. :(
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_aops.c | 134 +++++++++++++++++++++++++--
> fs/xfs/xfs_bmap_util.c | 21 +++--
> fs/xfs/xfs_bmap_util.h | 12 ++-
> fs/xfs/xfs_file.c | 202 +++++++++++++++++++++++++++++++++++++----
> fs/xfs/xfs_iomap.c | 186 ++++++++++++++++++++++++++++++++++++-
> fs/xfs/xfs_iomap.h | 6 +-
> fs/xfs/xfs_iops.c | 31 ++++++-
> fs/xfs/xfs_reflink.c | 2 +-
> fs/xfs/xfs_trace.h | 1 +
> 9 files changed, 550 insertions(+), 45 deletions(-)
>
> diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
> index d35ac4c19fb2..67392413216b 100644
> --- a/fs/xfs/xfs_aops.c
> +++ b/fs/xfs/xfs_aops.c
> @@ -1,7 +1,7 @@
> // SPDX-License-Identifier: GPL-2.0
> /*
> * Copyright (c) 2000-2005 Silicon Graphics, Inc.
> - * Copyright (c) 2016-2018 Christoph Hellwig.
> + * Copyright (c) 2016-2023 Christoph Hellwig.
2024, surely?
Or at this point, 2025
> * All Rights Reserved.
> */
> #include "xfs.h"
> @@ -19,6 +19,8 @@
> #include "xfs_reflink.h"
> #include "xfs_errortag.h"
> #include "xfs_error.h"
> +#include "xfs_zone_alloc.h"
> +#include "xfs_rtgroup.h"
>
> struct xfs_writepage_ctx {
> struct iomap_writepage_ctx ctx;
> @@ -85,6 +87,7 @@ xfs_end_ioend(
> {
> struct xfs_inode *ip = XFS_I(ioend->io_inode);
> struct xfs_mount *mp = ip->i_mount;
> + bool is_zoned = xfs_is_zoned_inode(ip);
> xfs_off_t offset = ioend->io_offset;
> size_t size = ioend->io_size;
> unsigned int nofs_flag;
> @@ -115,9 +118,10 @@ xfs_end_ioend(
> error = blk_status_to_errno(ioend->io_bio.bi_status);
> if (unlikely(error)) {
> if (ioend->io_flags & IOMAP_IOEND_SHARED) {
> + ASSERT(!is_zoned);
> xfs_reflink_cancel_cow_range(ip, offset, size, true);
> xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, offset,
> - offset + size);
> + offset + size, NULL);
> }
> goto done;
> }
> @@ -125,7 +129,10 @@ xfs_end_ioend(
> /*
> * Success: commit the COW or unwritten blocks if needed.
> */
> - if (ioend->io_flags & IOMAP_IOEND_SHARED)
> + if (is_zoned)
> + error = xfs_zoned_end_io(ip, offset, size, ioend->io_sector,
> + NULLFSBLOCK);
> + else if (ioend->io_flags & IOMAP_IOEND_SHARED)
> error = xfs_reflink_end_cow(ip, offset, size);
> else if (ioend->io_flags & IOMAP_IOEND_UNWRITTEN)
> error = xfs_iomap_write_unwritten(ip, offset, size, false);
> @@ -175,17 +182,27 @@ xfs_end_io(
> }
> }
>
> -STATIC void
> +static void
> xfs_end_bio(
> struct bio *bio)
> {
> struct iomap_ioend *ioend = iomap_ioend_from_bio(bio);
> struct xfs_inode *ip = XFS_I(ioend->io_inode);
> + struct xfs_mount *mp = ip->i_mount;
> unsigned long flags;
>
> + /*
> + * For Appends record the actually written block number and set the
> + * boundary flag if needed.
> + */
> + if (IS_ENABLED(CONFIG_XFS_RT) && bio_is_zone_append(bio)) {
> + ioend->io_sector = bio->bi_iter.bi_sector;
> + xfs_mark_rtg_boundary(ioend);
> + }
> +
> spin_lock_irqsave(&ip->i_ioend_lock, flags);
> if (list_empty(&ip->i_ioend_list))
> - WARN_ON_ONCE(!queue_work(ip->i_mount->m_unwritten_workqueue,
> + WARN_ON_ONCE(!queue_work(mp->m_unwritten_workqueue,
> &ip->i_ioend_work));
> list_add_tail(&ioend->io_list, &ip->i_ioend_list);
> spin_unlock_irqrestore(&ip->i_ioend_lock, flags);
> @@ -462,7 +479,7 @@ xfs_discard_folio(
> * folio itself and not the start offset that is passed in.
> */
> xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK, pos,
> - folio_pos(folio) + folio_size(folio));
> + folio_pos(folio) + folio_size(folio), NULL);
> }
>
> static const struct iomap_writeback_ops xfs_writeback_ops = {
> @@ -471,14 +488,117 @@ static const struct iomap_writeback_ops xfs_writeback_ops = {
> .discard_folio = xfs_discard_folio,
> };
>
> +struct xfs_zoned_writepage_ctx {
> + struct iomap_writepage_ctx ctx;
> + struct xfs_open_zone *open_zone;
> +};
> +
> +static inline struct xfs_zoned_writepage_ctx *
> +XFS_ZWPC(struct iomap_writepage_ctx *ctx)
> +{
> + return container_of(ctx, struct xfs_zoned_writepage_ctx, ctx);
> +}
> +
> +static int
> +xfs_zoned_map_blocks(
> + struct iomap_writepage_ctx *wpc,
> + struct inode *inode,
> + loff_t offset,
> + unsigned int len)
> +{
> + struct xfs_inode *ip = XFS_I(inode);
> + struct xfs_mount *mp = ip->i_mount;
> + xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
> + xfs_fileoff_t end_fsb = XFS_B_TO_FSB(mp, offset + len);
> + xfs_filblks_t count_fsb;
> + struct xfs_bmbt_irec imap, del;
> + struct xfs_iext_cursor icur;
> +
> + if (xfs_is_shutdown(mp))
> + return -EIO;
> +
> + XFS_ERRORTAG_DELAY(mp, XFS_ERRTAG_WB_DELAY_MS);
> +
> + /*
> + * All dirty data must be covered by delalloc extents. But truncate can
> + * remove delalloc extents underneath us or reduce their size.
> + * Returning a hole tells iomap to not write back any data from this
> + * range, which is the right thing to do in that case.
> + *
> + * Otherwise just tell iomap to treat ranges previously covered by a
> + * delalloc extent as mapped. The actual block allocation will be done
> + * just before submitting the bio.
> + */
> + xfs_ilock(ip, XFS_ILOCK_EXCL);
> + if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &imap))
> + imap.br_startoff = end_fsb; /* fake a hole past EOF */
> + if (imap.br_startoff > offset_fsb) {
> + imap.br_blockcount = imap.br_startoff - offset_fsb;
> + imap.br_startoff = offset_fsb;
> + imap.br_startblock = HOLESTARTBLOCK;
> + imap.br_state = XFS_EXT_NORM;
> + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> + xfs_bmbt_to_iomap(ip, &wpc->iomap, &imap, 0, 0, 0);
> + return 0;
> + }
> + end_fsb = min(end_fsb, imap.br_startoff + imap.br_blockcount);
> + count_fsb = end_fsb - offset_fsb;
> +
> + del = imap;
> + xfs_trim_extent(&del, offset_fsb, count_fsb);
> + xfs_bmap_del_extent_delay(ip, XFS_COW_FORK, &icur, &imap, &del,
> + XFS_BMAPI_REMAP);
> + xfs_iunlock(ip, XFS_ILOCK_EXCL);
> +
> + wpc->iomap.type = IOMAP_MAPPED;
> + wpc->iomap.flags = IOMAP_F_DIRTY;
> + wpc->iomap.bdev = mp->m_rtdev_targp->bt_bdev;
> + wpc->iomap.offset = offset;
> + wpc->iomap.length = XFS_FSB_TO_B(mp, count_fsb);
> + wpc->iomap.flags = IOMAP_F_ZONE_APPEND;
> + wpc->iomap.addr = 0;
/me wonders if this should be set somewhere other than block 0 just in
case we screw up? That might just be paranoia since I think iomap puts
the bio if it doesn't get to ->submit_bio.
> +
> + trace_xfs_zoned_map_blocks(ip, offset, wpc->iomap.length);
> + return 0;
> +}
> +
> +static int
> +xfs_zoned_submit_ioend(
> + struct iomap_writepage_ctx *wpc,
> + int status)
> +{
> + wpc->ioend->io_bio.bi_end_io = xfs_end_bio;
> + if (status)
> + return status;
> + xfs_zone_alloc_and_submit(wpc->ioend, &XFS_ZWPC(wpc)->open_zone);
> + return 0;
> +}
> +
> +static const struct iomap_writeback_ops xfs_zoned_writeback_ops = {
> + .map_blocks = xfs_zoned_map_blocks,
> + .submit_ioend = xfs_zoned_submit_ioend,
> + .discard_folio = xfs_discard_folio,
> +};
> +
> STATIC int
> xfs_vm_writepages(
> struct address_space *mapping,
> struct writeback_control *wbc)
> {
> + struct xfs_inode *ip = XFS_I(mapping->host);
> struct xfs_writepage_ctx wpc = { };
> + int error;
>
> - xfs_iflags_clear(XFS_I(mapping->host), XFS_ITRUNCATED);
> + xfs_iflags_clear(ip, XFS_ITRUNCATED);
> + if (xfs_is_zoned_inode(ip)) {
> + struct xfs_zoned_writepage_ctx xc = { };
I noticed that the zoned writepage ctx doesn't track data/cow fork
sequence numbers. Why is this not necessary? Can we not race with
someone else doing writeback?
> +
> + error = iomap_writepages(mapping, wbc, &xc.ctx,
> + &xfs_zoned_writeback_ops);
> + if (xc.open_zone)
> + xfs_open_zone_put(xc.open_zone);
> + return error;
> + }
> return iomap_writepages(mapping, wbc, &wpc.ctx, &xfs_writeback_ops);
> }
>
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index c623688e457c..c87422de2d77 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -30,6 +30,7 @@
> #include "xfs_reflink.h"
> #include "xfs_rtbitmap.h"
> #include "xfs_rtgroup.h"
> +#include "xfs_zone_alloc.h"
>
> /* Kernel only BMAP related definitions and functions */
>
> @@ -436,7 +437,8 @@ xfs_bmap_punch_delalloc_range(
> struct xfs_inode *ip,
> int whichfork,
> xfs_off_t start_byte,
> - xfs_off_t end_byte)
> + xfs_off_t end_byte,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct xfs_mount *mp = ip->i_mount;
> struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork);
> @@ -467,7 +469,10 @@ xfs_bmap_punch_delalloc_range(
> continue;
> }
>
> - xfs_bmap_del_extent_delay(ip, whichfork, &icur, &got, &del, 0);
> + xfs_bmap_del_extent_delay(ip, whichfork, &icur, &got, &del,
> + ac ? XFS_BMAPI_REMAP : 0);
> + if (xfs_is_zoned_inode(ip) && ac)
> + ac->reserved_blocks += del.br_blockcount;
> if (!xfs_iext_get_extent(ifp, &icur, &got))
> break;
> }
> @@ -582,7 +587,7 @@ xfs_free_eofblocks(
> if (ip->i_delayed_blks) {
> xfs_bmap_punch_delalloc_range(ip, XFS_DATA_FORK,
> round_up(XFS_ISIZE(ip), mp->m_sb.sb_blocksize),
> - LLONG_MAX);
> + LLONG_MAX, NULL);
> }
> xfs_inode_clear_eofblocks_tag(ip);
> return 0;
> @@ -825,7 +830,8 @@ int
> xfs_free_file_space(
> struct xfs_inode *ip,
> xfs_off_t offset,
> - xfs_off_t len)
> + xfs_off_t len,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct xfs_mount *mp = ip->i_mount;
> xfs_fileoff_t startoffset_fsb;
> @@ -880,7 +886,7 @@ xfs_free_file_space(
> return 0;
> if (offset + len > XFS_ISIZE(ip))
> len = XFS_ISIZE(ip) - offset;
> - error = xfs_zero_range(ip, offset, len, NULL);
> + error = xfs_zero_range(ip, offset, len, ac, NULL);
> if (error)
> return error;
>
> @@ -968,7 +974,8 @@ int
> xfs_collapse_file_space(
> struct xfs_inode *ip,
> xfs_off_t offset,
> - xfs_off_t len)
> + xfs_off_t len,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct xfs_mount *mp = ip->i_mount;
> struct xfs_trans *tp;
> @@ -981,7 +988,7 @@ xfs_collapse_file_space(
>
> trace_xfs_collapse_file_space(ip);
>
> - error = xfs_free_file_space(ip, offset, len);
> + error = xfs_free_file_space(ip, offset, len, ac);
> if (error)
> return error;
>
> diff --git a/fs/xfs/xfs_bmap_util.h b/fs/xfs/xfs_bmap_util.h
> index b29760d36e1a..c477b3361630 100644
> --- a/fs/xfs/xfs_bmap_util.h
> +++ b/fs/xfs/xfs_bmap_util.h
> @@ -15,6 +15,7 @@ struct xfs_inode;
> struct xfs_mount;
> struct xfs_trans;
> struct xfs_bmalloca;
> +struct xfs_zone_alloc_ctx;
>
> #ifdef CONFIG_XFS_RT
> int xfs_bmap_rtalloc(struct xfs_bmalloca *ap);
> @@ -31,7 +32,8 @@ xfs_bmap_rtalloc(struct xfs_bmalloca *ap)
> #endif /* CONFIG_XFS_RT */
>
> void xfs_bmap_punch_delalloc_range(struct xfs_inode *ip, int whichfork,
> - xfs_off_t start_byte, xfs_off_t end_byte);
> + xfs_off_t start_byte, xfs_off_t end_byte,
> + struct xfs_zone_alloc_ctx *ac);
>
> struct kgetbmap {
> __s64 bmv_offset; /* file offset of segment in blocks */
> @@ -54,13 +56,13 @@ int xfs_bmap_last_extent(struct xfs_trans *tp, struct xfs_inode *ip,
>
> /* preallocation and hole punch interface */
> int xfs_alloc_file_space(struct xfs_inode *ip, xfs_off_t offset,
> - xfs_off_t len);
> + xfs_off_t len);
> int xfs_free_file_space(struct xfs_inode *ip, xfs_off_t offset,
> - xfs_off_t len);
> + xfs_off_t len, struct xfs_zone_alloc_ctx *ac);
> int xfs_collapse_file_space(struct xfs_inode *, xfs_off_t offset,
> - xfs_off_t len);
> + xfs_off_t len, struct xfs_zone_alloc_ctx *ac);
> int xfs_insert_file_space(struct xfs_inode *, xfs_off_t offset,
> - xfs_off_t len);
> + xfs_off_t len);
>
> /* EOF block manipulation functions */
> bool xfs_can_free_eofblocks(struct xfs_inode *ip);
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 827f7819df6a..195cf60a81b0 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -25,6 +25,7 @@
> #include "xfs_iomap.h"
> #include "xfs_reflink.h"
> #include "xfs_file.h"
> +#include "xfs_zone_alloc.h"
>
> #include <linux/dax.h>
> #include <linux/falloc.h>
> @@ -360,7 +361,8 @@ xfs_file_write_zero_eof(
> struct iov_iter *from,
> unsigned int *iolock,
> size_t count,
> - bool *drained_dio)
> + bool *drained_dio,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct xfs_inode *ip = XFS_I(iocb->ki_filp->f_mapping->host);
> loff_t isize;
> @@ -414,7 +416,7 @@ xfs_file_write_zero_eof(
> trace_xfs_zero_eof(ip, isize, iocb->ki_pos - isize);
>
> xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
> - error = xfs_zero_range(ip, isize, iocb->ki_pos - isize, NULL);
> + error = xfs_zero_range(ip, isize, iocb->ki_pos - isize, ac, NULL);
> xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
>
> return error;
> @@ -431,7 +433,8 @@ STATIC ssize_t
> xfs_file_write_checks(
> struct kiocb *iocb,
> struct iov_iter *from,
> - unsigned int *iolock)
> + unsigned int *iolock,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct inode *inode = iocb->ki_filp->f_mapping->host;
> size_t count = iov_iter_count(from);
> @@ -481,7 +484,7 @@ xfs_file_write_checks(
> */
> if (iocb->ki_pos > i_size_read(inode)) {
> error = xfs_file_write_zero_eof(iocb, from, iolock, count,
> - &drained_dio);
> + &drained_dio, ac);
> if (error == 1)
> goto restart;
> if (error)
> @@ -491,6 +494,48 @@ xfs_file_write_checks(
> return kiocb_modified(iocb);
> }
>
> +static ssize_t
> +xfs_zoned_write_space_reserve(
> + struct xfs_inode *ip,
> + struct kiocb *iocb,
> + struct iov_iter *from,
> + unsigned int flags,
> + struct xfs_zone_alloc_ctx *ac)
> +{
> + loff_t count = iov_iter_count(from);
> + int error;
> +
> + if (iocb->ki_flags & IOCB_NOWAIT)
> + flags |= XFS_ZR_NOWAIT;
> +
> + /*
> + * Check the rlimit and LFS boundary first so that we don't over-reserve
> + * by possibly a lot.
> + *
> + * The generic write path will redo this check later, and it might have
> + * changed by then. If it got expanded we'll stick to our earlier
> + * smaller limit, and if it is decreased the new smaller limit will be
> + * used and our extra space reservation will be returned after finishing
> + * the write.
> + */
> + error = generic_write_check_limits(iocb->ki_filp, iocb->ki_pos, &count);
> + if (error)
> + return error;
> +
> + /*
> + * Sloppily round up count to file system blocks.
> + *
> + * This will often reserve an extra block, but that avoids having to look
> + * at the start offset, which isn't stable for O_APPEND until taking the
> + * iolock. Also we need to reserve a block each for zeroing the old
> + * EOF block and the new start block if they are unaligned.
> + *
> + * Any remaining block will be returned after the write.
> + */
> + return xfs_zoned_space_reserve(ip,
> + XFS_B_TO_FSB(ip->i_mount, count) + 1 + 2, flags, ac);
> +}
> +
> static int
> xfs_dio_write_end_io(
> struct kiocb *iocb,
> @@ -597,7 +642,7 @@ xfs_file_dio_write_aligned(
> ret = xfs_ilock_iocb_for_write(iocb, &iolock);
> if (ret)
> return ret;
> - ret = xfs_file_write_checks(iocb, from, &iolock);
> + ret = xfs_file_write_checks(iocb, from, &iolock, NULL);
> if (ret)
> goto out_unlock;
>
> @@ -675,7 +720,7 @@ xfs_file_dio_write_unaligned(
> goto out_unlock;
> }
>
> - ret = xfs_file_write_checks(iocb, from, &iolock);
> + ret = xfs_file_write_checks(iocb, from, &iolock, NULL);
> if (ret)
> goto out_unlock;
>
> @@ -749,7 +794,7 @@ xfs_file_dax_write(
> ret = xfs_ilock_iocb(iocb, iolock);
> if (ret)
> return ret;
> - ret = xfs_file_write_checks(iocb, from, &iolock);
> + ret = xfs_file_write_checks(iocb, from, &iolock, NULL);
> if (ret)
> goto out;
>
> @@ -793,7 +838,7 @@ xfs_file_buffered_write(
> if (ret)
> return ret;
>
> - ret = xfs_file_write_checks(iocb, from, &iolock);
> + ret = xfs_file_write_checks(iocb, from, &iolock, NULL);
> if (ret)
> goto out;
>
> @@ -840,6 +885,67 @@ xfs_file_buffered_write(
> return ret;
> }
>
> +STATIC ssize_t
> +xfs_file_buffered_write_zoned(
> + struct kiocb *iocb,
> + struct iov_iter *from)
> +{
> + struct xfs_inode *ip = XFS_I(iocb->ki_filp->f_mapping->host);
> + struct xfs_mount *mp = ip->i_mount;
> + unsigned int iolock = XFS_IOLOCK_EXCL;
> + bool cleared_space = false;
> + struct xfs_zone_alloc_ctx ac = { };
> + ssize_t ret;
> +
> + ret = xfs_zoned_write_space_reserve(ip, iocb, from, XFS_ZR_GREEDY, &ac);
> + if (ret < 0)
> + return ret;
> +
> + ret = xfs_ilock_iocb(iocb, iolock);
> + if (ret)
> + goto out_unreserve;
> +
> + ret = xfs_file_write_checks(iocb, from, &iolock, &ac);
> + if (ret)
> + goto out_unlock;
> +
> + /*
> + * Truncate the iter to the length that we were actually able to
> + * allocate blocks for. This needs to happen after
> + * xfs_file_write_checks, because that assigns ki_pos for O_APPEND
> + * writes.
> + */
> + iov_iter_truncate(from,
> + XFS_FSB_TO_B(mp, ac.reserved_blocks) -
> + (iocb->ki_pos & mp->m_blockmask));
> + if (!iov_iter_count(from))
> + goto out_unlock;
> +
> +retry:
> + trace_xfs_file_buffered_write(iocb, from);
> + ret = iomap_file_buffered_write(iocb, from,
> + &xfs_buffered_write_iomap_ops, &ac);
> + if (ret == -ENOSPC && !cleared_space) {
> + /*
^ extra space
> + * Kick off writeback to convert delalloc space and release the
> + * usually too pessimistic indirect block reservations.
> + */
> + xfs_flush_inodes(mp);
> + cleared_space = true;
> + goto retry;
> + }
> +
> +out_unlock:
> + xfs_iunlock(ip, iolock);
> +out_unreserve:
> + xfs_zoned_space_unreserve(ip, &ac);
> + if (ret > 0) {
> + XFS_STATS_ADD(mp, xs_write_bytes, ret);
> + ret = generic_write_sync(iocb, ret);
> + }
> + return ret;
> +}
> +
> STATIC ssize_t
> xfs_file_write_iter(
> struct kiocb *iocb,
> @@ -887,6 +993,8 @@ xfs_file_write_iter(
> return ret;
> }
>
> + if (xfs_is_zoned_inode(ip))
> + return xfs_file_buffered_write_zoned(iocb, from);
> return xfs_file_buffered_write(iocb, from);
> }
>
> @@ -941,7 +1049,8 @@ static int
> xfs_falloc_collapse_range(
> struct file *file,
> loff_t offset,
> - loff_t len)
> + loff_t len,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct inode *inode = file_inode(file);
> loff_t new_size = i_size_read(inode) - len;
> @@ -957,7 +1066,7 @@ xfs_falloc_collapse_range(
> if (offset + len >= i_size_read(inode))
> return -EINVAL;
>
> - error = xfs_collapse_file_space(XFS_I(inode), offset, len);
> + error = xfs_collapse_file_space(XFS_I(inode), offset, len, ac);
> if (error)
> return error;
> return xfs_falloc_setsize(file, new_size);
> @@ -1013,7 +1122,8 @@ xfs_falloc_zero_range(
> struct file *file,
> int mode,
> loff_t offset,
> - loff_t len)
> + loff_t len,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct inode *inode = file_inode(file);
> unsigned int blksize = i_blocksize(inode);
> @@ -1026,7 +1136,7 @@ xfs_falloc_zero_range(
> if (error)
> return error;
>
> - error = xfs_free_file_space(XFS_I(inode), offset, len);
> + error = xfs_free_file_space(XFS_I(inode), offset, len, ac);
> if (error)
> return error;
>
> @@ -1107,12 +1217,29 @@ xfs_file_fallocate(
> struct xfs_inode *ip = XFS_I(inode);
> long error;
> uint iolock = XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL;
> + struct xfs_zone_alloc_ctx ac = { };
>
> if (!S_ISREG(inode->i_mode))
> return -EINVAL;
> if (mode & ~XFS_FALLOC_FL_SUPPORTED)
> return -EOPNOTSUPP;
>
> + /*
> + * For zoned file systems, zeroing the first and last block of a hole
> + * punch requires allocating a new block to rewrite the remaining data
> + * and new zeroes out of place. Get a reservations for those before
> + * taking the iolock. Dip into the reserved pool because we are
> + * expected to be able to punch a hole even on a completely full
> + * file system.
> + */
> + if (xfs_is_zoned_inode(ip) &&
> + (mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE |
> + FALLOC_FL_COLLAPSE_RANGE))) {
> + error = xfs_zoned_space_reserve(ip, 2, XFS_ZR_RESERVED, &ac);
> + if (error)
> + return error;
> + }
> +
> xfs_ilock(ip, iolock);
> error = xfs_break_layouts(inode, &iolock, BREAK_UNMAP);
> if (error)
> @@ -1133,16 +1260,16 @@ xfs_file_fallocate(
>
> switch (mode & FALLOC_FL_MODE_MASK) {
> case FALLOC_FL_PUNCH_HOLE:
> - error = xfs_free_file_space(ip, offset, len);
> + error = xfs_free_file_space(ip, offset, len, &ac);
> break;
> case FALLOC_FL_COLLAPSE_RANGE:
> - error = xfs_falloc_collapse_range(file, offset, len);
> + error = xfs_falloc_collapse_range(file, offset, len, &ac);
> break;
> case FALLOC_FL_INSERT_RANGE:
> error = xfs_falloc_insert_range(file, offset, len);
> break;
> case FALLOC_FL_ZERO_RANGE:
> - error = xfs_falloc_zero_range(file, mode, offset, len);
> + error = xfs_falloc_zero_range(file, mode, offset, len, &ac);
> break;
> case FALLOC_FL_UNSHARE_RANGE:
> error = xfs_falloc_unshare_range(file, mode, offset, len);
> @@ -1160,6 +1287,8 @@ xfs_file_fallocate(
>
> out_unlock:
> xfs_iunlock(ip, iolock);
> + if (xfs_is_zoned_inode(ip))
> + xfs_zoned_space_unreserve(ip, &ac);
> return error;
> }
>
> @@ -1485,9 +1614,10 @@ xfs_dax_read_fault(
> * i_lock (XFS - extent map serialisation)
> */
> static vm_fault_t
> -xfs_write_fault(
> +__xfs_write_fault(
> struct vm_fault *vmf,
> - unsigned int order)
> + unsigned int order,
> + struct xfs_zone_alloc_ctx *ac)
> {
> struct inode *inode = file_inode(vmf->vma->vm_file);
> struct xfs_inode *ip = XFS_I(inode);
> @@ -1515,13 +1645,49 @@ xfs_write_fault(
> ret = xfs_dax_fault_locked(vmf, order, true);
> else
> ret = iomap_page_mkwrite(vmf, &xfs_buffered_write_iomap_ops,
> - NULL);
> + ac);
> xfs_iunlock(ip, lock_mode);
>
> sb_end_pagefault(inode->i_sb);
> return ret;
> }
>
> +static vm_fault_t
> +xfs_write_fault_zoned(
> + struct vm_fault *vmf,
> + unsigned int order)
> +{
> + struct xfs_inode *ip = XFS_I(file_inode(vmf->vma->vm_file));
> + unsigned int len = folio_size(page_folio(vmf->page));
> + struct xfs_zone_alloc_ctx ac = { };
> + int error;
> + vm_fault_t ret;
> +
> + /*
> + * This could over-allocate as it doesn't check for truncation.
> + *
> + * But as the overallocation is limited to less than a folio and will be
> + 8 release instantly that's just fine.
Nit: ^ should be an asterisk.
> + */
> + error = xfs_zoned_space_reserve(ip, XFS_B_TO_FSB(ip->i_mount, len), 0,
> + &ac);
> + if (error < 0)
> + return vmf_fs_error(error);
> + ret = __xfs_write_fault(vmf, order, &ac);
> + xfs_zoned_space_unreserve(ip, &ac);
> + return ret;
> +}
> +
> +static vm_fault_t
> +xfs_write_fault(
> + struct vm_fault *vmf,
> + unsigned int order)
> +{
> + if (xfs_is_zoned_inode(XFS_I(file_inode(vmf->vma->vm_file))))
> + return xfs_write_fault_zoned(vmf, order);
> + return __xfs_write_fault(vmf, order, NULL);
> +}
> +
> static inline bool
> xfs_is_write_fault(
> struct vm_fault *vmf)
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index aa1db0dc1d98..402b253ce3a2 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -31,6 +31,7 @@
> #include "xfs_health.h"
> #include "xfs_rtbitmap.h"
> #include "xfs_icache.h"
> +#include "xfs_zone_alloc.h"
>
> #define XFS_ALLOC_ALIGN(mp, off) \
> (((off) >> mp->m_allocsize_log) << mp->m_allocsize_log)
> @@ -1270,6 +1271,176 @@ xfs_bmapi_reserve_delalloc(
> return error;
> }
>
> +static int
> +xfs_zoned_buffered_write_iomap_begin(
> + struct inode *inode,
> + loff_t offset,
> + loff_t count,
> + unsigned flags,
> + struct iomap *iomap,
> + struct iomap *srcmap)
> +{
> + struct iomap_iter *iter =
> + container_of(iomap, struct iomap_iter, iomap);
I still wonder if iomap should be passing a (const struct iomap_iter *)
to the ->iomap_begin function instead of passing all these variables and
then implementations have to container_of if they want iter->private
too.
> + struct xfs_zone_alloc_ctx *ac = iter->private;
> + struct xfs_inode *ip = XFS_I(inode);
> + struct xfs_mount *mp = ip->i_mount;
> + xfs_fileoff_t offset_fsb = XFS_B_TO_FSBT(mp, offset);
> + xfs_fileoff_t end_fsb = xfs_iomap_end_fsb(mp, offset, count);
> + u16 iomap_flags = IOMAP_F_SHARED;
> + unsigned int lockmode = XFS_ILOCK_EXCL;
> + xfs_filblks_t count_fsb;
> + xfs_extlen_t indlen;
> + struct xfs_bmbt_irec got;
> + struct xfs_iext_cursor icur;
> + int error = 0;
> +
> + ASSERT(!xfs_get_extsz_hint(ip));
> + ASSERT(!(flags & IOMAP_UNSHARE));
> + ASSERT(ac);
> +
> + if (xfs_is_shutdown(mp))
> + return -EIO;
> +
> + error = xfs_qm_dqattach(ip);
> + if (error)
> + return error;
> +
> + error = xfs_ilock_for_iomap(ip, flags, &lockmode);
> + if (error)
> + return error;
> +
> + if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(&ip->i_df)) ||
> + XFS_TEST_ERROR(false, mp, XFS_ERRTAG_BMAPIFORMAT)) {
> + xfs_bmap_mark_sick(ip, XFS_DATA_FORK);
> + error = -EFSCORRUPTED;
> + goto out_unlock;
> + }
> +
> + XFS_STATS_INC(mp, xs_blk_mapw);
> +
> + error = xfs_iread_extents(NULL, ip, XFS_DATA_FORK);
> + if (error)
> + goto out_unlock;
> +
> + /*
> + * For zeroing operations check if there is any data to zero first.
> + *
> + * For regular writes we always need to allocate new blocks, but need to
> + * provide the source mapping when the range is unaligned to support
> + * read-modify-write of the whole block in the page cache.
> + *
> + * In either case we need to limit the reported range to the boundaries
> + * of the source map in the data fork.
> + */
> + if (!IS_ALIGNED(offset, mp->m_sb.sb_blocksize) ||
> + !IS_ALIGNED(offset + count, mp->m_sb.sb_blocksize) ||
> + (flags & IOMAP_ZERO)) {
> + struct xfs_bmbt_irec smap;
> + struct xfs_iext_cursor scur;
> +
> + if (!xfs_iext_lookup_extent(ip, &ip->i_df, offset_fsb, &scur,
> + &smap))
> + smap.br_startoff = end_fsb; /* fake hole until EOF */
> + if (smap.br_startoff > offset_fsb) {
> + /*
> + * We never need to allocate blocks for zeroing a hole.
> + */
> + if (flags & IOMAP_ZERO) {
> + xfs_hole_to_iomap(ip, iomap, offset_fsb,
> + smap.br_startoff);
> + goto out_unlock;
> + }
> + end_fsb = min(end_fsb, smap.br_startoff);
> + } else {
> + end_fsb = min(end_fsb,
> + smap.br_startoff + smap.br_blockcount);
> + xfs_trim_extent(&smap, offset_fsb,
> + end_fsb - offset_fsb);
> + error = xfs_bmbt_to_iomap(ip, srcmap, &smap, flags, 0,
> + xfs_iomap_inode_sequence(ip, 0));
> + if (error)
> + goto out_unlock;
> + }
> + }
> +
> + if (!ip->i_cowfp)
> + xfs_ifork_init_cow(ip);
> +
> + if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got))
> + got.br_startoff = end_fsb;
> + if (got.br_startoff <= offset_fsb) {
> + trace_xfs_reflink_cow_found(ip, &got);
> + goto done;
> + }
> +
> + /*
> + * Cap the maximum length to keep the chunks of work done here somewhat
> + * symmetric with the work writeback does.
> + */
> + end_fsb = min(end_fsb, got.br_startoff);
> + count_fsb = min3(end_fsb - offset_fsb, XFS_MAX_BMBT_EXTLEN,
> + XFS_B_TO_FSB(mp, 1024 * PAGE_SIZE));
> +
> + /*
> + * The block reservation is supposed to cover all blocks that the
> + * operation could possible write, but there is a nasty corner case
> + * where blocks could be stolen from underneath us:
> + *
> + * 1) while this thread iterates over a larger buffered write,
> + * 2) another thread is causing a write fault that calls into
> + * ->page_mkwrite in range this thread writes to, using up the
> + * delalloc reservation created by a previous call to this function.
> + * 3) another thread does direct I/O on the range that the write fault
> + * happened on, which causes writeback of the dirty data.
> + * 4) this then set the stale flag, which cuts the current iomap
> + * iteration short, causing the new call to ->iomap_begin that gets
> + * us here again, but now without a sufficient reservation.
> + *
> + * This is a very unusual I/O pattern, and nothing but generic/095 is
> + * known to hit it. There's not really much we can do here, so turn this
> + * into a short write.
Wheeeeee. >:(
> + */
> + if (count_fsb > ac->reserved_blocks) {
> + xfs_warn_ratelimited(mp,
> +"Short write on ino 0x%llx comm %.20s due to three-way race with write fault and direct I/O",
> + ip->i_ino, current->comm);
Aren't continuations ^^^ supposed to be indented two tabs?
> + count_fsb = ac->reserved_blocks;
> + if (!count_fsb) {
> + error = -EIO;
> + goto out_unlock;
> + }
> + }
> +
> + error = xfs_quota_reserve_blkres(ip, count_fsb);
> + if (error)
> + goto out_unlock;
> +
> + indlen = xfs_bmap_worst_indlen(ip, count_fsb);
> + error = xfs_dec_fdblocks(mp, indlen, false);
> + if (error)
> + goto out_unlock;
> + ip->i_delayed_blks += count_fsb;
> + xfs_mod_delalloc(ip, count_fsb, indlen);
> +
> + got.br_startoff = offset_fsb;
> + got.br_startblock = nullstartblock(indlen);
> + got.br_blockcount = count_fsb;
> + got.br_state = XFS_EXT_NORM;
> + xfs_bmap_add_extent_hole_delay(ip, XFS_COW_FORK, &icur, &got);
> + ac->reserved_blocks -= count_fsb;
> + iomap_flags |= IOMAP_F_NEW;
> +
> + trace_xfs_iomap_alloc(ip, offset, XFS_FSB_TO_B(mp, count_fsb),
> + XFS_COW_FORK, &got);
> +done:
> + error = xfs_bmbt_to_iomap(ip, iomap, &got, flags, iomap_flags,
> + xfs_iomap_inode_sequence(ip, IOMAP_F_SHARED));
> +out_unlock:
> + xfs_iunlock(ip, lockmode);
> + return error;
> +}
> +
> static int
> xfs_buffered_write_iomap_begin(
> struct inode *inode,
> @@ -1296,6 +1467,10 @@ xfs_buffered_write_iomap_begin(
> if (xfs_is_shutdown(mp))
> return -EIO;
>
> + if (xfs_is_zoned_inode(ip))
> + return xfs_zoned_buffered_write_iomap_begin(inode, offset,
> + count, flags, iomap, srcmap);
> +
> /* we can't use delayed allocations when using extent size hints */
> if (xfs_get_extsz_hint(ip))
> return xfs_direct_write_iomap_begin(inode, offset, count,
> @@ -1528,10 +1703,13 @@ xfs_buffered_write_delalloc_punch(
> loff_t length,
> struct iomap *iomap)
> {
> + struct iomap_iter *iter =
> + container_of(iomap, struct iomap_iter, iomap);
> +
> xfs_bmap_punch_delalloc_range(XFS_I(inode),
> (iomap->flags & IOMAP_F_SHARED) ?
> XFS_COW_FORK : XFS_DATA_FORK,
> - offset, offset + length);
> + offset, offset + length, iter->private);
> }
>
> static int
> @@ -1768,6 +1946,7 @@ xfs_zero_range(
> struct xfs_inode *ip,
> loff_t pos,
> loff_t len,
> + struct xfs_zone_alloc_ctx *ac,
> bool *did_zero)
> {
> struct inode *inode = VFS_I(ip);
> @@ -1778,13 +1957,14 @@ xfs_zero_range(
> return dax_zero_range(inode, pos, len, did_zero,
> &xfs_dax_write_iomap_ops);
> return iomap_zero_range(inode, pos, len, did_zero,
> - &xfs_buffered_write_iomap_ops, NULL);
> + &xfs_buffered_write_iomap_ops, ac);
> }
>
> int
> xfs_truncate_page(
> struct xfs_inode *ip,
> loff_t pos,
> + struct xfs_zone_alloc_ctx *ac,
> bool *did_zero)
> {
> struct inode *inode = VFS_I(ip);
> @@ -1793,5 +1973,5 @@ xfs_truncate_page(
> return dax_truncate_page(inode, pos, did_zero,
> &xfs_dax_write_iomap_ops);
> return iomap_truncate_page(inode, pos, did_zero,
> - &xfs_buffered_write_iomap_ops, NULL);
> + &xfs_buffered_write_iomap_ops, ac);
> }
> diff --git a/fs/xfs/xfs_iomap.h b/fs/xfs/xfs_iomap.h
> index 8347268af727..bc8a00cad854 100644
> --- a/fs/xfs/xfs_iomap.h
> +++ b/fs/xfs/xfs_iomap.h
> @@ -10,6 +10,7 @@
>
> struct xfs_inode;
> struct xfs_bmbt_irec;
> +struct xfs_zone_alloc_ctx;
>
> int xfs_iomap_write_direct(struct xfs_inode *ip, xfs_fileoff_t offset_fsb,
> xfs_fileoff_t count_fsb, unsigned int flags,
> @@ -24,8 +25,9 @@ int xfs_bmbt_to_iomap(struct xfs_inode *ip, struct iomap *iomap,
> u16 iomap_flags, u64 sequence_cookie);
>
> int xfs_zero_range(struct xfs_inode *ip, loff_t pos, loff_t len,
> - bool *did_zero);
> -int xfs_truncate_page(struct xfs_inode *ip, loff_t pos, bool *did_zero);
> + struct xfs_zone_alloc_ctx *ac, bool *did_zero);
> +int xfs_truncate_page(struct xfs_inode *ip, loff_t pos,
> + struct xfs_zone_alloc_ctx *ac, bool *did_zero);
>
> static inline xfs_filblks_t
> xfs_aligned_fsb_count(
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 990df072ba35..c382f6b6c9e3 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -29,6 +29,7 @@
> #include "xfs_xattr.h"
> #include "xfs_file.h"
> #include "xfs_bmap.h"
> +#include "xfs_zone_alloc.h"
>
> #include <linux/posix_acl.h>
> #include <linux/security.h>
> @@ -852,6 +853,7 @@ xfs_setattr_size(
> uint lock_flags = 0;
> uint resblks = 0;
> bool did_zeroing = false;
> + struct xfs_zone_alloc_ctx ac = { };
>
> xfs_assert_ilocked(ip, XFS_IOLOCK_EXCL | XFS_MMAPLOCK_EXCL);
> ASSERT(S_ISREG(inode->i_mode));
> @@ -887,6 +889,28 @@ xfs_setattr_size(
> */
> inode_dio_wait(inode);
>
> + /*
> + * Normally xfs_zoned_space_reserve is supposed to be called outside the
> + * IOLOCK. For truncate we can't do that since ->setattr is called with
> + * it already held by the VFS. So for now chicken out and try to
> + * allocate space under it.
> + *
> + * To avoid deadlocks this means we can't block waiting for space, which
> + * can lead to spurious -ENOSPC if there are no directly available
> + * blocks. We mitigate this a bit by allowing zeroing to dip into the
> + * reserved pool, but eventually the VFS calling convention needs to
> + * change.
> + */
> + if (xfs_is_zoned_inode(ip)) {
> + error = xfs_zoned_space_reserve(ip, 1,
> + XFS_ZR_NOWAIT | XFS_ZR_RESERVED, &ac);
> + if (error) {
> + if (error == -EAGAIN)
> + return -ENOSPC;
> + return error;
> + }
> + }
> +
> /*
> * File data changes must be complete before we start the transaction to
> * modify the inode. This needs to be done before joining the inode to
> @@ -900,11 +924,14 @@ xfs_setattr_size(
> if (newsize > oldsize) {
> trace_xfs_zero_eof(ip, oldsize, newsize - oldsize);
> error = xfs_zero_range(ip, oldsize, newsize - oldsize,
> - &did_zeroing);
> + &ac, &did_zeroing);
> } else {
> - error = xfs_truncate_page(ip, newsize, &did_zeroing);
> + error = xfs_truncate_page(ip, newsize, &ac, &did_zeroing);
> }
>
> + if (xfs_is_zoned_inode(ip))
> + xfs_zoned_space_unreserve(ip, &ac);
> +
> if (error)
> return error;
>
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index b7dba5ad2f34..b8e55c3badfb 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1538,7 +1538,7 @@ xfs_reflink_zero_posteof(
> return 0;
>
> trace_xfs_zero_eof(ip, isize, pos - isize);
> - return xfs_zero_range(ip, isize, pos - isize, NULL);
> + return xfs_zero_range(ip, isize, pos - isize, NULL, NULL);
Not really a fan of passing the *ac pointers around everywhere, but atm
I can't think of a better option.
--D
> }
>
> /*
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index bbaf9b2665c7..c03ec9bfec1f 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -1694,6 +1694,7 @@ DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write);
> DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write_unwritten);
> DEFINE_SIMPLE_IO_EVENT(xfs_end_io_direct_write_append);
> DEFINE_SIMPLE_IO_EVENT(xfs_file_splice_read);
> +DEFINE_SIMPLE_IO_EVENT(xfs_zoned_map_blocks);
>
> DECLARE_EVENT_CLASS(xfs_itrunc_class,
> TP_PROTO(struct xfs_inode *ip, xfs_fsize_t new_size),
> --
> 2.45.2
>
>
next prev parent reply other threads:[~2024-12-13 22:37 UTC|newest]
Thread overview: 143+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 8:54 RFC: support for zoned devices Christoph Hellwig
2024-12-11 8:54 ` [PATCH 01/43] xfs: constify feature checks Christoph Hellwig
2024-12-12 20:44 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 02/43] xfs: factor out a xfs_rt_check_size helper Christoph Hellwig
2024-12-12 21:11 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 03/43] xfs: add a rtg_blocks helper Christoph Hellwig
2024-12-12 21:12 ` Darrick J. Wong
2024-12-13 5:00 ` Christoph Hellwig
2024-12-15 18:10 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 04/43] xfs: move xfs_bmapi_reserve_delalloc to xfs_iomap.c Christoph Hellwig
2024-12-12 21:18 ` Darrick J. Wong
2024-12-13 5:04 ` Christoph Hellwig
2024-12-15 18:13 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 05/43] xfs: don't take m_sb_lock in xfs_fs_statfs Christoph Hellwig
2024-12-12 21:42 ` Darrick J. Wong
2024-12-13 5:06 ` Christoph Hellwig
2024-12-15 18:16 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 06/43] xfs: refactor xfs_fs_statfs Christoph Hellwig
2024-12-12 21:24 ` Darrick J. Wong
2024-12-13 5:08 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 07/43] xfs: cleanup xfs_vn_getattr Christoph Hellwig
2024-12-12 21:24 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 08/43] xfs: report the correct dio alignment for COW inodes Christoph Hellwig
2024-12-12 21:29 ` Darrick J. Wong
2024-12-13 5:09 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 09/43] xfs: generalize the freespace and reserved blocks handling Christoph Hellwig
2024-12-12 21:37 ` Darrick J. Wong
2024-12-13 5:11 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 10/43] xfs: preserve RT reservations across remounts Christoph Hellwig
2024-12-12 21:38 ` Darrick J. Wong
2024-12-13 9:15 ` Hans Holmberg
2024-12-15 18:42 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 11/43] xfs: skip always_cow inodes in xfs_reflink_trim_around_shared Christoph Hellwig
2024-12-12 21:38 ` Darrick J. Wong
2024-12-13 5:12 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 12/43] xfs: refine the unaligned check for always COW inodes in xfs_file_dio_write Christoph Hellwig
2024-12-12 21:44 ` Darrick J. Wong
2024-12-13 5:14 ` Christoph Hellwig
2024-12-13 23:14 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 13/43] xfs: support XFS_BMAPI_REMAP in xfs_bmap_del_extent_delay Christoph Hellwig
2024-12-12 21:47 ` [PATCH 13/43] xfs: support XFS_BMAPI_REMAP in xfs_bmap_del_extent_delayOM Darrick J. Wong
2024-12-13 5:14 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 14/43] xfs: add a xfs_rtrmap_first_unwritten_rgbno helper Christoph Hellwig
2024-12-12 21:48 ` Darrick J. Wong
2024-12-13 5:16 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 15/43] xfs: define the zoned on-disk format Christoph Hellwig
2024-12-12 22:02 ` Darrick J. Wong
2024-12-13 5:22 ` Christoph Hellwig
2024-12-13 17:09 ` Darrick J. Wong
2024-12-15 5:20 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 16/43] xfs: allow internal RT devices for zoned mode Christoph Hellwig
2024-12-12 22:06 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 17/43] xfs: don't allow growfs of the data device with internal RT device Christoph Hellwig
2024-12-12 22:07 ` Darrick J. Wong
2024-12-13 5:22 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 18/43] xfs: export zoned geometry via XFS_FSOP_GEOM Christoph Hellwig
2024-12-12 22:09 ` Darrick J. Wong
2024-12-13 5:23 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 19/43] xfs: disable sb_frextents for zoned file systems Christoph Hellwig
2024-12-12 22:26 ` Darrick J. Wong
2024-12-13 5:29 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 20/43] xfs: disable FITRIM for zoned RT devices Christoph Hellwig
2024-12-12 22:13 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 21/43] xfs: don't call xfs_can_free_eofblocks from ->release for zoned inodes Christoph Hellwig
2024-12-12 22:15 ` Darrick J. Wong
2024-12-13 5:28 ` Christoph Hellwig
2024-12-13 17:13 ` Darrick J. Wong
2024-12-13 17:18 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 22/43] xfs: skip zoned RT inodes in xfs_inodegc_want_queue_rt_file Christoph Hellwig
2024-12-12 22:15 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 23/43] xfs: parse and validate hardware zone information Christoph Hellwig
2024-12-13 17:31 ` Darrick J. Wong
2024-12-15 5:24 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 24/43] xfs: add the zoned space allocator Christoph Hellwig
2024-12-13 18:33 ` Darrick J. Wong
2024-12-15 5:27 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 25/43] xfs: add support for zoned space reservations Christoph Hellwig
2024-12-13 21:01 ` Darrick J. Wong
2024-12-15 5:31 ` Christoph Hellwig
2024-12-17 16:59 ` Darrick J. Wong
2024-12-19 5:50 ` Christoph Hellwig
2024-12-19 16:00 ` Darrick J. Wong
2024-12-19 17:36 ` Christoph Hellwig
2024-12-19 17:37 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 26/43] xfs: implement zoned garbage collection Christoph Hellwig
2024-12-13 22:18 ` Darrick J. Wong
2024-12-15 5:57 ` Christoph Hellwig
2024-12-17 1:27 ` Darrick J. Wong
2024-12-17 4:06 ` Christoph Hellwig
2024-12-17 17:42 ` Darrick J. Wong
2024-12-18 7:13 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 27/43] xfs: implement buffered writes to zoned RT devices Christoph Hellwig
2024-12-13 22:37 ` Darrick J. Wong [this message]
2024-12-15 6:12 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 28/43] xfs: implement direct " Christoph Hellwig
2024-12-13 22:39 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 29/43] xfs: wire up zoned block freeing in xfs_rtextent_free_finish_item Christoph Hellwig
2024-12-13 22:40 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 30/43] xfs: hide reserved RT blocks from statfs Christoph Hellwig
2024-12-13 22:43 ` Darrick J. Wong
2024-12-15 6:03 ` Christoph Hellwig
2024-12-11 8:54 ` [PATCH 31/43] xfs: support growfs on zoned file systems Christoph Hellwig
2024-12-13 22:45 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 32/43] xfs: allow COW forks on zoned file systems in xchk_bmap Christoph Hellwig
2024-12-13 22:47 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 33/43] xfs: support xchk_xref_is_used_rt_space on zoned file systems Christoph Hellwig
2024-12-13 22:49 ` Darrick J. Wong
2024-12-15 6:13 ` Christoph Hellwig
2024-12-17 17:02 ` Darrick J. Wong
2024-12-11 8:54 ` [PATCH 34/43] xfs: support xrep_require_rtext_inuse " Christoph Hellwig
2024-12-13 22:49 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 35/43] xfs: enable fsmap reporting for internal RT devices Christoph Hellwig
2024-12-13 23:11 ` Darrick J. Wong
2024-12-15 6:26 ` Christoph Hellwig
2024-12-17 17:06 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 36/43] xfs: disable reflink for zoned file systems Christoph Hellwig
2024-12-13 23:12 ` Darrick J. Wong
2024-12-15 6:26 ` Christoph Hellwig
2024-12-17 17:10 ` Darrick J. Wong
2024-12-18 7:09 ` Christoph Hellwig
2024-12-18 18:16 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 37/43] xfs: disable rt quotas " Christoph Hellwig
2024-12-13 23:05 ` Darrick J. Wong
2024-12-15 6:21 ` Christoph Hellwig
2024-12-11 8:55 ` [PATCH 38/43] xfs: enable the zoned RT device feature Christoph Hellwig
2024-12-13 22:52 ` Darrick J. Wong
2024-12-15 6:15 ` Christoph Hellwig
2024-12-11 8:55 ` [PATCH 39/43] xfs: support zone gaps Christoph Hellwig
2024-12-13 22:55 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 40/43] xfs: add a max_open_zones mount option Christoph Hellwig
2024-12-13 22:57 ` Darrick J. Wong
2024-12-15 6:16 ` Christoph Hellwig
2024-12-17 17:12 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 41/43] xfs: support write life time based data placement Christoph Hellwig
2024-12-13 23:00 ` Darrick J. Wong
2024-12-15 6:19 ` Christoph Hellwig
2024-12-17 17:14 ` Darrick J. Wong
2024-12-18 7:10 ` Christoph Hellwig
2024-12-18 18:19 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 42/43] xfs: wire up the show_stats super operation Christoph Hellwig
2024-12-13 23:01 ` Darrick J. Wong
2024-12-11 8:55 ` [PATCH 43/43] xfs: export zone stats in /proc/*/mountstats Christoph Hellwig
2024-12-13 23:04 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241213223757.GQ6678@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=cem@kernel.org \
--cc=hans.holmberg@wdc.com \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox