From: "Darrick J. Wong" <djwong@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH v2] iomap: replace folio_batch allocation with stack allocation
Date: Tue, 2 Dec 2025 08:54:20 -0800 [thread overview]
Message-ID: <20251202165420.GG89472@frogsfrogsfrogs> (raw)
In-Reply-To: <20251113135404.553339-1-bfoster@redhat.com>
On Thu, Nov 13, 2025 at 08:54:04AM -0500, Brian Foster wrote:
> Zhang Yi points out that the dynamic folio_batch allocation in
> iomap_fill_dirty_folios() is problematic for the ext4 on iomap work
> that is under development because it doesn't sufficiently handle the
> allocation failure case (by allowing a retry, for example).
>
> The dynamic allocation was initially added for simplicity and to
> help indicate whether the batch was used or not by the calling fs.
> To address this issue, put the batch on the stack of
> iomap_zero_range() and use a flag to control whether the batch
> should be used in the iomap folio lookup path. This keeps things
> simple and eliminates the concern for ext4 on iomap.
>
> While here, also clean up the fill helper signature to be more
> consistent with the underlying filemap helper. Pass through the
> return value of the filemap helper (folio count) and update the
> lookup offset via an out param.
>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Acked-by: Dave Chinner <dchinner@redhat.com>
This one got lost in my emailbox, sorry about that :(
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
--D
> ---
>
> v2:
> - Reworked fill function to return folio count and pass flags as param.
> - Updated commit log to note function signature changes.
> v1: https://lore.kernel.org/linux-fsdevel/20251111175047.321869-1-bfoster@redhat.com/
>
> fs/iomap/buffered-io.c | 50 +++++++++++++++++++++++++++++-------------
> fs/iomap/iter.c | 6 ++---
> fs/xfs/xfs_iomap.c | 11 +++++-----
> include/linux/iomap.h | 8 +++++--
> 4 files changed, 50 insertions(+), 25 deletions(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 9b0b9cf7caa7..bc8b2ebb3330 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -772,7 +772,7 @@ static struct folio *__iomap_get_folio(struct iomap_iter *iter,
> if (!mapping_large_folio_support(iter->inode->i_mapping))
> len = min_t(size_t, len, PAGE_SIZE - offset_in_page(pos));
>
> - if (iter->fbatch) {
> + if (iter->iomap.flags & IOMAP_F_FOLIO_BATCH) {
> struct folio *folio = folio_batch_next(iter->fbatch);
>
> if (!folio)
> @@ -869,7 +869,7 @@ static int iomap_write_begin(struct iomap_iter *iter,
> * process so return and let the caller iterate and refill the batch.
> */
> if (!folio) {
> - WARN_ON_ONCE(!iter->fbatch);
> + WARN_ON_ONCE(!(iter->iomap.flags & IOMAP_F_FOLIO_BATCH));
> return 0;
> }
>
> @@ -1483,23 +1483,39 @@ static int iomap_zero_iter(struct iomap_iter *iter, bool *did_zero,
> return status;
> }
>
> -loff_t
> +/**
> + * iomap_fill_dirty_folios - fill a folio batch with dirty folios
> + * @iter: Iteration structure
> + * @start: Start offset of range. Updated based on lookup progress.
> + * @end: End offset of range
> + * @iomap_flags: Flags to set on the associated iomap to track the batch.
> + *
> + * Returns the folio count directly. Also returns the associated control flag if
> + * the the batch lookup is performed and the expected offset of a subsequent
> + * lookup via out params. The caller is responsible to set the flag on the
> + * associated iomap.
> + */
> +unsigned int
> iomap_fill_dirty_folios(
> struct iomap_iter *iter,
> - loff_t offset,
> - loff_t length)
> + loff_t *start,
> + loff_t end,
> + unsigned int *iomap_flags)
> {
> struct address_space *mapping = iter->inode->i_mapping;
> - pgoff_t start = offset >> PAGE_SHIFT;
> - pgoff_t end = (offset + length - 1) >> PAGE_SHIFT;
> + pgoff_t pstart = *start >> PAGE_SHIFT;
> + pgoff_t pend = (end - 1) >> PAGE_SHIFT;
> + unsigned int count;
>
> - iter->fbatch = kmalloc(sizeof(struct folio_batch), GFP_KERNEL);
> - if (!iter->fbatch)
> - return offset + length;
> - folio_batch_init(iter->fbatch);
> + if (!iter->fbatch) {
> + *start = end;
> + return 0;
> + }
>
> - filemap_get_folios_dirty(mapping, &start, end, iter->fbatch);
> - return (start << PAGE_SHIFT);
> + count = filemap_get_folios_dirty(mapping, &pstart, pend, iter->fbatch);
> + *start = (pstart << PAGE_SHIFT);
> + *iomap_flags |= IOMAP_F_FOLIO_BATCH;
> + return count;
> }
> EXPORT_SYMBOL_GPL(iomap_fill_dirty_folios);
>
> @@ -1508,17 +1524,21 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> const struct iomap_ops *ops,
> const struct iomap_write_ops *write_ops, void *private)
> {
> + struct folio_batch fbatch;
> struct iomap_iter iter = {
> .inode = inode,
> .pos = pos,
> .len = len,
> .flags = IOMAP_ZERO,
> .private = private,
> + .fbatch = &fbatch,
> };
> struct address_space *mapping = inode->i_mapping;
> int ret;
> bool range_dirty;
>
> + folio_batch_init(&fbatch);
> +
> /*
> * To avoid an unconditional flush, check pagecache state and only flush
> * if dirty and the fs returns a mapping that might convert on
> @@ -1529,11 +1549,11 @@ iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero,
> while ((ret = iomap_iter(&iter, ops)) > 0) {
> const struct iomap *srcmap = iomap_iter_srcmap(&iter);
>
> - if (WARN_ON_ONCE(iter.fbatch &&
> + if (WARN_ON_ONCE((iter.iomap.flags & IOMAP_F_FOLIO_BATCH) &&
> srcmap->type != IOMAP_UNWRITTEN))
> return -EIO;
>
> - if (!iter.fbatch &&
> + if (!(iter.iomap.flags & IOMAP_F_FOLIO_BATCH) &&
> (srcmap->type == IOMAP_HOLE ||
> srcmap->type == IOMAP_UNWRITTEN)) {
> s64 status;
> diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c
> index 8692e5e41c6d..c04796f6e57f 100644
> --- a/fs/iomap/iter.c
> +++ b/fs/iomap/iter.c
> @@ -8,10 +8,10 @@
>
> static inline void iomap_iter_reset_iomap(struct iomap_iter *iter)
> {
> - if (iter->fbatch) {
> + if (iter->iomap.flags & IOMAP_F_FOLIO_BATCH) {
> folio_batch_release(iter->fbatch);
> - kfree(iter->fbatch);
> - iter->fbatch = NULL;
> + folio_batch_reinit(iter->fbatch);
> + iter->iomap.flags &= ~IOMAP_F_FOLIO_BATCH;
> }
>
> iter->status = 0;
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index 04f39ea15898..37a1b33e9045 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -1831,7 +1831,6 @@ xfs_buffered_write_iomap_begin(
> */
> if (flags & IOMAP_ZERO) {
> xfs_fileoff_t eof_fsb = XFS_B_TO_FSB(mp, XFS_ISIZE(ip));
> - u64 end;
>
> if (isnullstartblock(imap.br_startblock) &&
> offset_fsb >= eof_fsb)
> @@ -1851,12 +1850,14 @@ xfs_buffered_write_iomap_begin(
> */
> if (imap.br_state == XFS_EXT_UNWRITTEN &&
> offset_fsb < eof_fsb) {
> - loff_t len = min(count,
> - XFS_FSB_TO_B(mp, imap.br_blockcount));
> + loff_t foffset = offset, fend;
>
> - end = iomap_fill_dirty_folios(iter, offset, len);
> + fend = offset +
> + min(count, XFS_FSB_TO_B(mp, imap.br_blockcount));
> + iomap_fill_dirty_folios(iter, &foffset, fend,
> + &iomap_flags);
> end_fsb = min_t(xfs_fileoff_t, end_fsb,
> - XFS_B_TO_FSB(mp, end));
> + XFS_B_TO_FSB(mp, foffset));
> }
>
> xfs_trim_extent(&imap, offset_fsb, end_fsb - offset_fsb);
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 8b1ac08c7474..ce2c9fbd8e16 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -88,6 +88,9 @@ struct vm_fault;
> /*
> * Flags set by the core iomap code during operations:
> *
> + * IOMAP_F_FOLIO_BATCH indicates that the folio batch mechanism is active
> + * for this operation, set by iomap_fill_dirty_folios().
> + *
> * IOMAP_F_SIZE_CHANGED indicates to the iomap_end method that the file size
> * has changed as the result of this write operation.
> *
> @@ -95,6 +98,7 @@ struct vm_fault;
> * range it covers needs to be remapped by the high level before the operation
> * can proceed.
> */
> +#define IOMAP_F_FOLIO_BATCH (1U << 13)
> #define IOMAP_F_SIZE_CHANGED (1U << 14)
> #define IOMAP_F_STALE (1U << 15)
>
> @@ -352,8 +356,8 @@ bool iomap_dirty_folio(struct address_space *mapping, struct folio *folio);
> int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len,
> const struct iomap_ops *ops,
> const struct iomap_write_ops *write_ops);
> -loff_t iomap_fill_dirty_folios(struct iomap_iter *iter, loff_t offset,
> - loff_t length);
> +unsigned int iomap_fill_dirty_folios(struct iomap_iter *iter, loff_t *start,
> + loff_t end, unsigned int *iomap_flags);
> int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len,
> bool *did_zero, const struct iomap_ops *ops,
> const struct iomap_write_ops *write_ops, void *private);
> --
> 2.51.1
>
>
prev parent reply other threads:[~2025-12-02 16:54 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-13 13:54 [PATCH v2] iomap: replace folio_batch allocation with stack allocation Brian Foster
2025-12-02 13:37 ` Brian Foster
2025-12-02 13:38 ` Christoph Hellwig
2025-12-02 16:54 ` Darrick J. Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251202165420.GG89472@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=bfoster@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox