public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Brian Foster <bfoster@redhat.com>
Cc: Fengnan Chang <fengnanchang@gmail.com>,
	brauner@kernel.org, hch@infradead.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	lidiangang@bytedance.com,
	Fengnan Chang <changfengnan@bytedance.com>
Subject: Re: [PATCH] iomap: avoid memset iomap when iter is done
Date: Thu, 16 Apr 2026 08:27:05 -0700	[thread overview]
Message-ID: <20260416152705.GC114239@frogsfrogsfrogs> (raw)
In-Reply-To: <aeDjui2LGSidEHcJ@bfoster>

On Thu, Apr 16, 2026 at 09:27:22AM -0400, Brian Foster wrote:
> On Thu, Apr 16, 2026 at 11:06:42AM +0800, Fengnan Chang wrote:
> > When iomap_iter() finishes its iteration (returns <= 0), it is no longer
> > necessary to memset the entire iomap and srcmap structures.
> > 
> > In high-IOPS scenarios (like 4k randread NVMe polling with io_uring),
> > where the majority of I/Os complete in a single extent map, this wasted
> > memory write bandwidth, as the caller will just discard the iterator.
> > 
> > Use this command to test:
> > taskset -c 30 ./t/io_uring -p1 -d512 -b4096 -s32 -c32 -F1 -B1 -R1 -X1
> > -n1 -P1 /mnt/testfile
> > IOPS improve about 5% on ext4 and XFS.
> > 
> > However, we MUST still call iomap_iter_reset_iomap() to release the
> > folio_batch if IOMAP_F_FOLIO_BATCH is set, otherwise we leak page
> > references. Therefore, split the cleanup logic: always release the
> > folio_batch, but skip the memset() when ret <= 0.
> > 
> > Signed-off-by: Fengnan Chang <changfengnan@bytedance.com>
> > ---
> >  fs/iomap/iter.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/iomap/iter.c b/fs/iomap/iter.c
> > index c04796f6e57f..91eb5e6165ff 100644
> > --- a/fs/iomap/iter.c
> > +++ b/fs/iomap/iter.c
> > @@ -15,8 +15,6 @@ static inline void iomap_iter_reset_iomap(struct iomap_iter *iter)
> >  	}
> >  
> >  	iter->status = 0;
> > -	memset(&iter->iomap, 0, sizeof(iter->iomap));
> > -	memset(&iter->srcmap, 0, sizeof(iter->srcmap));
> >  }
> >  
> >  /* Advance the current iterator position and decrement the remaining length */
> > @@ -106,6 +104,9 @@ int iomap_iter(struct iomap_iter *iter, const struct iomap_ops *ops)
> >  	if (ret <= 0)
> >  		return ret;
> >  
> > +	memset(&iter->iomap, 0, sizeof(iter->iomap));
> > +	memset(&iter->srcmap, 0, sizeof(iter->srcmap));
> > +
> 
> This seems reasonable to me in principle, but it feels a little odd to
> leave a reset helper that doesn't really do a "reset." I wonder if this
> should be refactored into an iomap_iter_complete() (i.e. "complete an
> iteration") helper that includes the ret assignment logic just above the
> reset call and returns it, and then maybe leave a oneline comment above
> the memset so somebody doesn't blindly fold it back in the future. So
> for example:
> 
> 	ret = iomap_iter_complete(iter);
> 	if (ret <= 0)
> 		return ret;
> 
> 	/* save cycles and only clear the mappings if we plan to iterate */
> 	memset(..);
> 	...
> 
> We'd probably have to recheck some of the iter state within the new
> helper, but that doesn't seem like a big deal to me. Thoughts?

What kind of computer is this where there's a 5% hit to iops from a
memset of ~150 bytes?

--D

> Brian
> 
> >  begin:
> >  	ret = ops->iomap_begin(iter->inode, iter->pos, iter->len, iter->flags,
> >  			       &iter->iomap, &iter->srcmap);
> > -- 
> > 2.39.5 (Apple Git-154)
> > 
> > 
> 

  reply	other threads:[~2026-04-16 15:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-16  3:06 [PATCH] iomap: avoid memset iomap when iter is done Fengnan Chang
2026-04-16 13:27 ` Brian Foster
2026-04-16 15:27   ` Darrick J. Wong [this message]
2026-04-16 22:42     ` Dave Chinner
2026-04-17  2:15       ` changfengnan
2026-04-16 23:20     ` Mateusz Guzik
2026-04-17  2:19       ` changfengnan
2026-04-17  7:27   ` Christoph Hellwig
2026-04-17  9:15     ` changfengnan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260416152705.GC114239@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=bfoster@redhat.com \
    --cc=brauner@kernel.org \
    --cc=changfengnan@bytedance.com \
    --cc=fengnanchang@gmail.com \
    --cc=hch@infradead.org \
    --cc=lidiangang@bytedance.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox