From: "Darrick J. Wong" <djwong@kernel.org>
To: Matthew Wilcox <willy@infradead.org>
Cc: Joanne Koong <joannelkoong@gmail.com>,
brauner@kernel.org, wegao@suse.com, sashal@kernel.org,
hch@infradead.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v1 1/1] iomap: don't mark folio uptodate if read IO has bytes pending
Date: Fri, 20 Feb 2026 15:45:21 -0800 [thread overview]
Message-ID: <20260220234521.GA11069@frogsfrogsfrogs> (raw)
In-Reply-To: <aZaQO0jQaZXakwOA@casper.infradead.org>
On Thu, Feb 19, 2026 at 04:23:23AM +0000, Matthew Wilcox wrote:
> On Wed, Feb 18, 2026 at 06:45:34PM -0800, Darrick J. Wong wrote:
> > On Wed, Feb 18, 2026 at 04:39:11PM -0800, Joanne Koong wrote:
> > > If a folio has ifs metadata attached to it and the folio is partially
> > > read in through an async IO helper with the rest of it then being read
> > > in through post-EOF zeroing or as inline data, and the helper
> > > successfully finishes the read first, then post-EOF zeroing / reading
> > > inline will mark the folio as uptodate in iomap_set_range_uptodate().
> > >
> > > This is a problem because when the read completion path later calls
> > > iomap_read_end(), it will call folio_end_read(), which sets the uptodate
> > > bit using XOR semantics. Calling folio_end_read() on a folio that was
> > > already marked uptodate clears the uptodate bit.
> >
> > Aha, I wondered if that xor thing was going to come back to bite us.
>
> This isn't "the xor thing has come back to bite us". This is "the iomap
> code is now too complicated and I cannot figure out how to explain to
> Joanne that there's really a simple way to do this".
>
> I'm going to have to set aside my current projects and redo the iomap
> readahead/read_folio code myself, aren't I?
<willy and I had a chat; this is a clumsy non-AI summary of it>
I started looking at folio read state management in iomap, and made a
few observations that (I hope) match what willy's grumpy about.
There are three ways that iomap can be reading into the pagecache:
a) async ->readahead,
b) synchronous ->read_folio (page faults), and
c) synchronous ->read_folio_range (pagecache write).
(Note that (b) can call a different ->read_folio_range than (c), though
all implementations seem to have the same function)
All three of these IO paths share the behavior that they try to fill out
the folio's contents and set the corresponding folio/ifs uptodate bits
if that succeeds. Folio contents can come from anywhere, whether it's:
i) zeroing memory,
ii) copying from an inlinedata buffer, or
iii) asynchronously fetching the contents from somewhere
In the case of (c) above, if the read fails then we fail the write, and
if the read succeeds then we start copying to the pagecache.
However, (a) and (b) have this additional read_bytes_pending field in
the ifs that implements some extra tracking. AFAICT the purpose of this
field is to ensure that we don't call folio_end_read prematurely if
there's an async read in progress. This can happen if iomap_iter
returns a negative errno on a partially processed folio, I think?
read_bytes_pending is initialized to the folio_size() at the start of a
read and subtracted from when parts of the folio are supplied, whether
that's synchronous zeroing or asynchronous read ioend completion. When
the field reaches zero, we can then call folio_end_read().
But then there are twists, like the fact that we only call
iomap_read_init() to set read_bytes_pending if we decide to do an
asynchronous read. Or that iomap_read_end and iomap_finish_folio_read
have awfully similar code. I think in the case of (i) and (ii) we also
don't touch read_pending_bytes at all, and merely set the uptodate bits?
This is confusing to me. It would be more straightforward (I think) if
we just did it for all cases instead of adding more conditionals. IOWs,
how hard would it be to consolidate the read code so that there's one
function that iomap calls when it has filled out part of a folio. Is
that possible, even though we shouldn't be calling folio_end_read during
a pagecache write?
At the end of the day, however, there's a bug in Linus' tree and we need
to fix it, so Joanne's patch is a sufficient bandaid until we can go
clean this up.
--D
next prev parent reply other threads:[~2026-02-20 23:45 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-19 0:39 [PATCH v1 0/1] iomap: don't mark folio uptodate if read IO has bytes pending Joanne Koong
2026-02-19 0:39 ` [PATCH v1 1/1] " Joanne Koong
2026-02-19 2:45 ` Darrick J. Wong
2026-02-19 4:23 ` Matthew Wilcox
2026-02-19 6:11 ` Darrick J. Wong
2026-02-19 6:11 ` Christoph Hellwig
2026-02-20 22:13 ` Joanne Koong
2026-02-20 23:45 ` Darrick J. Wong [this message]
2026-02-23 23:53 ` Joanne Koong
2026-02-24 15:16 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260220234521.GA11069@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=brauner@kernel.org \
--cc=hch@infradead.org \
--cc=joannelkoong@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=sashal@kernel.org \
--cc=wegao@suse.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox