linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Joanne Koong <joannelkoong@gmail.com>
Cc: Gao Xiang <hsiangkao@linux.alibaba.com>,
	Christoph Hellwig <hch@infradead.org>,
	brauner@kernel.org, djwong@kernel.org,
	linux-fsdevel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH v1 1/9] iomap: account for unaligned end offsets when truncating read range
Date: Fri, 17 Oct 2025 14:27:13 -0400	[thread overview]
Message-ID: <aPKKgctbV4TQ9vid@bfoster> (raw)
In-Reply-To: <aPJhy4kEw0M7vtz-@bfoster>

On Fri, Oct 17, 2025 at 11:33:31AM -0400, Brian Foster wrote:
> On Thu, Oct 16, 2025 at 03:39:27PM -0700, Joanne Koong wrote:
> > On Thu, Oct 16, 2025 at 4:25 AM Brian Foster <bfoster@redhat.com> wrote:
> > >
> > > On Wed, Oct 15, 2025 at 06:27:10PM -0700, Joanne Koong wrote:
> > > > On Wed, Oct 15, 2025 at 5:36 PM Joanne Koong <joannelkoong@gmail.com> wrote:
> > > > >
> > > > > On Wed, Oct 15, 2025 at 11:39 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
> > > > > >
> > > > > > Hi Joanne,
> > > > > >
> > > > > > On 2025/10/16 02:21, Joanne Koong wrote:
> > > > > > > On Wed, Oct 15, 2025 at 11:06 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote:
> > > > > >
> > > > > > ...
> > > > > >
> > > > > > >>>
> > > > > > >>> This is where I encountered it in erofs: [1] for the "WARNING in
> > > > > > >>> iomap_iter_advance" syz repro. (this syzbot report was generated in
> > > > > > >>> response to this patchset version [2]).
> > > > > > >>>
> > > > > > >>> When I ran that syz program locally, I remember seeing pos=116 and length=3980.
> > > > > > >>
> > > > > > >> I just ran the C repro locally with the upstream codebase (but I
> > > > > > >> didn't use the related Kconfig), and it doesn't show anything.
> > > > > > >
> > > > > > > Which upstream commit are you running it on? It needs to be run on top
> > > > > > > of this patchset [1] but without this fix [2]. These changes are in
> > > > > > > Christian's vfs-6.19.iomap branch in his vfs tree but I don't think
> > > > > > > that branch has been published publicly yet so maybe just patching it
> > > > > > > in locally will work best.
> > > > > > >
> > > > > > > When I reproed it last month, I used the syz executor (not the C
> > > > > > > repro, though that should probably work too?) directly with the
> > > > > > > kconfig they had.
> > > > > >
> > > > > > I believe it's a regression somewhere since it's a valid
> > > > > > IOMAP_INLINE extent (since it's inlined, the length is not
> > > > > > block-aligned of course), you could add a print just before
> > > > > > erofs_iomap_begin() returns.
> > > > >
> > > > > Ok, so if erofs is strictly block-aligned except for tail inline data
> > > > > (eg the IOMAP_INLINE extent), then I agree, there is a regression
> > > > > somewhere as we shouldn't be running into the situation where erofs is
> > > > > calling iomap_adjust_read_range() with a non-block-aligned position
> > > > > and length. I'll track the offending commit down tomorrow.
> > > > >
> > > >
> > > > Ok, I think it's commit bc264fea0f6f ("iomap: support incremental
> > > > iomap_iter advances") that changed this behavior for erofs such that
> > > > the read iteration continues even after encountering an IOMAP_INLINE
> > > > extent, whereas before, the iteration stopped after reading in the
> > > > iomap inline extent. This leads erofs to end up in the situation where
> > > > it calls into iomap_adjust_read_range() with a non-block-aligned
> > > > position/length (on that subsequent iteration).
> > > >
> > > > In particular, this change in commit bc264fea0f6f to iomap_iter():
> > > >
> > > > -       if (ret > 0 && !iter->processed && !stale)
> > > > +       if (ret > 0 && !advanced && !stale)
> > > >
> > > > For iomap inline extents, iter->processed is 0, which stopped the
> > > > iteration before. But now, advanced (which is iter->pos -
> > > > iter->iter_start_pos) is used which will continue the iteration (since
> > > > the iter is advanced after reading in the iomap inline extents).
> > > >
> > > > Erofs is able to handle subsequent iterations after iomap_inline
> > > > extents because erofs_iomap_begin() checks the block map and returns
> > > > IOMAP_HOLE if it's not mapped
> > > >         if (!(map.m_flags & EROFS_MAP_MAPPED)) {
> > > >                 iomap->type = IOMAP_HOLE;
> > > >                 return 0;
> > > >         }
> > > >
> > > > but I think what probably would be better is a separate patch that
> > > > reverts this back to the original behavior of stopping the iteration
> > > > after IOMAP_INLINE extents are read in.
> > > >
> > >
> > > Hmm.. so as of commit bc264fea0f6f, it looks like the read_inline() path
> > > still didn't advance the iter at all by that point. It just returned 0
> > > and this caused iomap_iter() to break out of the iteration loop.
> > >
> > > The logic noted above in iomap_iter() is basically saying to break out
> > > if the iteration did nothing, which is a bit of a hacky way to terminate
> > > an IOMAP_INLINE read. The proper thing to do in that path IMO is to
> > > report the bytes processed and then terminate some other way more
> > > naturally. I see Gao actually fixed this sometime later in commit
> > > b26816b4e320 ("iomap: fix inline data on buffered read"), which is when
> > > the inline read path started to advance the iter.
> > 
> > That's a good point, the fix in commit b26816b4e320 is what led to the
> > new erofs behavior, not commit bc264fea0f6f.
> > 
> > >
> > > TBH, the behavior described above where we advance over the inline
> > > mapping and then report any remaining iter length as a hole also sounds
> > > like reasonably appropriate behavior to me. I suppose you could argue
> > > that the inline case should just terminate the iter, which perhaps means
> > > it should call iomap_iter_advance_full() instead. That technically
> > > hardcodes that we will never process mappings beyond an inline mapping
> > > into iomap. That bugs me a little bit, but is also probably always going
> > > to be true so doesn't seem like that big of a deal.
> > 
> > Reporting any remaining iter length as a hole also sounds reasonable
> > to me but it seems that this may add additional work that may not be
> > trivial. For example, I'm looking at erofs's erofs_map_blocks() call
> > which they would need to do to figure out it should be an iomap hole.
> > It seems a bit nonideal to me that any filesystem using iomap inline
> > data would also have to make sure they cover this case, which maybe
> > they already need to do that, I'm not sure, but it seems like an extra
> > thing they would now need to account for.
> > 
> 
> To make sure I follow, you're talking about any potential non-erofs
> cases, right? I thought it was mentioned that erofs already handled this
> correctly by returning a hole. So I take this to mean other users would
> need to handle this similarly.
> 
> Poking around, I see that ext4 uses iomap and IOMAP_INLINE in a couple
> isolated cases. One looks like swapfile activation and the other appears
> to be related to fiemap. Any idea if those are working correctly? At
> first look it kind of looks like they check and return error for offset
> beyond the specified length..
> 

JFYI from digging further.. ext4 returns -ENOENT for this "offset beyond
inline size" case. I didn't see any error returns from fiemap/filefrag
on a quick test against an inline data file. It looks like
iomap_fiemap() actually ignores errors either if the previous mapping is
non-hole, or catches -ENOENT explicitly to handle an attribute mapping
case. So afaict this all seems to work Ok..

I'm not sure this is all that relevant for the swap case. mkswap
apparently requires a file at least 40k in size, which iiuc is beyond
the scope of files with inline data..

Brian

> > One scenario I'm imagining maybe being useful in the future is
> > supporting chained inline mappings, in which case we would still want
> > to continue processing after the first inline mapping, but we could
> > also address that if it does come up.
> > 
> 
> Yeah..
> 
> > >
> > > If we wanted to consider it an optimization so we didn't always do this
> > > extra iter on inline files, perhaps another variant of that could be an
> > > EOF flag or some such that the fs could set to trigger a full advance
> > > after the current mapping. OTOH you could argue that's what inline
> > > already is so maybe that's overthinking it. Just a thought. Hm?
> > >
> > 
> > Would non-inline iomap types also find that flag helpful or is the
> > intention for it to be inline specific? I guess the flag could also be
> > used by non-inline types to stop the iteration short, if there's a use
> > case for that?
> > 
> 
> ... I hadn't really considered that. This was just me thinking out loud
> about trying to handle things more generically.
> 
> Having thought more about it, I think either way it might just make
> sense to fix the inline read case to do the full advance (assuming it is
> verified this fixes the problem) to restore historical iomap behavior.
> Then if there is ever value to support multiple inline mappings somehow
> or another, we could revisit the flag idea.
> 
> Brian
> 
> > Thanks,
> > Joanne
> > 
> > > Brian
> > >
> > > > So I don't think this patch should have a fixes: tag for that commit.
> > > > It seems to me like no one was hitting this path before with a
> > > > non-block-aligned position and offset. Though now there will be a use
> > > > case for it, which is fuse.
> > > >
> > > > Thanks,
> > > > Joanne
> > > >
> > > > >
> > > > > Thanks,
> > > > > Joanne
> > > > >
> > > > > >
> > > > > > Also see my reply:
> > > > > > https://lore.kernel.org/r/cff53c73-f050-44e2-9c61-96552c0e85ab@linux.alibaba.com
> > > > > >
> > > > > > I'm not sure if it caused user-visible regressions since
> > > > > > erofs images work properly with upstream code (unlike a
> > > > > > previous regression fixed by commit b26816b4e320 ("iomap:
> > > > > > fix inline data on buffered read")).
> > > > > >
> > > > > > But a fixes tag is needed since it causes an unexpected
> > > > > > WARNING at least.
> > > > > >
> > > > > > Thanks,
> > > > > > Gao Xiang
> > > > > >
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Joanne
> > > > > > >
> > > > > > > [1] https://lore.kernel.org/linux-fsdevel/20250926002609.1302233-1-joannelkoong@gmail.com/T/#t
> > > > > > > [2] https://lore.kernel.org/linux-fsdevel/20250922180042.1775241-1-joannelkoong@gmail.com/
> > > > > > > [3] https://lore.kernel.org/linux-fsdevel/20250926002609.1302233-1-joannelkoong@gmail.com/T/#m4ce4707bf98077cde4d1d4845425de30cf2b00f6
> > > > > > >
> > > > > > >>
> > > > > > >> I feel strange why pos is unaligned, does this warning show
> > > > > > >> without your patchset on your side?
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >> Gao Xiang
> > > > > > >>
> > > > > > >>>
> > > > > > >>> Thanks,
> > > > > > >>> Joanne
> > > > > > >>>
> > > > > > >>> [1] https://ci.syzbot.org/series/6845596a-1ec9-4396-b9c4-48bddc606bef
> > > > > > >>> [2] https://lore.kernel.org/linux-fsdevel/68ca71bd.050a0220.2ff435.04fc.GAE@google.com/
> > > > > > >>>
> > > > > > >>>>
> > > > > > >>>> Thanks,
> > > > > > >>>> Gao Xiang
> > > > > > >>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>> Thanks,
> > > > > > >>>>> Joanne
> > > > > > >>>>>
> > > > > > >>>>>>
> > > > > > >>>>>> Otherwise looks good:
> > > > > > >>>>>>
> > > > > > >>>>>> Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > > > > >>>>
> > > > > > >>
> > > > > >
> > > >
> > >
> > 
> 
> 


  reply	other threads:[~2025-10-17 18:23 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-09 22:56 [PATCH v1 0/9] iomap: buffered io changes Joanne Koong
2025-10-09 22:56 ` [PATCH v1 1/9] iomap: account for unaligned end offsets when truncating read range Joanne Koong
2025-10-13  3:00   ` Christoph Hellwig
2025-10-15 17:34     ` Joanne Koong
2025-10-15 17:40       ` Gao Xiang
2025-10-15 17:49         ` Joanne Koong
2025-10-15 18:06           ` Gao Xiang
2025-10-15 18:21             ` Joanne Koong
2025-10-15 18:39               ` Gao Xiang
2025-10-16  0:36                 ` Joanne Koong
2025-10-16  1:27                   ` Joanne Koong
2025-10-16  1:58                     ` Gao Xiang
2025-10-16 22:03                       ` Joanne Koong
2025-10-17  0:03                         ` Gao Xiang
2025-10-17 18:41                           ` Joanne Koong
2025-10-17 22:07                             ` Gao Xiang
2025-10-17 23:22                               ` Joanne Koong
2025-10-18  0:12                                 ` Gao Xiang
2025-10-20 23:25                                   ` Joanne Koong
2025-10-21  1:39                                     ` Gao Xiang
2025-10-16 11:29                     ` Brian Foster
2025-10-16 22:39                       ` Joanne Koong
2025-10-17 15:33                         ` Brian Foster
2025-10-17 18:27                           ` Brian Foster [this message]
2025-10-17 20:19                             ` Joanne Koong
2025-10-18 10:30                               ` Brian Foster
2025-10-20 21:39                                 ` Joanne Koong
2025-10-15 18:28             ` Gao Xiang
2025-10-09 22:56 ` [PATCH v1 2/9] docs: document iomap writeback's iomap_finish_folio_write() requirement Joanne Koong
2025-10-13  3:01   ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 3/9] iomap: optimize pending async writeback accounting Joanne Koong
2025-10-13  3:04   ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 4/9] iomap: simplify ->read_folio_range() error handling for reads Joanne Koong
2025-10-13  3:06   ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 5/9] iomap: simplify when reads can be skipped for writes Joanne Koong
2025-10-13  3:06   ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 6/9] iomap: optimize reads for non-block-aligned writes Joanne Koong
2025-10-13  3:08   ` Christoph Hellwig
2025-10-14  0:04     ` Joanne Koong
2025-10-14  4:14       ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 7/9] iomap: use loff_t for file positions and offsets in writeback code Joanne Koong
2025-10-13  3:09   ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 8/9] iomap: use find_next_bit() for dirty bitmap scanning Joanne Koong
2025-10-13  3:13   ` Christoph Hellwig
2025-10-09 22:56 ` [PATCH v1 9/9] iomap: use find_next_bit() for uptodate " Joanne Koong
2025-10-13  3:13   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPKKgctbV4TQ9vid@bfoster \
    --to=bfoster@redhat.com \
    --cc=brauner@kernel.org \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=joannelkoong@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).