Linux io-uring development
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Gregg Leventhal <gleventhal@janestreet.com>
Cc: Eric Hagberg <ehagberg@janestreet.com>,
	hch@infradead.org, djwong@kernel.org, linux-xfs@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org,
	Jens Axboe <axboe@kernel.dk>,
	stable@vger.kernel.org
Subject: Re: [BUG] iomap/io_uring: O_APPEND async buffered write silently re-appends a data chunk (corruption) on XFS, 6.1.y/6.12.y
Date: Wed, 10 Jun 2026 13:34:06 -0400	[thread overview]
Message-ID: <aimgDnzB_NYqOTx1@bfoster> (raw)
In-Reply-To: <CAFN_u7ELBj3YKncm6HA4-QUNyi-a3qPDEYxuLP+skVhm-r87uw@mail.gmail.com>

On Tue, Jun 09, 2026 at 01:14:40PM -0400, Gregg Leventhal wrote:
> I reproduce it by running 25 ~ concurrent instances of the attached reproducer,
> each writing its own file, on an otherwise-idle 15 GB VM:
> 
>   DIR=$(mktemp -d /tmp/uring.XXXXXX)
>   for i in {1..25}; do
>       ./repro_uring_dup "$DIR/file_$i" 120 48 &
>   done
> ...
> *** CORRUPTION DETECTED in /tmp/UmgK/file_17.1 ***
>   bytes kernel said it wrote (sum of CQE results): 53621960
>   actual file size:                                56218824
>   extra (duplicated) bytes:                        2596864
>   first mismatching offset: 6791168 (0x67a000)  page_aligned=YES
>     expected u64 848896 but found 524288 (content from byte offset
> 4194304 reappeared here)
>   (file kept for inspection)
> 
> 
> 
>   wait
> 
> *** CORRUPTION DETECTED in /tmp/Gznx/file_18.2 ***
>   bytes kernel said it wrote (sum of CQE results): 58112616
>   actual file size:                                60303976
>   extra (duplicated) bytes:                        2191360
>   first mismatching offset: 2191360 (0x217000)  page_aligned=YES
>     expected u64 273920 but found 0 (content from byte offset 0 reappeared here)
>   (file kept for inspection)
> 

Thanks. I had to bump up the concurrency a bit and then was able to
reproduce.

The patch I sent survived my regression testing but when taking another
look at the upstream patch, I realized something else I had previously
missed. The code in master doesn't actually return -EAGAIN directly
along with partial completion. It just returns the partial completion,
loops again in iomap, and then presumably returns -EAGAIN at that point
which makes its way back to io_uring. I think that is mostly harmless
but technically a bug in the upstream patch as the intent was to be able
to advance the iter, return -EAGAIN, and let the operation unwind from
there.

I think this actually leaves at least a couple options here. One is that
we could presumably just do the same thing on stable as current master:
forget the flag and just remove the iov revert and direct -EAGAIN return
at the cost of one more iter before returning to the caller. Another is
to fix up the code in master and use the patch I posted as a customized
stable backport of that.

WRT the latter I suppose we could also just stick with this patch for
stable and I can follow up with a separate patch for the loop thing on
master. Hmm.. I want to think about it a little more so if any iomap
folks have Opinions in the meantime, let me know.

Brian

> 
> On Tue, Jun 9, 2026 at 12:20 PM Brian Foster <bfoster@redhat.com> wrote:
> >
> > On Mon, Jun 08, 2026 at 01:17:10PM -0400, Eric Hagberg wrote:
> > > On Mon, Jun 8, 2026 at 12:03 PM Brian Foster <bfoster@redhat.com> wrote:
> > > > Another idea that came to mind is to try and just replace the -EAGAIN
> > > > return sequence from the low level iterator with a flag that triggers
> > > > -EAGAIN from the next iter advance. The idea here is to allow the write
> > > > to return partial completion (i.e. so no iov_iter revert) without having
> > > > to return an error from the lowest level in the stack. I had claude come
> > > > up with a quick patch [1] for reference/experimentation.
> > > >
> > > > This is based on v6.12 stable and compile tested only. It needs more
> > > > review and testing in general but might be worth throwing your
> > > > reproducer at if you can..?
> > >
> > > With that patch applied, the reproducer runs clean - no errors - and
> > > gets roughly the same performance (maybe slightly better) as when run
> > > against a 6.18 kernel on the same VM.
> > >
> >
> > Thanks for testing. I'll look into some more regression testing of this
> > patch and try to clean it up and post it for proper review for stable.
> >
> > Are you using the reproducer program in your original mail to test? If
> > so, does it require some concurrent memory pressure to reproduce, and
> > are you using anything in particular for that?
> >
> > That test seems small enough that we could potentially include it in
> > fstests, though I'm still not so sure about the mem pressure part..
> > Since you guys wrote the test, any interest in porting into fstests? If
> > not I can look into it.
> >
> > Brian
> >
> > > Thanks,
> > > -Eric
> > >
> >
> 


      reply	other threads:[~2026-06-10 17:34 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-04 18:46 [BUG] iomap/io_uring: O_APPEND async buffered write silently re-appends a data chunk (corruption) on XFS, 6.1.y/6.12.y Gregg Leventhal
2026-06-05 15:55 ` Brian Foster
2026-06-08 16:02   ` Brian Foster
2026-06-08 17:17     ` Eric Hagberg
2026-06-09 16:20       ` Brian Foster
2026-06-09 17:14         ` Gregg Leventhal
2026-06-10 17:34           ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aimgDnzB_NYqOTx1@bfoster \
    --to=bfoster@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=djwong@kernel.org \
    --cc=ehagberg@janestreet.com \
    --cc=gleventhal@janestreet.com \
    --cc=hch@infradead.org \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox