From: bugzilla-daemon@bugzilla.kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 208827] [fio io_uring] io_uring write data crc32c verify failed
Date: Tue, 11 Aug 2020 21:59:18 +0000 [thread overview]
Message-ID: <bug-208827-201763-w1a7mTghXy@https.bugzilla.kernel.org/> (raw)
In-Reply-To: <bug-208827-201763@https.bugzilla.kernel.org/>
https://bugzilla.kernel.org/show_bug.cgi?id=208827
--- Comment #18 from Dave Chinner (david@fromorbit.com) ---
On Tue, Aug 11, 2020 at 07:10:30AM -0600, Jens Axboe wrote:
> On 8/11/20 1:05 AM, Dave Chinner wrote:
> > On Mon, Aug 10, 2020 at 08:19:57PM -0600, Jens Axboe wrote:
> >> On 8/10/20 8:00 PM, Dave Chinner wrote:
> >>> On Mon, Aug 10, 2020 at 07:08:59PM +1000, Dave Chinner wrote:
> >>>> On Mon, Aug 10, 2020 at 05:08:07PM +1000, Dave Chinner wrote:
> >>>>> [cc Jens]
> >>>>>
> >>>>> [Jens, data corruption w/ io_uring and simple fio reproducer. see
> >>>>> the bz link below.]
> >>>
> >>> Looks like a io_uring/fio bugs at this point, Jens. All your go fast
> >>> bits turns the buffered read into a short read, and neither fio nor
> >>> io_uring async buffered read path handle short reads. Details below.
> >>
> >> It's a fio issue. The io_uring engine uses a different path for short
> >> IO completions, and that's being ignored by the backend... Hence the
> >> IO just gets completed and not retried for this case, and that'll then
> >> trigger verification as if it did complete. I'm fixing it up.
> >
> > I just updated fio to:
> >
> > cb7d7abb (HEAD -> master, origin/master, origin/HEAD) io_u: set
> io_u->verify_offset in fill_io_u()
> >
> > The workload still reports corruption almost instantly. Only this
> > time, the trace is not reporting a short read.
> >
> > File is patterned with:
> >
> > verify_pattern=0x33333333%o-16
> >
> > Offset of "bad" data is 0x1240000.
> >
> > Expected:
> >
> > 00000000: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000010: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000020: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000030: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000040: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000050: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000060: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000070: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000080: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff 3333............
> > .....
> > 0000ffd0: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff
> 3333............
> > 0000ffe0: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff
> 3333............
> > 0000fff0: 33 33 33 33 00 10 24 01 00 00 00 00 f0 ff ff ff
> 3333............
> >
> >
> > Received:
> >
> > 00000000: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000010: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000020: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000030: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000040: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000050: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000060: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000070: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > 00000080: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff 3333............
> > .....
> > 0000ffd0: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff
> 3333............
> > 0000ffe0: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff
> 3333............
> > 0000fff0: 33 33 33 33 00 00 24 01 00 00 00 00 f0 ff ff ff
> 3333............
> >
> >
> > Looks like the data in the expected buffer is wrong - the data
> > pattern in the received buffer is correct according the defined
> > pattern.
> >
> > Error is 100% reproducable from the same test case. Same bad byte in
> > the expected buffer dump every single time.
>
> What job file are you running? It's not impossible that I broken
> something else in fio, the io_u->verify_offset is a bit risky... I'll
> get it fleshed out shortly.
Details are in the bugzilla I pointed you at. I modified the
original config specified to put per-file and offset identifiers
into the file data rather than using random data. This is
"determining the origin of stale data 101" stuff - the first thing
we _always_ do when trying to diagnose data corruption is identify
where the bad data came from.
Entire config file is below.
CHeers,
Dave.
--
You are receiving this mail because:
You are watching the assignee of the bug.
next prev parent reply other threads:[~2020-08-11 21:59 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-06 4:57 [Bug 208827] New: [fio io_uring] io_uring write data crc32c verify failed bugzilla-daemon
2020-08-07 2:42 ` Dave Chinner
2020-08-07 3:12 ` [Bug 208827] " bugzilla-daemon
2020-08-10 0:09 ` Dave Chinner
2020-08-10 3:56 ` Dave Chinner
2020-08-10 7:08 ` Dave Chinner
2020-08-10 9:08 ` Dave Chinner
2020-08-11 1:15 ` Jens Axboe
2020-08-11 1:50 ` Jens Axboe
2020-08-11 2:01 ` Jens Axboe
2020-08-11 3:01 ` Jens Axboe
2020-08-11 20:56 ` Jeff Moyer
2020-08-11 22:09 ` Dave Chinner
2020-08-12 15:13 ` Jens Axboe
2020-08-12 15:24 ` Jeff Moyer
2020-08-12 15:26 ` Jens Axboe
2020-08-11 2:00 ` Dave Chinner
2020-08-11 2:19 ` Jens Axboe
2020-08-11 5:53 ` Dave Chinner
2020-08-11 7:05 ` Dave Chinner
2020-08-11 13:10 ` Jens Axboe
2020-08-11 21:59 ` Dave Chinner
2020-08-11 23:00 ` Dave Chinner
2020-08-12 15:19 ` Jens Axboe
2020-08-11 1:07 ` Jens Axboe
2020-08-10 0:09 ` bugzilla-daemon
2020-08-10 3:56 ` bugzilla-daemon
2020-08-10 7:08 ` bugzilla-daemon
2020-08-10 9:09 ` bugzilla-daemon
2020-08-11 1:07 ` bugzilla-daemon
2020-08-11 1:15 ` bugzilla-daemon
2020-08-11 1:50 ` bugzilla-daemon
2020-08-11 2:00 ` bugzilla-daemon
2020-08-11 2:01 ` bugzilla-daemon
2020-08-11 2:20 ` bugzilla-daemon
2020-08-11 3:01 ` bugzilla-daemon
2020-08-11 5:53 ` bugzilla-daemon
2020-08-11 7:05 ` bugzilla-daemon
2020-08-11 13:10 ` bugzilla-daemon
2020-08-11 16:16 ` bugzilla-daemon
2020-08-11 20:56 ` bugzilla-daemon
2020-08-11 21:59 ` bugzilla-daemon [this message]
2020-08-11 22:09 ` bugzilla-daemon
2020-08-11 23:00 ` bugzilla-daemon
2020-08-12 3:15 ` bugzilla-daemon
2020-08-12 15:14 ` bugzilla-daemon
2020-08-12 15:19 ` bugzilla-daemon
2020-08-12 15:24 ` bugzilla-daemon
2020-08-12 15:26 ` bugzilla-daemon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bug-208827-201763-w1a7mTghXy@https.bugzilla.kernel.org/ \
--to=bugzilla-daemon@bugzilla.kernel.org \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox