All of lore.kernel.org
 help / color / mirror / Atom feed
From: bugzilla-daemon@bugzilla.kernel.org
To: linux-xfs@vger.kernel.org
Subject: [Bug 209243] New: fsx IO_URING reading get BAD DATA
Date: Sat, 12 Sep 2020 05:46:32 +0000	[thread overview]
Message-ID: <bug-209243-201763@https.bugzilla.kernel.org/> (raw)

https://bugzilla.kernel.org/show_bug.cgi?id=209243

            Bug ID: 209243
           Summary: fsx IO_URING reading get BAD DATA
           Product: File System
           Version: 2.5
    Kernel Version: 5.9.0-rc4 with xfs-5.10-merge-1
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: XFS
          Assignee: filesystem_xfs@kernel-bugs.kernel.org
          Reporter: zlang@redhat.com
        Regression: No

Description of problem:

I've maken xfstests' fsstress and fsx support IO_URING read/write, refer to
below patchset:
https://marc.info/?l=fstests&m=159980858202447&w=2

But the fsx io_uring test on XFS failed:

# ./ltp/fsx -U -R -W -l $((128000*5)) -o 128000 /mnt/scratch/foo
truncating to largest ever: 0x8d1ef
copying to largest ever: 0x94e29
zero_range to largest ever: 0x9c400
READ BAD DATA: offset = 0xadf4, size = 0x1de1d, fname = /mnt/scratch/foo
OFFSET  GOOD    BAD     RANGE
0x17000 0x7e57  0x0000  0x00000
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17001 0x577e  0x0000  0x00001
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17002 0x7e81  0x0000  0x00002
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17003 0x817e  0x0000  0x00003
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17004 0x7ee5  0x0000  0x00004
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17005 0xe57e  0x0000  0x00005
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17006 0x7ec3  0x0000  0x00006
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17007 0xc37e  0x0000  0x00007
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17008 0x7e60  0x0000  0x00008
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x17009 0x607e  0x0000  0x00009
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x1700a 0x7ee4  0x0000  0x0000a
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x1700b 0xe47e  0x0000  0x0000b
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x1700c 0x7e16  0x0000  0x0000c
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x1700d 0x167e  0x0000  0x0000d
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x1700e 0x7ebb  0x0000  0x0000e
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x1700f 0xbb7e  0x0000  0x0000f
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
LOG DUMP (6784 total operations):
1(  1 mod 256): TRUNCATE UP     from 0x0 to 0x8d1ef     ******WWWW
2(  2 mod 256): CLONE 0x74000 thru 0x7ffff      (0xc000 bytes) to 0x11000 thru
0x1cfff  ******JJJJ
3(  3 mod 256): COPY 0x896df thru 0x8d1ee       (0x3b10 bytes) to 0x39fd3 thru
0x3dae2
4(  4 mod 256): READ     0x5c4d2 thru 0x72c7d   (0x167ac bytes)
5(  5 mod 256): COPY 0x4fad2 thru 0x676c1       (0x17bf0 bytes) to 0x213f7 thru
0x38fe6
...
...
6775(119 mod 256): WRITE    0x42e2b thru 0x546bf        (0x11895 bytes)
6776(120 mod 256): TRUNCATE DOWN        from 0x7ebed to 0x4a2e5
6777(121 mod 256): READ     0x1c41 thru 0x10ef9 (0xf2b9 bytes)
6778(122 mod 256): PUNCH    0x2ba67 thru 0x466da        (0x1ac74 bytes)
6779(123 mod 256): INSERT 0x3c000 thru 0x59fff  (0x1e000 bytes)
6780(124 mod 256): ZERO     0x2648e thru 0x3bdfc        (0x1596f bytes)
6781(125 mod 256): WRITE    0x2532d thru 0x3b5ed        (0x162c1 bytes)
6782(126 mod 256): WRITE    0xc049 thru 0x20ea9 (0x14e61 bytes) ***WWWW
6783(127 mod 256): FALLOC   0x2420 thru 0x19c4f (0x1782f bytes) INTERIOR
6784(128 mod 256): READ     0xadf4 thru 0x28c10 (0x1de1d bytes) ***RRRR***
Log of operations saved to "/mnt/scratch/foo.fsxops"; replay with --replay-ops
Correct content saved for comparison
(maybe hexdump "/mnt/scratch/foo" vs "/mnt/scratch/foo.fsxgood")

I'll upload foo foo.fsxgood, foo.fsxops and foo.fsxlog files later.

It's easily to reproduce on my system, by the arguments I used above. But same
test can't reproduce this error on ext4.

Version-Release number of selected component (if applicable):
Fedora with upstream xfs-linux for-next HEAD = commit
18a1031619dece8f46b7deb2da477c8134d0bf89 xfs-5.10-merge-1

How reproducible:
100%

Steps to Reproduce:
1. mkfs.xfs -f /dev/mapper/testvg-scratchdev
meta-data=/dev/mapper/testvg-scratchdev isize=512    agcount=4, agsize=1966080
blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1
data     =                       bsize=4096   blocks=7864320, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=3840, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
2. git clone latest xfstests-dev, and merge the patchset
https://marc.info/?l=fstests&m=159980858202447&w=2
3. build xfstests
4. mount /dev/mapper/testvg-scratchdev /mnt/scratch
5. ./xfstests/ltp/fsx -U -R -W -l $((128000*5)) -o 128000 /mnt/scratch/foo

Additional info:
1) To avoid the same mistake of
https://bugzilla.kernel.org/show_bug.cgi?id=208827, I've tried to deal with the
short read/write of io_uring in fsx. 

2) General fsx test(fsx -S 0 -U /mnt/scratch/foo) doesn't fail, but only when I
specify the "-o" option of fsx, the test fails.

3) Can't reproduce same failure on ext4 by keep running same test on ext4 one
hour.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

             reply	other threads:[~2020-09-12  5:46 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-12  5:46 bugzilla-daemon [this message]
2020-09-12  5:50 ` [Bug 209243] fsx IO_URING reading get BAD DATA bugzilla-daemon
2020-09-13  7:59 ` bugzilla-daemon
2020-09-13 15:00 ` bugzilla-daemon
2020-09-13 18:05 ` bugzilla-daemon
2020-09-14  7:07 ` [Bug 209243] [regression] " bugzilla-daemon
2020-09-14 13:39 ` bugzilla-daemon
2020-09-14 15:04 ` bugzilla-daemon
2020-09-14 15:27 ` bugzilla-daemon
2020-09-14 18:17 ` bugzilla-daemon
2020-09-14 19:49 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-209243-201763@https.bugzilla.kernel.org/ \
    --to=bugzilla-daemon@bugzilla.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.