public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Dave Chinner <david@fromorbit.com>, bugzilla-daemon@bugzilla.kernel.org
Cc: linux-xfs@vger.kernel.org
Subject: Re: [Bug 208827] [fio io_uring] io_uring write data crc32c verify failed
Date: Mon, 10 Aug 2020 19:07:07 -0600	[thread overview]
Message-ID: <d35b95df-2d0e-e402-251c-4da4bc254f66@kernel.dk> (raw)
In-Reply-To: <20200810070807.GJ2114@dread.disaster.area>

On 8/10/20 1:08 AM, Dave Chinner wrote:
> [cc Jens]
> 
> [Jens, data corruption w/ io_uring and simple fio reproducer. see
> the bz link below.]
> 
> On Mon, Aug 10, 2020 at 01:56:05PM +1000, Dave Chinner wrote:
>> On Mon, Aug 10, 2020 at 10:09:32AM +1000, Dave Chinner wrote:
>>> On Fri, Aug 07, 2020 at 03:12:03AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
>>>> --- Comment #1 from Dave Chinner (david@fromorbit.com) ---
>>>> On Thu, Aug 06, 2020 at 04:57:58AM +0000, bugzilla-daemon@bugzilla.kernel.org
>>>> wrote:
>>>>> https://bugzilla.kernel.org/show_bug.cgi?id=208827
>>>>>
>>>>>             Bug ID: 208827
>>>>>            Summary: [fio io_uring] io_uring write data crc32c verify
>>>>>                     failed
>>>>>            Product: File System
>>>>>            Version: 2.5
>>>>>     Kernel Version: xfs-linux xfs-5.9-merge-7 + v5.8-rc4
>>>
>>> FWIW, I can reproduce this with a vanilla 5.8 release kernel,
>>> so this isn't related to contents of the XFS dev tree at all...
>>>
>>> In fact, this bug isn't a recent regression. AFAICT, it was
>>> introduced between in 5.4 and 5.5 - 5.4 did not reproduce, 5.5 did
>>> reproduce. More info once I've finished bisecting it....
>>
>> f67676d160c6ee2ed82917fadfed6d29cab8237c is the first bad commit
>> commit f67676d160c6ee2ed82917fadfed6d29cab8237c
>> Author: Jens Axboe <axboe@kernel.dk>
>> Date:   Mon Dec 2 11:03:47 2019 -0700
>>
>>     io_uring: ensure async punted read/write requests copy iovec

I don't think this commit is related to the issue at all, but I think
we're probably on the same page with that. It's probably just changing
things slightly enough to avoid the race.

> Ok, I went back to vanilla 5.8 to continue debugging and adding
> tracepoints, and it's proving strangely difficult to reproduce now.
> 
> However, I did just hit this:
> 
> [ 4980.136032] ------------[ cut here ]------------
> [ 4980.137665] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000ef911b51>] prepare_to_wait_exclusive+0x3d/0xd0
> [ 4980.141403] WARNING: CPU: 13 PID: 6800 at kernel/sched/core.c:6888 __might_sleep+0x74/0x80
> [ 4980.143940] CPU: 13 PID: 6800 Comm: fio Not tainted 5.8.0-dgc+ #2549
> [ 4980.146147] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 04/01/2014
> [ 4980.148774] RIP: 0010:__might_sleep+0x74/0x80
> [ 4980.150455] Code: ff 41 5c 41 5d 41 5e 5d c3 48 8b 90 30 22 00 00 48 c7 c7 a8 b9 50 82 c6 05 38 e4 9a 01 01 48 8b 70 10 48 89 d1 e8 fa 5c fc ff <0f> 0b eb c5 0f 1f 84 00 00 00 002
> [ 4980.156255] RSP: 0018:ffffc90005383c58 EFLAGS: 00010282
> [ 4980.158299] RAX: 0000000000000000 RBX: 0000561a18122000 RCX: 0000000000000000
> [ 4980.160817] RDX: ffff88883eca7de0 RSI: ffff88883ec97a80 RDI: ffff88883ec97a80
> [ 4980.163162] RBP: ffffc90005383c70 R08: ffff88883ec97a80 R09: ffff8888070f3000
> [ 4980.165635] R10: ffff8888070f3434 R11: ffff8888070f3434 R12: ffffffff8251f46e
> [ 4980.168115] R13: 00000000000001ba R14: 0000000000000000 R15: ffff888235647740
> [ 4980.170714] FS:  00007f80de7af700(0000) GS:ffff88883ec80000(0000) knlGS:0000000000000000
> [ 4980.173442] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4980.175371] CR2: 00007f80d4005008 CR3: 00000005eb01c004 CR4: 0000000000060ee0
> [ 4980.177607] Call Trace:
> [ 4980.178583]  change_protection+0x827/0x9e0
> [ 4980.180063]  ? kvm_clock_read+0x18/0x30
> [ 4980.181654]  ? kvm_sched_clock_read+0x9/0x20
> [ 4980.183426]  ? sysvec_apic_timer_interrupt+0x46/0x90
> [ 4980.185160]  change_prot_numa+0x19/0x30
> [ 4980.186607]  task_numa_work+0x1c7/0x2e0
> [ 4980.188003]  task_work_run+0x64/0xb0
> [ 4980.189488]  io_cqring_wait+0x118/0x290

I'll get this one:

commit 4c6e277c4cc4a6b3b2b9c66a7b014787ae757cc1
Author: Jens Axboe <axboe@kernel.dk>
Date:   Wed Jul 1 11:29:10 2020 -0600

    io_uring: abstract out task work running

queued up for stable. Should not be related to this at all, current -git
shouldn't run into this.

-- 
Jens Axboe


  parent reply	other threads:[~2020-08-11  1:07 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-06  4:57 [Bug 208827] New: [fio io_uring] io_uring write data crc32c verify failed bugzilla-daemon
2020-08-07  2:42 ` Dave Chinner
2020-08-07  3:12 ` [Bug 208827] " bugzilla-daemon
2020-08-10  0:09   ` Dave Chinner
2020-08-10  3:56     ` Dave Chinner
2020-08-10  7:08       ` Dave Chinner
2020-08-10  9:08         ` Dave Chinner
2020-08-11  1:15           ` Jens Axboe
2020-08-11  1:50             ` Jens Axboe
2020-08-11  2:01               ` Jens Axboe
2020-08-11  3:01                 ` Jens Axboe
2020-08-11 20:56                 ` Jeff Moyer
2020-08-11 22:09                   ` Dave Chinner
2020-08-12 15:13                     ` Jens Axboe
2020-08-12 15:24                       ` Jeff Moyer
2020-08-12 15:26                         ` Jens Axboe
2020-08-11  2:00           ` Dave Chinner
2020-08-11  2:19             ` Jens Axboe
2020-08-11  5:53               ` Dave Chinner
2020-08-11  7:05               ` Dave Chinner
2020-08-11 13:10                 ` Jens Axboe
2020-08-11 21:59                   ` Dave Chinner
2020-08-11 23:00                     ` Dave Chinner
2020-08-12 15:19                       ` Jens Axboe
2020-08-11  1:07         ` Jens Axboe [this message]
2020-08-10  0:09 ` bugzilla-daemon
2020-08-10  3:56 ` bugzilla-daemon
2020-08-10  7:08 ` bugzilla-daemon
2020-08-10  9:09 ` bugzilla-daemon
2020-08-11  1:07 ` bugzilla-daemon
2020-08-11  1:15 ` bugzilla-daemon
2020-08-11  1:50 ` bugzilla-daemon
2020-08-11  2:00 ` bugzilla-daemon
2020-08-11  2:01 ` bugzilla-daemon
2020-08-11  2:20 ` bugzilla-daemon
2020-08-11  3:01 ` bugzilla-daemon
2020-08-11  5:53 ` bugzilla-daemon
2020-08-11  7:05 ` bugzilla-daemon
2020-08-11 13:10 ` bugzilla-daemon
2020-08-11 16:16 ` bugzilla-daemon
2020-08-11 20:56 ` bugzilla-daemon
2020-08-11 21:59 ` bugzilla-daemon
2020-08-11 22:09 ` bugzilla-daemon
2020-08-11 23:00 ` bugzilla-daemon
2020-08-12  3:15 ` bugzilla-daemon
2020-08-12 15:14 ` bugzilla-daemon
2020-08-12 15:19 ` bugzilla-daemon
2020-08-12 15:24 ` bugzilla-daemon
2020-08-12 15:26 ` bugzilla-daemon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d35b95df-2d0e-e402-251c-4da4bc254f66@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bugzilla-daemon@bugzilla.kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox