From: Luis Henriques <lhenriques@suse.de>
To: Jens Axboe <axboe@kernel.dk>
Cc: Theodore Ts'o <tytso@mit.edu>,
fstests@vger.kernel.org, fio@vger.kernel.org
Subject: Re: generic/095 failing in ext4 and xfs
Date: Mon, 4 Oct 2021 11:15:59 +0100 [thread overview]
Message-ID: <YVrUX+qOJCDKp/Ng@suse.de> (raw)
In-Reply-To: <YVrSnW2RKP5hjxx6@suse.de>
On Mon, Oct 04, 2021 at 11:08:29AM +0100, Luis Henriques wrote:
> On Sat, Oct 02, 2021 at 08:59:57AM -0600, Jens Axboe wrote:
> > On 10/2/21 4:16 AM, Luis Henriques wrote:
> > > "Theodore Ts'o" <tytso@mit.edu> writes:
> > >
> > >> On Fri, Oct 01, 2021 at 02:46:09PM -0600, Jens Axboe wrote:
> > >>>
> > >>> Hmm, do older versions fail? I see Ted suggested that 3.27 doesn't, can
> > >>> you give that a go? If that does work, would be great if you could try
> > >>> and bisect it.
> > >>
> > >> I just tried fio 3.28, and it worked for me. So I don't think it's
> > >> fio.
> > >
> > > Awesome, thank you both for checking it out. So, it's definitely
> > > something in my test environment.
> > >
> > >> Luis, could it be related to a kernel config option?
> > >
> > > Yeah, it could be. I've tested this on a rolling release (openSUSE TW),
> > > so it's definitely quite different from Debian 10. It may take me a bit
> > > to figure out what's going on, but I'll start with this kernel config and
> > > report back any finding.
> > >
> > > Again, thank you both for confirming it's working on your side.
> >
> > Do you have a core file from fio? Would be interesting to get a
> > backtrace from it.
>
> Ok, not a lot of progress from my end yet, but here's some info gathered
> with gdb from the core file:
>
> #0 0x000056505966b361 in io_completed (td=0x7f2b0c5437a0, io_u_ptr=0x7ffec2403e48, icd=0x7ffec2403e60) at /usr/src/debug/fio-3.28-1.1.x86_64/io_u.c:2012
> #1 0x000056505966b922 in ios_completed (icd=0x7ffec2403e60, td=0x7f2b0c5437a0) at /usr/src/debug/fio-3.28-1.1.x86_64/io_u.c:2086
> #2 io_u_queued_complete (td=0x7f2b0c5437a0, min_evts=<optimized out>) at /usr/src/debug/fio-3.28-1.1.x86_64/io_u.c:2145
> #3 0x0000565059680e88 in do_io (td=0x7f2b0c5437a0, bytes_done=0x7ffec2404070) at /usr/src/debug/fio-3.28-1.1.x86_64/backend.c:1176
> #4 0x000056505968a8ee in thread_main (data=data@entry=0x56505ae43510) at /usr/src/debug/fio-3.28-1.1.x86_64/backend.c:1870
> #5 0x000056505968ca48 in run_threads (sk_out=0x0) at /usr/src/debug/fio-3.28-1.1.x86_64/backend.c:2460
> #6 0x000056505968cb55 in fio_backend (sk_out=0x0) at /usr/src/debug/fio-3.28-1.1.x86_64/backend.c:2597
> #7 fio_backend (sk_out=0x0) at /usr/src/debug/fio-3.28-1.1.x86_64/backend.c:2558
> #8 0x000056505962fd97 in main (argc=4, argv=0x7ffec240c448, envp=<optimized out>) at /usr/src/debug/fio-3.28-1.1.x86_64/fio.c:60
>
> And here's the io_completed() code where the crash occurs:
>
> 2007 if (io_u->resid) {
> 2008 io_u->xfer_buflen = io_u->resid;
> 2009 io_u->xfer_buf += bytes;
> 2010 io_u->offset += bytes;
> 2011 td->ts.short_io_u[io_u->ddir]++;
> 2012 if (io_u->offset < io_u->file->real_file_size) {
> 2013 requeue_io_u(td, io_u_ptr);
> 2014 return;
> 2015 }
> 2016 }
I forgot to include the kernel log. The page cache error seems relevant,
and, as I said before, I'm seeing it both on ext4 and xfs:
[ 38.014790] fio[762]: segfault at 30 ip 000056505966b361 sp 00007ffec2403df0 error 4 in fio[56505962e000+84000]
[ 38.016320] Code: c1 48 85 c0 74 2e 48 89 45 68 48 8b 45 40 48 63 55 2c 4c 01 4d 60 4c 01 c8 48 89 45 40 49 83 84 d4 70 5d 02 00 01 48 8b 55 20 <48> 3b 42 30 0f 82 75 026
[ 38.016839] Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O!
[ 38.019520] fio[760]: segfault at 30 ip 000056505966b361 sp 00007ffec2403df0 error 4 in fio[56505962e000+84000]
[ 38.020543] File: /mnt/scratch/file1 PID: 754 Comm: fio
[ 38.022056] Code: c1 48 85 c0 74 2e 48 89 45 68 48 8b 45 40 48 63 55 2c 4c 01 4d 60 4c 01 c8 48 89 45 40 49 83 84 d4 70 5d 02 00 01 48 8b 55 20 <48> 3b 42 30 0f 82 75 026
[ 38.052142] fio[761]: segfault at 30 ip 000056505966b361 sp 00007ffec2403df0 error 4 in fio[56505962e000+84000]
[ 38.053545] Code: c1 48 85 c0 74 2e 48 89 45 68 48 8b 45 40 48 63 55 2c 4c 01 4d 60 4c 01 c8 48 89 45 40 49 83 84 d4 70 5d 02 00 01 48 8b 55 20 <48> 3b 42 30 0f 82 75 026
[ 38.058111] fio[759]: segfault at 30 ip 000056505966b361 sp 00007ffec2403df0 error 4 in fio[56505962e000+84000]
[ 38.059511] Code: c1 48 85 c0 74 2e 48 89 45 68 48 8b 45 40 48 63 55 2c 4c 01 4d 60 4c 01 c8 48 89 45 40 49 83 84 d4 70 5d 02 00 01 48 8b 55 20 <48> 3b 42 30 0f 82 75 026
[ 38.065638] fio[758]: segfault at 30 ip 000056505966b361 sp 00007ffec2403df0 error 4 in fio[56505962e000+84000]
[ 38.067055] Code: c1 48 85 c0 74 2e 48 89 45 68 48 8b 45 40 48 63 55 2c 4c 01 4d 60 4c 01 c8 48 89 45 40 49 83 84 d4 70 5d 02 00 01 48 8b 55 20 <48> 3b 42 30 0f 82 75 026
> > In terms of why it's failing, a guess would be that your device is using 4k
> > sectors and the test is trying to do 1k aligned dio. That would fail, but
> > it should not cause fio to crash...
> >
> > --
> > Jens Axboe
> >
Cheers,
--
Luís
next prev parent reply other threads:[~2021-10-04 10:16 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-01 17:11 generic/095 failing in ext4 and xfs Luis Henriques
2021-10-01 20:07 ` Theodore Ts'o
2021-10-01 20:46 ` Jens Axboe
2021-10-01 21:59 ` Theodore Ts'o
2021-10-02 10:16 ` Luis Henriques
2021-10-02 14:59 ` Jens Axboe
2021-10-04 10:08 ` Luis Henriques
2021-10-04 10:15 ` Luis Henriques [this message]
2021-10-04 12:17 ` Luis Henriques
2021-10-04 16:18 ` Theodore Ts'o
2021-10-06 13:39 ` Luis Henriques
2021-10-10 8:31 ` Zorro Lang
2021-10-11 9:09 ` Luís Henriques
2021-10-11 9:31 ` Ming Lei
2021-10-11 10:16 ` Luís Henriques
2021-10-11 11:13 ` Ming Lei
2021-10-11 13:41 ` Luís Henriques
2021-10-11 12:44 ` Theodore Ts'o
2021-10-11 13:41 ` Luís Henriques
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YVrUX+qOJCDKp/Ng@suse.de \
--to=lhenriques@suse.de \
--cc=axboe@kernel.dk \
--cc=fio@vger.kernel.org \
--cc=fstests@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox