From: Matthew Wilcox <willy@infradead.org>
To: Eric Whitney <enwlinux@gmail.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu
Subject: Re: generic/418 regression seen on 5.12-rc3
Date: Thu, 18 Mar 2021 20:15:06 +0000 [thread overview]
Message-ID: <20210318201506.GU3420@casper.infradead.org> (raw)
In-Reply-To: <20210318181613.GA13891@localhost.localdomain>
On Thu, Mar 18, 2021 at 02:16:13PM -0400, Eric Whitney wrote:
> As mentioned in today's ext4 concall, I've seen generic/418 fail from time to
> time when run on 5.12-rc3 and 5.12-rc1 kernels. This first occurred when
> running the 1k test case using kvm-xfstests. I was then able to bisect the
> failure to a patch landed in the -rc1 merge window:
>
> (bd8a1f3655a7) mm/filemap: support readpage splitting a page
Thanks for letting me know. This failure is new to me.
I don't understand it; this patch changes the behaviour of buffered reads
from waiting on a page with a refcount held to waiting on a page without
the refcount held, then starting the lookup from scratch once the page
is unlocked. I find it hard to believe this introduces a /new/ failure.
Either it makes an existing failure easier to hit, or there's a subtle
bug in the retry logic that I'm not seeing.
> Typical test output resulting from a failure looks like:
>
> QA output created by 418
> +cmpbuf: offset 0: Expected: 0x1, got 0x0
> +[6:0] FAIL - comparison failed, offset 3072
> +diotest -w -b 512 -n 8 -i 4 failed at loop 0
> Silence is golden
> ...
>
> I've also been able to reproduce the failure on -rc3 in the 4k test case as
> well. The failure frequency there was 10 out of 100 runs. It was anywhere
> from 2 to 8 failures out of 100 runs in the 1k case.
>
> So, the failure isn't dependent upon block size less than page size.
That's a good data point. I'll take a look at g/418 and see if i can
figure out what race we're hitting. Nice that it happens so often.
I suppose I could get you to put some debugging in -- maybe dumping the
page if we hit a contended case, then again if we're retrying?
I presume it doesn't always happen at the same offset or anything
convenient like that.
next prev parent reply other threads:[~2021-03-18 20:16 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-18 18:16 generic/418 regression seen on 5.12-rc3 Eric Whitney
2021-03-18 19:41 ` Theodore Ts'o
2021-03-18 20:15 ` Matthew Wilcox [this message]
2021-03-18 21:38 ` Eric Whitney
2021-03-18 22:16 ` Matthew Wilcox
2021-03-22 16:37 ` Eric Whitney
2021-03-28 2:41 ` Matthew Wilcox
2021-04-01 16:15 ` Jan Kara
2021-04-01 17:46 ` Eric Whitney
2021-04-02 5:07 ` Ritesh Harjani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210318201506.GU3420@casper.infradead.org \
--to=willy@infradead.org \
--cc=enwlinux@gmail.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox