From: "Darrick J. Wong" <djwong@kernel.org>
To: Theodore Ts'o <tytso@mit.edu>
Cc: Linux Filesystem Development List <linux-fsdevel@vger.kernel.org>,
linux-block@vger.kernel.org, fstests@vger.kernel.org
Subject: Re: Flaky test: generic/085
Date: Tue, 11 Jun 2024 09:37:01 -0700 [thread overview]
Message-ID: <20240611163701.GK52977@frogsfrogsfrogs> (raw)
In-Reply-To: <20240611085210.GA1838544@mit.edu>
On Tue, Jun 11, 2024 at 09:52:10AM +0100, Theodore Ts'o wrote:
> Hi, I've recently found a flaky test, generic/085 on 6.10-rc2 and
> fs-next. It's failing on both ext4 and xfs, and it reproduces more
> easiy with the dax config:
>
> xfs/4k: 20 tests, 1 failures, 137 seconds
> Flaky: generic/085: 5% (1/20)
> xfs/dax: 20 tests, 11 failures, 71 seconds
> Flaky: generic/085: 55% (11/20)
> ext4/4k: 20 tests, 111 seconds
> ext4/dax: 20 tests, 8 failures, 69 seconds
> Flaky: generic/085: 40% (8/20)
> Totals: 80 tests, 0 skipped, 20 failures, 0 errors, 388s
>
> The failure is caused by a WARN_ON in fs_bdev_thaw() in fs/super.c:
>
> static int fs_bdev_thaw(struct block_device *bdev)
> {
> ...
> sb = get_bdev_super(bdev);
> if (WARN_ON_ONCE(!sb))
> return -EINVAL;
>
>
> The generic/085 test which exercises races between the fs
> freeze/unfeeze and mount/umount code paths, so this appears to be
> either a VFS-level or block device layer bug. Modulo the warning, it
> looks relatively harmless, so I'll just exclude generic/085 from my
> test appliance, at least for now. Hopefully someone will have a
> chance to take a look at it?
I think this can happen if fs_bdev_thaw races with unmount?
Let's say that the _umount $lvdev in the second loop in generic/085
starts the unmount process, which clears SB_ACTIVE from the super_block.
Then the first loop tries to freeze the bdev (and fails), and
immediately tries to thaw the bdev. The thaw code calls fs_bdev_thaw
because the unmount process is still running & so the fs is still
holding the bdev. But get_bdev_super sees that SB_ACTIVE has been
cleared from the super_block so it returns NULL, which trips the
warning.
If that's correct, then I think the WARN_ON_ONCE should go away.
--D
> Thanks,
>
> - Ted
>
next prev parent reply other threads:[~2024-06-11 16:37 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-11 8:52 Flaky test: generic/085 Theodore Ts'o
2024-06-11 16:37 ` Darrick J. Wong [this message]
2024-06-12 11:25 ` Christian Brauner
2024-06-12 14:47 ` Theodore Ts'o
2024-06-13 9:55 ` Christian Brauner
2024-06-13 17:17 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240611163701.GK52977@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=fstests@vger.kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox