Flaky test: generic/085

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

* Flaky test: generic/085
@ 2024-06-11  8:52 Theodore Ts'o
  2024-06-11 16:37 ` Darrick J. Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2024-06-11  8:52 UTC (permalink / raw)
  To: Linux Filesystem Development List, linux-block, fstests

Hi, I've recently found a flaky test, generic/085 on 6.10-rc2 and
fs-next.  It's failing on both ext4 and xfs, and it reproduces more
easiy with the dax config:

xfs/4k: 20 tests, 1 failures, 137 seconds
  Flaky: generic/085:  5% (1/20)
xfs/dax: 20 tests, 11 failures, 71 seconds
  Flaky: generic/085: 55% (11/20)
ext4/4k: 20 tests, 111 seconds
ext4/dax: 20 tests, 8 failures, 69 seconds
  Flaky: generic/085: 40% (8/20)
Totals: 80 tests, 0 skipped, 20 failures, 0 errors, 388s

The failure is caused by a WARN_ON in fs_bdev_thaw() in fs/super.c:

static int fs_bdev_thaw(struct block_device *bdev)
{
	...
	sb = get_bdev_super(bdev);
	if (WARN_ON_ONCE(!sb))
		return -EINVAL;

The generic/085 test which exercises races between the fs
freeze/unfeeze and mount/umount code paths, so this appears to be
either a VFS-level or block device layer bug.  Modulo the warning, it
looks relatively harmless, so I'll just exclude generic/085 from my
test appliance, at least for now.  Hopefully someone will have a
chance to take a look at it?

Thanks,

					- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Flaky test: generic/085
  2024-06-11  8:52 Flaky test: generic/085 Theodore Ts'o
@ 2024-06-11 16:37 ` Darrick J. Wong
  2024-06-12 11:25   ` Christian Brauner
  0 siblings, 1 reply; 6+ messages in thread
From: Darrick J. Wong @ 2024-06-11 16:37 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Linux Filesystem Development List, linux-block, fstests

On Tue, Jun 11, 2024 at 09:52:10AM +0100, Theodore Ts'o wrote:
> Hi, I've recently found a flaky test, generic/085 on 6.10-rc2 and
> fs-next.  It's failing on both ext4 and xfs, and it reproduces more
> easiy with the dax config:
> 
> xfs/4k: 20 tests, 1 failures, 137 seconds
>   Flaky: generic/085:  5% (1/20)
> xfs/dax: 20 tests, 11 failures, 71 seconds
>   Flaky: generic/085: 55% (11/20)
> ext4/4k: 20 tests, 111 seconds
> ext4/dax: 20 tests, 8 failures, 69 seconds
>   Flaky: generic/085: 40% (8/20)
> Totals: 80 tests, 0 skipped, 20 failures, 0 errors, 388s
> 
> The failure is caused by a WARN_ON in fs_bdev_thaw() in fs/super.c:
> 
> static int fs_bdev_thaw(struct block_device *bdev)
> {
> 	...
> 	sb = get_bdev_super(bdev);
> 	if (WARN_ON_ONCE(!sb))
> 		return -EINVAL;
> 
> 
> The generic/085 test which exercises races between the fs
> freeze/unfeeze and mount/umount code paths, so this appears to be
> either a VFS-level or block device layer bug.  Modulo the warning, it
> looks relatively harmless, so I'll just exclude generic/085 from my
> test appliance, at least for now.  Hopefully someone will have a
> chance to take a look at it?

I think this can happen if fs_bdev_thaw races with unmount?

Let's say that the _umount $lvdev in the second loop in generic/085
starts the unmount process, which clears SB_ACTIVE from the super_block.
Then the first loop tries to freeze the bdev (and fails), and
immediately tries to thaw the bdev.  The thaw code calls fs_bdev_thaw
because the unmount process is still running & so the fs is still
holding the bdev.  But get_bdev_super sees that SB_ACTIVE has been
cleared from the super_block so it returns NULL, which trips the
warning.

If that's correct, then I think the WARN_ON_ONCE should go away.

--D

> Thanks,
> 
> 					- Ted
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Flaky test: generic/085
  2024-06-11 16:37 ` Darrick J. Wong
@ 2024-06-12 11:25   ` Christian Brauner
  2024-06-12 14:47     ` Theodore Ts'o
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2024-06-12 11:25 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Theodore Ts'o, Linux Filesystem Development List, linux-block,
	fstests

On Tue, Jun 11, 2024 at 09:37:01AM -0700, Darrick J. Wong wrote:
> On Tue, Jun 11, 2024 at 09:52:10AM +0100, Theodore Ts'o wrote:
> > Hi, I've recently found a flaky test, generic/085 on 6.10-rc2 and
> > fs-next.  It's failing on both ext4 and xfs, and it reproduces more
> > easiy with the dax config:
> > 
> > xfs/4k: 20 tests, 1 failures, 137 seconds
> >   Flaky: generic/085:  5% (1/20)
> > xfs/dax: 20 tests, 11 failures, 71 seconds
> >   Flaky: generic/085: 55% (11/20)
> > ext4/4k: 20 tests, 111 seconds
> > ext4/dax: 20 tests, 8 failures, 69 seconds
> >   Flaky: generic/085: 40% (8/20)
> > Totals: 80 tests, 0 skipped, 20 failures, 0 errors, 388s
> > 
> > The failure is caused by a WARN_ON in fs_bdev_thaw() in fs/super.c:
> > 
> > static int fs_bdev_thaw(struct block_device *bdev)
> > {
> > 	...
> > 	sb = get_bdev_super(bdev);
> > 	if (WARN_ON_ONCE(!sb))
> > 		return -EINVAL;
> > 
> > 
> > The generic/085 test which exercises races between the fs
> > freeze/unfeeze and mount/umount code paths, so this appears to be
> > either a VFS-level or block device layer bug.  Modulo the warning, it
> > looks relatively harmless, so I'll just exclude generic/085 from my
> > test appliance, at least for now.  Hopefully someone will have a
> > chance to take a look at it?
> 
> I think this can happen if fs_bdev_thaw races with unmount?
> 
> Let's say that the _umount $lvdev in the second loop in generic/085
> starts the unmount process, which clears SB_ACTIVE from the super_block.
> Then the first loop tries to freeze the bdev (and fails), and
> immediately tries to thaw the bdev.  The thaw code calls fs_bdev_thaw
> because the unmount process is still running & so the fs is still
> holding the bdev.  But get_bdev_super sees that SB_ACTIVE has been
> cleared from the super_block so it returns NULL, which trips the
> warning.
> 
> If that's correct, then I think the WARN_ON_ONCE should go away.

I've been trying to reproduce this with pmem yesterday and wasn't able to.

SB_ACTIVE is cleared in generic_shutdown_super(). If we're in there we
know that there are no active references to the superblock anymore. That
includes freeze requests:

* Freezes are nestable from kernel and userspace but all
  nested freezers share a single active reference in sb->s_active.
* The nested freeze requests are counted in
  sb->s_writers.freeze_{kcount,ucount}.
* The last thaw request (sb->s_writers.freeze_kcount +
  sb->s_writers.freeze_ucount == 0) releases the sb->s_active reference.
* Nested freezes from the block layer via bdev_freeze() are
  additionally counted in bdev->bd_fsfreeze_count protected by
  bdev->bd_fsfreeze_mutex.

The device mapper suspend logic that generic/085 uses relies on
bdev_freeze() and bdev_thaw() from the block layer. So all those dm
freezes should be counted in bdev->bd_fsfreeze_count. And device mapper
has logic to ensure that only a single freeze request is ever made. So
bdev->bd_fsfreeze_count in that test should be 1. So when a bdev_thaw()
request comes via dm_suspend():

* bdev_thaw() is called and encounters bdev->bd_fsfreeze_count == 1.
  * As there aren't any fs initiated freezes we know that
  sb->s_writers.kcount == 0 and sb->s_writer.ucount == 1 ==
  bdev->bd_fsfreeze_count.
* When fs_bdev_thaw() the superblock is still valid and we've got at
  least one active reference taken during the bdev_freeze()
    request.
* get_bdev_super() tries to grab an active reference to the
  superblock but fails.

That can indeed happen because SB_ACTIVE is cleared. But for that to be
the case we must've dropped the last active reference, forgot to take it
during the original freeze, miscounted bdev->bd_fsfreeze_count, or
missed a nested sb->s_writers.freeze_{kcount,ucount}.

What's the kernel config and test config that's used?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Flaky test: generic/085
  2024-06-12 11:25   ` Christian Brauner
@ 2024-06-12 14:47     ` Theodore Ts'o
  2024-06-13  9:55       ` Christian Brauner
  0 siblings, 1 reply; 6+ messages in thread
From: Theodore Ts'o @ 2024-06-12 14:47 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Darrick J. Wong, Linux Filesystem Development List, linux-block,
	fstests

On Wed, Jun 12, 2024 at 01:25:07PM +0200, Christian Brauner wrote:
> I've been trying to reproduce this with pmem yesterday and wasn't able to.
> 
> What's the kernel config and test config that's used?
>

The kernel config can be found here:

https://github.com/tytso/xfstests-bld/blob/master/kernel-build/kernel-configs/config-6.1

Drop it into .config in the build directory of any kernel sources
newer than 6.1, and then run "make olddefconfig".  This is all
automated in the install-kconfig script which I use:

https://github.com/tytso/xfstests-bld/blob/master/kernel-build/install-kconfig

The VM has 4 CPU's, and 26GiB of memory, and kernel is booted with the
boot command line options "memmap=4G!9G memmap=9G!14G", which sets up
fake /dev/pmem0 and /dev/pmem1 devices backed by RAM.  This is my poor
engineer's way of testing DAX without needing to get access to
expensive VM's with pmem.  :-)

I'm assuming this is a timing-dependant bug which is easiest to
trigger on fast devices, so a ramdisk might also work.  FWIW, I also
can see failures relatively frequently using the ext4/nojournal
configuration on a SSD-backed cloud block device (GCE's Persistent
Disk SSD product).

As a result, if you grab my xfstests-bld repo from github, and then
run "qemu-xfstests -c ext4/nojournal C 20 generic/085" it should
also reproduce.  See the Documentation/kvm-quickstart.md for more details.

						- Ted

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Flaky test: generic/085
  2024-06-12 14:47     ` Theodore Ts'o
@ 2024-06-13  9:55       ` Christian Brauner
  2024-06-13 17:17         ` Darrick J. Wong
  0 siblings, 1 reply; 6+ messages in thread
From: Christian Brauner @ 2024-06-13  9:55 UTC (permalink / raw)
  To: Theodore Ts'o, Darrick J. Wong
  Cc: Linux Filesystem Development List, linux-block, fstests

[-- Attachment #1: Type: text/plain, Size: 3315 bytes --]

On Wed, Jun 12, 2024 at 03:47:16PM GMT, Theodore Ts'o wrote:
> On Wed, Jun 12, 2024 at 01:25:07PM +0200, Christian Brauner wrote:
> > I've been trying to reproduce this with pmem yesterday and wasn't able to.
> > 
> > What's the kernel config and test config that's used?
> >
> 
> The kernel config can be found here:
> 
> https://github.com/tytso/xfstests-bld/blob/master/kernel-build/kernel-configs/config-6.1
> 
> Drop it into .config in the build directory of any kernel sources
> newer than 6.1, and then run "make olddefconfig".  This is all
> automated in the install-kconfig script which I use:
> 
> https://github.com/tytso/xfstests-bld/blob/master/kernel-build/install-kconfig
> 
> The VM has 4 CPU's, and 26GiB of memory, and kernel is booted with the
> boot command line options "memmap=4G!9G memmap=9G!14G", which sets up
> fake /dev/pmem0 and /dev/pmem1 devices backed by RAM.  This is my poor
> engineer's way of testing DAX without needing to get access to
> expensive VM's with pmem.  :-)
> 
> I'm assuming this is a timing-dependant bug which is easiest to
> trigger on fast devices, so a ramdisk might also work.  FWIW, I also
> can see failures relatively frequently using the ext4/nojournal
> configuration on a SSD-backed cloud block device (GCE's Persistent
> Disk SSD product).
> 
> As a result, if you grab my xfstests-bld repo from github, and then
> run "qemu-xfstests -c ext4/nojournal C 20 generic/085" it should
> also reproduce.  See the Documentation/kvm-quickstart.md for more details.

Thanks, Ted! Ok, I think I figured it out.

P1
dm_resume()
-> bdev_freeze()
   mutex_lock(&bdev->bd_fsfreeze_mutex);
   atomic_inc_return(&bdev->bd_fsfreeze_count); // == 1
   mutex_unlock(&bdev->bd_fsfreeze_mutex);

P2						P3
setup_bdev_super()
bdev_file_open_by_dev();
atomic_read(&bdev->bd_fsfreeze_count); // != 0

						bdev_thaw()
						mutex_lock(&bdev->bd_fsfreeze_mutex);
						atomic_dec_return(&bdev->bd_fsfreeze_count); // == 0
						mutex_unlock(&bdev->bd_fsfreeze_mutex);
						bd_holder_lock();
						// grab passive reference on sb via sb->s_count
						bd_holder_unlock();
						// Sleep to be woken when superblock ready or dead
bdev_fput()
bd_holder_lock()
// yield bdev
bd_holder_unlock()

deactivate_locked_super()
// notify that superblock is dead

						// get woken and see that superblock is dead; fail

In words this basically means that P1 calls dm_suspend() which calls
into bdev_freeze() before the block device has been claimed by the
filesystem. This brings bdev->bd_fsfreeze_count to 1 and no call into
fs_bdev_freeze() is required.

Now P2 tries to mount that frozen block device. It claims it and checks
bdev->bd_fsfreeze_count. As it's elevated it aborts mounting holding
sb->s_umount all the time ofc.

In the meantime P3 calls dm_resume() it sees that the block device is
already claimed by a filesystem and calls into fs_bdev_thaw().

It takes a passive reference and realizes that the filesystem isn't
ready yet. So P3 puts itself to sleep to wait for the filesystem to
become ready.

P2 puts the last active reference to the filesystem and marks it as
dying.

Now P3 gets woken, sees that the filesystem is dying and
get_bdev_super() fails.

So Darrick is correct about the fix but the reasoning is a bit
different. :)

Patch appended and on #vfs.fixes.

[-- Attachment #2: 0001-fs-don-t-misleadingly-warn-during-thaw-operations.patch --]
[-- Type: text/x-diff, Size: 2618 bytes --]

From 35224b919d6778ca5dd11f76659ae849594bd2bf Mon Sep 17 00:00:00 2001
From: Christian Brauner <brauner@kernel.org>
Date: Thu, 13 Jun 2024 11:38:14 +0200
Subject: [PATCH] fs: don't misleadingly warn during thaw operations

The block device may have been frozen before it was claimed by a
filesystem. Concurrently another process might try to mount that
frozen block device and has temporarily claimed the block device for
that purpose causing a concurrent fs_bdev_thaw() to end up here. The
mounter is already about to abort mounting because they still saw an
elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return
NULL in that case.

For example, P1 calls dm_suspend() which calls into bdev_freeze() before
the block device has been claimed by the filesystem. This brings
bdev->bd_fsfreeze_count to 1 and no call into fs_bdev_freeze() is
required.

Now P2 tries to mount that frozen block device. It claims it and checks
bdev->bd_fsfreeze_count. As it's elevated it aborts mounting.

In the meantime P3 calls dm_resume() it sees that the block device is
already claimed by a filesystem and calls into fs_bdev_thaw().

It takes a passive reference and realizes that the filesystem isn't
ready yet. So P3 puts itself to sleep to wait for the filesystem to
become ready.

P2 puts the last active reference to the filesystem and marks it as
dying. Now P3 gets woken, sees that the filesystem is dying and
get_bdev_super() fails.

Fixes: 49ef8832fb1a ("bdev: implement freeze and thaw holder operations")
Cc: <stable@vger.kernel.org>
Reported-by: Theodore Ts'o <tytso@mit.edu>
Link: https://lore.kernel.org/r/20240611085210.GA1838544@mit.edu
Signed-off-by: Christian Brauner <brauner@kernel.org>
---
 fs/super.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index b72f1d288e95..095ba793e10c 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -1502,8 +1502,17 @@ static int fs_bdev_thaw(struct block_device *bdev)

 	lockdep_assert_held(&bdev->bd_fsfreeze_mutex);

+	/*
+	 * The block device may have been frozen before it was claimed by a
+	 * filesystem. Concurrently another process might try to mount that
+	 * frozen block device and has temporarily claimed the block device for
+	 * that purpose causing a concurrent fs_bdev_thaw() to end up here. The
+	 * mounter is already about to abort mounting because they still saw an
+	 * elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return
+	 * NULL in that case.
+	 */
 	sb = get_bdev_super(bdev);
-	if (WARN_ON_ONCE(!sb))
+	if (!sb)
 		return -EINVAL;

 	if (sb->s_op->thaw_super)
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: Flaky test: generic/085
  2024-06-13  9:55       ` Christian Brauner
@ 2024-06-13 17:17         ` Darrick J. Wong
  0 siblings, 0 replies; 6+ messages in thread
From: Darrick J. Wong @ 2024-06-13 17:17 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Theodore Ts'o, Linux Filesystem Development List, linux-block,
	fstests

On Thu, Jun 13, 2024 at 11:55:37AM +0200, Christian Brauner wrote:
> On Wed, Jun 12, 2024 at 03:47:16PM GMT, Theodore Ts'o wrote:
> > On Wed, Jun 12, 2024 at 01:25:07PM +0200, Christian Brauner wrote:
> > > I've been trying to reproduce this with pmem yesterday and wasn't able to.
> > > 
> > > What's the kernel config and test config that's used?
> > >
> > 
> > The kernel config can be found here:
> > 
> > https://github.com/tytso/xfstests-bld/blob/master/kernel-build/kernel-configs/config-6.1
> > 
> > Drop it into .config in the build directory of any kernel sources
> > newer than 6.1, and then run "make olddefconfig".  This is all
> > automated in the install-kconfig script which I use:
> > 
> > https://github.com/tytso/xfstests-bld/blob/master/kernel-build/install-kconfig
> > 
> > The VM has 4 CPU's, and 26GiB of memory, and kernel is booted with the
> > boot command line options "memmap=4G!9G memmap=9G!14G", which sets up
> > fake /dev/pmem0 and /dev/pmem1 devices backed by RAM.  This is my poor
> > engineer's way of testing DAX without needing to get access to
> > expensive VM's with pmem.  :-)
> > 
> > I'm assuming this is a timing-dependant bug which is easiest to
> > trigger on fast devices, so a ramdisk might also work.  FWIW, I also
> > can see failures relatively frequently using the ext4/nojournal
> > configuration on a SSD-backed cloud block device (GCE's Persistent
> > Disk SSD product).
> > 
> > As a result, if you grab my xfstests-bld repo from github, and then
> > run "qemu-xfstests -c ext4/nojournal C 20 generic/085" it should
> > also reproduce.  See the Documentation/kvm-quickstart.md for more details.
> 
> Thanks, Ted! Ok, I think I figured it out.
> 
> P1
> dm_resume()
> -> bdev_freeze()
>    mutex_lock(&bdev->bd_fsfreeze_mutex);
>    atomic_inc_return(&bdev->bd_fsfreeze_count); // == 1
>    mutex_unlock(&bdev->bd_fsfreeze_mutex);
> 
> P2						P3
> setup_bdev_super()
> bdev_file_open_by_dev();
> atomic_read(&bdev->bd_fsfreeze_count); // != 0
> 
> 						bdev_thaw()
> 						mutex_lock(&bdev->bd_fsfreeze_mutex);
> 						atomic_dec_return(&bdev->bd_fsfreeze_count); // == 0
> 						mutex_unlock(&bdev->bd_fsfreeze_mutex);
> 						bd_holder_lock();
> 						// grab passive reference on sb via sb->s_count
> 						bd_holder_unlock();
> 						// Sleep to be woken when superblock ready or dead
> bdev_fput()
> bd_holder_lock()
> // yield bdev
> bd_holder_unlock()
> 
> deactivate_locked_super()
> // notify that superblock is dead
> 
> 						// get woken and see that superblock is dead; fail
> 
> In words this basically means that P1 calls dm_suspend() which calls
> into bdev_freeze() before the block device has been claimed by the
> filesystem. This brings bdev->bd_fsfreeze_count to 1 and no call into
> fs_bdev_freeze() is required.
> 
> Now P2 tries to mount that frozen block device. It claims it and checks
> bdev->bd_fsfreeze_count. As it's elevated it aborts mounting holding
> sb->s_umount all the time ofc.
> 
> In the meantime P3 calls dm_resume() it sees that the block device is
> already claimed by a filesystem and calls into fs_bdev_thaw().
> 
> It takes a passive reference and realizes that the filesystem isn't
> ready yet. So P3 puts itself to sleep to wait for the filesystem to
> become ready.
> 
> P2 puts the last active reference to the filesystem and marks it as
> dying.
> 
> Now P3 gets woken, sees that the filesystem is dying and
> get_bdev_super() fails.
> 
> So Darrick is correct about the fix but the reasoning is a bit
> different. :)
> 
> Patch appended and on #vfs.fixes.

> From 35224b919d6778ca5dd11f76659ae849594bd2bf Mon Sep 17 00:00:00 2001
> From: Christian Brauner <brauner@kernel.org>
> Date: Thu, 13 Jun 2024 11:38:14 +0200
> Subject: [PATCH] fs: don't misleadingly warn during thaw operations
> 
> The block device may have been frozen before it was claimed by a
> filesystem. Concurrently another process might try to mount that
> frozen block device and has temporarily claimed the block device for
> that purpose causing a concurrent fs_bdev_thaw() to end up here. The
> mounter is already about to abort mounting because they still saw an
> elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return
> NULL in that case.
> 
> For example, P1 calls dm_suspend() which calls into bdev_freeze() before
> the block device has been claimed by the filesystem. This brings
> bdev->bd_fsfreeze_count to 1 and no call into fs_bdev_freeze() is
> required.
> 
> Now P2 tries to mount that frozen block device. It claims it and checks
> bdev->bd_fsfreeze_count. As it's elevated it aborts mounting.
> 
> In the meantime P3 calls dm_resume() it sees that the block device is
> already claimed by a filesystem and calls into fs_bdev_thaw().
> 
> It takes a passive reference and realizes that the filesystem isn't
> ready yet. So P3 puts itself to sleep to wait for the filesystem to
> become ready.
> 
> P2 puts the last active reference to the filesystem and marks it as
> dying. Now P3 gets woken, sees that the filesystem is dying and
> get_bdev_super() fails.

Wow that's twisty.  But it makes sense to me, so
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D


> Fixes: 49ef8832fb1a ("bdev: implement freeze and thaw holder operations")
> Cc: <stable@vger.kernel.org>
> Reported-by: Theodore Ts'o <tytso@mit.edu>
> Link: https://lore.kernel.org/r/20240611085210.GA1838544@mit.edu
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
>  fs/super.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/super.c b/fs/super.c
> index b72f1d288e95..095ba793e10c 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -1502,8 +1502,17 @@ static int fs_bdev_thaw(struct block_device *bdev)
>  
>  	lockdep_assert_held(&bdev->bd_fsfreeze_mutex);
>  
> +	/*
> +	 * The block device may have been frozen before it was claimed by a
> +	 * filesystem. Concurrently another process might try to mount that
> +	 * frozen block device and has temporarily claimed the block device for
> +	 * that purpose causing a concurrent fs_bdev_thaw() to end up here. The
> +	 * mounter is already about to abort mounting because they still saw an
> +	 * elevanted bdev->bd_fsfreeze_count so get_bdev_super() will return
> +	 * NULL in that case.
> +	 */
>  	sb = get_bdev_super(bdev);
> -	if (WARN_ON_ONCE(!sb))
> +	if (!sb)
>  		return -EINVAL;
>  
>  	if (sb->s_op->thaw_super)
> -- 
> 2.43.0
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-13 17:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-11  8:52 Flaky test: generic/085 Theodore Ts'o
2024-06-11 16:37 ` Darrick J. Wong
2024-06-12 11:25   ` Christian Brauner
2024-06-12 14:47     ` Theodore Ts'o
2024-06-13  9:55       ` Christian Brauner
2024-06-13 17:17         ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox