public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure()
@ 2026-01-27  5:16 Qu Wenruo
  2026-01-27  5:49 ` Christoph Hellwig
  2026-02-04 14:07 ` David Sterba
  0 siblings, 2 replies; 3+ messages in thread
From: Qu Wenruo @ 2026-01-27  5:16 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Christoph Hellwig

[BUG]
There is a bug report that when btrfs hits ENOSPC error in a critical
path, btrfs flips RO (this part is expected, although the ENOSPC bug
still needs to be addressed).

The problem is after the RO flip, if we trigger a read repair, we can
hit the ASSERT() inside btrfs_repair_io_failure() like the following:

 BTRFS info (device vdc): relocating block group 30408704 flags metadata|raid1
 ------------[ cut here ]------------
 BTRFS: Transaction aborted (error -28)
 WARNING: fs/btrfs/extent-tree.c:3235 at __btrfs_free_extent.isra.0+0x453/0xfd0, CPU#1: btrfs/383844
 Modules linked in: kvm_intel kvm irqbypass
 [...]
 ---[ end trace 0000000000000000 ]---
 BTRFS info (device vdc state EA): 2 enospc errors during balance
 BTRFS info (device vdc state EA): balance: ended with status: -30
 BTRFS error (device vdc state EA): parent transid verify failed on logical 30556160 mirror 2 wanted 8 found 6
 BTRFS error (device vdc state EA): bdev /dev/nvme0n1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
 [...]
 assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938
 ------------[ cut here ]------------
 assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938
 kernel BUG at fs/btrfs/bio.c:938!
 Oops: invalid opcode: 0000 [#1] SMP NOPTI
 CPU: 0 UID: 0 PID: 868 Comm: kworker/u8:13 Tainted: G        W        N  6.19.0-rc6+ #4788 PREEMPT(full)
 Tainted: [W]=WARN, [N]=TEST
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
 Workqueue: btrfs-endio simple_end_io_work
 RIP: 0010:btrfs_repair_io_failure.cold+0xb2/0x120
 Code: 82 e8 18 52 fc ff 0f 0b 41 b8 aa 03 00 00 48 c7 c1 f7 2b 07 83 31 d2 48 c7 c6 88 68 fb 82 48 c7 c7 f0 54 fa 82 e8 f4 51 fc ff <0f> 0b 41 b8 b4 03 00 00 48 c7 c1 f7 2b 07 83 31 d2 48 c7 c6 3d 2c
 RSP: 0000:ffffc90001d2bcf0 EFLAGS: 00010246
 RAX: 0000000000000051 RBX: 0000000000001000 RCX: 0000000000000000
 RDX: 0000000000000000 RSI: ffffffff8305cf42 RDI: 00000000ffffffff
 RBP: 0000000000000002 R08: 00000000fffeffff R09: ffffffff837fa988
 R10: ffffffff8327a9e0 R11: 6f69747265737361 R12: ffff88813018d310
 R13: ffff888168b8a000 R14: ffffc90001d2bd90 R15: ffff88810a169000
 FS:  0000000000000000(0000) GS:ffff8885e752c000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 ------------[ cut here ]------------

[CAUSE]
The cause of -ENOSPC error during the test case btrfs/124 is still
unknown, although it's known that we still have cases where metadata can
be over-committed but can not be fulfilled correctly, thus if we hit
such ENOSPC error inside a critical path, we have no choice but abort
the current transaction.

This will mark the fs read-only.

The problem is inside the btrfs_repair_io_failure() path that we require
the fs not to be mount read-only. This is normally fine, but if we are
doing a read-repair meanwhile the fs flips RO due to a critical error,
we can enter btrfs_repair_io_failure() with super block set to
read-only, thus triggering the above crash.

[FIX]
Just replace the ASSERT() with a proper return if the fs is already
read-only.

Reported-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/linux-btrfs/20260126045555.GB31641@lst.de/
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/bio.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/btrfs/bio.c b/fs/btrfs/bio.c
index d3475d179362..b0e9c46ff25b 100644
--- a/fs/btrfs/bio.c
+++ b/fs/btrfs/bio.c
@@ -928,7 +928,6 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
 	struct bio *bio = NULL;
 	int ret = 0;
 
-	ASSERT(!(fs_info->sb->s_flags & SB_RDONLY));
 	BUG_ON(!mirror_num);
 
 	/* Basic alignment checks. */
@@ -940,6 +939,13 @@ int btrfs_repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 fileoff,
 	ASSERT(step <= length);
 	ASSERT(is_power_of_2(step));
 
+	/*
+	 * The fs either mounted RO or hit critical errors, no need
+	 * to continue repairing.
+	 */
+	if (unlikely(sb_rdonly(fs_info->sb)))
+		return 0;
+
 	if (btrfs_repair_one_zone(fs_info, logical))
 		return 0;
 
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure()
  2026-01-27  5:16 [PATCH] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() Qu Wenruo
@ 2026-01-27  5:49 ` Christoph Hellwig
  2026-02-04 14:07 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: Christoph Hellwig @ 2026-01-27  5:49 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Christoph Hellwig

On Tue, Jan 27, 2026 at 03:46:55PM +1030, Qu Wenruo wrote:
> [BUG]
> There is a bug report that when btrfs hits ENOSPC error in a critical
> path, btrfs flips RO (this part is expected, although the ENOSPC bug
> still needs to be addressed).
> 
> The problem is after the RO flip, if we trigger a read repair, we can
> hit the ASSERT() inside btrfs_repair_io_failure() like the following:

This makes the assert go away, and now the test seems to fail
consistently with:


output mismatch (see /root/xfstests-dev/results//btrfs/124.out.bad)
    --- tests/btrfs/124.out	2024-08-19 04:21:17.339959767 +0000
    +++ /root/xfstests-dev/results//btrfs/124.out.bad	2026-01-27 05:45:24.341140050 +0000
    @@ -3,5 +3,11 @@
     Write data with degraded mount
     
     Mount normal and balance
    +ERROR: error during balancing '/mnt/scratch': Read-only file system
    +There may be more info in syslog - try dmesg | tail
     
on the mixes size setup.  I guess this counts as:

Tested-by: Christoph Hellwig <hch@lst.de>

?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure()
  2026-01-27  5:16 [PATCH] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() Qu Wenruo
  2026-01-27  5:49 ` Christoph Hellwig
@ 2026-02-04 14:07 ` David Sterba
  1 sibling, 0 replies; 3+ messages in thread
From: David Sterba @ 2026-02-04 14:07 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, Christoph Hellwig

On Tue, Jan 27, 2026 at 03:46:55PM +1030, Qu Wenruo wrote:
> [BUG]
> There is a bug report that when btrfs hits ENOSPC error in a critical
> path, btrfs flips RO (this part is expected, although the ENOSPC bug
> still needs to be addressed).
> 
> The problem is after the RO flip, if we trigger a read repair, we can
> hit the ASSERT() inside btrfs_repair_io_failure() like the following:
> 
>  BTRFS info (device vdc): relocating block group 30408704 flags metadata|raid1
>  ------------[ cut here ]------------
>  BTRFS: Transaction aborted (error -28)
>  WARNING: fs/btrfs/extent-tree.c:3235 at __btrfs_free_extent.isra.0+0x453/0xfd0, CPU#1: btrfs/383844
>  Modules linked in: kvm_intel kvm irqbypass
>  [...]
>  ---[ end trace 0000000000000000 ]---
>  BTRFS info (device vdc state EA): 2 enospc errors during balance
>  BTRFS info (device vdc state EA): balance: ended with status: -30
>  BTRFS error (device vdc state EA): parent transid verify failed on logical 30556160 mirror 2 wanted 8 found 6
>  BTRFS error (device vdc state EA): bdev /dev/nvme0n1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0
>  [...]
>  assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938
>  ------------[ cut here ]------------
>  assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938
>  kernel BUG at fs/btrfs/bio.c:938!
>  Oops: invalid opcode: 0000 [#1] SMP NOPTI
>  CPU: 0 UID: 0 PID: 868 Comm: kworker/u8:13 Tainted: G        W        N  6.19.0-rc6+ #4788 PREEMPT(full)
>  Tainted: [W]=WARN, [N]=TEST
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014
>  Workqueue: btrfs-endio simple_end_io_work
>  RIP: 0010:btrfs_repair_io_failure.cold+0xb2/0x120
>  Code: 82 e8 18 52 fc ff 0f 0b 41 b8 aa 03 00 00 48 c7 c1 f7 2b 07 83 31 d2 48 c7 c6 88 68 fb 82 48 c7 c7 f0 54 fa 82 e8 f4 51 fc ff <0f> 0b 41 b8 b4 03 00 00 48 c7 c1 f7 2b 07 83 31 d2 48 c7 c6 3d 2c

Please delete the Code: line from changelogs.

>  RSP: 0000:ffffc90001d2bcf0 EFLAGS: 00010246
>  RAX: 0000000000000051 RBX: 0000000000001000 RCX: 0000000000000000
>  RDX: 0000000000000000 RSI: ffffffff8305cf42 RDI: 00000000ffffffff
>  RBP: 0000000000000002 R08: 00000000fffeffff R09: ffffffff837fa988
>  R10: ffffffff8327a9e0 R11: 6f69747265737361 R12: ffff88813018d310
>  R13: ffff888168b8a000 R14: ffffc90001d2bd90 R15: ffff88810a169000
>  FS:  0000000000000000(0000) GS:ffff8885e752c000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  ------------[ cut here ]------------
> 
> [CAUSE]
> The cause of -ENOSPC error during the test case btrfs/124 is still
> unknown, although it's known that we still have cases where metadata can
> be over-committed but can not be fulfilled correctly, thus if we hit
> such ENOSPC error inside a critical path, we have no choice but abort
> the current transaction.
> 
> This will mark the fs read-only.
> 
> The problem is inside the btrfs_repair_io_failure() path that we require
> the fs not to be mount read-only. This is normally fine, but if we are
> doing a read-repair meanwhile the fs flips RO due to a critical error,
> we can enter btrfs_repair_io_failure() with super block set to
> read-only, thus triggering the above crash.
> 
> [FIX]
> Just replace the ASSERT() with a proper return if the fs is already
> read-only.
> 
> Reported-by: Christoph Hellwig <hch@lst.de>
> Link: https://lore.kernel.org/linux-btrfs/20260126045555.GB31641@lst.de/
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Reviewed-by: David Sterba <dsterba@suse.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-02-04 14:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-27  5:16 [PATCH] btrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure() Qu Wenruo
2026-01-27  5:49 ` Christoph Hellwig
2026-02-04 14:07 ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox