public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: linux-btrfs <linux-btrfs@vger.kernel.org>,
	Boris Burkov <boris@bur.io>, Josef Bacik <josef@toxicpanda.com>,
	QuWenruo <wqu@suse.com>, Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Filipe Manana <fdmanana@kernel.org>
Cc: Chris Murphy <lists@colorremedies.com>,
	Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	Roman Mamedov <rm@romanrm.net>, To: Su Yue <Damenly_Su@gmx.com>,
	Su Yue <suy.fnst@cn.fujitsu.com>;
Subject: Re: BTRFS discard crash: failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2 6.11.2)
Date: Sat, 11 Apr 2026 18:57:33 -0700	[thread overview]
Message-ID: <adr8DUbqkdFyvZsf@merlins.org> (raw)
In-Reply-To: <adnBhWfJQ1n3hZC8@merlins.org>

So, btrfs repair is weird. The "real one" just OOM's even if I give it
64GB of swap, because I guess it wants gigabytes of RAM in huge chunks
that can't be swapped.

moremagic:~# btrfs check --repair /dev/mapper/crypt_bcache0
enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. E.g.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/mapper/crypt_bcache0
UUID: a97dec85-a0d5-42ab-a0ef-e9b7479fbe43
[1/8] checking log skipped (none written)
[2/8] checking root items
Fixed 0 roots.
[3/8] checking extents
super bytes used 18659783561216 mismatches actual used 18659783544832
super bytes used 18659783544832 mismatches actual used 18659783593984
No device size related problem found
[4/8] checking free space cache
[5/8] checking fs roots
Killed

But this is strange lowmem is giving a totally different result, it may
just be entirely trashing my FS as I write this, but if it doesn't
succeed, I need to wipe it and start over anyway
moremagic:~# btrfs check --mode lowmem /dev/mapper/crypt_bcache0
Opening filesystem to check...
Checking filesystem on /dev/mapper/crypt_bcache0
UUID: a97dec85-a0d5-42ab-a0ef-e9b7479fbe43
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
ERROR: extent[16842752 168 4096] has unknown ref type: 172
ERROR: extent[16855040 168 4096] has unknown ref type: 172
ERROR: extent[1121296384 168 8192] has unknown ref type: 172

Gemnini said that's the simple quotas not supported in lowmem

moremagic:~# btrfstune --remove-simple-quota /dev/mapper/crypt_bcache0
bad eb member end: ptr 0x4000 start 15495212859392 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16568940527616 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16133001379840 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15495296155648 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15641227673600 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16027774648320 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16217827999744 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16217830113280 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15505949786112 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16357413355520 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15495414267904 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 3133210902528 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16027775500288 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16217837060096 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15181688930304 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15181689208832 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16349905764352 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16349906010112 member offset 16384 size 1

So basically it looks like I'm kind of screwed unless I move the array
(remote, can't get to it right now) to a system with 64GB of RAM or whatnot.

Back to the original point, this is kind of sad
1) mdadm raid5 can't close the right hole while intent bitmaps are on
2) native btrfs raid5 has never been stable enough to be used in production
3) RST should eventually be, but nothing I read says it is today
4) btrfs check --repair still apparently requires at least as much RAM
as the filesystem size, which is "problematic"
5) --lowmem is out of date and not usable.

So, am I pretty much screwed and need to wipe, restart, and speed
probably weeks trying to resync all that data over the internet, or is
there a way out?

Thanks,
Marc

On Fri, Apr 10, 2026 at 08:35:33PM -0700, Marc MERLIN wrote:
> [Is there a more appropriate way to report FS corruption? Looks like
> Emails to just linux-btrfs@vger.kernel.org do not get seen amongst all
> the patches hiding a normal Email]
> 
> Howdy,
> 
> I had btfrs filesystem on top of raid5 with 5 spinning drives.
> I mistakenly enabled discard by mistake which caused a crash when the discard thread tried
> to run (no discard on those drives)
> Kernel 6.12
> 
> I worked on recovery using gemini 3.0 pro, mounting read only is fine, but I need read write
> or will waste days (probably weeks) recreating this entire 20TB+ backup over the internet
> 
> I'm not qualified to say if everything Gemini said was correct, but I think summary is:
> 1) discard can apparently kill a filesystem when it's hard drives below (it did for me)
> 2) -o skip_balance,usebackuproot didn't help
> 3) no way to mount after space cache has been cleared and block-group-tree is enabled
> 4) still no way to mount read write after removing block-group-tree
> 
> It started with:
> [23345.326321] BTRFS: error (device dm-0 state A) in do_free_extent_accounting:2996: errno=-2 No such entry
> [23345.336394] BTRFS error (device dm-0 state EA): failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2
> [23345.350299] BTRFS: error (device dm-0 state EA) in btrfs_run_delayed_refs:2215: errno=-2 No such entry
> [23345.360154] BTRFS warning (device dm-0 state EA):
> 
> I ended up with:
> 
> moremagic:~# mount -t btrfs -o rw,skip_balance,space_cache=v2,clear_cache /dev/mapper/crypt_bcache0 /mnt/btrfs_bigbackup
> BTRFS: device label DS6 devid 1 transid 296950 /dev/mapper/crypt_bcache0 (251:0) scanned by mount (6029)
> BTRFS info (device dm-0): first mount of filesystem a97dec85-a0d5-42ab-a0ef-e9b7479fbe43
> BTRFS info (device dm-0): using crc32c (crc32c-generic) checksum algorithm
> BTRFS warning (device dm-0): read-write for sector size 4096 with page size 16384 is experimental
> BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 5074, gen 0
> ------------[ cut here ]------------
> BTRFS: Transaction aborted (error -2)
> WARNING: CPU: 3 PID: 6029 at fs/btrfs/extent-tree.c:2996 __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
> Modules linked in: dm_crypt dm_mod bcache raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xt_MASQUERADE ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack xt_LOG nf_log_syslog nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables rfcomm algif_hash algif_skcipher af_alg bnep cp210x brcmfmac_wcc binfmt_misc usbserial hci_uart brcmfmac btbcm vc4 snd_soc_hdmi_codec brcmutil bluetooth drm_display_helper cfg80211 cec drm_dma_helper rpi_hevc_dec ecdh_generic v4l2_mem2mem ecc snd_soc_core pisp_be videobuf2_dma_contig v3d videobuf2_memops videobuf2_v4l2 gpu_sched rfkill videodev drm_shmem_helper snd_compress snd_pcm_dmaengine snd_pcm videobuf2_common rp1_pio snd_timer snd drm_kms_helper mc raspberrypi_gpiomem rp1_fw sg sch_fq_codel ecryptfs fuse drm drm_panel_orientation_quirks backlight nfnetlink ip_tables x_tables raid1 aes_ce_blk aes_ce_cipher ghash_ce gf128mul libaes sha2_ce spidev sha256_arm64 sha1_ce raspberrypi_hwmon sha1_generic ahci i2c_brcmstb spi_bcm2835
>  md_mod gpio_keys libahci pwm_fan rp1_adc libata rp1_mailbox nvmem_rmem uio_pdrv_genirq uio btrfs blake2b_generic xor xor_neon raid6_pq zram lz4_compress ipv6
> CPU: 3 UID: 0 PID: 6029 Comm: mount Not tainted 6.12.47+rpt-rpi-2712 #1  Debian 1:6.12.47-1+rpt1
> Hardware name: Raspberry Pi 5 Model B Rev 1.1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
> lr : __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
> sp : ffffc000868bb680
> x29: ffffc000868bb720 x28: 0000000000000000 x27: 0000000000002f02
> x26: 000000000000007f x25: ffff8001de833aa0 x24: 0000000000004000
> x23: 0000000000000000 x22: ffff800102b64e70 x21: 0000000000004000
> x20: 00000e1a4bb88000 x19: 00000000fffffffe x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> x11: 00000000000000c0 x10: 0000000000001a40 x9 : ffffd06fce4e06c0
> x8 : ffff80011f56e0a0 x7 : 000000042f72a7bd x6 : 0000000000000039
> x5 : 0000000000000001 x4 : 0000000000001ab0 x3 : 0000000000000804
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff80011f56c600
> Call trace:
>  __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
>  __btrfs_run_delayed_refs+0x508/0xec0 [btrfs]
>  btrfs_run_delayed_refs+0x48/0x198 [btrfs]
>  btrfs_commit_transaction+0x88/0xe20 [btrfs]
>  btrfs_recover_relocation+0x55c/0x5d0 [btrfs]
>  btrfs_start_pre_rw_mount+0x1d4/0x470 [btrfs]
>  open_ctree+0x101c/0x13b8 [btrfs]
>  btrfs_get_tree+0x5b4/0x800 [btrfs]
>  vfs_get_tree+0x30/0x108
>  fc_mount+0x20/0x68
>  btrfs_get_tree+0x238/0x800 [btrfs]
>  vfs_get_tree+0x30/0x108
>  vfs_cmd_create+0x58/0xf8
>  __arm64_sys_fsconfig+0x444/0x5b8
>  invoke_syscall+0x50/0x120
>  el0_svc_common.constprop.0+0x48/0xf0
>  do_el0_svc+0x24/0x38
>  el0_svc+0x30/0xf8
>  el0t_64_sync_handler+0x120/0x130
>  el0t_64_sync+0x190/0x198
> ---[ end trace 0000000000000000 ]---
> BTRFS: error (device dm-0 state A) in do_free_extent_accounting:2996: errno=-2 No such entry
> BTRFS error (device dm-0 state EA): failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2
> BTRFS: error (device dm-0 state EA) in btrfs_run_delayed_refs:2215: errno=-2 No such entry
> BTRFS warning (device dm-0 state EA): failed to recover relocation: -2
> BTRFS error (device dm-0 state EA): commit super ret -30
> BTRFS error (device dm-0 state EA): open_ctree failed: -2
> 
> 
> Gemini said
> 
> The Btrfs "Ghost" Accounting When you added discard=async to your fstab
> (or remounted with it), you told the Btrfs kernel module to start a specific
> background thread.
> Btrfs's Perspective: "The user told me to use async discard. I will now start a
> list of every extent we delete so I can 'trim' them later in the background."
> The Problem: Btrfs doesn't check if the underlying dm-crypt device actually
> supports discards before it starts its own internal accounting.
> The Result: Btrfs started tracking a massive list of "extents to be discarded"
> in its memory and metadata.
> 
> 2. The "No Such Entry" (-2) Race Condition The crash didn't happen because a
> command hit a drive; it happened because of a logic race inside the kernel's
> Btrfs code:
> The Balance Thread: You were running a balance. This thread moves data from "Old
> Block A" to "New Block B."
> The Discard Thread: Because discard=async was on, the discard thread saw "Old
> Block A" get freed. It put "Old Block A" on its "to-do list."
> The Metadata Conflict: The balance thread finished moving the data and
> successfully deleted the reference to "Old Block A" from the extent tree.
> The Crash: A few milliseconds later, the async discard thread woke up and tried
> to "pin" or "process" the metadata for "Old Block A." It looked in the tree,
> found nothing (because the balance already deleted it), and threw an ENOENT
> (Error -2: No such entry).
> Btrfs panicked: "Wait, I was told to discard this block, but it doesn't exist in
> my records anymore! Something is inconsistent!" → Transaction Abort.
> 
> more details:
> backuproot didn't work (read write)
> I was forced to run
> btrfstune --convert-from-block-group-tree /dev/mapper/crypt_bcache0
> because
> When you ran btrfs check --clear-space-cache v2, the tool did exactly
> what it was supposed to do: it deleted the Free Space Tree and removed
> the FREE_SPACE_TREE flag from your superblock.
> The Conflict: Your 23TB array was formatted with the modern
> block-group-tree feature (which speeds up mounting).
> The Kernel Rule: The Btrfs kernel code explicitly dictates: If the Block
> Group Tree is enabled, the Free Space Tree MUST also be enabled. * The
> Crash: Because the FREE_SPACE_TREE flag is now missing, the kernel sees
> an "illegal" superblock state and throws a fatal -22 error, refusing to
> proceed to the mount options.
> 
> This was vexing, hours lost removing the block group tree.
> and when it was finally finished, 
> mount -t btrfs -o skip_balance /dev/mapper/crypt_bcache0 /mnt/btrfs_bigbackup/
> did run, but crashed as above
> 
> Now doing a repair in case it can salvage things.
> 
> Marc
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>  
> Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

WARNING: multiple messages have this Message-ID (diff)
From: Marc MERLIN <marc_btrfs@merlins.org>
To: linux-btrfs <linux-btrfs@vger.kernel.org>,
	Boris Burkov <boris@bur.io>, Josef Bacik <josef@toxicpanda.com>,
	QuWenruo <wqu@suse.com>, Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Filipe Manana <fdmanana@kernel.org>
Cc: Chris Murphy <lists@colorremedies.com>,
	Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	Roman Mamedov <rm@romanrm.net>, To: Su Yue <Damenly_Su@gmx.com>,
	Su Yue <suy.fnst@cn.fujitsu.com>;
Subject: Re: BTRFS discard crash: failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2 6.11.2)
Date: Sat, 11 Apr 2026 18:57:33 -0700	[thread overview]
Message-ID: <adr8DUbqkdFyvZsf@merlins.org> (raw)
Message-ID: <20260412015733.kEBUYDzKF_rcFGmP_K_ORhRc1Wo7XsT58I7CCEbxKII@z> (raw)
In-Reply-To: <adnBhWfJQ1n3hZC8@merlins.org>

So, btrfs repair is weird. The "real one" just OOM's even if I give it
64GB of swap, because I guess it wants gigabytes of RAM in huge chunks
that can't be swapped.

moremagic:~# btrfs check --repair /dev/mapper/crypt_bcache0
enabling repair mode
WARNING:

        Do not use --repair unless you are advised to do so by a developer
        or an experienced user, and then only after having accepted that no
        fsck can successfully repair all types of filesystem corruption. E.g.
        some software or hardware bugs can fatally damage a volume.
        The operation will start in 10 seconds.
        Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1
Starting repair.
Opening filesystem to check...
Checking filesystem on /dev/mapper/crypt_bcache0
UUID: a97dec85-a0d5-42ab-a0ef-e9b7479fbe43
[1/8] checking log skipped (none written)
[2/8] checking root items
Fixed 0 roots.
[3/8] checking extents
super bytes used 18659783561216 mismatches actual used 18659783544832
super bytes used 18659783544832 mismatches actual used 18659783593984
No device size related problem found
[4/8] checking free space cache
[5/8] checking fs roots
Killed

But this is strange lowmem is giving a totally different result, it may
just be entirely trashing my FS as I write this, but if it doesn't
succeed, I need to wipe it and start over anyway
moremagic:~# btrfs check --mode lowmem /dev/mapper/crypt_bcache0
Opening filesystem to check...
Checking filesystem on /dev/mapper/crypt_bcache0
UUID: a97dec85-a0d5-42ab-a0ef-e9b7479fbe43
[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
ERROR: extent[16842752 168 4096] has unknown ref type: 172
ERROR: extent[16855040 168 4096] has unknown ref type: 172
ERROR: extent[1121296384 168 8192] has unknown ref type: 172

Gemnini said that's the simple quotas not supported in lowmem

moremagic:~# btrfstune --remove-simple-quota /dev/mapper/crypt_bcache0
bad eb member end: ptr 0x4000 start 15495212859392 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16568940527616 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16133001379840 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15495296155648 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15641227673600 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16027774648320 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16217827999744 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16217830113280 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15505949786112 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16357413355520 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15495414267904 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 3133210902528 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16027775500288 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16217837060096 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15181688930304 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 15181689208832 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16349905764352 member offset 16384 size 1
bad eb member end: ptr 0x4000 start 16349906010112 member offset 16384 size 1

So basically it looks like I'm kind of screwed unless I move the array
(remote, can't get to it right now) to a system with 64GB of RAM or whatnot.

Back to the original point, this is kind of sad
1) mdadm raid5 can't close the right hole while intent bitmaps are on
2) native btrfs raid5 has never been stable enough to be used in production
3) RST should eventually be, but nothing I read says it is today
4) btrfs check --repair still apparently requires at least as much RAM
as the filesystem size, which is "problematic"
5) --lowmem is out of date and not usable.

So, am I pretty much screwed and need to wipe, restart, and speed
probably weeks trying to resync all that data over the internet, or is
there a way out?

Thanks,
Marc

On Fri, Apr 10, 2026 at 08:35:33PM -0700, Marc MERLIN wrote:
> [Is there a more appropriate way to report FS corruption? Looks like
> Emails to just linux-btrfs@vger.kernel.org do not get seen amongst all
> the patches hiding a normal Email]
> 
> Howdy,
> 
> I had btfrs filesystem on top of raid5 with 5 spinning drives.
> I mistakenly enabled discard by mistake which caused a crash when the discard thread tried
> to run (no discard on those drives)
> Kernel 6.12
> 
> I worked on recovery using gemini 3.0 pro, mounting read only is fine, but I need read write
> or will waste days (probably weeks) recreating this entire 20TB+ backup over the internet
> 
> I'm not qualified to say if everything Gemini said was correct, but I think summary is:
> 1) discard can apparently kill a filesystem when it's hard drives below (it did for me)
> 2) -o skip_balance,usebackuproot didn't help
> 3) no way to mount after space cache has been cleared and block-group-tree is enabled
> 4) still no way to mount read write after removing block-group-tree
> 
> It started with:
> [23345.326321] BTRFS: error (device dm-0 state A) in do_free_extent_accounting:2996: errno=-2 No such entry
> [23345.336394] BTRFS error (device dm-0 state EA): failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2
> [23345.350299] BTRFS: error (device dm-0 state EA) in btrfs_run_delayed_refs:2215: errno=-2 No such entry
> [23345.360154] BTRFS warning (device dm-0 state EA):
> 
> I ended up with:
> 
> moremagic:~# mount -t btrfs -o rw,skip_balance,space_cache=v2,clear_cache /dev/mapper/crypt_bcache0 /mnt/btrfs_bigbackup
> BTRFS: device label DS6 devid 1 transid 296950 /dev/mapper/crypt_bcache0 (251:0) scanned by mount (6029)
> BTRFS info (device dm-0): first mount of filesystem a97dec85-a0d5-42ab-a0ef-e9b7479fbe43
> BTRFS info (device dm-0): using crc32c (crc32c-generic) checksum algorithm
> BTRFS warning (device dm-0): read-write for sector size 4096 with page size 16384 is experimental
> BTRFS info (device dm-0): bdev /dev/mapper/crypt_bcache0 errs: wr 0, rd 0, flush 0, corrupt 5074, gen 0
> ------------[ cut here ]------------
> BTRFS: Transaction aborted (error -2)
> WARNING: CPU: 3 PID: 6029 at fs/btrfs/extent-tree.c:2996 __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
> Modules linked in: dm_crypt dm_mod bcache raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xt_MASQUERADE ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_conntrack xt_LOG nf_log_syslog nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables rfcomm algif_hash algif_skcipher af_alg bnep cp210x brcmfmac_wcc binfmt_misc usbserial hci_uart brcmfmac btbcm vc4 snd_soc_hdmi_codec brcmutil bluetooth drm_display_helper cfg80211 cec drm_dma_helper rpi_hevc_dec ecdh_generic v4l2_mem2mem ecc snd_soc_core pisp_be videobuf2_dma_contig v3d videobuf2_memops videobuf2_v4l2 gpu_sched rfkill videodev drm_shmem_helper snd_compress snd_pcm_dmaengine snd_pcm videobuf2_common rp1_pio snd_timer snd drm_kms_helper mc raspberrypi_gpiomem rp1_fw sg sch_fq_codel ecryptfs fuse drm drm_panel_orientation_quirks backlight nfnetlink ip_tables x_tables raid1 aes_ce_blk aes_ce_cipher ghash_ce gf128mul libaes sha2_ce spidev sha256_arm64 sha1_ce raspberrypi_hwmon sha1_generic ahci i2c_brcmstb spi_bcm2835
>  md_mod gpio_keys libahci pwm_fan rp1_adc libata rp1_mailbox nvmem_rmem uio_pdrv_genirq uio btrfs blake2b_generic xor xor_neon raid6_pq zram lz4_compress ipv6
> CPU: 3 UID: 0 PID: 6029 Comm: mount Not tainted 6.12.47+rpt-rpi-2712 #1  Debian 1:6.12.47-1+rpt1
> Hardware name: Raspberry Pi 5 Model B Rev 1.1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
> lr : __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
> sp : ffffc000868bb680
> x29: ffffc000868bb720 x28: 0000000000000000 x27: 0000000000002f02
> x26: 000000000000007f x25: ffff8001de833aa0 x24: 0000000000004000
> x23: 0000000000000000 x22: ffff800102b64e70 x21: 0000000000004000
> x20: 00000e1a4bb88000 x19: 00000000fffffffe x18: 0000000000000000
> x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
> x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
> x11: 00000000000000c0 x10: 0000000000001a40 x9 : ffffd06fce4e06c0
> x8 : ffff80011f56e0a0 x7 : 000000042f72a7bd x6 : 0000000000000039
> x5 : 0000000000000001 x4 : 0000000000001ab0 x3 : 0000000000000804
> x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff80011f56c600
> Call trace:
>  __btrfs_free_extent.isra.0+0x13a0/0x14a0 [btrfs]
>  __btrfs_run_delayed_refs+0x508/0xec0 [btrfs]
>  btrfs_run_delayed_refs+0x48/0x198 [btrfs]
>  btrfs_commit_transaction+0x88/0xe20 [btrfs]
>  btrfs_recover_relocation+0x55c/0x5d0 [btrfs]
>  btrfs_start_pre_rw_mount+0x1d4/0x470 [btrfs]
>  open_ctree+0x101c/0x13b8 [btrfs]
>  btrfs_get_tree+0x5b4/0x800 [btrfs]
>  vfs_get_tree+0x30/0x108
>  fc_mount+0x20/0x68
>  btrfs_get_tree+0x238/0x800 [btrfs]
>  vfs_get_tree+0x30/0x108
>  vfs_cmd_create+0x58/0xf8
>  __arm64_sys_fsconfig+0x444/0x5b8
>  invoke_syscall+0x50/0x120
>  el0_svc_common.constprop.0+0x48/0xf0
>  do_el0_svc+0x24/0x38
>  el0_svc+0x30/0xf8
>  el0t_64_sync_handler+0x120/0x130
>  el0t_64_sync+0x190/0x198
> ---[ end trace 0000000000000000 ]---
> BTRFS: error (device dm-0 state A) in do_free_extent_accounting:2996: errno=-2 No such entry
> BTRFS error (device dm-0 state EA): failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2
> BTRFS: error (device dm-0 state EA) in btrfs_run_delayed_refs:2215: errno=-2 No such entry
> BTRFS warning (device dm-0 state EA): failed to recover relocation: -2
> BTRFS error (device dm-0 state EA): commit super ret -30
> BTRFS error (device dm-0 state EA): open_ctree failed: -2
> 
> 
> Gemini said
> 
> The Btrfs "Ghost" Accounting When you added discard=async to your fstab
> (or remounted with it), you told the Btrfs kernel module to start a specific
> background thread.
> Btrfs's Perspective: "The user told me to use async discard. I will now start a
> list of every extent we delete so I can 'trim' them later in the background."
> The Problem: Btrfs doesn't check if the underlying dm-crypt device actually
> supports discards before it starts its own internal accounting.
> The Result: Btrfs started tracking a massive list of "extents to be discarded"
> in its memory and metadata.
> 
> 2. The "No Such Entry" (-2) Race Condition The crash didn't happen because a
> command hit a drive; it happened because of a logic race inside the kernel's
> Btrfs code:
> The Balance Thread: You were running a balance. This thread moves data from "Old
> Block A" to "New Block B."
> The Discard Thread: Because discard=async was on, the discard thread saw "Old
> Block A" get freed. It put "Old Block A" on its "to-do list."
> The Metadata Conflict: The balance thread finished moving the data and
> successfully deleted the reference to "Old Block A" from the extent tree.
> The Crash: A few milliseconds later, the async discard thread woke up and tried
> to "pin" or "process" the metadata for "Old Block A." It looked in the tree,
> found nothing (because the balance already deleted it), and threw an ENOENT
> (Error -2: No such entry).
> Btrfs panicked: "Wait, I was told to discard this block, but it doesn't exist in
> my records anymore! Something is inconsistent!" → Transaction Abort.
> 
> more details:
> backuproot didn't work (read write)
> I was forced to run
> btrfstune --convert-from-block-group-tree /dev/mapper/crypt_bcache0
> because
> When you ran btrfs check --clear-space-cache v2, the tool did exactly
> what it was supposed to do: it deleted the Free Space Tree and removed
> the FREE_SPACE_TREE flag from your superblock.
> The Conflict: Your 23TB array was formatted with the modern
> block-group-tree feature (which speeds up mounting).
> The Kernel Rule: The Btrfs kernel code explicitly dictates: If the Block
> Group Tree is enabled, the Free Space Tree MUST also be enabled. * The
> Crash: Because the FREE_SPACE_TREE flag is now missing, the kernel sees
> an "illegal" superblock state and throws a fatal -22 error, refusing to
> proceed to the mount options.
> 
> This was vexing, hours lost removing the block group tree.
> and when it was finally finished, 
> mount -t btrfs -o skip_balance /dev/mapper/crypt_bcache0 /mnt/btrfs_bigbackup/
> did run, but crashed as above
> 
> Now doing a repair in case it can salvage things.
> 
> Marc
> -- 
> "A mouse is a device used to point at the xterm you want to type in" - A.S.R.
>  
> Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08
> 

-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
 
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

  parent reply	other threads:[~2026-04-12  1:57 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-11  3:35 BTRFS discard crash: failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2 6.11.2) Marc MERLIN
2026-04-11  4:47 ` Qu Wenruo
2026-04-11 12:04 ` Roman Mamedov
2026-04-11 16:22   ` Marc MERLIN
2026-04-12  1:57 ` Marc MERLIN [this message]
2026-04-12  1:57   ` Marc MERLIN
2026-04-12  2:28   ` Marc MERLIN
2026-04-12  2:28     ` Marc MERLIN
2026-04-12 17:38     ` Marc MERLIN
2026-04-12 17:38       ` Marc MERLIN
2026-04-12 20:21       ` Marc MERLIN
2026-04-12 20:21         ` Marc MERLIN
2026-04-13  2:14         ` Roman Mamedov
2026-04-13  2:34           ` Marc MERLIN
2026-04-13  2:34             ` Marc MERLIN
2026-04-13 17:52 ` Simple quota unsafe? RIP: 0010:__btrfs_free_extent.isra.0+0xc41/0x1020 [btrfs] / do_free_extent_accounting:2999: errno=-2 No such entry Marc MERLIN
2026-04-13 17:52   ` Marc MERLIN
2026-04-13 18:47   ` Boris Burkov
2026-04-13 19:40     ` Marc MERLIN
2026-04-13 19:40       ` Marc MERLIN
2026-04-15  5:21       ` Marc MERLIN
2026-04-15 17:05         ` Boris Burkov
2026-04-15 17:59           ` Marc MERLIN
2026-04-15 18:44             ` Boris Burkov
2026-04-15 20:22               ` Marc MERLIN
2026-04-15 22:36                 ` Boris Burkov
2026-04-15 22:55                   ` Marc MERLIN
2026-04-15 23:25                     ` Boris Burkov
2026-04-16  0:55                       ` Marc MERLIN
2026-04-16  1:22                         ` Boris Burkov
2026-04-16  0:45                     ` Boris Burkov
2026-04-16  1:08                       ` Marc MERLIN
2026-04-16  1:25                         ` Boris Burkov
2026-04-16 16:51                           ` Simple quota unsafe (FIXED: btrfstune --remove-simple-quota worked) Marc MERLIN
2026-04-16 17:21                           ` Simple quota unsafe? RIP: 0010:__btrfs_free_extent.isra.0+0xc41/0x1020 [btrfs] / do_free_extent_accounting:2999: errno=-2 No such entry Marc MERLIN
2026-04-16 21:36                             ` Boris Burkov
2026-04-16 21:47                               ` Marc MERLIN
2026-04-17 21:51                                 ` Boris Burkov
2026-04-17 22:37                                   ` Marc MERLIN
2026-04-17 23:16                                     ` Boris Burkov
2026-04-18  0:18                                       ` Marc MERLIN
2026-04-17  3:43 ` BTRFS discard crash: failed to run delayed ref for logical 15506102321152 num_bytes 16384 type 182 action 2 ref_mod 1: -2 6.11.2) David Disseldorp
2026-04-17  5:19   ` Marc MERLIN

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adr8DUbqkdFyvZsf@merlins.org \
    --to=marc@merlins.org \
    --cc=Damenly_Su@gmx.com \
    --cc=boris@bur.io \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=fdmanana@kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=rm@romanrm.net \
    --cc=suy.fnst@cn.fujitsu.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox