From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: Stefan N <stefannnau@gmail.com>
Cc: Qu Wenruo <wqu@suse.com>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>,
Josef Bacik <josef@toxicpanda.com>,
Filipe Manana <fdmanana@kernel.org>
Subject: Re: Out of space loop: skip_balance not working
Date: Sun, 23 Jul 2023 15:23:09 +0800 [thread overview]
Message-ID: <c4b714cc-100c-0099-c498-896b815b8e5f@gmx.com> (raw)
In-Reply-To: <CA+W5K0oDRo2LZMiUiysYXpcpmfXTvS27hPdjm1pzq4kfq9=vdQ@mail.gmail.com>
On 2023/7/23 14:21, Stefan N wrote:
> Hi Qu,
>
> Many thanks for that new patch, that's done the job.
>
> As I've now got 3 disks with plenty of space, I've converted
> metadata/system to RAID1C3 to mitigate the issue until the 4th disk has
> finished replacing.
>
> Hopefully a fix for the underlying issue is applied by the time I start
> running low on space, though looking at the usage now (below) it looks
> like I might never run out again 😂 judging by this, is it possible the
> issue I had only existed because I was on LTS with kernel 5,15, and 6.2
> might already have fixed the under allocation issue that caused this?
Unfortunately I'm not an export on the extent allocator nor ENOSPC
situations, thus I can not help much on the root cauese.
Filipe and Josef may provide mode helps on this.
Thanks,
Qu
>
> Many thanks again,
>
> Stefan
>
> Data,RAID6: Size:65.41TiB, Used:65.22TiB (99.70%)
> /dev/sdf 11.49TiB
> /dev/sdg 10.91TiB
> /dev/sdd 11.46TiB
> /dev/sdj 10.88TiB
> /dev/sde 10.88TiB
> /dev/sdc 10.88TiB
> /dev/sdh 11.47TiB
> /dev/sdb 10.89TiB
>
> Metadata,RAID1C3: Size:133.00GiB, Used:77.74GiB (58.45%)
> /dev/sdf 133.00GiB
> /dev/sdd 133.00GiB
> /dev/sdh 133.00GiB
>
> System,RAID1C3: Size:32.00MiB, Used:5.25MiB (16.41%)
> /dev/sdf 32.00MiB
> /dev/sdd 32.00MiB
> /dev/sdh 32.00MiB
>
> Unallocated:
> /dev/sda 10.91TiB <-- replace target (in progress)
> /dev/sdf 4.75TiB
> /dev/sdg 5.41GiB
> /dev/sdd 4.78TiB
> /dev/sdj 36.49GiB
> /dev/sde 38.53GiB
> /dev/sdc 36.33GiB
> /dev/sdh 4.77TiB
> /dev/sdb 26.01GiB
>
>
> On Sat, 22 Jul 2023 at 19:38, Qu Wenruo <quwenruo.btrfs@gmx.com
> <mailto:quwenruo.btrfs@gmx.com>> wrote:
>
>
>
> On 2023/7/22 13:28, Stefan N wrote:
> > Hi again Qu,
> >
> > Thanks for all your help last month, I managed to get things going
> > again and have been slowly adding new disks, but have now ended up
> > with a similar but slightly more complicated problem I need some more
> > assistance with.
> >
> > Since last time: I used loop devices to get the fs operational again,
> > then deleted some files to create space, removed the loop devices,
> > successfully used btrfs replace to replace 3x 12tb disks with 18tbs,
> > and moved to space cache v2 in the hope it'd prevent future issues.
> >
> > The problem: during the 4th replace operation the metadata issue has
> > recurred, the first time self correcting when remounted, but this
> > second time has resulted in a similar paradox to last time. I've
> > rebooted into the patched kernel from last month, but the same
> > solution is now ineffective due to the system failing to detect the
> > replace target, despite no disks having been removed nor changing
> from
> > /dev/sda and /dev/sdl during the reboots.
> >
> > During the replace process the disks were in use, and while after
> > there's plenty of space for data it seems enough was written to fill
> > metadata again. In hindsight I should have left the 4 loop devices in
> > place until the replaces had completed to satisfy the RAID1C4
> > requirement for the metadata, as despite deleting files data has not
> > been freed from the existing 12tb disks.
> >
> > The 'missing' replace target is:
> > Disk /dev/sda: 16.37 TiB, 18000207937536 bytes, 35156656128 sectors
>
> The problem seems to be that, replace cancel also needs to commit
> transaction, which is obviously a bad situation during high metadata
> stress.
>
>
> But the root problem is still why we hit ENOSPC, AFAIK Filipe is working
> on this problem.
>
>
> For now, the problem can be more or less worked around by the same
> method, instead of committing transaction we just cancel the current one
> so that you can continue to go with the patched device add.
>
> I have updated the branch to have a new patch, please try if this allows
> you to mount it with "-o degraded" then try cancel and add devices.
>
> https://github.com/adam900710/linux/tree/dev_add_no_commit
> <https://github.com/adam900710/linux/tree/dev_add_no_commit>
>
> Thanks,
> Qu
>
> [...]
> >
> >
> > $ sudo mount -o degraded /mnt/data ; sudo btrfs replace cancel
> > /mnt/data ; sudo btrfs dev add -K -f /dev/loop20 /dev/loop21
> > /dev/loop22 /dev/loop23 /mnt/data ; sudo btrfs fi sync /mnt/data
> > ERROR: error adding device '/dev/loop20': Read-only file system
> > ERROR: error adding device '/dev/loop21': Read-only file system
> > ERROR: error adding device '/dev/loop22': Read-only file system
> > ERROR: error adding device '/dev/loop23': Read-only file system
> > ERROR: Could not sync filesystem: Read-only file system
> > $
> >
> > syslog:
> > BTRFS info (device sdf): using crc32c (crc32c-intel) checksum
> algorithm
> > BTRFS info (device sdf): allowing degraded mounts
> > BTRFS info (device sdf): using free space tree
> > BTRFS info (device sdf): bdev /dev/sdg errs: wr 0, rd 0, flush 0,
> > corrupt 845, gen 0
> > BTRFS info (device sdf): bdev /dev/sde errs: wr 3, rd 7, flush 0,
> > corrupt 0, gen 0
> > BTRFS info (device sdf): bdev /dev/sdc errs: wr 41, rd 0, flush 0,
> > corrupt 0, gen 0
> > BTRFS info (device sdf): cannot continue dev_replace, tgtdev is
> missing
> > BTRFS info (device sdf): you may cancel the operation after
> 'mount -o degraded'
> > BTRFS: Transaction aborted (error -28)
> > WARNING: CPU: 0 PID: 6659 at fs/btrfs/extent-tree.c:3077
> > __btrfs_free_extent+0xa18/0xf50 [btrfs]
> > Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nft_chain_nat
> > xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables
> > nfnetlink br_netfilter bridge stp llc rpcsec_gss_krb5 nfsv4 nfs
> > fscache netfs ipmi_devintf ipmi_msghandler overlay iwlwifi_compat(O)
> > binfmt_misc nls_iso8859_1 intel_rapl_msr snd_hda_codec_realtek
> > snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel
> > snd_intel_dspcfg intel_rapl_common snd_intel_sdw_acpi edac_mce_amd
> > snd_hda_codec kvm_amd snd_hda_core kvm snd_hwdep irqbypass snd_pcm
> > rapl wmi_bmof snd_timer k10temp snd ccp soundcore joydev input_leds
> > mac_hid dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> bonding tls
> > msr nfsd efi_pstore auth_rpcgss nfs_acl lockd grace sunrpc dmi_sysfs
> > ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456
> > async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> > raid6_pq libcrc32c raid1 raid0 multipath linear
> > hid_logitech_hidpp hid_logitech_dj amdgpu hid_generic iommu_v2
> > drm_buddy gpu_sched drm_ttm_helper ttm drm_display_helper uas cec
> > rc_core usbhid hid usb_storage drm_kms_helper syscopyarea sysfillrect
> > sysimgblt crct10dif_pclmul igb crc32_pclmul polyval_clmulni
> > polyval_generic ghash_clmulni_intel dca sha512_ssse3 aesni_intel
> > crypto_simd drm nvme ahci cryptd libahci qlcnic i2c_algo_bit
> nvme_core
> > mpt3sas xhci_pci video raid_class scsi_transport_sas xhci_pci_renesas
> > nvme_common i2c_piix4 wmi
> > CPU: 0 PID: 6659 Comm: btrfs Tainted: G W O
> > 6.2.0-23-generic #23+btrdebug2c
> > Hardware name: To Be Filled By O.E.M. X570M Pro4/X570M Pro4, BIOS
> > P3.70 02/23/2022
> > RIP: 0010:__btrfs_free_extent+0xa18/0xf50 [btrfs]
> > Code: 48 c7 c6 80 19 71 c1 48 8b 78 50 e8 82 57 0e 00 41 b8 01 00 00
> > 00 e9 58 fe ff ff 8b 75 94 48 c7 c7 a8 19 71 c1 e8 d8 92 4d c7
> <0f> 0b
> > e9 64 fb ff ff 8b 7d 90 e8 b9 04 ff ff 84 c0 0f 85 f1 01 00
> > RSP: 0018:ffffb05e4746fa38 EFLAGS: 00010246
> > RAX: 0000000000000000 RBX: 0000b711db1d0000 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> > RBP: ffffb05e4746fad8 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
> > R13: 0000000000000000 R14: ffff88edc031ea90 R15: ffff88edc3ba0230
> > FS: 00007f2b14740d40(0000) GS:ffff88f4e0a00000(0000)
> knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000c000253000 CR3: 00000001e7cc8000 CR4: 00000000003506f0
> > Call Trace:
> > <TASK>
> > run_delayed_tree_ref+0x69/0x1b0 [btrfs]
> > btrfs_run_delayed_refs_for_head+0x3aa/0x520 [btrfs]
> > ? btrfs_create_pending_block_groups+0x280/0x4d0 [btrfs]
> > __btrfs_run_delayed_refs+0xe6/0x1d0 [btrfs]
> > btrfs_run_delayed_refs+0x6d/0x1f0 [btrfs]
> > commit_cowonly_roots+0x1e7/0x240 [btrfs]
> > btrfs_commit_transaction+0x5d2/0xbc0 [btrfs]
> > ? start_transaction+0xc8/0x600 [btrfs]
> > btrfs_dev_replace_cancel+0x168/0x2e0 [btrfs]
> > btrfs_ioctl+0x12ed/0x14d0 [btrfs]
> > ? __handle_mm_fault+0x661/0x720
> > __x64_sys_ioctl+0xa0/0xe0
> > do_syscall_64+0x5b/0x90
> > ? do_user_addr_fault+0x1e8/0x720
> > ? exit_to_user_mode_prepare+0x30/0xb0
> > ? irqentry_exit_to_user_mode+0x9/0x20
> > ? irqentry_exit+0x43/0x50
> > ? exc_page_fault+0x91/0x1b0
> > entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > RIP: 0033:0x7f2b145119ef
> > Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48
> > 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05
> <89> c2
> > 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> > RSP: 002b:00007ffcda96ca10 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2b145119ef
> > RDX: 00007ffcda96ca80 RSI: 00000000ca289435 RDI: 0000000000000003
> > RBP: 0000000000000003 R08: 0000000000021001 R09: 0000000000000000
> > R10: fffffffffffff000 R11: 0000000000000246 R12: 00007ffcda96e7eb
> > R13: 000056092aafbe60 R14: 000056092aab3578 R15: 0000000000000000
> > </TASK>
> > ---[ end trace 0000000000000000 ]---
> > BTRFS info (device sdf: state A): dumping space info:
> > BTRFS info (device sdf: state A): space_info DATA has 219646795776
> > free, is not full
> > BTRFS info (device sdf: state A): space_info total=71845742116864,
> > used=71626091782144, pinned=0, reserved=0, may_use=0,
> readonly=3538944
> > zone_unusable=0
> > BTRFS info (device sdf: state A): space_info METADATA has -536821760
> > free, is full
> > BTRFS info (device sdf: state A): space_info total=83481329664,
> > used=83421233152, pinned=57606144, reserved=2490368,
> > may_use=536821760, readonly=0 zone_unusable=0
> > BTRFS info (device sdf: state A): space_info SYSTEM has 20676608
> free,
> > is not full
> > BTRFS info (device sdf: state A): space_info total=26214400,
> > used=5537792, pinned=0, reserved=0, may_use=0, readonly=0
> > zone_unusable=0
> > BTRFS info (device sdf: state A): global_block_rsv: size 536870912
> > reserved 536805376
> > BTRFS info (device sdf: state A): trans_block_rsv: size 0 reserved 0
> > BTRFS info (device sdf: state A): chunk_block_rsv: size 0 reserved 0
> > BTRFS info (device sdf: state A): delayed_block_rsv: size 0
> reserved 0
> > BTRFS info (device sdf: state A): delayed_refs_rsv: size 523239424
> > reserved 16384
> > BTRFS: error (device sdf: state A) in __btrfs_free_extent:3077:
> > errno=-28 No space left
> > BTRFS info (device sdf: state EA): forced readonly
> > BTRFS error (device sdf: state EA): failed to run delayed ref for
> > logical 201287318437888 num_bytes 16384 type 176 action 2 ref_mod 1:
> > -28
> > BTRFS: error (device sdf: state EA) in btrfs_run_delayed_refs:2151:
> > errno=-28 No space left
> > BTRFS warning (device sdf: state EA): Skipping commit of aborted
> transaction.
> > BTRFS: error (device sdf: state EA) in cleanup_transaction:1986:
> > errno=-28 No space left
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 6659 at fs/btrfs/dev-replace.c:1121
> > btrfs_dev_replace_cancel+0x2b0/0x2e0 [btrfs]
> > Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nft_chain_nat
> > xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 xfrm_user xfrm_algo xt_addrtype nft_compat nf_tables
> > nfnetlink br_netfilter bridge stp llc rpcsec_gss_krb5 nfsv4 nfs
> > fscache netfs ipmi_devintf ipmi_msghandler overlay iwlwifi_compat(O)
> > binfmt_misc nls_iso8859_1 intel_rapl_msr snd_hda_codec_realtek
> > snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel
> > snd_intel_dspcfg intel_rapl_common snd_intel_sdw_acpi edac_mce_amd
> > snd_hda_codec kvm_amd snd_hda_core kvm snd_hwdep irqbypass snd_pcm
> > rapl wmi_bmof snd_timer k10temp snd ccp soundcore joydev input_leds
> > mac_hid dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> bonding tls
> > msr nfsd efi_pstore auth_rpcgss nfs_acl lockd grace sunrpc dmi_sysfs
> > ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456
> > async_raid6_recov async_memcpy async_pq async_xor async_tx xor
> > raid6_pq libcrc32c raid1 raid0 multipath linear
> > 2023-07-22T14:04:29.956673+09:30 ltsnas kernel: [ 422.690184]
> > hid_logitech_hidpp hid_logitech_dj amdgpu hid_generic iommu_v2
> > drm_buddy gpu_sched drm_ttm_helper ttm drm_display_helper uas cec
> > rc_core usbhid hid usb_storage drm_kms_helper syscopyarea sysfillrect
> > sysimgblt crct10dif_pclmul igb crc32_pclmul polyval_clmulni
> > polyval_generic ghash_clmulni_intel dca sha512_ssse3 aesni_intel
> > crypto_simd drm nvme ahci cryptd libahci qlcnic i2c_algo_bit
> nvme_core
> > mpt3sas xhci_pci video raid_class scsi_transport_sas xhci_pci_renesas
> > nvme_common i2c_piix4 wmi
> > CPU: 0 PID: 6659 Comm: btrfs Tainted: G W O
> > 6.2.0-23-generic #23+btrdebug2c
> > Hardware name: To Be Filled By O.E.M. X570M Pro4/X570M Pro4, BIOS
> > P3.70 02/23/2022
> > RIP: 0010:btrfs_dev_replace_cancel+0x2b0/0x2e0 [btrfs]
> > Code: 4c 89 c2 e8 52 3f 02 00 e8 9d 4a 4e c7 e9 35 ff ff ff 4c 89 e7
> > 48 89 45 d0 e8 bc d5 3f c8 48 8b 45 d0 41 89 c5 e9 38 ff ff ff
> <0f> 0b
> > e9 b9 fe ff ff 41 bd e2 ff ff ff e9 26 ff ff ff 48 c7 c2 74
> > RSP: 0018:ffffb05e4746fd58 EFLAGS: 00010286
> > RAX: 00000000ffffffe4 RBX: ffff88edda916000 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> > RBP: ffffb05e4746fd88 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000000 R12: ffff88edda916ab0
> > R13: ffff88eddb627800 R14: ffff88ede7fad000 R15: ffff88edda916ad0
> > FS: 00007f2b14740d40(0000) GS:ffff88f4e0a00000(0000)
> knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 000000c000253000 CR3: 00000001e7cc8000 CR4: 00000000003506f0
> > Call Trace:
> > <TASK>
> > btrfs_ioctl+0x12ed/0x14d0 [btrfs]
> > ? __handle_mm_fault+0x661/0x720
> > __x64_sys_ioctl+0xa0/0xe0
> > do_syscall_64+0x5b/0x90
> > ? do_user_addr_fault+0x1e8/0x720
> > ? exit_to_user_mode_prepare+0x30/0xb0
> > ? irqentry_exit_to_user_mode+0x9/0x20
> > ? irqentry_exit+0x43/0x50
> > ? exc_page_fault+0x91/0x1b0
> > entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > RIP: 0033:0x7f2b145119ef
> > Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48
> > 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05
> <89> c2
> > 3d 00 f0 ff ff 77 18 48 8b 44 24 18 64 48 2b 04 25 28 00 00
> > RSP: 002b:00007ffcda96ca10 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2b145119ef
> > RDX: 00007ffcda96ca80 RSI: 00000000ca289435 RDI: 0000000000000003
> > RBP: 0000000000000003 R08: 0000000000021001 R09: 0000000000000000
> > R10: fffffffffffff000 R11: 0000000000000246 R12: 00007ffcda96e7eb
> > R13: 000056092aafbe60 R14: 000056092aab3578 R15: 0000000000000000
> > </TASK>
> > ---[ end trace 0000000000000000 ]---
> > BTRFS info (device sdf: state EA): suspended dev_replace from
> /dev/sdl
> > (devid 4) to <missing disk> canceled
> > BTRFS error (device sdf: state EA): failed to add disk
> /dev/loop20: -30
> > BTRFS error (device sdf: state EA): failed to add disk
> /dev/loop21: -30
> > BTRFS error (device sdf: state EA): failed to add disk
> /dev/loop22: -30
> > BTRFS error (device sdf: state EA): failed to add disk
> /dev/loop23: -30
> >
> > On Mon, 26 Jun 2023 at 22:28, Stefan N <stefannnau@gmail.com
> <mailto:stefannnau@gmail.com>> wrote:
> >>
> >> Hi Qu,
> >>
> >> Thanks for all the help, I managed to get it mounted and synced with
> >> 5G loops (2G allocated to metadata, 3G unallocated on each).
> >>
> >> I'm able to read existing files, write new files, and any changes
> >> remain after an unmount and remount.
> >>
> >> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data ; sudo
> btrfs
> >> dev add -K -f /dev/loop20 /dev/loop21 /dev/loop22 /dev/loop23
> >> /mnt/data ; sudo btrfs fi sync /mnt/data
> >> $ sudo btrfs fi show
> >> Label: none uuid: abc123
> >> Total devices 12 FS bytes used 64.52TiB
> >> devid 1 size 10.91TiB used 10.89TiB path /dev/sdd
> >> devid 2 size 10.91TiB used 10.89TiB path /dev/sdh
> >> devid 3 size 10.91TiB used 10.89TiB path /dev/sdb
> >> devid 4 size 10.91TiB used 10.89TiB path /dev/sdg
> >> devid 5 size 10.91TiB used 10.89TiB path /dev/sdi
> >> devid 6 size 10.91TiB used 10.89TiB path /dev/sde
> >> devid 7 size 10.91TiB used 10.89TiB path /dev/sdf
> >> devid 8 size 10.91TiB used 10.89TiB path /dev/sdc
> >> devid 9 size 5.00GiB used 2.00GiB path /dev/loop20
> >> devid 10 size 5.00GiB used 2.00GiB path /dev/loop21
> >> devid 11 size 5.00GiB used 2.00GiB path /dev/loop22
> >> devid 12 size 5.00GiB used 2.00GiB path /dev/loop23
> >> $
> >>
> >> I'd be keen to know what you'd suggest for next steps. I have
> two 18T
> >> disks to upgrade two of the existing 12T disks, which could be a
> >> substitute or add them over USB for a while.
> >>
> >> While a random sample of files seem to be perfectly intact, I'd be
> >> keen to verify the integrity to track down any corrupted files.
> >>
> >> Should I perform a scrub before adding/replacing the new disks,
> or can
> >> this be safely done afterwards? e.g. can I safely add 2x18tb, remove
> >> loops, begin scrub, and then remove 2x 12tb when scrub completes?
> >>
> >> See kernel log below:
> >>
> >> kernel: [ 399.272458] BTRFS info (device sdd): using crc32c
> >> (crc32c-intel) checksum algorithm
> >> kernel: [ 399.272476] BTRFS info (device sdd): disk space
> caching is enabled
> >> kernel: [ 404.855750] BTRFS info (device sdd): bdev /dev/sdh
> errs: wr
> >> 0, rd 0, flush 0, corrupt 845, gen 0
> >> kernel: [ 404.855766] BTRFS info (device sdd): bdev /dev/sdb
> errs: wr
> >> 41089, rd 1556, flush 0, corrupt 0, gen 0
> >> kernel: [ 404.855778] BTRFS info (device sdd): bdev /dev/sdi
> errs: wr
> >> 3, rd 7, flush 0, corrupt 0, gen 0
> >> kernel: [ 404.855785] BTRFS info (device sdd): bdev /dev/sde
> errs: wr
> >> 41, rd 0, flush 0, corrupt 0, gen 0
> >> kernel: [ 630.844173] BTRFS info (device sdd): balance: resume
> skipped
> >> kernel: [ 630.844185] BTRFS info (device sdd): checking UUID tree
> >> kernel: [ 630.871787] BTRFS info (device sdd): disk added
> /dev/loop20
> >> kernel: [ 630.881223] BTRFS info (device sdd): disk added
> /dev/loop21
> >> kernel: [ 630.888817] BTRFS info (device sdd): disk added
> /dev/loop22
> >> kernel: [ 630.896302] BTRFS info (device sdd): disk added
> /dev/loop23
> >> kernel: [ 846.849616] INFO: task btrfs-uuid:4834 blocked for more
> >> than 120 seconds.
> >> kernel: [ 846.849660] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 846.849693] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 846.849725] task:btrfs-uuid state:D stack:0
> >> pid:4834 ppid:2 flags:0x00004000
> >> kernel: [ 846.849735] Call Trace:
> >> kernel: [ 846.849739] <TASK>
> >> kernel: [ 846.849747] __schedule+0x2aa/0x610
> >> kernel: [ 846.849761] schedule+0x63/0x110
> >> kernel: [ 846.849769] wait_current_trans+0x100/0x160 [btrfs]
> >> kernel: [ 846.849908] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 846.849920] start_transaction+0x28b/0x600 [btrfs]
> >> kernel: [ 846.850057] btrfs_start_transaction+0x1e/0x30 [btrfs]
> >> kernel: [ 846.850191] btrfs_uuid_scan_kthread+0x314/0x420 [btrfs]
> >> kernel: [ 846.850359] ?
> __pfx_btrfs_uuid_rescan_kthread+0x10/0x10 [btrfs]
> >> kernel: [ 846.850487] btrfs_uuid_rescan_kthread+0x20/0x70 [btrfs]
> >> kernel: [ 846.850614] kthread+0xe9/0x110
> >> kernel: [ 846.850623] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 846.850631] ret_from_fork+0x2c/0x50
> >> kernel: [ 846.850642] </TASK>
> >> kernel: [ 846.850645] INFO: task btrfs:4850 blocked for more
> than 120 seconds.
> >> kernel: [ 846.850676] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 846.850707] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 846.850738] task:btrfs state:D stack:0
> >> pid:4850 ppid:4849 flags:0x00000002
> >> kernel: [ 846.850746] Call Trace:
> >> kernel: [ 846.850749] <TASK>
> >> kernel: [ 846.850752] __schedule+0x2aa/0x610
> >> kernel: [ 846.850760] schedule+0x63/0x110
> >> kernel: [ 846.850765] btrfs_commit_transaction+0x9b7/0xbc0 [btrfs]
> >> kernel: [ 846.850899] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 846.850908] btrfs_sync_fs+0x5a/0x1b0 [btrfs]
> >> kernel: [ 846.851027] btrfs_ioctl+0x643/0x14d0 [btrfs]
> >> kernel: [ 846.851186] ? putname+0x5d/0x80
> >> kernel: [ 846.851195] ? do_sys_openat2+0xab/0x180
> >> kernel: [ 846.851203] ? exit_to_user_mode_prepare+0x30/0xb0
> >> kernel: [ 846.851213] __x64_sys_ioctl+0xa0/0xe0
> >> kernel: [ 846.851221] do_syscall_64+0x5b/0x90
> >> kernel: [ 846.851229] ? exc_page_fault+0x91/0x1b0
> >> kernel: [ 846.851236] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >> kernel: [ 846.851243] RIP: 0033:0x7fbf339119ef
> >> kernel: [ 846.851249] RSP: 002b:00007ffd58427660 EFLAGS: 00000246
> >> ORIG_RAX: 0000000000000010
> >> kernel: [ 846.851255] RAX: ffffffffffffffda RBX: 0000000000000003
> >> RCX: 00007fbf339119ef
> >> kernel: [ 846.851259] RDX: 0000000000000000 RSI: 0000000000009408
> >> RDI: 0000000000000003
> >> kernel: [ 846.851263] RBP: 0000000000000007 R08: 0000000000000000
> >> R09: 0000000000000000
> >> kernel: [ 846.851266] R10: 0000000000000000 R11: 0000000000000246
> >> R12: 00007fbf339f642c
> >> kernel: [ 846.851269] R13: 0000000000000001 R14: 0000557384b29578
> >> R15: 0000000000000000
> >> kernel: [ 846.851277] </TASK>
> >> kernel: [ 967.681770] INFO: task btrfs-uuid:4834 blocked for more
> >> than 241 seconds.
> >> kernel: [ 967.681818] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 967.681852] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 967.681884] task:btrfs-uuid state:D stack:0
> >> pid:4834 ppid:2 flags:0x00004000
> >> kernel: [ 967.681895] Call Trace:
> >> kernel: [ 967.681899] <TASK>
> >> kernel: [ 967.681907] __schedule+0x2aa/0x610
> >> kernel: [ 967.681922] schedule+0x63/0x110
> >> kernel: [ 967.681931] wait_current_trans+0x100/0x160 [btrfs]
> >> kernel: [ 967.682070] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 967.682082] start_transaction+0x28b/0x600 [btrfs]
> >> kernel: [ 967.682219] btrfs_start_transaction+0x1e/0x30 [btrfs]
> >> kernel: [ 967.682353] btrfs_uuid_scan_kthread+0x314/0x420 [btrfs]
> >> kernel: [ 967.682519] ?
> __pfx_btrfs_uuid_rescan_kthread+0x10/0x10 [btrfs]
> >> kernel: [ 967.682645] btrfs_uuid_rescan_kthread+0x20/0x70 [btrfs]
> >> kernel: [ 967.682728] kthread+0xe9/0x110
> >> kernel: [ 967.682734] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 967.682739] ret_from_fork+0x2c/0x50
> >> kernel: [ 967.682746] </TASK>
> >> kernel: [ 967.682749] INFO: task btrfs:4850 blocked for more
> than 241 seconds.
> >> kernel: [ 967.682771] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 967.682793] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 967.682815] task:btrfs state:D stack:0
> >> pid:4850 ppid:4849 flags:0x00000002
> >> kernel: [ 967.682820] Call Trace:
> >> kernel: [ 967.682822] <TASK>
> >> kernel: [ 967.682824] __schedule+0x2aa/0x610
> >> kernel: [ 967.682829] schedule+0x63/0x110
> >> kernel: [ 967.682832] btrfs_commit_transaction+0x9b7/0xbc0 [btrfs]
> >> kernel: [ 967.682918] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 967.682923] btrfs_sync_fs+0x5a/0x1b0 [btrfs]
> >> kernel: [ 967.682999] btrfs_ioctl+0x643/0x14d0 [btrfs]
> >> kernel: [ 967.683085] ? putname+0x5d/0x80
> >> kernel: [ 967.683091] ? do_sys_openat2+0xab/0x180
> >> kernel: [ 967.683096] ? exit_to_user_mode_prepare+0x30/0xb0
> >> kernel: [ 967.683103] __x64_sys_ioctl+0xa0/0xe0
> >> kernel: [ 967.683107] do_syscall_64+0x5b/0x90
> >> kernel: [ 967.683112] ? exc_page_fault+0x91/0x1b0
> >> kernel: [ 967.683116] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >> kernel: [ 967.683121] RIP: 0033:0x7fbf339119ef
> >> kernel: [ 967.683124] RSP: 002b:00007ffd58427660 EFLAGS: 00000246
> >> ORIG_RAX: 0000000000000010
> >> kernel: [ 967.683128] RAX: ffffffffffffffda RBX: 0000000000000003
> >> RCX: 00007fbf339119ef
> >> kernel: [ 967.683130] RDX: 0000000000000000 RSI: 0000000000009408
> >> RDI: 0000000000000003
> >> kernel: [ 967.683132] RBP: 0000000000000007 R08: 0000000000000000
> >> R09: 0000000000000000
> >> kernel: [ 967.683134] R10: 0000000000000000 R11: 0000000000000246
> >> R12: 00007fbf339f642c
> >> kernel: [ 967.683136] R13: 0000000000000001 R14: 0000557384b29578
> >> R15: 0000000000000000
> >> kernel: [ 967.683141] </TASK>
> >> kernel: [ 1088.519959] INFO: task btrfs-uuid:4834 blocked for more
> >> than 362 seconds.
> >> kernel: [ 1088.520006] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 1088.520039] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 1088.520071] task:btrfs-uuid state:D stack:0
> >> pid:4834 ppid:2 flags:0x00004000
> >> kernel: [ 1088.520082] Call Trace:
> >> kernel: [ 1088.520087] <TASK>
> >> kernel: [ 1088.520094] __schedule+0x2aa/0x610
> >> kernel: [ 1088.520108] schedule+0x63/0x110
> >> kernel: [ 1088.520117] wait_current_trans+0x100/0x160 [btrfs]
> >> kernel: [ 1088.520257] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 1088.520269] start_transaction+0x28b/0x600 [btrfs]
> >> kernel: [ 1088.520406] btrfs_start_transaction+0x1e/0x30 [btrfs]
> >> kernel: [ 1088.520539] btrfs_uuid_scan_kthread+0x314/0x420 [btrfs]
> >> kernel: [ 1088.520706] ?
> __pfx_btrfs_uuid_rescan_kthread+0x10/0x10 [btrfs]
> >> kernel: [ 1088.520834] btrfs_uuid_rescan_kthread+0x20/0x70 [btrfs]
> >> kernel: [ 1088.520961] kthread+0xe9/0x110
> >> kernel: [ 1088.520969] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 1088.520977] ret_from_fork+0x2c/0x50
> >> kernel: [ 1088.520987] </TASK>
> >> kernel: [ 1088.520990] INFO: task btrfs:4850 blocked for more
> than 362 seconds.
> >> kernel: [ 1088.521021] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 1088.521052] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 1088.521084] task:btrfs state:D stack:0
> >> pid:4850 ppid:4849 flags:0x00000002
> >> kernel: [ 1088.521092] Call Trace:
> >> kernel: [ 1088.521095] <TASK>
> >> kernel: [ 1088.521098] __schedule+0x2aa/0x610
> >> kernel: [ 1088.521106] schedule+0x63/0x110
> >> kernel: [ 1088.521111] btrfs_commit_transaction+0x9b7/0xbc0 [btrfs]
> >> kernel: [ 1088.521245] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 1088.521254] btrfs_sync_fs+0x5a/0x1b0 [btrfs]
> >> kernel: [ 1088.521372] btrfs_ioctl+0x643/0x14d0 [btrfs]
> >> kernel: [ 1088.521530] ? putname+0x5d/0x80
> >> kernel: [ 1088.521539] ? do_sys_openat2+0xab/0x180
> >> kernel: [ 1088.521548] ? exit_to_user_mode_prepare+0x30/0xb0
> >> kernel: [ 1088.521559] __x64_sys_ioctl+0xa0/0xe0
> >> kernel: [ 1088.521567] do_syscall_64+0x5b/0x90
> >> kernel: [ 1088.521575] ? exc_page_fault+0x91/0x1b0
> >> kernel: [ 1088.521582] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >> kernel: [ 1088.521589] RIP: 0033:0x7fbf339119ef
> >> kernel: [ 1088.521595] RSP: 002b:00007ffd58427660 EFLAGS: 00000246
> >> ORIG_RAX: 0000000000000010
> >> kernel: [ 1088.521602] RAX: ffffffffffffffda RBX: 0000000000000003
> >> RCX: 00007fbf339119ef
> >> kernel: [ 1088.521606] RDX: 0000000000000000 RSI: 0000000000009408
> >> RDI: 0000000000000003
> >> kernel: [ 1088.521610] RBP: 0000000000000007 R08: 0000000000000000
> >> R09: 0000000000000000
> >> kernel: [ 1088.521613] R10: 0000000000000000 R11: 0000000000000246
> >> R12: 00007fbf339f642c
> >> kernel: [ 1088.521616] R13: 0000000000000001 R14: 0000557384b29578
> >> R15: 0000000000000000
> >> kernel: [ 1088.521626] </TASK>
> >> kernel: [ 1209.357423] INFO: task btrfs-uuid:4834 blocked for more
> >> than 483 seconds.
> >> kernel: [ 1209.357473] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 1209.357507] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 1209.357540] task:btrfs-uuid state:D stack:0
> >> pid:4834 ppid:2 flags:0x00004000
> >> kernel: [ 1209.357551] Call Trace:
> >> kernel: [ 1209.357555] <TASK>
> >> kernel: [ 1209.357563] __schedule+0x2aa/0x610
> >> kernel: [ 1209.357577] schedule+0x63/0x110
> >> kernel: [ 1209.357597] wait_current_trans+0x100/0x160 [btrfs]
> >> kernel: [ 1209.357738] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 1209.357750] start_transaction+0x28b/0x600 [btrfs]
> >> kernel: [ 1209.357887] btrfs_start_transaction+0x1e/0x30 [btrfs]
> >> kernel: [ 1209.358021] btrfs_uuid_scan_kthread+0x314/0x420 [btrfs]
> >> kernel: [ 1209.358187] ?
> __pfx_btrfs_uuid_rescan_kthread+0x10/0x10 [btrfs]
> >> kernel: [ 1209.358315] btrfs_uuid_rescan_kthread+0x20/0x70 [btrfs]
> >> kernel: [ 1209.358442] kthread+0xe9/0x110
> >> kernel: [ 1209.358451] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 1209.358458] ret_from_fork+0x2c/0x50
> >> kernel: [ 1209.358468] </TASK>
> >> kernel: [ 1330.195147] INFO: task btrfs-transacti:4088 blocked for
> >> more than 120 seconds.
> >> kernel: [ 1330.195192] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 1330.195221] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 1330.195250] task:btrfs-transacti state:D stack:0
> >> pid:4088 ppid:2 flags:0x00004000
> >> kernel: [ 1330.195259] Call Trace:
> >> kernel: [ 1330.195263] <TASK>
> >> kernel: [ 1330.195269] __schedule+0x2aa/0x610
> >> kernel: [ 1330.195281] schedule+0x63/0x110
> >> kernel: [ 1330.195288] wait_for_commit+0x14c/0x1b0 [btrfs]
> >> kernel: [ 1330.195413] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 1330.195424] btrfs_commit_transaction+0x16c/0xbc0 [btrfs]
> >> kernel: [ 1330.195552] ? start_transaction+0xc8/0x600 [btrfs]
> >> kernel: [ 1330.195676] transaction_kthread+0x14b/0x1c0 [btrfs]
> >> kernel: [ 1330.195795] ? __pfx_transaction_kthread+0x10/0x10
> [btrfs]
> >> kernel: [ 1330.195912] kthread+0xe9/0x110
> >> kernel: [ 1330.195920] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 1330.195927] ret_from_fork+0x2c/0x50
> >> kernel: [ 1330.195937] </TASK>
> >> kernel: [ 1330.195939] INFO: task btrfs-uuid:4834 blocked for more
> >> than 604 seconds.
> >> kernel: [ 1330.195968] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 1330.195997] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 1330.196026] task:btrfs-uuid state:D stack:0
> >> pid:4834 ppid:2 flags:0x00004000
> >> kernel: [ 1330.196033] Call Trace:
> >> kernel: [ 1330.196036] <TASK>
> >> kernel: [ 1330.196039] __schedule+0x2aa/0x610
> >> kernel: [ 1330.196046] schedule+0x63/0x110
> >> kernel: [ 1330.196051] wait_current_trans+0x100/0x160 [btrfs]
> >> kernel: [ 1330.196169] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 1330.196177] start_transaction+0x28b/0x600 [btrfs]
> >> kernel: [ 1330.196298] btrfs_start_transaction+0x1e/0x30 [btrfs]
> >> kernel: [ 1330.196416] btrfs_uuid_scan_kthread+0x314/0x420 [btrfs]
> >> kernel: [ 1330.196565] ?
> __pfx_btrfs_uuid_rescan_kthread+0x10/0x10 [btrfs]
> >> kernel: [ 1330.196680] btrfs_uuid_rescan_kthread+0x20/0x70 [btrfs]
> >> kernel: [ 1330.196794] kthread+0xe9/0x110
> >> kernel: [ 1330.196800] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 1330.196807] ret_from_fork+0x2c/0x50
> >> kernel: [ 1330.196814] </TASK>
> >> kernel: [ 1451.031238] INFO: task btrfs-transacti:4088 blocked for
> >> more than 241 seconds.
> >> kernel: [ 1451.031286] Tainted: G W O
> >> 6.2.0-23-generic #23+btrdebug2c
> >> kernel: [ 1451.031319] "echo 0 >
> >> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >> kernel: [ 1451.031352] task:btrfs-transacti state:D stack:0
> >> pid:4088 ppid:2 flags:0x00004000
> >> kernel: [ 1451.031362] Call Trace:
> >> kernel: [ 1451.031366] <TASK>
> >> kernel: [ 1451.031373] __schedule+0x2aa/0x610
> >> kernel: [ 1451.031388] schedule+0x63/0x110
> >> kernel: [ 1451.031396] wait_for_commit+0x14c/0x1b0 [btrfs]
> >> kernel: [ 1451.031535] ? __pfx_autoremove_wake_function+0x10/0x10
> >> kernel: [ 1451.031548] btrfs_commit_transaction+0x16c/0xbc0 [btrfs]
> >> kernel: [ 1451.031684] ? start_transaction+0xc8/0x600 [btrfs]
> >> kernel: [ 1451.031819] transaction_kthread+0x14b/0x1c0 [btrfs]
> >> kernel: [ 1451.031951] ? __pfx_transaction_kthread+0x10/0x10
> [btrfs]
> >> kernel: [ 1451.032082] kthread+0xe9/0x110
> >> kernel: [ 1451.032091] ? __pfx_kthread+0x10/0x10
> >> kernel: [ 1451.032098] ret_from_fork+0x2c/0x50
> >> kernel: [ 1451.032108] </TASK>
> >>
> >> On Mon, 26 Jun 2023 at 19:48, Qu Wenruo <quwenruo.btrfs@gmx.com
> <mailto:quwenruo.btrfs@gmx.com>> wrote:
> >>>
> >>>
> >>>
> >>> On 2023/6/24 23:29, Stefan N wrote:
> >>>> Whoops, I had left --dry-run on the first debug patch you
> commited, so
> >>>> that didn't run correctly.
> >>>>
> >>>> I've included the output from both patches, as they result in
> different output.
> >>>>
> >>>> Rerunning the older patch first, with loop devices (I tried both
> >>>> 4x100mb and 4x1gb) I get the following:
> >>>>
> >>> [...]
> >>>> *** The below is using the newer patch as follows:
> >>>> $ diff fs/btrfs/ ../linux-6.2.0-dist/fs/btrfs/
> >>>> diff fs/btrfs/ioctl.c ../linux-6.2.0-dist/fs/btrfs/ioctl.c
> >>>> 2656,2658d2655
> >>>> < else
> >>>> < btrfs_err(fs_info, "failed to add disk %s: %d",
> >>>> < vol_args->name, ret);
> >>>> diff fs/btrfs/transaction.c
> ../linux-6.2.0-dist/fs/btrfs/transaction.c
> >>>> 1029d1028
> >>>> < /*
> >>>> 1031d1029
> >>>> < */
> >>>> diff fs/btrfs/volumes.c ../linux-6.2.0-dist/fs/btrfs/volumes.c
> >>>> 2677c2677
> >>>> < trans = btrfs_join_transaction(root);
> >>>> ---
> >>>>> trans = btrfs_start_transaction(root, 0);
> >>>> 2680d2679
> >>>> < btrfs_err(fs_info, "failed to start trans:
> %d", ret);
> >>>> 2769d2767
> >>>> < btrfs_err(fs_info, "failed to add dev item:
> %d", ret);
> >>>> 2787,2789c2785
> >>>> < ret = btrfs_end_transaction(trans);
> >>>> < if (ret < 0)
> >>>> < btrfs_err(fs_info, "failed to end trans: %d",
> ret);
> >>>> ---
> >>>>> ret = btrfs_commit_transaction(trans);
> >>>> $
> >>>>
> >>>> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data ;
> sudo btrfs
> >>>> dev add -K -f /dev/loop12 /dev/loop13 /dev/loop14 /dev/loop15
> >>>> /mnt/data ; sudo btrfs fi sync /mnt/data
> >>>> ERROR: Could not sync filesystem: No space left on device
> >>>
> >>> Is it the same even with 4x1GiB loopback devices?
> >>>
> >>>> $
> >>>>
> >>>> kernel: [ 1811.846087] BTRFS info (device sdc): using crc32c
> >>>> (crc32c-intel) checksum algorithm
> >>>> kernel: [ 1811.846107] BTRFS info (device sdc): disk space
> caching is enabled
> >>>> kernel: [ 1817.852850] BTRFS info (device sdc): bdev /dev/sde
> errs: wr
> >>>> 0, rd 0, flush 0, corrupt 845, gen 0
> >>>> kernel: [ 1817.852866] BTRFS info (device sdc): bdev /dev/sda
> errs: wr
> >>>> 41089, rd 1556, flush 0, corrupt 0, gen 0
> >>>> kernel: [ 1817.852877] BTRFS info (device sdc): bdev /dev/sdh
> errs: wr
> >>>> 3, rd 7, flush 0, corrupt 0, gen 0
> >>>> kernel: [ 1817.852884] BTRFS info (device sdc): bdev /dev/sdd
> errs: wr
> >>>> 41, rd 0, flush 0, corrupt 0, gen 0
> >>>> kernel: [ 2037.562050] BTRFS info (device sdc): balance:
> resume skipped
> >>>> kernel: [ 2037.562064] BTRFS info (device sdc): checking UUID tree
> >>>> kernel: [ 2037.581550] BTRFS info (device sdc): disk added
> /dev/loop12
> >>>> kernel: [ 2037.591163] BTRFS info (device sdc): disk added
> /dev/loop13
> >>>> kernel: [ 2037.599477] BTRFS info (device sdc): disk added
> /dev/loop14
> >>>> kernel: [ 2037.607064] BTRFS info (device sdc): disk added
> /dev/loop15
> >>>> kernel: [ 2176.124630] INFO: task btrfs:7783 blocked for more
> than 120 seconds.
> >>>> kernel: [ 2176.124678] Tainted: G W O
> >>>> 6.2.0-23-generic #23+btrdebug2c
> >>>> kernel: [ 2176.124710] "echo 0 >
> >>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> kernel: [ 2176.124742] task:btrfs state:D stack:0
> >>>> pid:7783 ppid:7782 flags:0x00004002
> >>>> kernel: [ 2176.124753] Call Trace:
> >>>> kernel: [ 2176.124758] <TASK>
> >>>> kernel: [ 2176.124765] __schedule+0x2aa/0x610
> >>>> kernel: [ 2176.124780] schedule+0x63/0x110
> >>>> kernel: [ 2176.124788] btrfs_commit_transaction+0x9b7/0xbc0
> [btrfs]
> >>>
> >>> This means we're doing the real work, but it seems to take too
> long.
> >>>
> >>> In fact this is already looking promising as we have when
> through the
> >>> whole device add part.
> >>>
> >>> Just need to let the final commit to finish.
> >>>
> >>>> kernel: [ 2176.124929] ? __pfx_autoremove_wake_function+0x10/0x10
> >>>> kernel: [ 2176.124941] btrfs_sync_fs+0x5a/0x1b0 [btrfs]
> >>>> kernel: [ 2176.125060] btrfs_ioctl+0x643/0x14d0 [btrfs]
> >>>> kernel: [ 2176.125225] __x64_sys_ioctl+0xa0/0xe0
> >>>> kernel: [ 2176.125235] do_syscall_64+0x5b/0x90
> >>>> kernel: [ 2176.125242] ? do_sys_openat2+0xab/0x180
> >>>> kernel: [ 2176.125251] ? exit_to_user_mode_prepare+0x30/0xb0
> >>>> kernel: [ 2176.125260] ? syscall_exit_to_user_mode+0x29/0x50
> >>>> kernel: [ 2176.125268] ? do_syscall_64+0x67/0x90
> >>>> kernel: [ 2176.125275] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >>>> kernel: [ 2176.125282] RIP: 0033:0x7f2e8eb119ef
> >>>> kernel: [ 2176.125288] RSP: 002b:00007ffd632b6aa0 EFLAGS: 00000246
> >>>> ORIG_RAX: 0000000000000010
> >>>> kernel: [ 2176.125295] RAX: ffffffffffffffda RBX: 0000000000000003
> >>>> RCX: 00007f2e8eb119ef
> >>>> kernel: [ 2176.125300] RDX: 0000000000000000 RSI: 0000000000009408
> >>>> RDI: 0000000000000003
> >>>> kernel: [ 2176.125303] RBP: 0000000000000007 R08: 0000000000000000
> >>>> R09: 0000000000000000
> >>>> kernel: [ 2176.125306] R10: 0000000000000000 R11: 0000000000000246
> >>>> R12: 00007f2e8ebf642c
> >>>> kernel: [ 2176.125310] R13: 0000000000000001 R14: 000055cdb7940578
> >>>> R15: 0000000000000000
> >>>> kernel: [ 2176.125318] </TASK>
> >>>> kernel: [ 2296.956781] INFO: task btrfs:7783 blocked for more
> than 241 seconds.
> >>>> kernel: [ 2296.956824] Tainted: G W O
> >>>> 6.2.0-23-generic #23+btrdebug2c
> >>>> kernel: [ 2296.956856] "echo 0 >
> >>>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>>> kernel: [ 2296.956887] task:btrfs state:D stack:0
> >>>> pid:7783 ppid:7782 flags:0x00004002
> >>>> kernel: [ 2296.956898] Call Trace:
> >>>> kernel: [ 2296.956902] <TASK>
> >>>> kernel: [ 2296.956908] __schedule+0x2aa/0x610
> >>>> kernel: [ 2296.956921] schedule+0x63/0x110
> >>>> kernel: [ 2296.956928] btrfs_commit_transaction+0x9b7/0xbc0
> [btrfs]
> >>>> kernel: [ 2296.957069] ? __pfx_autoremove_wake_function+0x10/0x10
> >>>> kernel: [ 2296.957080] btrfs_sync_fs+0x5a/0x1b0 [btrfs]
> >>>> kernel: [ 2296.957200] btrfs_ioctl+0x643/0x14d0 [btrfs]
> >>>> kernel: [ 2296.957366] __x64_sys_ioctl+0xa0/0xe0
> >>>> kernel: [ 2296.957375] do_syscall_64+0x5b/0x90
> >>>> kernel: [ 2296.957383] ? do_sys_openat2+0xab/0x180
> >>>> kernel: [ 2296.957391] ? exit_to_user_mode_prepare+0x30/0xb0
> >>>> kernel: [ 2296.957399] ? syscall_exit_to_user_mode+0x29/0x50
> >>>> kernel: [ 2296.957407] ? do_syscall_64+0x67/0x90
> >>>> kernel: [ 2296.957414] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> >>>> kernel: [ 2296.957420] RIP: 0033:0x7f2e8eb119ef
> >>>> kernel: [ 2296.957426] RSP: 002b:00007ffd632b6aa0 EFLAGS: 00000246
> >>>> ORIG_RAX: 0000000000000010
> >>>> kernel: [ 2296.957433] RAX: ffffffffffffffda RBX: 0000000000000003
> >>>> RCX: 00007f2e8eb119ef
> >>>> kernel: [ 2296.957438] RDX: 0000000000000000 RSI: 0000000000009408
> >>>> RDI: 0000000000000003
> >>>> kernel: [ 2296.957441] RBP: 0000000000000007 R08: 0000000000000000
> >>>> R09: 0000000000000000
> >>>> kernel: [ 2296.957444] R10: 0000000000000000 R11: 0000000000000246
> >>>> R12: 00007f2e8ebf642c
> >>>> kernel: [ 2296.957448] R13: 0000000000000001 R14: 000055cdb7940578
> >>>> R15: 0000000000000000
> >>>> kernel: [ 2296.957468] </TASK>
> >>>> kernel: [ 2314.043258] ------------[ cut here ]------------
> >>>> kernel: [ 2314.043264] BTRFS: Transaction aborted (error -28)
> >>>> kernel: [ 2314.043334] WARNING: CPU: 2 PID: 7739 at
> >>>> fs/btrfs/extent-tree.c:2847 do_free_extent_accounting+0x21a/0x220
> >>>> [btrfs]
> >>>> kernel: [ 2314.043467] Modules linked in: ipmi_devintf
> ipmi_msghandler
> >>>> overlay iwlwifi_compat(O) binfmt_misc nls_iso8859_1 intel_rapl_msr
> >>>> snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio
> >>>> intel_rapl_common snd_hda_codec_hdmi edac_mce_amd snd_hda_intel
> >>>> snd_intel_dspcfg kvm_amd snd_intel_sdw_acpi snd_hda_codec kvm
> >>>> snd_hda_core snd_hwdep snd_pcm snd_timer irqbypass rapl
> wmi_bmof snd
> >>>> k10temp ccp soundcore input_leds mac_hid dm_multipath scsi_dh_rdac
> >>>> scsi_dh_emc scsi_dh_alua bonding tls msr nfsd efi_pstore
> auth_rpcgss
> >>>> nfs_acl lockd grace sunrpc dmi_sysfs ip_tables x_tables
> autofs4 btrfs
> >>>> blake2b_generic raid10 raid456 async_raid6_recov async_memcpy
> async_pq
> >>>> async_xor async_tx xor raid6_pq libcrc32c raid1 raid0
> multipath linear
> >>>> amdgpu iommu_v2 drm_buddy gpu_sched drm_ttm_helper hid_generic ttm
> >>>> drm_display_helper cec uas rc_core usbhid hid drm_kms_helper
> >>>> crct10dif_pclmul syscopyarea usb_storage crc32_pclmul
> polyval_clmulni
> >>>> sysfillrect polyval_generic sysimgblt nvme ghash_clmulni_intel
> >>>> sha512_ssse3
> >>>> kernel: [ 2314.043599] nvme_core aesni_intel crypto_simd
> mpt3sas drm
> >>>> cryptd raid_class ahci i2c_piix4 scsi_transport_sas
> nvme_common igb
> >>>> xhci_pci qlcnic dca xhci_pci_renesas libahci i2c_algo_bit
> video wmi
> >>>> kernel: [ 2314.043631] CPU: 2 PID: 7739 Comm: btrfs-transacti
> Tainted:
> >>>> G W O 6.2.0-23-generic #23+btrdebug2c
> >>>> kernel: [ 2314.043638] Hardware name: To Be Filled By O.E.M. X570M
> >>>> Pro4/X570M Pro4, BIOS P3.70 02/23/2022
> >>>> kernel: [ 2314.043641] RIP:
> 0010:do_free_extent_accounting+0x21a/0x220 [btrfs]
> >>>> kernel: [ 2314.043766] Code: ce 0f 0b eb b8 44 89 e6 48 c7 c7
> a8 39 a0
> >>>> c1 e8 2c d5 1e ce 0f 0b e9 78 ff ff ff 44 89 e6 48 c7 c7 a8 39
> a0 c1
> >>>> e8 16 d5 1e ce <0f> 0b eb b9 66 90 90 90 90 90 90 90 90 90 90
> 90 90 90
> >>>> 90 90 90 90
> >>>> kernel: [ 2314.043771] RSP: 0018:ffffad0b11b7bb38 EFLAGS: 00010246
> >>>> kernel: [ 2314.043777] RAX: 0000000000000000 RBX: ffff9c80e40e8f08
> >>>> RCX: 0000000000000000
> >>>> kernel: [ 2314.043781] RDX: 0000000000000000 RSI: 0000000000000000
> >>>> RDI: 0000000000000000
> >>>> kernel: [ 2314.043784] RBP: ffffad0b11b7bb60 R08: 0000000000000000
> >>>> R09: 0000000000000000
> >>>> kernel: [ 2314.043787] R10: 0000000000000000 R11: 0000000000000000
> >>>> R12: 00000000ffffffe4
> >>>> kernel: [ 2314.043790] R13: 00005e4c359ba000 R14: 0000000000020000
> >>>> R15: ffff9c824d9a58c0
> >>>> kernel: [ 2314.043794] FS: 0000000000000000(0000)
> >>>> GS:ffff9c87a0a80000(0000) knlGS:0000000000000000
> >>>> kernel: [ 2314.043798] CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> >>>> kernel: [ 2314.043802] CR2: 00007f54adc86000 CR3: 00000001471d8000
> >>>> CR4: 00000000003506e0
> >>>> kernel: [ 2314.043806] Call Trace:
> >>>> kernel: [ 2314.043809] <TASK>
> >>>> kernel: [ 2314.043815] __btrfs_free_extent+0x6bc/0xf50 [btrfs]
> >>>> kernel: [ 2314.043943] run_delayed_data_ref+0x8b/0x180 [btrfs]
> >>>> kernel: [ 2314.044068]
> btrfs_run_delayed_refs_for_head+0x196/0x520 [btrfs]
> >>>> kernel: [ 2314.044192] __btrfs_run_delayed_refs+0xe6/0x1d0
> [btrfs]
> >>>> kernel: [ 2314.044316] btrfs_run_delayed_refs+0x6d/0x1f0 [btrfs]
> >>>> kernel: [ 2314.044439]
> btrfs_start_dirty_block_groups+0x36b/0x530 [btrfs]
> >>>> kernel: [ 2314.044598] btrfs_commit_transaction+0xb3/0xbc0
> [btrfs]
> >>>> kernel: [ 2314.044754] ? start_transaction+0xc8/0x600 [btrfs]
> >>>> kernel: [ 2314.044890] transaction_kthread+0x14b/0x1c0 [btrfs]
> >>>> kernel: [ 2314.045021] ? __pfx_transaction_kthread+0x10/0x10
> [btrfs]
> >>>> kernel: [ 2314.045151] kthread+0xe9/0x110
> >>>> kernel: [ 2314.045162] ? __pfx_kthread+0x10/0x10
> >>>> kernel: [ 2314.045170] ret_from_fork+0x2c/0x50
> >>>> kernel: [ 2314.045180] </TASK>
> >>>> kernel: [ 2314.045182] ---[ end trace 0000000000000000 ]---
> >>>> kernel: [ 2314.045186] BTRFS info (device sdc: state A):
> dumping space info:
> >>>> kernel: [ 2314.045191] BTRFS info (device sdc: state A):
> space_info
> >>>> DATA has 160777674752 free, is not full
> >>>> kernel: [ 2314.045197] BTRFS info (device sdc: state A):
> space_info
> >>>> total=71201958395904, used=71013439856640, pinned=27737325568,
> >>>> reserved=0, may_use=0, readonly=3538944 zone_unusable=0
> >>>> kernel: [ 2314.045205] BTRFS info (device sdc: state A):
> space_info
> >>>> METADATA has -429047808 free, is full
> >>>
> >>> This means we need at least 500+ MiB metadata space.
> >>>
> >>> Thus you may want to try 4x1GiB to see if this makes any
> difference.
> >>>
> >>> Thanks,
> >>> Qu
> >>>> kernel: [ 2314.045209] BTRFS info (device sdc: state A):
> space_info
> >>>> total=83634421760, used=82789777408, pinned=244891648,
> >>>> reserved=599687168, may_use=429047808, readonly=65536
> zone_unusable=0
> >>>> kernel: [ 2314.045217] BTRFS info (device sdc: state A):
> space_info
> >>>> SYSTEM has 33390592 free, is not full
> >>>> kernel: [ 2314.045221] BTRFS info (device sdc: state A):
> space_info
> >>>> total=38797312, used=5373952, pinned=16384, reserved=16384,
> may_use=0,
> >>>> readonly=0 zone_unusable=0
> >>>> kernel: [ 2314.045227] BTRFS info (device sdc: state A):
> >>>> global_block_rsv: size 536870912 reserved 428523520
> >>>> kernel: [ 2314.045231] BTRFS info (device sdc: state A):
> >>>> trans_block_rsv: size 524288 reserved 524288
> >>>> kernel: [ 2314.045235] BTRFS info (device sdc: state A):
> >>>> chunk_block_rsv: size 0 reserved 0
> >>>> kernel: [ 2314.045239] BTRFS info (device sdc: state A):
> >>>> delayed_block_rsv: size 0 reserved 0
> >>>> kernel: [ 2314.045242] BTRFS info (device sdc: state A):
> >>>> delayed_refs_rsv: size 249756909568 reserved 0
> >>>> kernel: [ 2314.045251] BTRFS: error (device sdc: state A) in
> >>>> do_free_extent_accounting:2847: errno=-28 No space left
> >>>> kernel: [ 2314.045265] BTRFS warning (device sdc: state A):
> >>>> btrfs_uuid_scan_kthread failed -28
> >>>> kernel: [ 2314.045295] BTRFS info (device sdc: state EA):
> forced readonly
> >>>> kernel: [ 2314.045300] BTRFS error (device sdc: state EA):
> failed to
> >>>> run delayed ref for logical 103681409916928 num_bytes 131072
> type 184
> >>>> action 2 ref_mod 1: -28
> >>>> kernel: [ 2314.045360] BTRFS: error (device sdc: state EA) in
> >>>> btrfs_run_delayed_refs:2151: errno=-28 No space left
> >>>> kernel: [ 2314.049204] BTRFS: error (device sdc: state EA) in
> >>>> btrfs_create_pending_block_groups:2487: errno=-28 No space left
> >>>> kernel: [ 2314.049331] BTRFS: error (device sdc: state EA) in
> >>>> btrfs_create_pending_block_groups:2499: errno=-28 No space left
> >>>> kernel: [ 2314.053259] BTRFS: error (device sdc: state EA) in
> >>>> do_free_extent_accounting:2847: errno=-28 No space left
> >>>> kernel: [ 2314.053318] BTRFS error (device sdc: state EA):
> failed to
> >>>> run delayed ref for logical 103681419366400 num_bytes 131072
> type 184
> >>>> action 2 ref_mod 1: -28
> >>>> kernel: [ 2314.053375] BTRFS: error (device sdc: state EA) in
> >>>> btrfs_run_delayed_refs:2151: errno=-28 No space left
> >>>> kernel: [ 2314.053430] BTRFS warning (device sdc: state EA):
> Skipping
> >>>> commit of aborted transaction.
> >>>> kernel: [ 2314.053435] BTRFS: error (device sdc: state EA) in
> >>>> cleanup_transaction:1986: errno=-28 No space left
> >>>>
> >>>>
> >>>>
> >>>> On Fri, 23 Jun 2023 at 19:16, Qu Wenruo <wqu@suse.com
> <mailto:wqu@suse.com>> wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 2023/6/23 17:00, Stefan N wrote:
> >>>>>> Apologies, I thought I included the log output too, though I
> can't see
> >>>>>> any additional output
> >>>>>>
> >>>>>> From a fresh run, still using the same kernel
> >>>>>> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data ;
> sudo btrfs
> >>>>>> dev add -f /dev/sdl /dev/sdm /dev/sdn /dev/sdo /mnt/data ;
> sudo btrfs
> >>>>>> fi sync /mnt/data
> >>>>>> ERROR: error adding device '/dev/sdl': Input/output error
> >>>>>> ERROR: error adding device '/dev/sdm': Read-only file system
> >>>>>> ERROR: error adding device '/dev/sdn': Read-only file system
> >>>>>> ERROR: error adding device '/dev/sdo': Read-only file system
> >>>>>> ERROR: Could not sync filesystem: Read-only file system
> >>>>>> $
> >>>>>>
> >>>>>> Output from kern.log, syslog or dmesg -k
> >>>>>>
> >>>>> [...]
> >>>>>
> >>>>> None of the newly added debug lines triggered, so there is
> something
> >>>>> else causing the problem.
> >>>>>
> >>>>> And furthermore the backtrace is not that helpful, it only
> shows it's
> >>>>> some async metadata reclaim kthread causing the problem.
> >>>>>
> >>>>> Although I guess the async metadata reclaim is triggered by the
> >>>>> btrfs_start_transaction() call when adding a device.
> >>>>> So I updated my github branch to go btrfs_join_transaction()
> which would
> >>>>> not flush any metadata, thus avoid the problem.
> >>>>>
> >>>>> Would you please give it a try again?
> >>>>>
> >>>>>>
> >>>>>> However, now I started digging into logs to check I hadn't
> missed
> >>>>>> where the errors were being logged, I've found this from
> roughly a
> >>>>>> week before I started having issues, which I had not previously
> >>>>>> noticed
> >>>>>
> >>>>> You don't need to bother most error messages after the fs
> flipped RO.
> >>>>> As it's known to have some false alerts.
> >>>>>
> >>>>> Thanks,
> >>>>> Qu
> >>>>>
> >>>>>> [ 1990.495861] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 107988943355904 num_bytes 245760 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 1990.518282] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 107989043494912 num_bytes 245760 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 620.104065] BTRFS error (device sdk): failed to run
> delayed ref for
> >>>>>> logical 123187655077888 num_bytes 176128 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 620.126209] BTRFS error (device sdk): failed to run
> delayed ref for
> >>>>>> logical 123190279929856 num_bytes 134217728 type 184 action
> 2 ref_mod
> >>>>>> 1: -28
> >>>>>> [ 620.126241] BTRFS error (device sdk): failed to run
> delayed ref for
> >>>>>> logical 123189970468864 num_bytes 134217728 type 184 action
> 2 ref_mod
> >>>>>> 1: -28
> >>>>>> [ 620.126271] BTRFS error (device sdk): failed to run
> delayed ref for
> >>>>>> logical 123190414409728 num_bytes 134217728 type 184 action
> 2 ref_mod
> >>>>>> 1: -28
> >>>>>> [ 476.565308] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101906434228224 num_bytes 651264 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 476.565932] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101906434031616 num_bytes 180224 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 447.371754] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101946151927808 num_bytes 262144 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 447.372362] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101946083725312 num_bytes 245760 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 439.839007] BTRFS error (device sdj): failed to run
> delayed ref for
> >>>>>> logical 101923102179328 num_bytes 192512 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 439.839578] BTRFS error (device sdj): failed to run
> delayed ref for
> >>>>>> logical 101923401629696 num_bytes 245760 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 466.393884] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101981116137472 num_bytes 245760 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 466.394451] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101981122854912 num_bytes 1720320 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 431.541367] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101876426952704 num_bytes 126976 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 431.542010] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101876427780096 num_bytes 126976 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 597.487948] BTRFS error (device sdj): failed to run
> delayed ref for
> >>>>>> logical 108127459409920 num_bytes 196608 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 597.488539] BTRFS error (device sdj): failed to run
> delayed ref for
> >>>>>> logical 108124677865472 num_bytes 126976 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 534.717509] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101958618710016 num_bytes 1597440 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 534.718494] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 101958756335616 num_bytes 368640 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 508.089394] BTRFS error (device sdk): failed to run
> delayed ref for
> >>>>>> logical 101911627694080 num_bytes 126976 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 508.090007] BTRFS error (device sdk): failed to run
> delayed ref for
> >>>>>> logical 101911627415552 num_bytes 126976 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 1632.112084] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 102203759886336 num_bytes 229376 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>> [ 1632.112885] BTRFS error (device sdh): failed to run
> delayed ref for
> >>>>>> logical 102203764379648 num_bytes 126976 type 184 action 2
> ref_mod 1:
> >>>>>> -28
> >>>>>>
> >>>>>> and today, when leaving the disks mounted read-only for a
> while, I
> >>>>>> found many occurances similar to:
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 1 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 2 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 3 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 4 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 1 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 2 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201329754554368 mirror 3 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201350830227456 mirror 4 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201350830227456 mirror 1 wanted 2 found 0
> >>>>>> BTRFS error (device sdc: state EA): level verify failed on
> logical
> >>>>>> 201350830227456 mirror 2 wanted 2 found 0
> >>>>>>
> >>>>>> On Fri, 23 Jun 2023 at 10:27, Qu Wenruo
> <quwenruo.btrfs@gmx.com <mailto:quwenruo.btrfs@gmx.com>> wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 2023/6/23 06:18, Stefan N wrote:
> >>>>>>>> Hi Qu,
> >>>>>>>>
> >>>>>>>> I got one new line this time, but it doesn't seem to match
> your commit
> >>>>>>>> ERROR: zoned: unable to stat /dev/loop/13
> >>>>>>>
> >>>>>>> Please provide the dmesg of that attempt, as all the extra
> debug info is
> >>>>>>> inside dmesg.
> >>>>>>>
> >>>>>>> With that info provided, we can determine what to do next.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Qu
> >>>>>>>
> >>>>>>>>
> >>>>>>>> I tried it on the USB flash drives too and didn't get any
> extra line
> >>>>>>>>
> >>>>>>>> In context
> >>>>>>>> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data ;
> sudo btrfs
> >>>>>>>> dev add -K -f /dev/loop12 /dev/loop/13 /dev/loop14 /dev/loop15
> >>>>>>>> /mnt/data ; sudo btrfs fi sync /mnt/data
> >>>>>>>> ERROR: error adding device '/dev/loop12': Input/output error
> >>>>>>>> ERROR: zoned: unable to stat /dev/loop/13
> >>>>>>>> ERROR: checking status of /dev/loop/13: No such file or
> directory
> >>>>>>>> ERROR: error adding device '/dev/loop14': Read-only file
> system
> >>>>>>>> ERROR: error adding device '/dev/loop15': Read-only file
> system
> >>>>>>>> ERROR: Could not sync filesystem: Read-only file system
> >>>>>>>> $
> >>>>>>>>
> >>>>>>>> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data ;
> sudo btrfs
> >>>>>>>> dev add -f /dev/sdl /dev/sdm /dev/sdn /dev/sdo /mnt/data ;
> sudo btrfs
> >>>>>>>> fi sync /mnt/data
> >>>>>>>> ERROR: error adding device '/dev/sdl': Input/output error
> >>>>>>>> ERROR: error adding device '/dev/sdm': Read-only file system
> >>>>>>>> ERROR: error adding device '/dev/sdn': Read-only file system
> >>>>>>>> ERROR: error adding device '/dev/sdo': Read-only file system
> >>>>>>>> ERROR: Could not sync filesystem: Read-only file system
> >>>>>>>> $
> >>>>>>>>
> >>>>>>>> On Thu, 22 Jun 2023 at 18:48, Qu Wenruo
> <quwenruo.btrfs@gmx.com <mailto:quwenruo.btrfs@gmx.com>> wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 2023/6/22 16:33, Stefan N wrote:
> >>>>>>>>>> Hi Qu,
> >>>>>>>>>>
> >>>>>>>>>> Many thanks for the detailed instructions and your
> patience. I got it
> >>>>>>>>>> working combined with
> >>>>>>>>>> https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel
> <https://wiki.ubuntu.com/Kernel/BuildYourOwnKernel> on the main system
> >>>>>>>>>> OS instead, tagged +btrfix
> >>>>>>>>>> $ uname -vr
> >>>>>>>>>> 6.2.0-23-generic #23+btrfix SMP PREEMPT_DYNAMIC Thu Jun 22
> >>>>>>>>>>
> >>>>>>>>>> However, I've not had luck with the commands suggested,
> and would
> >>>>>>>>>> appreciate any further ideas.
> >>>>>>>>>>
> >>>>>>>>>> Outputs follow below, with /mnt/data as the btrfs mount
> point that
> >>>>>>>>>> currently contains 8x disks sd[a-j] with an additional
> 4x 64gb USB
> >>>>>>>>>> flash drives being added sd[l-o]
> >>>>>>>>>> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data
> ; sudo btrfs
> >>>>>>>>>> dev add -f /dev/sdl /dev/sdm /dev/sdn /dev/sdo /mnt/data
> ; sudo btrfs
> >>>>>>>>>> fi sync /mnt/data
> >>>>>>>>>> ERROR: error adding device '/dev/sdl': Input/output error
> >>>>>>>>>> ERROR: error adding device '/dev/sdm': Read-only file system
> >>>>>>>>>> ERROR: error adding device '/dev/sdn': Read-only file system
> >>>>>>>>>> ERROR: error adding device '/dev/sdo': Read-only file system
> >>>>>>>>>> ERROR: Could not sync filesystem: Read-only file system
> >>>>>>>>>> $
> >>>>>>>>>>
> >>>>>>>>>> The same occurs if I try to add 4x 100mb loop devices
> (on a ssd so
> >>>>>>>>>> they're super quick to zero);
> >>>>>>>>>> $ sudo mount -o skip_balance -t btrfs /dev/sde /mnt/data
> ; sudo btrfs
> >>>>>>>>>> dev add -K -f /dev/loop16 /dev/loop17 /dev/loop18
> /dev/loop19
> >>>>>>>>>> /mnt/data ; sudo btrfs fi sync /mnt/data
> >>>>>>>>>> ERROR: error adding device '/dev/loop16': Input/output error
> >>>>>>>>>
> >>>>>>>>> This is the interesting part, this means we're erroring
> out due to -EIO
> >>>>>>>>> (not -ENOSPC) during the first device add.
> >>>>>>>>>
> >>>>>>>>> And by somehow, after the first device add, we already
> got the trans abort.
> >>>>>>>>>
> >>>>>>>>> Would you please try the following branch?
> >>>>>>>>>
> >>>>>>>>>
> https://github.com/adam900710/linux/tree/dev_add_no_commit
> <https://github.com/adam900710/linux/tree/dev_add_no_commit>
> >>>>>>>>>
> >>>>>>>>> It has not only the patch to skip the commit, but also
> extra debug
> >>>>>>>>> output for the situation.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Qu
> >>>>>>>>>
> >>>>>>>>>> ERROR: error adding device '/dev/loop17': Read-only file
> system
> >>>>>>>>>> ERROR: error adding device '/dev/loop18': Read-only file
> system
> >>>>>>>>>> ERROR: error adding device '/dev/loop19': Read-only file
> system
> >>>>>>>>>> ERROR: Could not sync filesystem: Read-only file system
> >>>>>>>>>> $
> >>>>>>>>>>
> >>>>>>>>>> I confirmed before both these kernel builds that the
> replaced line was
> >>>>>>>>>> btrfs_end_transaction rather than
> btrfs_commit_transaction (anyone
> >>>>>>>>>> else following, I needed to remove the -n in the patch
> command
> >>>>>>>>>> earlier)
> >>>>>>>>>> $ grep -A3 -ri btrfs_sysfs_update_sprout
> */fs/btrfs/volumes.c*
> >>>>>>>>>> linux-6.2.0-dist/fs/btrfs/volumes.c:
> >>>>>>>>>> btrfs_sysfs_update_sprout_fsid(fs_devices);
> >>>>>>>>>> linux-6.2.0-dist/fs/btrfs/volumes.c- }
> >>>>>>>>>> linux-6.2.0-dist/fs/btrfs/volumes.c-
> >>>>>>>>>> linux-6.2.0-dist/fs/btrfs/volumes.c- ret =
> btrfs_commit_transaction(trans);
> >>>>>>>>>> --
> >>>>>>>>>> linux-6.2.0-v2/fs/btrfs/volumes.c:
> >>>>>>>>>> btrfs_sysfs_update_sprout_fsid(fs_devices);
> >>>>>>>>>> linux-6.2.0-v2/fs/btrfs/volumes.c- }
> >>>>>>>>>> linux-6.2.0-v2/fs/btrfs/volumes.c-
> >>>>>>>>>> linux-6.2.0-v2/fs/btrfs/volumes.c- ret =
> btrfs_end_transaction(trans);
> >>>>>>>>>> --
> >>>>>>>>>> linux-6.2.0-v3/fs/btrfs/volumes.c:
> >>>>>>>>>> btrfs_sysfs_update_sprout_fsid(fs_devices);
> >>>>>>>>>> linux-6.2.0-v3/fs/btrfs/volumes.c- }
> >>>>>>>>>> linux-6.2.0-v3/fs/btrfs/volumes.c-
> >>>>>>>>>> linux-6.2.0-v3/fs/btrfs/volumes.c- ret =
> btrfs_end_transaction(trans);
> >>>>>>>>>> $
> >>>>>>>>>>
> >>>>>>>>>> $ btrfs fi usage /mnt/data
> >>>>>>>>>> Overall:
> >>>>>>>>>> Device size: 87.31TiB
> >>>>>>>>>> Device allocated: 87.31TiB
> >>>>>>>>>> Device unallocated: 1.94GiB
> >>>>>>>>>> Device missing: 0.00B
> >>>>>>>>>> Device slack: 0.00B
> >>>>>>>>>> Used: 87.08TiB
> >>>>>>>>>> Free (estimated): 173.29GiB
> (min: 172.33GiB)
> >>>>>>>>>> Free (statfs, df): 171.84GiB
> >>>>>>>>>> Data ratio: 1.34
> >>>>>>>>>> Metadata ratio: 4.00
> >>>>>>>>>> Global reserve: 512.00MiB
> (used: 371.25MiB)
> >>>>>>>>>> Multiple profiles: no
> >>>>>>>>>>
> >>>>>>>>>> Data,RAID6: Size:64.76TiB, Used:64.59TiB (99.74%)
> >>>>>>>>>> /dev/sdc 10.90TiB
> >>>>>>>>>> /dev/sdf 10.90TiB
> >>>>>>>>>> /dev/sda 10.86TiB
> >>>>>>>>>> /dev/sdg 10.87TiB
> >>>>>>>>>> /dev/sdh 10.86TiB
> >>>>>>>>>> /dev/sdd 10.87TiB
> >>>>>>>>>> /dev/sde 10.88TiB
> >>>>>>>>>> /dev/sdb 10.88TiB
> >>>>>>>>>>
> >>>>>>>>>> Metadata,RAID1C4: Size:77.79GiB, Used:77.11GiB (99.12%)
> >>>>>>>>>> /dev/sdc 15.33GiB
> >>>>>>>>>> /dev/sdf 18.41GiB
> >>>>>>>>>> /dev/sda 49.63GiB
> >>>>>>>>>> /dev/sdg 49.50GiB
> >>>>>>>>>> /dev/sdh 51.52GiB
> >>>>>>>>>> /dev/sdd 48.70GiB
> >>>>>>>>>> /dev/sde 39.09GiB
> >>>>>>>>>> /dev/sdb 39.01GiB
> >>>>>>>>>>
> >>>>>>>>>> System,RAID1C4: Size:37.00MiB, Used:5.11MiB (13.81%)
> >>>>>>>>>> /dev/sdc 1.00MiB
> >>>>>>>>>> /dev/sda 37.00MiB
> >>>>>>>>>> /dev/sdg 37.00MiB
> >>>>>>>>>> /dev/sdh 36.00MiB
> >>>>>>>>>> /dev/sdd 37.00MiB
> >>>>>>>>>>
> >>>>>>>>>> Unallocated:
> >>>>>>>>>> /dev/sdc 1.00MiB
> >>>>>>>>>> /dev/sdf 1.00MiB
> >>>>>>>>>> /dev/sda 1.27GiB
> >>>>>>>>>> /dev/sdg 1.00MiB
> >>>>>>>>>> /dev/sdh 1.00MiB
> >>>>>>>>>> /dev/sdd 687.00MiB
> >>>>>>>>>> /dev/sde 1.00MiB
> >>>>>>>>>> /dev/sdb 1.00MiB
> >>>>>>>>>> $
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> This first attempt generated the following syslog output:
> >>>>>>>>>> kernel: [ 868.435387] BTRFS info (device sde): using crc32c
> >>>>>>>>>> (crc32c-intel) checksum algorithm
> >>>>>>>>>> kernel: [ 868.435407] BTRFS info (device sde): disk
> space caching is enabled
> >>>>>>>>>> kernel: [ 874.477712] BTRFS info (device sde): bdev
> /dev/sdg errs: wr
> >>>>>>>>>> 0, rd 0, flush 0, corrupt 845, gen 0
> >>>>>>>>>> kernel: [ 874.477727] BTRFS info (device sde): bdev
> /dev/sdc errs: wr
> >>>>>>>>>> 41089, rd 1556, flush 0, corrupt 0, gen 0
> >>>>>>>>>> kernel: [ 874.477735] BTRFS info (device sde): bdev
> /dev/sdj errs: wr
> >>>>>>>>>> 3, rd 7, flush 0, corrupt 0, gen 0
> >>>>>>>>>> kernel: [ 874.477740] BTRFS info (device sde): bdev
> /dev/sdf errs: wr
> >>>>>>>>>> 41, rd 0, flush 0, corrupt 0, gen 0
> >>>>>>>>>> kernel: [ 1082.645551] BTRFS info (device sde): balance:
> resume skipped
> >>>>>>>>>> kernel: [ 1082.645564] BTRFS info (device sde): checking
> UUID tree
> >>>>>>>>>> kernel: [ 1082.645551] BTRFS info (device sde): balance:
> resume skipped
> >>>>>>>>>> kernel: [ 1082.645564] BTRFS info (device sde): checking
> UUID tree
> >>>>>>>>>> kernel: [ 1267.280506] BTRFS: Transaction aborted (error
> -28)
> >>>>>>>>>> kernel: [ 1267.280553] BTRFS: error (device sde: state A) in
> >>>>>>>>>> do_free_extent_accounting:2847: errno=-28 No space left
> >>>>>>>>>> kernel: [ 1267.280604] BTRFS info (device sde: state
> EA): forced readonly
> >>>>>>>>>> kernel: [ 1267.280610] BTRFS error (device sde: state
> EA): failed to
> >>>>>>>>>> run delayed ref for logical 102255404044288 num_bytes
> 294912 type 184
> >>>>>>>>>> action 2 ref_mod 1: -28
> >>>>>>>>>> kernel: [ 1267.280584] WARNING: CPU: 3 PID: 14519 at
> >>>>>>>>>> fs/btrfs/extent-tree.c:2847
> do_free_extent_accounting+0x21a/0x220
> >>>>>>>>>> [btrfs]
> >>>>>>>>>> kernel: [ 1267.280666] BTRFS: error (device sde: state
> EA) in
> >>>>>>>>>> btrfs_run_delayed_refs:2151: errno=-28 No space left
> >>>>>>>>>> kernel: [ 1267.280695] BTRFS warning (device sde: state EA):
> >>>>>>>>>> btrfs_uuid_scan_kthread failed -5
> >>>>>>>>>> kernel: [ 1267.280794] Modules linked in: xt_nat
> xt_tcpudp veth
> >>>>>>>>>> xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat
> nf_conntrack_netlink
> >>>>>>>>>> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user
> xfrm_algo
> >>>>>>>>>> xt_addrtype nft_compat nf_tables nfnetlink br_netfilter
> bridge stp llc
> >>>>>>>>>> ipmi_devintf ipmi_msghandler overlay iwlwifi_compat(O)
> binfmt_misc
> >>>>>>>>>> nls_iso8859_1 intel_rapl_msr intel_rapl_common edac_mce_amd
> >>>>>>>>>> snd_hda_codec_realtek kvm_amd snd_hda_codec_generic
> ledtrig_audio kvm
> >>>>>>>>>> snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg
> snd_intel_sdw_acpi
> >>>>>>>>>> snd_hda_codec irqbypass snd_hda_core snd_hwdep rapl
> snd_pcm snd_timer
> >>>>>>>>>> wmi_bmof k10temp snd ccp soundcore input_leds mac_hid
> dm_multipath
> >>>>>>>>>> scsi_dh_rdac scsi_dh_emc scsi_dh_alua bonding tls
> efi_pstore msr nfsd
> >>>>>>>>>> auth_rpcgss nfs_acl lockd grace sunrpc dmi_sysfs
> ip_tables x_tables
> >>>>>>>>>> autofs4 btrfs blake2b_generic raid10 raid456
> async_raid6_recov
> >>>>>>>>>> async_memcpy async_pq async_xor async_txxor raid6_pq
> libcrc32c raid1
> >>>>>>>>>> raid0 multipath linear hid_generic usbhid hid amdgpu uas
> usb_storage
> >>>>>>>>>> kernel: [ 1267.280994] CPU: 3 PID: 14519 Comm:
> btrfs-transacti
> >>>>>>>>>> Tainted: G W O 6.2.0-23-generic #23+btrfix
> >>>>>>>>>> kernel: [ 1267.281005] RIP:
> 0010:do_free_extent_accounting+0x21a/0x220 [btrfs]
> >>>>>>>>>> kernel: [ 1267.281181] __btrfs_free_extent+0x6bc/0xf50
> [btrfs]
> >>>>>>>>>> kernel: [ 1267.281310] run_delayed_data_ref+0x8b/0x180
> [btrfs]
> >>>>>>>>>> kernel: [ 1267.281444]
> btrfs_run_delayed_refs_for_head+0x196/0x520 [btrfs]
> >>>>>>>>>> kernel: [ 1267.281570]
> __btrfs_run_delayed_refs+0xe6/0x1d0 [btrfs]
> >>>>>>>>>> kernel: [ 1267.281694]
> btrfs_run_delayed_refs+0x6d/0x1f0 [btrfs]
> >>>>>>>>>> kernel: [ 1267.281818]
> btrfs_start_dirty_block_groups+0x36b/0x530 [btrfs]
> >>>>>>>>>> kernel: [ 1267.281976]
> btrfs_commit_transaction+0xb3/0xbc0 [btrfs]
> >>>>>>>>>> kernel: [ 1267.282110] ? start_transaction+0xc8/0x600
> [btrfs]
> >>>>>>>>>> kernel: [ 1267.282244] transaction_kthread+0x14b/0x1c0
> [btrfs]
> >>>>>>>>>> kernel: [ 1267.282375] ?
> __pfx_transaction_kthread+0x10/0x10 [btrfs]
> >>>>>>>>>> kernel: [ 1267.282548] BTRFS info (device sde: state
> EA): dumping space info:
> >>>>>>>>>> kernel: [ 1267.282552] BTRFS info (device sde: state
> EA): space_info
> >>>>>>>>>> DATA has 160777674752 free, is not full
> >>>>>>>>>> kernel: [ 1267.282558] BTRFS info (device sde: state
> EA): space_info
> >>>>>>>>>> total=71201958395904, used=71018191273984,
> pinned=22985908224,
> >>>>>>>>>> reserved=0, may_use=0, readonly=3538944 zone_unusable=0
> >>>>>>>>>> kernel: [ 1267.282566] BTRFS info (device sde: state
> EA): space_info
> >>>>>>>>>> METADATA has -124944384 free, is full
> >>>>>>>>>> kernel: [ 1267.282571] BTRFS info (device sde: state
> EA): space_info
> >>>>>>>>>> total=83530612736, used=82791497728, pinned=242745344,
> >>>>>>>>>> reserved=496369664, may_use=124944384, readonly=0
> zone_unusable=0
> >>>>>>>>>> kernel: [ 1267.282577] BTRFS info (device sde: state
> EA): space_info
> >>>>>>>>>> SYSTEM has 33439744 free, is not full
> >>>>>>>>>> kernel: [ 1267.282582] BTRFS info (device sde: state
> EA): space_info
> >>>>>>>>>> total=38797312, used=5357568, pinned=0, reserved=0,
> may_use=0,
> >>>>>>>>>> readonly=0 zone_unusable=0
> >>>>>>>>>> kernel: [ 1267.282588] BTRFS info (device sde: state EA):
> >>>>>>>>>> global_block_rsv: size 536870912 reserved 124944384
> >>>>>>>>>> kernel: [ 1267.282592] BTRFS info (device sde: state EA):
> >>>>>>>>>> trans_block_rsv: size 0 reserved 0
> >>>>>>>>>> kernel: [ 1267.282595] BTRFS info (device sde: state EA):
> >>>>>>>>>> chunk_block_rsv: size 0 reserved 0
> >>>>>>>>>> kernel: [ 1267.282599] BTRFS info (device sde: state EA):
> >>>>>>>>>> delayed_block_rsv: size 0 reserved 0
> >>>>>>>>>> kernel: [ 1267.282602] BTRFS info (device sde: state EA):
> >>>>>>>>>> delayed_refs_rsv: size 251322957824 reserved 0
> >>>>>>>>>> kernel: [ 1267.282608] BTRFS: error (device sde: state
> EA) in
> >>>>>>>>>> do_free_extent_accounting:2847: errno=-28 No space left
> >>>>>>>>>> kernel: [ 1267.282653] BTRFS error (device sde: state
> EA): failed to
> >>>>>>>>>> run delayed ref for logical 102255401897984 num_bytes
> 126976 type 184
> >>>>>>>>>> action 2 ref_mod 1: -28
> >>>>>>>>>> kernel: [ 1267.282708] BTRFS: error (device sde: state
> EA) in
> >>>>>>>>>> btrfs_run_delayed_refs:2151: errno=-28 No space left
> >>>>>>>>>>
> >>>>>>>>>> A couple of kernel recompiles later, the second attempt
> on the SSD
> >>>>>>>>>> generated similar:
> >>>>>>>>>> kernel: [ 1472.203470] BTRFS info (device sdc): using crc32c
> >>>>>>>>>> (crc32c-intel) checksum algorithm
> >>>>>>>>>> kernel: [ 1472.203491] BTRFS info (device sdc): disk
> space caching is enabled
> >>>>>>>>>> kernel: [ 1478.155004] BTRFS info (device sdc): bdev
> /dev/sdf errs: wr
> >>>>>>>>>> 0, rd 0, flush 0, corrupt 845, gen 0
> >>>>>>>>>> kernel: [ 1478.155022] BTRFS info (device sdc): bdev
> /dev/sda errs: wr
> >>>>>>>>>> 41089, rd 1556, flush 0, corrupt 0, gen 0
> >>>>>>>>>> kernel: [ 1478.155034] BTRFS info (device sdc): bdev
> /dev/sdh errs: wr
> >>>>>>>>>> 3, rd 7, flush 0, corrupt 0, gen 0
> >>>>>>>>>> kernel: [ 1478.155041] BTRFS info (device sdc): bdev
> /dev/sdd errs: wr
> >>>>>>>>>> 41, rd 0, flush 0, corrupt 0, gen 0
> >>>>>>>>>> kernel: [ 1696.662526] BTRFS info (device sdc): balance:
> resume skipped
> >>>>>>>>>> kernel: [ 1696.662537] BTRFS info (device sdc): checking
> UUID tree
> >>>>>>>>>> kernel: [ 1919.452464] BTRFS: Transaction aborted (error
> -28)
> >>>>>>>>>> kernel: [ 1919.452534] WARNING: CPU: 1 PID: 161 at
> >>>>>>>>>> fs/btrfs/extent-tree.c:2847
> do_free_extent_accounting+0x21a/0x220
> >>>>>>>>>> [btrfs]
> >>>>>>>>>> kernel: [ 1919.452655] Modules linked in: xt_nat
> xt_tcpudp veth
> >>>>>>>>>> xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat
> nf_conntrack_netlink
> >>>>>>>>>> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xfrm_user
> xfrm_algo
> >>>>>>>>>> xt_addrtype nft_compat nf_tables nfnetlink br_netfilter
> bridge stp llc
> >>>>>>>>>> ipmi_devintf ipmi_msghandler overlay iwlwifi_compat(O)
> binfmt_misc
> >>>>>>>>>> nls_iso8859_1 snd_hda_codec_realtek snd_hda_codec_generic
> >>>>>>>>>> ledtrig_audio snd_hda_codec_hdmi snd_hda_intel
> snd_intel_dspcfg
> >>>>>>>>>> snd_intel_sdw_acpi snd_hda_codec intel_rapl_msr snd_hda_core
> >>>>>>>>>> intel_rapl_common edac_mce_amd snd_hwdep kvm_amd snd_pcm
> snd_timer kvm
> >>>>>>>>>> irqbypass rapl wmi_bmof snd k10temp soundcore ccp
> input_leds mac_hid
> >>>>>>>>>> dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua
> bonding tls nfsd
> >>>>>>>>>> msr auth_rpcgss efi_pstore nfs_acl lockd grace sunrpc
> dmi_sysfs
> >>>>>>>>>> ip_tables x_tables autofs4 btrfs blake2b_generic raid10
> raid456
> >>>>>>>>>> async_raid6_recov async_memcpy async_pq async_xor
> async_tx xor
> >>>>>>>>>> raid6_pq libcrc32c raid1 raid0 multipath linear
> hid_generic usbhid
> >>>>>>>>>> amdgpu uas hid iommu_v2
> >>>>>>>>>> kernel: [ 1919.452839] Workqueue: events_unbound
> >>>>>>>>>> btrfs_async_reclaim_metadata_space [btrfs]
> >>>>>>>>>> kernel: [ 1919.452985] RIP:
> 0010:do_free_extent_accounting+0x21a/0x220 [btrfs]
> >>>>>>>>>> kernel: [ 1919.453141] __btrfs_free_extent+0x6bc/0xf50
> [btrfs]
> >>>>>>>>>> kernel: [ 1919.453256] run_delayed_data_ref+0x8b/0x180
> [btrfs]
> >>>>>>>>>> kernel: [ 1919.453368]
> btrfs_run_delayed_refs_for_head+0x196/0x520 [btrfs]
> >>>>>>>>>> kernel: [ 1919.453480]
> __btrfs_run_delayed_refs+0xe6/0x1d0 [btrfs]
> >>>>>>>>>> kernel: [ 1919.453592]
> btrfs_run_delayed_refs+0x6d/0x1f0 [btrfs]
> >>>>>>>>>> kernel: [ 1919.453703] flush_space+0x23c/0x2c0 [btrfs]
> >>>>>>>>>> kernel: [ 1919.453845]
> btrfs_async_reclaim_metadata_space+0x19b/0x2b0 [btrfs]
> >>>>>>>>>> kernel: [ 1919.454034] BTRFS info (device sdc: state A):
> dumping space info:
> >>>>>>>>>> kernel: [ 1919.454038] BTRFS info (device sdc: state A):
> space_info
> >>>>>>>>>> DATA has 160778723328 free, is not full
> >>>>>>>>>> kernel: [ 1919.454043] BTRFS info (device sdc: state A):
> space_info
> >>>>>>>>>> total=71201958395904, used=71017442181120,
> pinned=23733952512,
> >>>>>>>>>> reserved=0, may_use=0, readonly=3538944 zone_unusable=0
> >>>>>>>>>> kernel: [ 1919.454050] BTRFS info (device sdc: state A):
> space_info
> >>>>>>>>>> METADATA has -147570688 free, is full
> >>>>>>>>>> kernel: [ 1919.454054] BTRFS info (device sdc: state A):
> space_info
> >>>>>>>>>> total=83530612736, used=82792185856, pinned=238059520,
> >>>>>>>>>> reserved=500367360, may_use=147570688, readonly=0
> zone_unusable=0
> >>>>>>>>>> kernel: [ 1919.454060] BTRFS info (device sdc: state A):
> space_info
> >>>>>>>>>> SYSTEM has 33439744 free, is not full
> >>>>>>>>>> kernel: [ 1919.454064] BTRFS info (device sdc: state A):
> space_info
> >>>>>>>>>> total=38797312, used=5357568, pinned=0, reserved=0,
> may_use=0,
> >>>>>>>>>> readonly=0 zone_unusable=0
> >>>>>>>>>> kernel: [ 1919.454070] BTRFS info (device sdc: state A):
> >>>>>>>>>> global_block_rsv: size 536870912 reserved 147570688
> >>>>>>>>>> kernel: [ 1919.454074] BTRFS info (device sdc: state A):
> >>>>>>>>>> trans_block_rsv: size 0 reserved 0
> >>>>>>>>>> kernel: [ 1919.454077] BTRFS info (device sdc: state A):
> >>>>>>>>>> chunk_block_rsv: size 0 reserved 0
> >>>>>>>>>> kernel: [ 1919.454080] BTRFS info (device sdc: state A):
> >>>>>>>>>> delayed_block_rsv: size 0 reserved 0
> >>>>>>>>>> kernel: [ 1919.454083] BTRFS info (device sdc: state A):
> >>>>>>>>>> delayed_refs_rsv: size 254292787200 reserved 0
> >>>>>>>>>> kernel: [ 1919.454086] BTRFS: error (device sdc: state A) in
> >>>>>>>>>> do_free_extent_accounting:2847: errno=-28 No space left
> >>>>>>>>>> kernel: [ 1919.454123] BTRFS info (device sdc: state
> EA): forced readonly
> >>>>>>>>>> kernel: [ 1919.454127] BTRFS error (device sdc: state
> EA): failed to
> >>>>>>>>>> run delayed ref for logical 102538713931776 num_bytes
> 245760 type 184
> >>>>>>>>>> action 2 ref_mod 1: -28
> >>>>>>>>>> kernel: [ 1919.454176] BTRFS: error (device sdc: state
> EA) in
> >>>>>>>>>> btrfs_run_delayed_refs:2151: errno=-28 No space left
> >>>>>>>>>> kernel: [ 1919.454249] BTRFS warning (device sdc: state EA):
> >>>>>>>>>> btrfs_uuid_scan_kthread failed -5
> >>>>>>>>>> kernel: [ 1919.472381] BTRFS: error (device sdc: state
> EA) in
> >>>>>>>>>> __btrfs_free_extent:3077: errno=-28 No space left
> >>>>>>>>>> kernel: [ 1919.472417] BTRFS error (device sdc: state
> EA): failed to
> >>>>>>>>>> run delayed ref for logical 102538732191744 num_bytes
> 245760 type 184
> >>>>>>>>>> action 2 ref_mod 1: -28
> >>>>>>>>>> kernel: [ 1919.472442] BTRFS: error (device sdc: state
> EA) in
> >>>>>>>>>> btrfs_run_delayed_refs:2151: errno=-28 No space left
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Sat, 17 Jun 2023 at 15:00, Qu Wenruo <wqu@suse.com
> <mailto:wqu@suse.com>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 2023/6/17 13:11, Stefan N wrote:
> >>>>>>>>>>>> Hi Qu,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I believe I've got this environment ready, with the
> 6.2.0 kernel as
> >>>>>>>>>>>> before using the Ubuntu kernel, but can switch to
> vanilla if required.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I've not done anything kernel modifications for a
> solid decade, so
> >>>>>>>>>>>> would be keen for a bit of guidance.
> >>>>>>>>>>>
> >>>>>>>>>>> Sure no problem.
> >>>>>>>>>>>
> >>>>>>>>>>> Please fetch the kernel source tar ball (6.2.x) first,
> decompress, then
> >>>>>>>>>>> apply the attached one-line patch by:
> >>>>>>>>>>>
> >>>>>>>>>>> $ tar czf linux*.tar.xz
> >>>>>>>>>>> $ cd linux*
> >>>>>>>>>>> $ patch -np1 -i <the patch file>
> >>>>>>>>>>>
> >>>>>>>>>>> Then use your running system kernel config if possible:
> >>>>>>>>>>>
> >>>>>>>>>>> $ cp /proc/config.gz .
> >>>>>>>>>>> $ gunzip config.gz
> >>>>>>>>>>> $ mv config .config
> >>>>>>>>>>> $ make olddefconfig
> >>>>>>>>>>>
> >>>>>>>>>>> Then you can start your kernel compiling, and
> considering you're using
> >>>>>>>>>>> your distro's default, it would include tons of
> drivers, thus would be
> >>>>>>>>>>> very slow. (Replace the number to something more
> suitable to your
> >>>>>>>>>>> system, using all CPU cores can be very hot)
> >>>>>>>>>>>
> >>>>>>>>>>> $ make -j12
> >>>>>>>>>>>
> >>>>>>>>>>> Finally you need to install the modules/kernel.
> >>>>>>>>>>>
> >>>>>>>>>>> Unfortunately this is distro specific, but if you're
> using Ubuntu, it
> >>>>>>>>>>> may be much easier:
> >>>>>>>>>>>
> >>>>>>>>>>> $ make bindeb-pkg
> >>>>>>>>>>>
> >>>>>>>>>>> Then install the generated dpkg I guess? I have never
> tried kernel
> >>>>>>>>>>> building using deb/rpm, but only manual installation,
> which is also
> >>>>>>>>>>> distro dependent in the initramfs generation part.
> >>>>>>>>>>>
> >>>>>>>>>>> # cp arch/x86/boot/bzImage /boot/vmlinuz-custom
> >>>>>>>>>>> # make modules_install
> >>>>>>>>>>> # mkinitcpio -k /boot/vmlinuz-custom -g
> /boot/initramfs-custom.img
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> The last step is to update your bootloader to add the
> new kernel, which
> >>>>>>>>>>> is not only distro dependent but also bootloader dependent.
> >>>>>>>>>>>
> >>>>>>>>>>> In my case, I go with systemd-boot with manually
> crafted entries.
> >>>>>>>>>>> But if you go Ubuntu I believe just installing the
> kernel dpkg would
> >>>>>>>>>>> have everything handled?
> >>>>>>>>>>>
> >>>>>>>>>>> Finally you can try reboot into the newer kernel, and
> try device add
> >>>>>>>>>>> (need to add 4 disks), then sync and see if things work
> as expected.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Qu
> >>>>>>>>>>>>
> >>>>>>>>>>>> I will recover a 1tb SSD and partition it into 4 in a
> USB enclosure,
> >>>>>>>>>>>> but failing this will use 4x loop devices.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, 13 Jun 2023 at 11:28, Qu Wenruo
> <quwenruo.btrfs@gmx.com <mailto:quwenruo.btrfs@gmx.com>> wrote:
> >>>>>>>>>>>>> In your particular case, since you're running RAID1C4
> you need to add 4
> >>>>>>>>>>>>> devices in one transaction.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I can easily craft a patch to avoid commit
> transaction, but still you'll
> >>>>>>>>>>>>> need to add at least 4 disks, and then sync to see if
> things would work.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Furthermore this means you need a liveCD with full
> kernel compiling
> >>>>>>>>>>>>> environment.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> If you want to go this path, I can send you the patch
> when you've
> >>>>>>>>>>>>> prepared the needed environment.
>
prev parent reply other threads:[~2023-07-23 7:23 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-12 4:47 Out of space loop: skip_balance not working Stefan N
2023-06-12 5:20 ` Qu Wenruo
2023-06-12 10:31 ` Stefan N
2023-06-12 10:46 ` Qu Wenruo
2023-06-12 13:02 ` Stefan N
2023-06-13 1:29 ` Paul Jones
2023-06-13 1:54 ` Stefan N
2023-06-13 1:58 ` Qu Wenruo
2023-06-17 5:11 ` Stefan N
2023-06-17 5:30 ` Qu Wenruo
2023-06-22 8:33 ` Stefan N
2023-06-22 9:18 ` Qu Wenruo
2023-06-22 22:18 ` Stefan N
2023-06-23 0:57 ` Qu Wenruo
2023-06-23 9:00 ` Stefan N
2023-06-23 9:46 ` Qu Wenruo
2023-06-24 15:29 ` Stefan N
2023-06-26 10:18 ` Qu Wenruo
2023-06-26 12:58 ` Stefan N
2023-07-22 5:28 ` Stefan N
2023-07-22 10:08 ` Qu Wenruo
[not found] ` <CA+W5K0oDRo2LZMiUiysYXpcpmfXTvS27hPdjm1pzq4kfq9=vdQ@mail.gmail.com>
2023-07-23 7:23 ` Qu Wenruo [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c4b714cc-100c-0099-c498-896b815b8e5f@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=fdmanana@kernel.org \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=stefannnau@gmail.com \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox