Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Andrei Borzenkov <arvidjaar@gmail.com>
To: Kristupas Savickas <kristupas.savickas@pm.me>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: Not enough space during conversion to raid5, filesystem fails to mount as RW
Date: Sat, 21 Nov 2020 09:02:51 +0300	[thread overview]
Message-ID: <2a8c046c-78c3-9f62-e5b4-7a5c9909da5d@gmail.com> (raw)
In-Reply-To: <bzvIMgcJHYGZvBm4xa7bCl_20ql_b3sZtJ6zxcAVyw7eZ8jQYpRFCukGBshxLFF4cRJ-vwdkZgj7GkbqF8o9tKt25RU3xiz_ikIaejDuH90=@pm.me>


[-- Attachment #1.1: Type: text/plain, Size: 8051 bytes --]

20.11.2020 12:24, Kristupas Savickas пишет:
> Hello,
> 
> I tried to convert my (nearly full) file system to raid5. The fs originally contained 3 8TB disks.
> After adding the forth disk I ran:
>     # btrfs balance start -dconvert=raid5,soft -mconvert=raid5,soft -sconvert=raid5,soft -b /mnt/
> 

Using raid5 for metadata is not recommended with current state of raid5
support in btrfs. See also

https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/

> However, it looks like the conversion failed and the fs was left in RO state, with only part of the data being converted to raid5:
> 
>     # btrfs fi show /dev/sdc
>     Label: 'array'  uuid: 6e95de0a-4b51-4aab-b935-469626c83036
>             Total devices 4 FS bytes used 21.52TiB
>             devid    1 size 7.28TiB used 7.28TiB path /dev/sdc
>             devid    2 size 7.28TiB used 7.28TiB path /dev/sdd
>             devid    3 size 7.28TiB used 7.28TiB path /dev/sda
>             devid    4 size 7.28TiB used 7.28TiB path /dev/sdb
> 
>     # mount /dev/sdc /mnt/
>     # mount | grep /mnt
>     /dev/sdc on /mnt type btrfs (ro,relatime,space_cache,subvolid=5,subvol=/)
> 
>     # btrfs fi df /mnt
>     Data, single: total=14.22TiB, used=14.06TiB
>     Data, RAID5: total=7.44TiB, used=7.43TiB
>     System, RAID1: total=32.00MiB, used=2.47MiB
>     System, RAID5: total=32.00MiB, used=128.00KiB
>     Metadata, RAID1: total=13.00GiB, used=11.67GiB
>     Metadata, RAID5: total=12.00GiB, used=11.90GiB
>     GlobalReserve, single: total=512.00MiB, used=0.00B
>     WARNING: Multiple block group profiles detected, see 'man btrfs(5)'.
>     WARNING:   Data: single, raid5
>     WARNING:   Metadata: raid1, raid5
>     WARNING:   System: raid1, raid5
> 

"btrfs filesystem usage -T" fives better overview, but it appears there
are is not enough space to allocate new raid5 chunks.

>     # dmesg
>     [65134.312783] BTRFS info (device sdc): disk space caching is enabled
>     [65134.312784] BTRFS info (device sdc): has skinny extents
>     [65136.207839] BTRFS info (device sdc): bdev /dev/sdd errs: wr 0, rd 4, flush 0, corrupt 0, gen 0
>     [65172.993492] BTRFS alert (device sdc): btrfs RAID5/6 is EXPERIMENTAL and has known data-loss bugs
>     [65204.813036] BTRFS info (device sdc): checking UUID tree
>     [65211.733565] ------------[ cut here ]------------
>     [65211.733567] BTRFS: Transaction aborted (error -28)
>     [65211.733583] BTRFS: error (device sdc) in __btrfs_free_extent:3069: errno=-28 No space left
>     [65211.733629] WARNING: CPU: 2 PID: 19980 at fs/btrfs/extent-tree.c:3069 __btrfs_free_extent.isra.0+0x57e/0x8f0 [btrfs]
>     [65211.735427] BTRFS info (device sdc): forced readonly
>     [65211.735427] Modules linked in: xt_recent fuse ufs
>     [65211.735430] BTRFS: error (device sdc) in btrfs_run_delayed_refs:2173: errno=-28 No space left
>     [65211.735431]  qnx4 hfsplus hfs cdrom minix msdos jfs xfs dm_mod uas usb_storage xt_nat veth xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo nft_chain_nat nf_nat br_netfilter bridge stp llc overlay hid_logitech_hidpp joydev hid_logitech_dj hid_generic usbhid hid amdgpu edac_mce_amd kvm_amd kvm irqbypass ip6t_REJECT nf_reject_ipv6 xt_hl ip6_tables ghash_clmulni_intel nls_ascii nls_cp437 ppdev snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg ip6t_rt snd_hda_codec vfat fat gpu_sched snd_hda_core ttm snd_hwdep ipt_REJECT nf_reject_ipv4 efi_pstore aesni_intel xt_comment wmi_bmof libaes crypto_simd cryptd snd_pcm xt_multiport glue_helper rapl drm_kms_helper ccp snd_timer sp5100_tco efivars pcspkr k10temp watchdog nft_limit snd cec rng_core soundcore i2c_algo_bit sg parport_pc parport evdev xt_limit xt_addrtype xt_tcpudp acpi_cpufreq button xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nft_counter nf_tables
>     [65211.735467]  nfnetlink drm efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c crc32c_generic sd_mod nvme nvme_core crc32_pclmul t10_pi ahci libahci r8169 realtek mdio_devres crc_t10dif libphy i2c_piix4 crct10dif_generic libata crc32c_intel xhci_pci xhci_hcd usbcore scsi_mod crct10dif_pclmul crct10dif_common usb_common wmi video gpio_amdpt gpio_generic
>     [65211.737278] CPU: 2 PID: 19980 Comm: btrfs-transacti Tainted: G        W      X  5.9.0-1-amd64 #1 Debian 5.9.1-1
>     [65211.737280] BTRFS info (device sdc): balance: resume -dconvert=raid5,soft -mconvert=raid5,soft -sconvert=raid5,soft
>     [65211.737281] Hardware name: Gigabyte Technology Co., Ltd. AB350M-D3H/AB350M-D3H-CF, BIOS F51c 07/02/2020
>     [65211.737300] RIP: 0010:__btrfs_free_extent.isra.0+0x57e/0x8f0 [btrfs]
>     [65211.737302] Code: 24 0c ba 5b 0c 00 00 48 c7 c6 40 01 5d c0 4c 89 f7 e8 12 c2 0a 00 e9 b1 fe ff ff 44 89 e6 48 c7 c7 80 a5 5d c0 e8 98 c5 16 d1 <0f> 0b 44 89 e1 ba fd 0b 00 00 48 c7 c6 40 01 5d c0 4c 89 f7 e8 e5
>     [65211.737303] RSP: 0018:ffffa398c192bc50 EFLAGS: 00010282
>     [65211.737305] RAX: 0000000000000000 RBX: 000026d0f54f8000 RCX: ffff969590898ac8
>     [65211.737305] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff969590898ac0
>     [65211.737306] RBP: 0000259405cbd000 R08: 0000000000000995 R09: 0000000000000004
>     [65211.737307] R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffffe4
>     [65211.737308] R13: ffff96955d4f0310 R14: ffff9692ab751d00 R15: 0000000000000005
>     [65211.737309] FS:  0000000000000000(0000) GS:ffff969590880000(0000) knlGS:0000000000000000
>     [65211.737310] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     [65211.737311] CR2: 000055ff8d180cb0 CR3: 00000003b621a000 CR4: 00000000003506e0
>     [65211.737312] Call Trace:
>     [65211.737333]  ? __btrfs_run_delayed_refs+0xfd3/0x1060 [btrfs]
>     [65211.737351]  __btrfs_run_delayed_refs+0x27a/0x1060 [btrfs]
>     [65211.737371]  btrfs_run_delayed_refs+0x73/0x200 [btrfs]
>     [65211.737391]  btrfs_commit_transaction+0x57/0xa30 [btrfs]
>     [65211.737411]  ? start_transaction+0xd2/0x540 [btrfs]
>     [65211.737415]  ? try_to_wake_up+0x130/0x5e0
>     [65211.737434]  transaction_kthread+0x14c/0x170 [btrfs]
>     [65211.737453]  ? btrfs_cleanup_transaction.isra.0+0x5a0/0x5a0 [btrfs]
>     [65211.737455]  kthread+0x11b/0x140
>     [65211.737457]  ? __kthread_bind_mask+0x60/0x60
>     [65211.737460]  ret_from_fork+0x22/0x30
>     [65211.737462] ---[ end trace 2f4a1b25242944de ]---
>     [65211.737463] BTRFS: error (device sdc) in __btrfs_free_extent:3069: errno=-28 No space left
>     [65211.738722] BTRFS: error (device sdc) in btrfs_run_delayed_refs:2173: errno=-28 No space left
>     [65211.748466] BTRFS info (device sdc): balance: ended with status: -30
> 
> Looking at the dmesg output, I see:
>     [65211.737280] BTRFS info (device sdc): balance: resume -dconvert=raid5,soft -mconvert=raid5,soft -sconvert=raid5,soft
> 
> This indicates that the balance is ran again after remounting. Looking at the last line it says that:
>     [65211.748466] BTRFS info (device sdc): balance: ended with status: -30
> 
> I suppose this is what causes the fs to be mounted read only.
> 
> Trying to remount the system as RW results in:
>     [68268.865313] BTRFS info (device sdc): disk space caching is enabled
>     [68268.865316] BTRFS error (device sdc): Remounting read-write after error is not allowed
> 
> My question is how can I recover from a situation like this?
> I could definitely reduce the file system usage if I could mount it as RW, but
> the balance is ran immediately after mounting and fails, which results in a RO mount.
> Would it be possible to cancel the balance, so it doesn't run on mounting?
> 

Mount with skip_balance mount option.

> Additional information:
>     # uname -a
>     Linux s 5.9.0-1-amd64 #1 SMP Debian 5.9.1-1 (2020-10-17) x86_64 GNU/Linux
> 
>     # btrfs --version
>     btrfs-progs v5.9
> 



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

      reply	other threads:[~2020-11-21  6:03 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-20  9:24 Not enough space during conversion to raid5, filesystem fails to mount as RW Kristupas Savickas
2020-11-21  6:02 ` Andrei Borzenkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a8c046c-78c3-9f62-e5b4-7a5c9909da5d@gmail.com \
    --to=arvidjaar@gmail.com \
    --cc=kristupas.savickas@pm.me \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox