* Change BTRFS filesystem back to R/W from R/O
@ 2022-11-15 12:27 Spencer Collyer
2022-11-15 12:41 ` Qu Wenruo
0 siblings, 1 reply; 6+ messages in thread
From: Spencer Collyer @ 2022-11-15 12:27 UTC (permalink / raw)
To: linux-btrfs
Hi,
I've hit a problem with a BTRFS filesystem I have on my main machine. It's gone into read-only mode, I suspect because it ran out of disk space (it's got v little space left).
(I've put the info that is requested on the mailing list page at the end of this message.)
Ironically I was going to shift a large chunk of data off this filesystem onto an external backup disk but hadn't gotten around to doing it.
Digging around on the web I've seen suggestions involving unmounting the filesystem, remounting it R/W, removing any unwanted data, then continuing the rebalance. They generally seem to want to add another disk temporarily as well.
I don't really want to go to the trouble of temporarily adding a secondary disk only to then remove it again. As I was already planning on moving a bunch of old data off to external backup, I'm wondering if the following would be safe to do:
1) Unmount the filesystem.
2) Remount it as R/W
3) Move data to the external disk
4) Resume the rebalance operation (is this required?)
Does that look reasonable? Is there likely to be any problem with moving files off the filesystem when the rebalance operation failed?
Thanks for you attention,
Spencer
Command output:
uname -a:
Linux selket 6.0.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 29 Oct 2022 14:08:39 +0000 x86_64 GNU/Linux
btrfs --version:
btrfs-progs v6.0
btrfs fi show: (empty output)
btrfs fi df /data
Data, RAID0: total=10.87TiB, used=10.86TiB
System, single: total=4.00MiB, used=0.00B
System, RAID1: total=8.00MiB, used=784.00KiB
Metadata, single: total=8.00MiB, used=0.00B
Metadata, RAID1: total=23.00GiB, used=21.39GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
WARNING: Multiple block group profiles detected, see 'man btrfs(5)'
WARNING: Metadata: single, raid1
WARNING: System: single, raid1
I'm using systemd, so rather than dmesg the following is output from 'journalctl -b', looking for the first BTRFS error:
Nov 15 09:09:35 selket kernel: ------------[ cut here ]------------
Nov 15 09:09:35 selket kernel: BTRFS: Transaction aborted (error -28)
Nov 15 09:09:35 selket kernel: BTRFS: error (device dm-1: state A) in btrfs_finish_ordered_io:3315: errno=-28 No space left
Nov 15 09:09:35 selket kernel: BTRFS info (device dm-1: state EA): forced readonly
Nov 15 09:09:35 selket kernel: WARNING: CPU: 6 PID: 81435 at fs/btrfs/inode.c:3315 btrfs_finish_ordered_io+0x894/0x960 [btrfs]
Nov 15 09:09:35 selket kernel: Modules linked in: btrfs blake2b_generic xor raid6_pq libcrc32c intel_rapl_msr iTCO_wdt intel_pmc_bxt eeepc_wmi asus_wmi iTCO_vendor_support>
Nov 15 09:09:35 selket kernel: usbhid crc32c_intel sr_mod xhci_pci cdrom xhci_pci_renesas
Nov 15 09:09:35 selket kernel: CPU: 6 PID: 81435 Comm: kworker/u32:16 Not tainted 6.0.6-arch1-1 #1 a46cc4b882cfc11c3bbb09d6a0fab3dcad53b5c2
Nov 15 09:09:35 selket kernel: Hardware name: ASUS All Series/X99-E WS, BIOS 1003 03/04/2015
Nov 15 09:09:35 selket kernel: Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
Nov 15 09:09:35 selket kernel: RIP: 0010:btrfs_finish_ordered_io+0x894/0x960 [btrfs]
Nov 15 09:09:35 selket kernel: Code: 49 8b 46 50 48 05 28 0a 00 00 f0 48 0f ba 28 03 72 1a 83 fd fb 74 30 83 fd e2 74 2b 89 ee 48 c7 c7 e8 9f 48 c1 e8 ac ec 5d e6 <0f> 0b >
Nov 15 09:09:35 selket kernel: RSP: 0018:ffff99624757fd68 EFLAGS: 00010286
Nov 15 09:09:35 selket kernel: RAX: 0000000000000000 RBX: ffff8c532a95a940 RCX: 0000000000000027
Nov 15 09:09:35 selket kernel: RDX: ffff8c57ffba1668 RSI: 0000000000000001 RDI: ffff8c57ffba1660
Nov 15 09:09:35 selket kernel: RBP: 00000000ffffffe4 R08: 0000000000000000 R09: ffff99624757fbf0
Nov 15 09:09:35 selket kernel: R10: 0000000000000003 R11: ffffffffa8acb508 R12: 0000000024906000
Nov 15 09:09:35 selket kernel: R13: ffff8c490501b000 R14: ffff8c49088fee38 R15: ffff8c532a27b4e0
Nov 15 09:09:35 selket kernel: FS: 0000000000000000(0000) GS:ffff8c57ffb80000(0000) knlGS:0000000000000000
Nov 15 09:09:35 selket kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 15 09:09:35 selket kernel: CR2: 00007ff124cfd000 CR3: 0000000173610001 CR4: 00000000001706e0
Nov 15 09:09:35 selket kernel: Call Trace:
Nov 15 09:09:35 selket kernel: <TASK>
Nov 15 09:09:35 selket kernel: btrfs_work_helper+0xe8/0x380 [btrfs bea3ab37602bd115354fd14d10316f0d593c6d2f]
Nov 15 09:09:35 selket kernel: process_one_work+0x1c7/0x380
Nov 15 09:09:35 selket kernel: worker_thread+0x51/0x390
Nov 15 09:09:35 selket kernel: ? rescuer_thread+0x3b0/0x3b0
Nov 15 09:09:35 selket kernel: kthread+0xde/0x110
Nov 15 09:09:35 selket kernel: ? kthread_complete_and_exit+0x20/0x20
Nov 15 09:09:35 selket kernel: ret_from_fork+0x22/0x30
Nov 15 09:09:35 selket kernel: </TASK>
Nov 15 09:09:35 selket kernel: ---[ end trace 0000000000000000 ]---
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Change BTRFS filesystem back to R/W from R/O
2022-11-15 12:27 Change BTRFS filesystem back to R/W from R/O Spencer Collyer
@ 2022-11-15 12:41 ` Qu Wenruo
2022-11-15 14:08 ` Spencer Collyer
[not found] ` <20221115125208.02a2876d@selket>
0 siblings, 2 replies; 6+ messages in thread
From: Qu Wenruo @ 2022-11-15 12:41 UTC (permalink / raw)
To: Spencer Collyer, linux-btrfs
On 2022/11/15 20:27, Spencer Collyer wrote:
> Hi,
>
> I've hit a problem with a BTRFS filesystem I have on my main machine. It's gone into read-only mode, I suspect because it ran out of disk space (it's got v little space left).
>
> (I've put the info that is requested on the mailing list page at the end of this message.)
>
> Ironically I was going to shift a large chunk of data off this filesystem onto an external backup disk but hadn't gotten around to doing it.
>
> Digging around on the web I've seen suggestions involving unmounting the filesystem, remounting it R/W, removing any unwanted data, then continuing the rebalance. They generally seem to want to add another disk temporarily as well.
>
> I don't really want to go to the trouble of temporarily adding a secondary disk only to then remove it again. As I was already planning on moving a bunch of old data off to external backup, I'm wondering if the following would be safe to do:
>
> 1) Unmount the filesystem.
> 2) Remount it as R/W
> 3) Move data to the external disk
> 4) Resume the rebalance operation (is this required?)
I don't think balance can even start.
>
> Does that look reasonable? Is there likely to be any problem with moving files off the filesystem when the rebalance operation failed?
Moving files doesn't seem to cause the problem.
>
> Thanks for you attention,
>
> Spencer
>
> Command output:
>
> uname -a:
> Linux selket 6.0.6-arch1-1 #1 SMP PREEMPT_DYNAMIC Sat, 29 Oct 2022 14:08:39 +0000 x86_64 GNU/Linux
>
> btrfs --version:
> btrfs-progs v6.0
>
> btrfs fi show: (empty output)
>
> btrfs fi df /data
> Data, RAID0: total=10.87TiB, used=10.86TiB
> System, single: total=4.00MiB, used=0.00B
> System, RAID1: total=8.00MiB, used=784.00KiB
> Metadata, single: total=8.00MiB, used=0.00B
> Metadata, RAID1: total=23.00GiB, used=21.39GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> WARNING: Multiple block group profiles detected, see 'man btrfs(5)'
> WARNING: Metadata: single, raid1
> WARNING: System: single, raid1
In fact, you have around 12 MiB space taken by SINGLE chunks.
Thus you can go with something like "btrfs balance start -musage=0
-susage=0 <mnt>" to free up some space.
But that's only recommended when you have moved some data first (aka,
deleted some data).
Or I guess you may ran out of space again.
Another thing not really shown in your report is "btrfs fi usage <mnt>".
There is a known bug especially for multi-device profiles like RAID1
that, if one device ran out of space, while the other device still has
some space left, btrfs will wrongly assume it still has enough space.
I believe that's exactly the reason why you hit the ENOSPC at such a
critical path. Other than hitting ENOSPC reserving space before doing
the real work.
>
> I'm using systemd, so rather than dmesg the following is output from 'journalctl -b', looking for the first BTRFS error:
> Nov 15 09:09:35 selket kernel: ------------[ cut here ]------------
> Nov 15 09:09:35 selket kernel: BTRFS: Transaction aborted (error -28)
> Nov 15 09:09:35 selket kernel: BTRFS: error (device dm-1: state A) in btrfs_finish_ordered_io:3315: errno=-28 No space left
This function is responsible to insert the file extent items and
checksum items.
Hitting ENOSPC here mostly means the above RAID1 situation where btrfs
wrongly assume it can continue by over-committing its metadata space.
Considering you have some metadata space left, I believe you can free
enough space by deleting files (aka, moving it to other filesystems)
Thanks,
Qu
> Nov 15 09:09:35 selket kernel: BTRFS info (device dm-1: state EA): forced readonly
> Nov 15 09:09:35 selket kernel: WARNING: CPU: 6 PID: 81435 at fs/btrfs/inode.c:3315 btrfs_finish_ordered_io+0x894/0x960 [btrfs]
> Nov 15 09:09:35 selket kernel: Modules linked in: btrfs blake2b_generic xor raid6_pq libcrc32c intel_rapl_msr iTCO_wdt intel_pmc_bxt eeepc_wmi asus_wmi iTCO_vendor_support>
> Nov 15 09:09:35 selket kernel: usbhid crc32c_intel sr_mod xhci_pci cdrom xhci_pci_renesas
> Nov 15 09:09:35 selket kernel: CPU: 6 PID: 81435 Comm: kworker/u32:16 Not tainted 6.0.6-arch1-1 #1 a46cc4b882cfc11c3bbb09d6a0fab3dcad53b5c2
> Nov 15 09:09:35 selket kernel: Hardware name: ASUS All Series/X99-E WS, BIOS 1003 03/04/2015
> Nov 15 09:09:35 selket kernel: Workqueue: btrfs-endio-write btrfs_work_helper [btrfs]
> Nov 15 09:09:35 selket kernel: RIP: 0010:btrfs_finish_ordered_io+0x894/0x960 [btrfs]
> Nov 15 09:09:35 selket kernel: Code: 49 8b 46 50 48 05 28 0a 00 00 f0 48 0f ba 28 03 72 1a 83 fd fb 74 30 83 fd e2 74 2b 89 ee 48 c7 c7 e8 9f 48 c1 e8 ac ec 5d e6 <0f> 0b >
> Nov 15 09:09:35 selket kernel: RSP: 0018:ffff99624757fd68 EFLAGS: 00010286
> Nov 15 09:09:35 selket kernel: RAX: 0000000000000000 RBX: ffff8c532a95a940 RCX: 0000000000000027
> Nov 15 09:09:35 selket kernel: RDX: ffff8c57ffba1668 RSI: 0000000000000001 RDI: ffff8c57ffba1660
> Nov 15 09:09:35 selket kernel: RBP: 00000000ffffffe4 R08: 0000000000000000 R09: ffff99624757fbf0
> Nov 15 09:09:35 selket kernel: R10: 0000000000000003 R11: ffffffffa8acb508 R12: 0000000024906000
> Nov 15 09:09:35 selket kernel: R13: ffff8c490501b000 R14: ffff8c49088fee38 R15: ffff8c532a27b4e0
> Nov 15 09:09:35 selket kernel: FS: 0000000000000000(0000) GS:ffff8c57ffb80000(0000) knlGS:0000000000000000
> Nov 15 09:09:35 selket kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> Nov 15 09:09:35 selket kernel: CR2: 00007ff124cfd000 CR3: 0000000173610001 CR4: 00000000001706e0
> Nov 15 09:09:35 selket kernel: Call Trace:
> Nov 15 09:09:35 selket kernel: <TASK>
> Nov 15 09:09:35 selket kernel: btrfs_work_helper+0xe8/0x380 [btrfs bea3ab37602bd115354fd14d10316f0d593c6d2f]
> Nov 15 09:09:35 selket kernel: process_one_work+0x1c7/0x380
> Nov 15 09:09:35 selket kernel: worker_thread+0x51/0x390
> Nov 15 09:09:35 selket kernel: ? rescuer_thread+0x3b0/0x3b0
> Nov 15 09:09:35 selket kernel: kthread+0xde/0x110
> Nov 15 09:09:35 selket kernel: ? kthread_complete_and_exit+0x20/0x20
> Nov 15 09:09:35 selket kernel: ret_from_fork+0x22/0x30
> Nov 15 09:09:35 selket kernel: </TASK>
> Nov 15 09:09:35 selket kernel: ---[ end trace 0000000000000000 ]---
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Change BTRFS filesystem back to R/W from R/O
2022-11-15 12:41 ` Qu Wenruo
@ 2022-11-15 14:08 ` Spencer Collyer
2022-11-15 15:21 ` Spencer Collyer
[not found] ` <20221115125208.02a2876d@selket>
1 sibling, 1 reply; 6+ messages in thread
From: Spencer Collyer @ 2022-11-15 14:08 UTC (permalink / raw)
Cc: linux-btrfs
(Resending to the list as I accidentally sent it just to Qu.)
On Tue, 15 Nov 2022 20:41:54 +0800, Qu Wenruo wrote:
> Considering you have some metadata space left, I believe you can free
> enough space by deleting files (aka, moving it to other filesystems)
>
> Thanks,
> Qu
Hi Qu,
Thanks for that. You say I should move some files to other filesystems, but that's really the nub of my problem - the filesystem is marked as read-only. Am I Ok to do what I mentioned previously:
> 1) Unmount the filesystem.
> 2) Remount it as R/W
> 3) Move data to the external disk
If that is all good, would I need to do anything else or would the BTRFS system sort itself out correctly?
Thanks for your attention,
Spencer
PS. The output form the 'btrfs fi usage /data' command you requested is as follows (run as root to get everything):
Overall:
Device size: 10.92TiB
Device allocated: 10.92TiB
Device unallocated: 1.00MiB
Device missing: 0.00B
Device slack: 0.00B
Used: 10.90TiB
Free (estimated): 15.26GiB (min: 15.26GiB)
Free (statfs, df): 15.26GiB
Data ratio: 1.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Multiple profiles: yes (metadata, system)
Data,RAID0: Size:10.87TiB, Used:10.86TiB (99.86%)
/dev/mapper/data1 5.44TiB
/dev/mapper/data2 5.44TiB
Metadata,single: Size:8.00MiB, Used:0.00B (0.00%)
/dev/mapper/data1 8.00MiB
Metadata,RAID1: Size:23.00GiB, Used:21.39GiB (93.00%)
/dev/mapper/data1 23.00GiB
/dev/mapper/data2 23.00GiB
System,single: Size:4.00MiB, Used:0.00B (0.00%)
/dev/mapper/data1 4.00MiB
System,RAID1: Size:8.00MiB, Used:784.00KiB (9.57%)
/dev/mapper/data1 8.00MiB
/dev/mapper/data2 8.00MiB
Unallocated:
/dev/mapper/data1 0.00B
/dev/mapper/data2 1.00MiB
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Change BTRFS filesystem back to R/W from R/O
2022-11-15 14:08 ` Spencer Collyer
@ 2022-11-15 15:21 ` Spencer Collyer
0 siblings, 0 replies; 6+ messages in thread
From: Spencer Collyer @ 2022-11-15 15:21 UTC (permalink / raw)
To: linux-btrfs
On Tue, 15 Nov 2022 14:08:47 +0000, Spencer Collyer wrote:
>
In the end I just did a umount / mount and it came back fine. So now I'm busy moving stuff off the filesystem to backup :)
Spencer
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Change BTRFS filesystem back to R/W from R/O
[not found] ` <20221115125208.02a2876d@selket>
@ 2022-11-15 22:46 ` Qu Wenruo
0 siblings, 0 replies; 6+ messages in thread
From: Qu Wenruo @ 2022-11-15 22:46 UTC (permalink / raw)
To: Spencer Collyer, linux-btrfs@vger.kernel.org
On 2022/11/15 20:52, Spencer Collyer wrote:
> On Tue, 15 Nov 2022 20:41:54 +0800, Qu Wenruo wrote:
>
>> Considering you have some metadata space left, I believe you can free
>> enough space by deleting files (aka, moving it to other filesystems)
>>
>> Thanks,
>> Qu
>
> Hi Qu,
>
> Thanks for that. You say I should move some files to other filesystems, but that's really the nub of my problem - the filesystem is marked as read-only. Am I Ok to do what I mentioned previously:
>
>> 1) Unmount the filesystem.
>> 2) Remount it as R/W
>> 3) Move data to the external disk
>
> If that is all good, would I need to do anything else or would the BTRFS system sort itself out correctly?
With enough data removed (or maybe with balance to remove those empty
block groups?) btrfs should be able to handle everything.
Just don't try to write any new data into the fs, as it will trigger
btrfs back to RO again due to very limited metadata space.
Thanks,
Qu
>
> Thanks for your attention,
>
> Spencer
>
> PS. The output form the 'btrfs fi usage /data' command you requested is as follows (run as root to get everything):
>
> Overall:
> Device size: 10.92TiB
> Device allocated: 10.92TiB
> Device unallocated: 1.00MiB
> Device missing: 0.00B
> Device slack: 0.00B
> Used: 10.90TiB
> Free (estimated): 15.26GiB (min: 15.26GiB)
> Free (statfs, df): 15.26GiB
> Data ratio: 1.00
> Metadata ratio: 2.00
> Global reserve: 512.00MiB (used: 0.00B)
> Multiple profiles: yes (metadata, system)
>
> Data,RAID0: Size:10.87TiB, Used:10.86TiB (99.86%)
> /dev/mapper/data1 5.44TiB
> /dev/mapper/data2 5.44TiB
>
> Metadata,single: Size:8.00MiB, Used:0.00B (0.00%)
> /dev/mapper/data1 8.00MiB
>
> Metadata,RAID1: Size:23.00GiB, Used:21.39GiB (93.00%)
> /dev/mapper/data1 23.00GiB
> /dev/mapper/data2 23.00GiB
>
> System,single: Size:4.00MiB, Used:0.00B (0.00%)
> /dev/mapper/data1 4.00MiB
>
> System,RAID1: Size:8.00MiB, Used:784.00KiB (9.57%)
> /dev/mapper/data1 8.00MiB
> /dev/mapper/data2 8.00MiB
>
> Unallocated:
> /dev/mapper/data1 0.00B
> /dev/mapper/data2 1.00MiB
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Change BTRFS filesystem back to R/W from R/O
@ 2022-11-16 6:40 Forza
0 siblings, 0 replies; 6+ messages in thread
From: Forza @ 2022-11-16 6:40 UTC (permalink / raw)
To: Spencer Collyer, linux-btrfs Mailinglist, Qu Wenruo; +Cc: linux-btrfs
---- From: Spencer Collyer <spencer@spencercollyer.plus.com> -- Sent: 2022-11-15 - 15:08 ----
> (Resending to the list as I accidentally sent it just to Qu.)
>
> On Tue, 15 Nov 2022 20:41:54 +0800, Qu Wenruo wrote:
>
>> Considering you have some metadata space left, I believe you can free
>> enough space by deleting files (aka, moving it to other filesystems)
>>
>> Thanks,
>> Qu
>
> Hi Qu,
>
> Thanks for that. You say I should move some files to other filesystems, but that's really the nub of my problem - the filesystem is marked as read-only. Am I Ok to do what I mentioned previously:
>
>> 1) Unmount the filesystem.
>> 2) Remount it as R/W
>> 3) Move data to the external disk
>
> If that is all good, would I need to do anything else or would the BTRFS system sort itself out correctly?
With enough `unallocated` space, Btrfs will be OK. Read below for more details on why/how...
>
> Thanks for your attention,
>
> Spencer
>
> PS. The output form the 'btrfs fi usage /data' command you requested is as follows (run as root to get everything):
>
> Overall:
> Device size: 10.92TiB
> Device allocated: 10.92TiB
> Device unallocated: 1.00MiB
> Device missing: 0.00B
> Device slack: 0.00B
> Used: 10.90TiB
> Free (estimated): 15.26GiB (min: 15.26GiB)
> Free (statfs, df): 15.26GiB
> Data ratio: 1.00
> Metadata ratio: 2.00
> Global reserve: 512.00MiB (used: 0.00B)
> Multiple profiles: yes (metadata, system)
>
> Data,RAID0: Size:10.87TiB, Used:10.86TiB (99.86%)
> /dev/mapper/data1 5.44TiB
> /dev/mapper/data2 5.44TiB
>
> Metadata,single: Size:8.00MiB, Used:0.00B (0.00%)
> /dev/mapper/data1 8.00MiB
>
> Metadata,RAID1: Size:23.00GiB, Used:21.39GiB (93.00%)
> /dev/mapper/data1 23.00GiB
> /dev/mapper/data2 23.00GiB
>
> System,single: Size:4.00MiB, Used:0.00B (0.00%)
> /dev/mapper/data1 4.00MiB
>
> System,RAID1: Size:8.00MiB, Used:784.00KiB (9.57%)
> /dev/mapper/data1 8.00MiB
> /dev/mapper/data2 8.00MiB
>
> Unallocated:
> /dev/mapper/data1 0.00B
> /dev/mapper/data2 1.00MiB
Btrfs uses a multi stage allocator. The first stage allocates large regions of space known as chunks for specific types of data, then the second stage allocates blocks like a regular (old-fashioned) filesystem within these larger regions.
In your case, btrfs needed to allocate another metadata chunk, but as you see, there is no `unallocated` space available. Btrfs went read-only to protect itself from damage.
Balancing means Btrfs moved data between chunks so that it can free them. It means that if there are two chunks with 50% usage, Btrfs can compact the data into one chunk and free the other, increasing the unallocated space that can be used for new allocations as needed.
It might be good to schedule a limited data balance at regular intervals to ensure there are always a few unallocated gigabytes.
I wrote a little about it on my wiki https://wiki.tnonline.net/w/Btrfs/ENOSPC and https://wiki.tnonline.net/w/Btrfs/Balance
Regards,
Forza
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2022-11-16 6:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-15 12:27 Change BTRFS filesystem back to R/W from R/O Spencer Collyer
2022-11-15 12:41 ` Qu Wenruo
2022-11-15 14:08 ` Spencer Collyer
2022-11-15 15:21 ` Spencer Collyer
[not found] ` <20221115125208.02a2876d@selket>
2022-11-15 22:46 ` Qu Wenruo
-- strict thread matches above, loose matches on Subject: below --
2022-11-16 6:40 Forza
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox