Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Alban Browaeys <alban.browaeys@gmail.com>
To: Chris Murphy <lists@colorremedies.com>,
	Btrfs BTRFS <linux-btrfs@vger.kernel.org>,
	Qu WenRuo <wqu@suse.com>
Subject: Re: Btrfs ENOSPC / Stuck in RO with "exclusive operation balance paused in progress"
Date: Mon, 15 Jun 2026 13:06:19 +0200	[thread overview]
Message-ID: <a1f5967d-604a-4a4f-aa35-a67ce9cfe3a7@gmail.com> (raw)
In-Reply-To: <791bf200-b3b8-4811-bdf7-3ceaadb9a4f3@app.fastmail.com>

Hi Chris, Hi All,


 There is still a thing I wonder, reagarding the size of the used
 metadata, and knowing that small data are stored on the metadata
 partition, could it be possible to force btrfs to move these small data
 items to the data partition that has 16GB free?
 I doubt I have 1.5GB of pure metadata for around 50GB of data, even if
 there are a lot of small files.

 Le 15/06/2026 à 04:07, Chris Murphy a écrit :
 > 
 > 
 > On Sun, Jun 14, 2026, at 6:25 PM, Alban Browaeys wrote:
 >> Hi,
 >>
 >> I had space after the sdb3 btrfs /var partition, a 9GB swap partition. I
 >> deleted this swap partition and expanded the sdb3 btrfs size with parted
 >> But I don't seem to be able to expand the btrfs filesystem inside sdb3
 >> if I cannot mount it read write, is it so? Or is it only due to my
 >> pending operation?
 > 
 > If the file system goes read-only it cannot be modified at all, including resize.
 > 
 > 
 >> Can I get this btrfs partition back?
 > 
 > I think there's a bug. I do know know if the bug wrote confusion to disk, but either way the current kernel code can't handle it so it's a bug. One of your dmesg contained this
 > 
 >> [ 1084.779326] BTRFS info (device sdb3 state A): space_info METADATA (sub-group id 0) has -190382080 free, is full
 > 
 > Can you run 'btrfs check --readonly' on this file system?
 > 


 # btrfs check --readonly /dev/sdb3
 Opening filesystem to check...
 Checking filesystem on /dev/sdb3
 UUID: 13af326c-631f-482b-9c34-b59b4f100608
 [1/8] checking log skipped (none written)
 [2/8] checking root items
 [3/8] checking extents
 [4/8] checking free space tree
 [5/8] checking fs roots
 [6/8] checking only csums items (without verifying data)
 [7/8] checking root refs
 [8/8] checking quota groups skipped (not enabled on this FS)
 found 54982475776 bytes used, no error found
 total csum bytes: 50964248
 total tree bytes: 2142486528
 total fs tree bytes: 1951760384
 total extent tree bytes: 97910784
 btree space waste bytes: 569359506
 file data blocks allocated: 105614794752
  referenced 100030590976



 > My recommendation is to stop mounting it rw. Only mount it ro. The more it is changed the worse the problem is likely to get.


 So I did. I don't know if the previous attempt to mount it read write
 with default options that filed with ENOSPC did damages. But I have
 since refrained from attempting to mount it rw with any rescue option,
 I only mounted ro with the rescue options.

 > 
 > 
 >> Is there a way to copy the content of the partition mounted in ro in its
 >> current state, create a new partition and copy the content back to it?
 > 
 > mount -o ro
 > 
 > Then use rsync -a or cp -a
 > 
 > I think that's safest.
 > 


 Thank you, the doubt I had with rsync where about it flatting
 subvolumes. I don't know a tool that does what rsync does without
 flattening subvolumes. Thankfully the only subvolumes on this
 partitions are /var/li/docker/btrfs that Gemini told me would be
 recreated at startup by the docker dameon, so it told me to exclude
 them from the rsync. For now the only reliable backup I am confident is
 complete is the dd raw image... but it suffers from the same issue the
 raw device partition has, ie that it is likely in a possibly
 recoverable corrupt state (extends halfway in migration). Gemini told
 me to try btrfs-restore but it gave me thousands of errors like:
 "
 ERROR: zstd frame incomplete
 ERROR: copying data for /mnt/4/hermes-var-20260610-enospc-vanilla/restored_var/@var/backups/dpkg.arch.5.gz failed
 "


 > The other option I was suggesting might make it possible to fix, but I've never been able to try it in a case like what you're experiencing so it's entirely untested and therefore risky.
 > 
 > btrfstune -S1
 > 
 > This *changes the file system* therefore it's a risk. But I think it's a minimal change to just the superblock to make the file system a read-only seed device. This is the only exception to the rule that  you cannot add a device when a Btrfs is read-only. It is possible to add a device to a  seed device. But the unintuitive part is how to make it read-write.
 > 
 > btrfstune -S1 $device1
 > mount $device1 /mnt
 > btrfs device add $device2 /mnt
 > mount -o remount,rw /mnt
 > 

 Sadly I tried it and reported about it in my previous email (but my bad
 I should have shortened the bug citation block in the middle which
 probably lead you to conclude there was only citation until the end of
 the email). The issue is I cannot mark the filesystem as seed if an
 operation is pending. (and as seen before I cannot cancel this
 operation it seems if I cannot mount the filesystem read write, dead lock).
 Here it is:

 So with I tried with what I understood:
 with a copy of the image I did of the partition:
 /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
# btrfstune -S 1 /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
ERROR: please finish/cancel the running replace/balance before running this command

 I tried:
# btrfs check --clear-space-cache v2 /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
Opening filesystem to check...
Checking filesystem on /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
UUID: 13af326c-631f-482b-9c34-b59b4f100608
WARNING: --clear-space-cache option is deprecated, please use "btrfs rescue clear-space-cache" instead
ERROR: please finish/cancel the running replace/balance before running this command

# btrfs balance cancel /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
ERROR: not a directory: /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img

# btrfs rescue clear-uuid-tree /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
ERROR: please finish/cancel the running replace/balance before running this command



 and without the seed flag:
# dd if=/dev/zero of=/mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var_target.img bs=1M count=90000
90000+0 records in
90000+0 records out
94371840000 bytes (94 GB, 88 GiB) copied, 1340.52 s, 70.4 MB/s


# losetup -f --show /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var.img
[475088.702700] loop0: detected capacity change from 0 to 166119424
/dev/loop0

 # mkdir -p /mnt/seed_workspace

# mount -o ro,rescue=all /dev/loop0 /mnt/seed_workspace
[475287.536430] BTRFS info: device /dev/loop0 (7:0) using temp-fsid 7d3aa685-b9cd-4cbc-87ed-c2d0288fdeba
[475287.546680] BTRFS: device label SSDHOME devid 1 transid 44655875 /dev/loop0 (7:0) scanned by mount (8576)
[475287.556933] BTRFS info (device loop0 state S): first mount of filesystem 13af326c-631f-482b-9c34-b59b4f100608
[475287.567325] BTRFS info (device loop0 state S): using crc32c checksum algorithm
[475288.554301] BTRFS info (device loop0 state ECS): enabling ssd optimizations
[475288.562226] BTRFS info (device loop0 state ECS): disabling log replay at mount time
[475288.570394] BTRFS info (device loop0 state ECS): turning on async discard
[475288.577648] BTRFS info (device loop0 state ECS): enabling free space tree
[475288.584915] BTRFS info (device loop0 state ECS): ignoring bad roots
[475288.591652] BTRFS info (device loop0 state ECS): ignoring data csums
[475288.598482] BTRFS info (device loop0 state ECS): ignoring meta csums
[475288.605308] BTRFS info (device loop0 state ECS): ignoring unknown super block flags


# losetup -f --show /mnt/4/hermes-var-20260610-enospc-vanilla/tests/sdb3_var_target.img
[475310.285452] loop1: detected capacity change from 0 to 184320000
/dev/loop1

# btrfs device add /dev/loop1 /mnt/seed_workspace
Performing full device TRIM /dev/loop1 (87.89GiB) ...
[475344.936908] BTRFS error (device loop0 state ECS): device add not supported on cloned temp-fsid mount
ERROR: error adding device '/dev/loop1': Invalid argument



 > This is now a two device Btrfs file system. The seed device remains read-only. The second device receives all the writes/changes as a feature of COW. In effect it's a kind of overlay.
 > 
 > What I'm suggesting is if you now do:
 > 
 > btrfs device remove $device1 /mnt
 > 
 > This tells the btrfs kernel code to replicate the contents of device1 onto device2. This command will not complete until the replication is complete, which could take hours (no idea, depends on how full the file system is).
 > 
 > $device2 size needs to be at least as much as the used amount for $device1.
 > 
 >> There are docker subvolumes on it wich seems they can be recreated by
 >> docker later on (I hope so) so rsync might be an option but if the
 >> docker subvolume need to be backed up rsync is not able to backup them
 >> correctly.
 > 
 > I do not know to what degree the docker btrfs graph driver uses reflinks. Using rsync or cp might dramatically explode the amount of storage needed. It could be docker usage is part of the problem if the problem is at all related to bookend extents. I seem to recall that docker graph driver does result in substantial amounts of bookend extents.
 > 
 > 
 >> And btrfs restore gives me thousands of
 > 
 > Btrfs restore may not help. It's a scraping tool. It's designed to get data out at all costs, even permitting extraction of corrupt files. It won't prefer reflinks or snapshots.


 Ok. Though I believe that if btrfs-restore find incomplete zstd frame
 it might be that rsync will not be able to copy everything from the ro
 half converted btrfs partition?


 >>
 >> Also should I expect this behavior to happen everytime I do a btrfs
 >> filessytem conversion without checking I have enough metadata space
 >> available beforehand or will this be prevented by some new code one day ?
 > 
 > This is always the hard part about file systems. Computers are ordinarily deterministic. But file systems are increasingly non-deterministic as they age. And once you hit the bug, the state of the file system has already changed making it important to stop making all changes to the file system in order to preserve that state for file system developers. The more we hammer on trying to fix it, it's like a crime scene being cleaned up or tampered with. It makes it impossible for the developers to understand what happened and therefore how to fix it so it doesn't happen again.

 Sure. Out of plain "mount /dev/sdb3 /var" I haven't done any rw mount
 attempts. I hope these attempts did not make the issue worst.


 Alban




  reply	other threads:[~2026-06-15 11:06 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-09  4:42 Btrfs ENOSPC / Stuck in RO with "exclusive operation balance paused in progress" Alban Browaeys
2026-06-09 12:17 ` Alban Browaeys
2026-06-09 15:25   ` Chris Murphy
2026-06-10 21:05     ` Alban Browaeys
2026-06-11 21:19       ` Alban Browaeys
2026-06-15  0:25         ` Alban Browaeys
2026-06-15  2:07           ` Chris Murphy
2026-06-15 11:06             ` Alban Browaeys [this message]
2026-06-15 20:00               ` Chris Murphy
2026-06-19  0:05                 ` Alban Browaeys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1f5967d-604a-4a4f-aa35-a67ce9cfe3a7@gmail.com \
    --to=alban.browaeys@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=wqu@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox