Strange behavior with scrub, quotas, and snapshots

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

* Strange behavior with scrub, quotas, and snapshots
@ 2026-04-26 23:52 brainchild
  2026-04-27  2:05 ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-26 23:52 UTC (permalink / raw)
  To: linux-btrfs, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2671 bytes --]

Hello.

I am struggling with a poorly behaved BTRFS volume.

It is a simple volume, consuming only one partition, with no redundancy 
for data. The underlying media, which is NVME, appears to be completely 
healthy, according to SMART.

Several weeks ago, the volume became unwriteable, with false reports 
generated that no space was available, whenever would be attempted the 
creation of a new file.

After a chaotic intermix of balance and scrub operations, as well as 
the deletion of many snapshots and a few large files, a message 
eventually appeared in the kernel log reporting the discovery of 
corruption in the space cache, that had since been resolved.

After, the volume again was usable, with no problems writing files.

Incidentally, I later upgraded to space cache version 2.

Although the episode of complete dysfunction has ended, problems 
remain. In particular are two observations.

First, scrub operations complete almost immediately, reporting a status 
of "finished", with no errors found.

However, as seen also seen in the following console capture, the 
reported amount of total data scanned is only a fraction of the total 
used space:

---
$ btrfs --version
btrfs-progs v6.6.3

$ btrfs fi show
Label: none uuid: bbac86e5-eaba-45bf-bbaa-c2494e11831a
 Total devices 1 FS bytes used 671.85GiB
 devid 1 size 831.26GiB used 831.26GiB path /dev/nvme0n1p5

$ sudo btrfs filesystem df /
Data, single: total=813.13GiB, used=662.49GiB
System, DUP: total=32.00MiB, used=144.00KiB
Metadata, DUP: total=9.03GiB, used=6.26GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

$ sudo btrfs scrub start -B /
Starting scrub on devid 1
scrub done for bbac86e5-eaba-45bf-bbaa-c2494e11831a
Scrub started: Sun Apr 26 19:24:02 2026
Status: finished
Duration: 0:00:11
Total to scrub: 23.37GiB
Rate: 2.12GiB/s
Error summary: no errors found
---

No errors are reported in the kernel log, only warnings about skipping 
the swap file during scrub.

Second, within the logs generated for Timeshift, a concerning pattern 
recurs, as in the attached example. Further, during the periods in 
which are generated logs such as the one attached, the entire system 
lags considerably. It is clear that the volume is not healthy.

I was using a recent 6.x kernel, I believe one of 6.18.x, when the 
problem emerged. I upgraded by to 7.0, finding no improvement in the 
operation of the volume.

Also, I tried initiating the scrub through the most recent static build 
of the user-space utility (i.e. btrfs-progs), with no improvement.

I would like some suggestions for restoring the volume to health, to 
avoid the need to provision a new volume from scratch.

Thank you.

[-- Attachment #2: brainchild_timeshift_log.txt --]
[-- Type: text/plain, Size: 22799 bytes --]

[21:00:06] Removing snapshot: 2026-04-23_20-00-02
[21:00:06] Deleting subvolume: @ (Id:60338)
[21:00:06] btrfs subvolume delete --commit-after '/run/timeshift/105175/backup/timeshift-btrfs/snapshots/2026-04-23_20-00-02/@'
[21:00:06] Waiting on btrfs to finish deleting...
[21:04:55] Deleted subvolume: @ (Id:60338)

[21:04:55] Rescanning quotas...
[21:04:55] Destroying qgroup: 0/60338
[21:04:55] btrfs qgroup destroy 0/60338 '/run/timeshift/105175/backup'
[21:04:55] E: Failed to destroy qgroup: '0/60338'
[21:04:55] E: Failed to remove snapshot: 2026-04-23_20-00-02
[21:04:55] ------------------------------------------------------------------------------
[21:04:55] ------------------------------------------------------------------------------
[21:04:55] Removing snapshot: 2026-04-23_21-58-52
[21:04:55] Deleting subvolume: @ (Id:60350)
[21:04:55] btrfs subvolume delete --commit-after '/run/timeshift/105175/backup/timeshift-btrfs/snapshots/2026-04-23_21-58-52/@'
[21:04:55] Waiting on btrfs to finish deleting...
[21:08:45] Deleted subvolume: @ (Id:60350)

[21:08:45] Rescanning quotas...
[21:08:45] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:46] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:47] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:48] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:50] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:51] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:52] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:53] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:54] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:55] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:56] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:57] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:58] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:08:59] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:00] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:01] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:02] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:03] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:04] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:05] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:06] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:07] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:08] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:09] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:10] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:11] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:12] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:13] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:14] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:15] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:16] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:17] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:18] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:19] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:20] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:21] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:22] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:23] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:24] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:25] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:26] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:27] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:28] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:29] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:30] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:31] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:32] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:33] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:34] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:35] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:36] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:37] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:38] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:39] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:40] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:41] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:42] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:43] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:44] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:45] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:46] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:47] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:48] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:49] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:50] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:51] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:52] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:53] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:54] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:55] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:56] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:57] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:58] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:09:59] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:00] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:01] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:02] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:03] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:04] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:05] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:06] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:07] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:08] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:09] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:10] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:11] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:12] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:13] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:14] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:15] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:16] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:17] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:18] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:19] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:20] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:21] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:22] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:23] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:24] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:25] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:26] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:27] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:28] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:29] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:30] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:31] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:32] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:33] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:34] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:35] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:36] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:37] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:38] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:39] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:40] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:41] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:42] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:43] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:44] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:45] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:47] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:48] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:49] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:50] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:51] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:52] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:53] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:54] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:55] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:56] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:57] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:58] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:10:59] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:00] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:01] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:02] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:03] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:04] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:05] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:06] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:07] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:08] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:09] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:10] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:11] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:12] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:13] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:14] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:15] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:16] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:17] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:18] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:19] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:20] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:21] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:22] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:23] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:24] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:25] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:26] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:27] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:28] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:29] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:30] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:31] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:32] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:33] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:34] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:35] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:36] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:37] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:38] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:39] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:40] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:41] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:42] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:43] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:44] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:45] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:46] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:47] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:48] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:49] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:50] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:51] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:52] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:53] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:54] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:55] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:56] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:57] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:58] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:11:59] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:00] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:01] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:02] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:03] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:04] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:05] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:06] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:07] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:08] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:09] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:10] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:11] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:12] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:13] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:14] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:15] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:16] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:17] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:18] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:19] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:20] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:21] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:22] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:23] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:24] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:25] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:26] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:27] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:28] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:29] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:30] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:31] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:32] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:33] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:34] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:35] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:36] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:37] Still rescanning quotas... ERROR: quota rescan failed: Operation now in progress

[21:12:38] Destroying qgroup: 0/60350
[21:12:38] btrfs qgroup destroy 0/60350 '/run/timeshift/105175/backup'
[21:12:38] E: Failed to destroy qgroup: '0/60350'
[21:12:38] E: Failed to remove snapshot: 2026-04-23_21-58-52

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-26 23:52 Strange behavior with scrub, quotas, and snapshots brainchild
@ 2026-04-27  2:05 ` Qu Wenruo
  2026-04-27 20:32   ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-27  2:05 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/27 09:22, brainchild@mailbox.org 写道:
> Hello.
> 
> I am struggling with a poorly behaved BTRFS volume.
> 
[...]
> ---
> 
> No errors are reported in the kernel log, only warnings about skipping 
> the swap file during scrub.

If you assume the fs has some corruption, none of the above is really 
useful.
A full "btrfs check" is strongly recommended.

> 
> Second, within the logs generated for Timeshift, a concerning pattern 
> recurs, as in the attached example. Further, during the periods in which 
> are generated logs such as the one attached, the entire system lags 
> considerably. It is clear that the volume is not healthy.

The lag is mostly caused by qgroup.
You have a lot of snapshots (shown by the super large snapshot id), 
every time a large snapshot/subvolume is deleted, btrfs will try to 
disable qgroup to avoid such lag, but if whatever script/tool decides to 
rescan qgroup when the snapshot/subvolume deleting is still under going, 
the lag will be re-introduced.

> 
> I was using a recent 6.x kernel, I believe one of 6.18.x, when the 
> problem emerged. I upgraded by to 7.0, finding no improvement in the 
> operation of the volume.
> 
> Also, I tried initiating the scrub through the most recent static build 
> of the user-space utility (i.e. btrfs-progs), with no improvement.
> 
> I would like some suggestions for restoring the volume to health, to 
> avoid the need to provision a new volume from scratch.

"btrfs check" first, if no error, disable qgroup if you have frequent 
snapshot creation/deletion.

Thanks,
Qu

> 
> Thank you.
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-27  2:05 ` Qu Wenruo
@ 2026-04-27 20:32   ` brainchild
  2026-04-27 22:10     ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-27 20:32 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2009 bytes --]

I have the run the check command, which reported a variety of errors. 
The output is attached.

Are any recommendations available to attempt restoring the volume?

Thanks.

On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo 
<quwenruo.btrfs@gmx.com> wrote:
> 
> 
> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
>> Hello.
>> 
>> I am struggling with a poorly behaved BTRFS volume.
>> 
> [...]
>> ---
>> 
>> No errors are reported in the kernel log, only warnings about 
>> skipping \x7fthe swap file during scrub.
> 
> If you assume the fs has some corruption, none of the above is really 
> useful.
> A full "btrfs check" is strongly recommended.
> 
>> 
>> Second, within the logs generated for Timeshift, a concerning 
>> pattern \x7frecurs, as in the attached example. Further, during the 
>> periods in which \x7fare generated logs such as the one attached, the 
>> entire system lags \x7fconsiderably. It is clear that the volume is not 
>> healthy.
> 
> The lag is mostly caused by qgroup.
> You have a lot of snapshots (shown by the super large snapshot id), 
> every time a large snapshot/subvolume is deleted, btrfs will try to 
> disable qgroup to avoid such lag, but if whatever script/tool decides 
> to rescan qgroup when the snapshot/subvolume deleting is still under 
> going, the lag will be re-introduced.
> 
>> 
>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the 
>> \x7fproblem emerged. I upgraded by to 7.0, finding no improvement in 
>> the \x7foperation of the volume.
>> 
>> Also, I tried initiating the scrub through the most recent static 
>> build \x7fof the user-space utility (i.e. btrfs-progs), with no 
>> improvement.
>> 
>> I would like some suggestions for restoring the volume to health, to 
>> \x7favoid the need to provision a new volume from scratch.
> 
> "btrfs check" first, if no error, disable qgroup if you have frequent 
> snapshot creation/deletion.
> 
> Thanks,
> Qu
> 
>> 
>> Thank you.
>> 
> 

[-- Attachment #2: brainchild_btrfs-check_log.txt --]
[-- Type: text/plain, Size: 1511 bytes --]

[1/8] checking log skipped (none written)
[2/8] checking root items
[3/8] checking extents
super bytes used 743892983808 mismatches actual used 740558856192
ERROR: errors found in extent allocation tree or chunk allocation
[4/8] checking free space tree
[5/8] checking fs roots
root 6855 inode 16523863 errors 400, nbytes wrong
root 58570 inode 16523863 errors 400, nbytes wrong
root 59486 inode 16523863 errors 400, nbytes wrong
root 60333 inode 16523863 errors 400, nbytes wrong
root 60367 inode 16523863 errors 400, nbytes wrong
root 60377 inode 16523863 errors 400, nbytes wrong
root 60383 inode 16523863 errors 400, nbytes wrong
root 60475 inode 16523863 errors 400, nbytes wrong
root 60713 inode 16523863 errors 400, nbytes wrong
root 61351 inode 16523863 errors 400, nbytes wrong
root 61421 inode 16523863 errors 400, nbytes wrong
root 61423 inode 16523863 errors 400, nbytes wrong
root 61425 inode 16523863 errors 400, nbytes wrong
root 61427 inode 16523863 errors 400, nbytes wrong
root 61429 inode 16523863 errors 400, nbytes wrong
root 61431 inode 16523863 errors 400, nbytes wrong
ERROR: errors found in fs roots
Opening filesystem to check...
Checking filesystem on /dev/nvme0n1p5
UUID: bbac86e5-eaba-45bf-bbaa-c2494e11831a
found 740558647296 bytes used, error(s) found
total csum bytes: 473596236
total tree bytes: 6834847744
total fs tree bytes: 5660327936
total extent tree bytes: 529825792
btree space waste bytes: 1514685920
file data blocks allocated: 11027378925568
 referenced 886704037888

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-27 20:32   ` brainchild
@ 2026-04-27 22:10     ` Qu Wenruo
       [not found]       ` <SNC6ET.5NSSU3PO7MKD2@mailbox.org>
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-27 22:10 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/28 06:02, brainchild@mailbox.org 写道:
> I have the run the check command, which reported a variety of errors. 
> The output is attached.
> 
> Are any recommendations available to attempt restoring the volume?

The super block bytes mismatch is a minor one, which shouldn't affect 
normal operations.

But still you can use "btrfs rescue fix-device-size" to fix the problem.


The nbytes wrong can be minor too, but it's affecting all snapshots 
containing the inode 16523863.

You can either fix it by copying the inode to another location (can be 
inside the same btrfs), remove the original file, mv back the new copy.
This will need to be done for every snapshot.

Or you can try "btrfs check --repair", which will do an in-place fix, 
but will still break the shared blocks of every snapshot.

Overall, I'd strongly recommend to remove all unused snapshots first 
before doing either fix.

> 
> Thanks.
> 
> On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo 
> <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
>>> Hello.
>>>
>>> I am struggling with a poorly behaved BTRFS volume.
>>>
>> [...]
>>> ---
>>>
>>> No errors are reported in the kernel log, only warnings about 
>>> skipping \x7fthe swap file during scrub.
>>
>> If you assume the fs has some corruption, none of the above is really 
>> useful.
>> A full "btrfs check" is strongly recommended.
>>
>>>
>>> Second, within the logs generated for Timeshift, a concerning pattern 
>>> \x7frecurs, as in the attached example. Further, during the periods in 
>>> which \x7fare generated logs such as the one attached, the entire system 
>>> lags \x7fconsiderably. It is clear that the volume is not healthy.
>>
>> The lag is mostly caused by qgroup.
>> You have a lot of snapshots (shown by the super large snapshot id), 
>> every time a large snapshot/subvolume is deleted, btrfs will try to 
>> disable qgroup to avoid such lag, but if whatever script/tool decides 
>> to rescan qgroup when the snapshot/subvolume deleting is still under 
>> going, the lag will be re-introduced.
>>
>>>
>>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the 
>>> \x7fproblem emerged. I upgraded by to 7.0, finding no improvement in the 
>>> \x7foperation of the volume.
>>>
>>> Also, I tried initiating the scrub through the most recent static 
>>> build \x7fof the user-space utility (i.e. btrfs-progs), with no 
>>> improvement.
>>>
>>> I would like some suggestions for restoring the volume to health, to 
>>> \x7favoid the need to provision a new volume from scratch.
>>
>> "btrfs check" first, if no error, disable qgroup if you have frequent 
>> snapshot creation/deletion.

So your fsck is mostly fine, the lag part is highly possible to be 
caused by qgroup.

If you do not need it, just disable it for good.

Thanks,
Qu

>>
>> Thanks,
>> Qu
>>
>>>
>>> Thank you.
>>>
>>
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
       [not found]       ` <SNC6ET.5NSSU3PO7MKD2@mailbox.org>
@ 2026-04-27 22:58         ` Qu Wenruo
  2026-04-28  0:22           ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-27 22:58 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/28 08:17, brainchild@mailbox.org 写道:
> Currently, scrub operations are not completing properly, so I certainly 
> think it is important to try to repair the volume.
> 
> Which error do you expect is related to the particular problem, 
> concerning scrub?

Can you provide the full "btrfs scrub start -BR" output?

> 
> Is there any evidence that data may have been lost,

Not yet.

> or concern that 
> 'fix-device-size' could cause further loss?

That should be mostly safe, but if you're concerned about it, please 
update to a newer version of btrfs-progs.

Ubuntu is pretty bad at backporting fixes for btrfs-progs.

> Should the operation be done 
> while the device is mounted, or only while not mounted?

Must be unmounted for btrfs check and btrfs rescue.

Thanks,
Qu

> 
> Thanks.
> 
> On Tue, Apr 28 2026 at 07:40:31 AM +09:30:00, Qu Wenruo <wqu@suse.com> 
> wrote:
>>
>>
>> 在 2026/4/28 06:02, brainchild@mailbox.org 写道:
>>> I have the run the check command, which reported a variety of errors. 
>>> \x7fThe output is attached.
>>>
>>> Are any recommendations available to attempt restoring the volume?
>>
>> The super block bytes mismatch is a minor one, which shouldn't affect 
>> normal operations.
>>
>> But still you can use "btrfs rescue fix-device-size" to fix the problem.
>>
>>
>> The nbytes wrong can be minor too, but it's affecting all snapshots 
>> containing the inode 16523863.
>>
>> You can either fix it by copying the inode to another location (can be 
>> inside the same btrfs), remove the original file, mv back the new copy.
>> This will need to be done for every snapshot.
>>
>> Or you can try "btrfs check --repair", which will do an in-place fix, 
>> but will still break the shared blocks of every snapshot.
>>
>> Overall, I'd strongly recommend to remove all unused snapshots first 
>> before doing either fix.
>>
>>>
>>> Thanks.
>>>
>>> On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo 
>>> \x7f<quwenruo.btrfs@gmx.com> wrote:
>>>>
>>>>
>>>> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
>>>>> Hello.
>>>>>
>>>>> I am struggling with a poorly behaved BTRFS volume.
>>>>>
>>>> [...]
>>>>> ---
>>>>>
>>>>> No errors are reported in the kernel log, only warnings about 
>>>>> \x7f\x7f\x7fskipping \x7fthe swap file during scrub.
>>>>
>>>> If you assume the fs has some corruption, none of the above is 
>>>> really \x7f\x7fuseful.
>>>> A full "btrfs check" is strongly recommended.
>>>>
>>>>>
>>>>> Second, within the logs generated for Timeshift, a concerning 
>>>>> pattern \x7f\x7f\x7f\x7frecurs, as in the attached example. Further, during the 
>>>>> periods in \x7f\x7f\x7fwhich \x7fare generated logs such as the one attached, 
>>>>> the entire system \x7f\x7f\x7flags \x7fconsiderably. It is clear that the 
>>>>> volume is not healthy.
>>>>
>>>> The lag is mostly caused by qgroup.
>>>> You have a lot of snapshots (shown by the super large snapshot id), 
>>>> \x7f\x7fevery time a large snapshot/subvolume is deleted, btrfs will try 
>>>> to \x7f\x7fdisable qgroup to avoid such lag, but if whatever script/tool 
>>>> decides \x7f\x7fto rescan qgroup when the snapshot/subvolume deleting is 
>>>> still under \x7f\x7fgoing, the lag will be re-introduced.
>>>>
>>>>>
>>>>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the 
>>>>> \x7f\x7f\x7f\x7fproblem emerged. I upgraded by to 7.0, finding no improvement 
>>>>> in the \x7f\x7f\x7f\x7foperation of the volume.
>>>>>
>>>>> Also, I tried initiating the scrub through the most recent static 
>>>>> \x7f\x7f\x7fbuild \x7fof the user-space utility (i.e. btrfs-progs), with no 
>>>>> \x7f\x7f\x7fimprovement.
>>>>>
>>>>> I would like some suggestions for restoring the volume to health, 
>>>>> to \x7f\x7f\x7f\x7favoid the need to provision a new volume from scratch.
>>>>
>>>> "btrfs check" first, if no error, disable qgroup if you have 
>>>> frequent \x7f\x7fsnapshot creation/deletion.
>>
>> So your fsck is mostly fine, the lag part is highly possible to be 
>> caused by qgroup.
>>
>> If you do not need it, just disable it for good.
>>
>> Thanks,
>> Qu
>>
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>
>>>>> Thank you.
>>>>>
>>>>
>>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-27 22:58         ` Qu Wenruo
@ 2026-04-28  0:22           ` brainchild
  2026-04-28  1:16             ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-28  0:22 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4621 bytes --]

Just as I was scanning for the name of the particular inode, the volume reverted to again insist on being without any free space.
 
I was able to extract the log for a new scrub, as requested (begun after the switch to being unable to create new files).
 
Thanks.
 
"Qu Wenruo" wqu@suse.com – April 27, 2026 6:58 PM
> 在 2026/4/28 08:17, brainchild@mailbox.org 写道:
> > Currently, scrub operations are not completing properly, so I certainly 
> > think it is important to try to repair the volume.
> > 
> > Which error do you expect is related to the particular problem, 
> > concerning scrub?
> 
> Can you provide the full "btrfs scrub start -BR" output?
> 
> > 
> > Is there any evidence that data may have been lost,
> 
> Not yet.
> 
> > or concern that 
> > 'fix-device-size' could cause further loss?
> 
> That should be mostly safe, but if you're concerned about it, please 
> update to a newer version of btrfs-progs.
> 
> Ubuntu is pretty bad at backporting fixes for btrfs-progs.
> 
> > Should the operation be done 
> > while the device is mounted, or only while not mounted?
> 
> Must be unmounted for btrfs check and btrfs rescue.
> 
> Thanks,
> Qu
> 
> > 
> > Thanks.
> > 
> > On Tue, Apr 28 2026 at 07:40:31 AM +09:30:00, Qu Wenruo <wqu@suse.com> 
> > wrote:
> >>
> >>
> >> 在 2026/4/28 06:02, brainchild@mailbox.org 写道:
> >>> I have the run the check command, which reported a variety of errors. 
> >>> The output is attached.
> >>>
> >>> Are any recommendations available to attempt restoring the volume?
> >>
> >> The super block bytes mismatch is a minor one, which shouldn't affect 
> >> normal operations.
> >>
> >> But still you can use "btrfs rescue fix-device-size" to fix the problem.
> >>
> >>
> >> The nbytes wrong can be minor too, but it's affecting all snapshots 
> >> containing the inode 16523863.
> >>
> >> You can either fix it by copying the inode to another location (can be 
> >> inside the same btrfs), remove the original file, mv back the new copy.
> >> This will need to be done for every snapshot.
> >>
> >> Or you can try "btrfs check --repair", which will do an in-place fix, 
> >> but will still break the shared blocks of every snapshot.
> >>
> >> Overall, I'd strongly recommend to remove all unused snapshots first 
> >> before doing either fix.
> >>
> >>>
> >>> Thanks.
> >>>
> >>> On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo 
> >>> <quwenruo.btrfs@gmx.com> wrote:
> >>>>
> >>>>
> >>>> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
> >>>>> Hello.
> >>>>>
> >>>>> I am struggling with a poorly behaved BTRFS volume.
> >>>>>
> >>>> [...]
> >>>>> ---
> >>>>>
> >>>>> No errors are reported in the kernel log, only warnings about 
> >>>>> skipping the swap file during scrub.
> >>>>
> >>>> If you assume the fs has some corruption, none of the above is 
> >>>> really useful.
> >>>> A full "btrfs check" is strongly recommended.
> >>>>
> >>>>>
> >>>>> Second, within the logs generated for Timeshift, a concerning 
> >>>>> pattern recurs, as in the attached example. Further, during the 
> >>>>> periods in which are generated logs such as the one attached, 
> >>>>> the entire system lags considerably. It is clear that the 
> >>>>> volume is not healthy.
> >>>>
> >>>> The lag is mostly caused by qgroup.
> >>>> You have a lot of snapshots (shown by the super large snapshot id), 
> >>>> every time a large snapshot/subvolume is deleted, btrfs will try 
> >>>> to disable qgroup to avoid such lag, but if whatever script/tool 
> >>>> decides to rescan qgroup when the snapshot/subvolume deleting is 
> >>>> still under going, the lag will be re-introduced.
> >>>>
> >>>>>
> >>>>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the 
> >>>>> problem emerged. I upgraded by to 7.0, finding no improvement 
> >>>>> in the operation of the volume.
> >>>>>
> >>>>> Also, I tried initiating the scrub through the most recent static 
> >>>>> build of the user-space utility (i.e. btrfs-progs), with no 
> >>>>> improvement.
> >>>>>
> >>>>> I would like some suggestions for restoring the volume to health, 
> >>>>> to avoid the need to provision a new volume from scratch.
> >>>>
> >>>> "btrfs check" first, if no error, disable qgroup if you have 
> >>>> frequent snapshot creation/deletion.
> >>
> >> So your fsck is mostly fine, the lag part is highly possible to be 
> >> caused by qgroup.
> >>
> >> If you do not need it, just disable it for good.
> >>
> >> Thanks,
> >> Qu
> >>
> >>>>
> >>>> Thanks,
> >>>> Qu
> >>>>
> >>>>>
> >>>>> Thank you.
> >>>>>
> >>>>
> >>>
> >>
> > 
> >
> 
>

[-- Attachment #2: brainchild_scub_log.txt --]
[-- Type: text/plain, Size: 799 bytes --]

WARNING: failed to open the progress status socket at /var/lib/btrfs/scrub.progress.bbac86e5-eaba-45bf-bbaa-c2494e11831a: No space left on device. Progress cannot be queried
WARNING: failed to write the progress status file: No space left on device. Status recording disabled
Starting scrub on devid 1
scrub done for bbac86e5-eaba-45bf-bbaa-c2494e11831a
Scrub started:    Mon Apr 27 19:25:47 2026
Status:           finished
Duration:         0:03:34
	data_extents_scrubbed: 174742
	tree_extents_scrubbed: 1117169
	data_bytes_scrubbed: 11444801536
	tree_bytes_scrubbed: 18303696896
	read_errors: 0
	csum_errors: 0
	verify_errors: 0
	no_csum: 1621176
	csum_discards: 0
	super_errors: 0
	malloc_errors: 0
	uncorrectable_errors: 0
	unverified_errors: 0
	corrected_errors: 0
	last_physical: 844727582720

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  0:22           ` brainchild
@ 2026-04-28  1:16             ` Qu Wenruo
  2026-04-28  1:21               ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28  1:16 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/28 09:52, brainchild 写道:
> Just as I was scanning for the name of the particular inode, the volume reverted to again insist on being without any free space.
>   
> I was able to extract the log for a new scrub, as requested (begun after the switch to being unable to create new files).

Only 10GiB of data is scrubbed, meanwhile your data should be 600GiB+.

You mentioned that there is a swap file, how large is that file, and 
have you tried to disable the swap file before scrubbing?

Thanks,
Qu

>   
> Thanks.
>   
> "Qu Wenruo" wqu@suse.com – April 27, 2026 6:58 PM
>> 在 2026/4/28 08:17, brainchild@mailbox.org 写道:
>>> Currently, scrub operations are not completing properly, so I certainly
>>> think it is important to try to repair the volume.
>>>   
>>> Which error do you expect is related to the particular problem,
>>> concerning scrub?
>>   
>> Can you provide the full "btrfs scrub start -BR" output?
>>   
>>>   
>>> Is there any evidence that data may have been lost,
>>   
>> Not yet.
>>   
>>> or concern that
>>> 'fix-device-size' could cause further loss?
>>   
>> That should be mostly safe, but if you're concerned about it, please
>> update to a newer version of btrfs-progs.
>>   
>> Ubuntu is pretty bad at backporting fixes for btrfs-progs.
>>   
>>> Should the operation be done
>>> while the device is mounted, or only while not mounted?
>>   
>> Must be unmounted for btrfs check and btrfs rescue.
>>   
>> Thanks,
>> Qu
>>   
>>>   
>>> Thanks.
>>>   
>>> On Tue, Apr 28 2026 at 07:40:31 AM +09:30:00, Qu Wenruo <wqu@suse.com>
>>> wrote:
>>>>
>>>>
>>>> 在 2026/4/28 06:02, brainchild@mailbox.org 写道:
>>>>> I have the run the check command, which reported a variety of errors.
>>>>> The output is attached.
>>>>>
>>>>> Are any recommendations available to attempt restoring the volume?
>>>>
>>>> The super block bytes mismatch is a minor one, which shouldn't affect
>>>> normal operations.
>>>>
>>>> But still you can use "btrfs rescue fix-device-size" to fix the problem.
>>>>
>>>>
>>>> The nbytes wrong can be minor too, but it's affecting all snapshots
>>>> containing the inode 16523863.
>>>>
>>>> You can either fix it by copying the inode to another location (can be
>>>> inside the same btrfs), remove the original file, mv back the new copy.
>>>> This will need to be done for every snapshot.
>>>>
>>>> Or you can try "btrfs check --repair", which will do an in-place fix,
>>>> but will still break the shared blocks of every snapshot.
>>>>
>>>> Overall, I'd strongly recommend to remove all unused snapshots first
>>>> before doing either fix.
>>>>
>>>>>
>>>>> Thanks.
>>>>>
>>>>> On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo
>>>>> <quwenruo.btrfs@gmx.com> wrote:
>>>>>>
>>>>>>
>>>>>> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
>>>>>>> Hello.
>>>>>>>
>>>>>>> I am struggling with a poorly behaved BTRFS volume.
>>>>>>>
>>>>>> [...]
>>>>>>> ---
>>>>>>>
>>>>>>> No errors are reported in the kernel log, only warnings about
>>>>>>> skipping the swap file during scrub.
>>>>>>
>>>>>> If you assume the fs has some corruption, none of the above is
>>>>>> really useful.
>>>>>> A full "btrfs check" is strongly recommended.
>>>>>>
>>>>>>>
>>>>>>> Second, within the logs generated for Timeshift, a concerning
>>>>>>> pattern recurs, as in the attached example. Further, during the
>>>>>>> periods in which are generated logs such as the one attached,
>>>>>>> the entire system lags considerably. It is clear that the
>>>>>>> volume is not healthy.
>>>>>>
>>>>>> The lag is mostly caused by qgroup.
>>>>>> You have a lot of snapshots (shown by the super large snapshot id),
>>>>>> every time a large snapshot/subvolume is deleted, btrfs will try
>>>>>> to disable qgroup to avoid such lag, but if whatever script/tool
>>>>>> decides to rescan qgroup when the snapshot/subvolume deleting is
>>>>>> still under going, the lag will be re-introduced.
>>>>>>
>>>>>>>
>>>>>>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the
>>>>>>> problem emerged. I upgraded by to 7.0, finding no improvement
>>>>>>> in the operation of the volume.
>>>>>>>
>>>>>>> Also, I tried initiating the scrub through the most recent static
>>>>>>> build of the user-space utility (i.e. btrfs-progs), with no
>>>>>>> improvement.
>>>>>>>
>>>>>>> I would like some suggestions for restoring the volume to health,
>>>>>>> to avoid the need to provision a new volume from scratch.
>>>>>>
>>>>>> "btrfs check" first, if no error, disable qgroup if you have
>>>>>> frequent snapshot creation/deletion.
>>>>
>>>> So your fsck is mostly fine, the lag part is highly possible to be
>>>> caused by qgroup.
>>>>
>>>> If you do not need it, just disable it for good.
>>>>
>>>> Thanks,
>>>> Qu
>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Qu
>>>>>>
>>>>>>>
>>>>>>> Thank you.
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>   
>>>
>>   
>>


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  1:16             ` Qu Wenruo
@ 2026-04-28  1:21               ` brainchild
  2026-04-28  2:33                 ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-28  1:21 UTC (permalink / raw)
  To: linux-btrfs

I have not tried scrubbing without the swap file being in use.
 
I could try, but as the file is only 32Gb, I would be very surprised if it made any difference.
 
"Qu Wenruo" wqu@suse.com – April 27, 2026 9:16 PM
> 在 2026/4/28 09:52, brainchild 写道:
> > Just as I was scanning for the name of the particular inode, the volume reverted to again insist on being without any free space.
> > 
> > I was able to extract the log for a new scrub, as requested (begun after the switch to being unable to create new files).
> 
> Only 10GiB of data is scrubbed, meanwhile your data should be 600GiB+.
> 
> You mentioned that there is a swap file, how large is that file, and 
> have you tried to disable the swap file before scrubbing?
> 
> Thanks,
> Qu
> 
> > 
> > Thanks.
> > 
> > "Qu Wenruo" wqu@suse.com – April 27, 2026 6:58 PM
> >> 在 2026/4/28 08:17, brainchild@mailbox.org 写道:
> >>> Currently, scrub operations are not completing properly, so I certainly
> >>> think it is important to try to repair the volume.
> >>> 
> >>> Which error do you expect is related to the particular problem,
> >>> concerning scrub?
> >> 
> >> Can you provide the full "btrfs scrub start -BR" output?
> >> 
> >>> 
> >>> Is there any evidence that data may have been lost,
> >> 
> >> Not yet.
> >> 
> >>> or concern that
> >>> 'fix-device-size' could cause further loss?
> >> 
> >> That should be mostly safe, but if you're concerned about it, please
> >> update to a newer version of btrfs-progs.
> >> 
> >> Ubuntu is pretty bad at backporting fixes for btrfs-progs.
> >> 
> >>> Should the operation be done
> >>> while the device is mounted, or only while not mounted?
> >> 
> >> Must be unmounted for btrfs check and btrfs rescue.
> >> 
> >> Thanks,
> >> Qu
> >> 
> >>> 
> >>> Thanks.
> >>> 
> >>> On Tue, Apr 28 2026 at 07:40:31 AM +09:30:00, Qu Wenruo <wqu@suse.com>
> >>> wrote:
> >>>>
> >>>>
> >>>> 在 2026/4/28 06:02, brainchild@mailbox.org 写道:
> >>>>> I have the run the check command, which reported a variety of errors.
> >>>>> The output is attached.
> >>>>>
> >>>>> Are any recommendations available to attempt restoring the volume?
> >>>>
> >>>> The super block bytes mismatch is a minor one, which shouldn't affect
> >>>> normal operations.
> >>>>
> >>>> But still you can use "btrfs rescue fix-device-size" to fix the problem.
> >>>>
> >>>>
> >>>> The nbytes wrong can be minor too, but it's affecting all snapshots
> >>>> containing the inode 16523863.
> >>>>
> >>>> You can either fix it by copying the inode to another location (can be
> >>>> inside the same btrfs), remove the original file, mv back the new copy.
> >>>> This will need to be done for every snapshot.
> >>>>
> >>>> Or you can try "btrfs check --repair", which will do an in-place fix,
> >>>> but will still break the shared blocks of every snapshot.
> >>>>
> >>>> Overall, I'd strongly recommend to remove all unused snapshots first
> >>>> before doing either fix.
> >>>>
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo
> >>>>> <quwenruo.btrfs@gmx.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
> >>>>>>> Hello.
> >>>>>>>
> >>>>>>> I am struggling with a poorly behaved BTRFS volume.
> >>>>>>>
> >>>>>> [...]
> >>>>>>> ---
> >>>>>>>
> >>>>>>> No errors are reported in the kernel log, only warnings about
> >>>>>>> skipping the swap file during scrub.
> >>>>>>
> >>>>>> If you assume the fs has some corruption, none of the above is
> >>>>>> really useful.
> >>>>>> A full "btrfs check" is strongly recommended.
> >>>>>>
> >>>>>>>
> >>>>>>> Second, within the logs generated for Timeshift, a concerning
> >>>>>>> pattern recurs, as in the attached example. Further, during the
> >>>>>>> periods in which are generated logs such as the one attached,
> >>>>>>> the entire system lags considerably. It is clear that the
> >>>>>>> volume is not healthy.
> >>>>>>
> >>>>>> The lag is mostly caused by qgroup.
> >>>>>> You have a lot of snapshots (shown by the super large snapshot id),
> >>>>>> every time a large snapshot/subvolume is deleted, btrfs will try
> >>>>>> to disable qgroup to avoid such lag, but if whatever script/tool
> >>>>>> decides to rescan qgroup when the snapshot/subvolume deleting is
> >>>>>> still under going, the lag will be re-introduced.
> >>>>>>
> >>>>>>>
> >>>>>>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the
> >>>>>>> problem emerged. I upgraded by to 7.0, finding no improvement
> >>>>>>> in the operation of the volume.
> >>>>>>>
> >>>>>>> Also, I tried initiating the scrub through the most recent static
> >>>>>>> build of the user-space utility (i.e. btrfs-progs), with no
> >>>>>>> improvement.
> >>>>>>>
> >>>>>>> I would like some suggestions for restoring the volume to health,
> >>>>>>> to avoid the need to provision a new volume from scratch.
> >>>>>>
> >>>>>> "btrfs check" first, if no error, disable qgroup if you have
> >>>>>> frequent snapshot creation/deletion.
> >>>>
> >>>> So your fsck is mostly fine, the lag part is highly possible to be
> >>>> caused by qgroup.
> >>>>
> >>>> If you do not need it, just disable it for good.
> >>>>
> >>>> Thanks,
> >>>> Qu
> >>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Qu
> >>>>>>
> >>>>>>>
> >>>>>>> Thank you.
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>> 
> >>>
> >> 
> >>
> 
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  1:21               ` brainchild
@ 2026-04-28  2:33                 ` brainchild
  2026-04-28  3:13                   ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-28  2:33 UTC (permalink / raw)
  To: linux-btrfs

With swap disabled, the scrub operation seems to reach completion, with no errors found.
 
However, the check operation still discovers the same errors.
 
The output of `rescue fix-device-size' is "No device size related problem found".
 
After, check still reports the same errors, including the super mismatch.
 
"brainchild" brainchild@mailbox.org – April 28, 2026 1:21 AM
> I have not tried scrubbing without the swap file being in use.
>  
> I could try, but as the file is only 32Gb, I would be very surprised if it made any difference.
>  
> "Qu Wenruo" wqu@suse.com – April 27, 2026 9:16 PM
> > 在 2026/4/28 09:52, brainchild 写道:
> > > Just as I was scanning for the name of the particular inode, the volume reverted to again insist on being without any free space.
> > > 
> > > I was able to extract the log for a new scrub, as requested (begun after the switch to being unable to create new files).
> > 
> > Only 10GiB of data is scrubbed, meanwhile your data should be 600GiB+.
> > 
> > You mentioned that there is a swap file, how large is that file, and 
> > have you tried to disable the swap file before scrubbing?
> > 
> > Thanks,
> > Qu
> > 
> > > 
> > > Thanks.
> > > 
> > > "Qu Wenruo" wqu@suse.com – April 27, 2026 6:58 PM
> > >> 在 2026/4/28 08:17, brainchild@mailbox.org 写道:
> > >>> Currently, scrub operations are not completing properly, so I certainly
> > >>> think it is important to try to repair the volume.
> > >>> 
> > >>> Which error do you expect is related to the particular problem,
> > >>> concerning scrub?
> > >> 
> > >> Can you provide the full "btrfs scrub start -BR" output?
> > >> 
> > >>> 
> > >>> Is there any evidence that data may have been lost,
> > >> 
> > >> Not yet.
> > >> 
> > >>> or concern that
> > >>> 'fix-device-size' could cause further loss?
> > >> 
> > >> That should be mostly safe, but if you're concerned about it, please
> > >> update to a newer version of btrfs-progs.
> > >> 
> > >> Ubuntu is pretty bad at backporting fixes for btrfs-progs.
> > >> 
> > >>> Should the operation be done
> > >>> while the device is mounted, or only while not mounted?
> > >> 
> > >> Must be unmounted for btrfs check and btrfs rescue.
> > >> 
> > >> Thanks,
> > >> Qu
> > >> 
> > >>> 
> > >>> Thanks.
> > >>> 
> > >>> On Tue, Apr 28 2026 at 07:40:31 AM +09:30:00, Qu Wenruo <wqu@suse.com>
> > >>> wrote:
> > >>>>
> > >>>>
> > >>>> 在 2026/4/28 06:02, brainchild@mailbox.org 写道:
> > >>>>> I have the run the check command, which reported a variety of errors.
> > >>>>> The output is attached.
> > >>>>>
> > >>>>> Are any recommendations available to attempt restoring the volume?
> > >>>>
> > >>>> The super block bytes mismatch is a minor one, which shouldn't affect
> > >>>> normal operations.
> > >>>>
> > >>>> But still you can use "btrfs rescue fix-device-size" to fix the problem.
> > >>>>
> > >>>>
> > >>>> The nbytes wrong can be minor too, but it's affecting all snapshots
> > >>>> containing the inode 16523863.
> > >>>>
> > >>>> You can either fix it by copying the inode to another location (can be
> > >>>> inside the same btrfs), remove the original file, mv back the new copy.
> > >>>> This will need to be done for every snapshot.
> > >>>>
> > >>>> Or you can try "btrfs check --repair", which will do an in-place fix,
> > >>>> but will still break the shared blocks of every snapshot.
> > >>>>
> > >>>> Overall, I'd strongly recommend to remove all unused snapshots first
> > >>>> before doing either fix.
> > >>>>
> > >>>>>
> > >>>>> Thanks.
> > >>>>>
> > >>>>> On Mon, Apr 27 2026 at 11:35:28 AM +09:30:00, Qu Wenruo
> > >>>>> <quwenruo.btrfs@gmx.com> wrote:
> > >>>>>>
> > >>>>>>
> > >>>>>> 在 2026/4/27 09:22, brainchild@mailbox.org 写道:
> > >>>>>>> Hello.
> > >>>>>>>
> > >>>>>>> I am struggling with a poorly behaved BTRFS volume.
> > >>>>>>>
> > >>>>>> [...]
> > >>>>>>> ---
> > >>>>>>>
> > >>>>>>> No errors are reported in the kernel log, only warnings about
> > >>>>>>> skipping the swap file during scrub.
> > >>>>>>
> > >>>>>> If you assume the fs has some corruption, none of the above is
> > >>>>>> really useful.
> > >>>>>> A full "btrfs check" is strongly recommended.
> > >>>>>>
> > >>>>>>>
> > >>>>>>> Second, within the logs generated for Timeshift, a concerning
> > >>>>>>> pattern recurs, as in the attached example. Further, during the
> > >>>>>>> periods in which are generated logs such as the one attached,
> > >>>>>>> the entire system lags considerably. It is clear that the
> > >>>>>>> volume is not healthy.
> > >>>>>>
> > >>>>>> The lag is mostly caused by qgroup.
> > >>>>>> You have a lot of snapshots (shown by the super large snapshot id),
> > >>>>>> every time a large snapshot/subvolume is deleted, btrfs will try
> > >>>>>> to disable qgroup to avoid such lag, but if whatever script/tool
> > >>>>>> decides to rescan qgroup when the snapshot/subvolume deleting is
> > >>>>>> still under going, the lag will be re-introduced.
> > >>>>>>
> > >>>>>>>
> > >>>>>>> I was using a recent 6.x kernel, I believe one of 6.18.x, when the
> > >>>>>>> problem emerged. I upgraded by to 7.0, finding no improvement
> > >>>>>>> in the operation of the volume.
> > >>>>>>>
> > >>>>>>> Also, I tried initiating the scrub through the most recent static
> > >>>>>>> build of the user-space utility (i.e. btrfs-progs), with no
> > >>>>>>> improvement.
> > >>>>>>>
> > >>>>>>> I would like some suggestions for restoring the volume to health,
> > >>>>>>> to avoid the need to provision a new volume from scratch.
> > >>>>>>
> > >>>>>> "btrfs check" first, if no error, disable qgroup if you have
> > >>>>>> frequent snapshot creation/deletion.
> > >>>>
> > >>>> So your fsck is mostly fine, the lag part is highly possible to be
> > >>>> caused by qgroup.
> > >>>>
> > >>>> If you do not need it, just disable it for good.
> > >>>>
> > >>>> Thanks,
> > >>>> Qu
> > >>>>
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Qu
> > >>>>>>
> > >>>>>>>
> > >>>>>>> Thank you.
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>> 
> > >>>
> > >> 
> > >>
> > 
> >
> 
>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  2:33                 ` brainchild
@ 2026-04-28  3:13                   ` Qu Wenruo
  2026-04-28  4:03                     ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28  3:13 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/28 12:03, brainchild 写道:
> With swap disabled, the scrub operation seems to reach completion, with no errors found.
>   
> However, the check operation still discovers the same errors.

Scrub is not a fsck, from man page of btrfs-scrub:

Note:
         Scrub is not a filesystem checker (fsck, btrfs-check(8)). It 
can only detect filesystem damage using the checksum validation, and it 
can only repair filesystem damage by copying from other known good replicas.

         btrfs-check(8) performs more exhaustive checking and can 
sometimes be used, with expert guidance, to rebuild certain corrupted 
filesystem structures in the absence of any good replica.


>   
> The output of `rescue fix-device-size' is "No device size related problem found".
>   
> After, check still reports the same errors, including the super mismatch.

If you do not want to manually fix the nbytes mismatch, go "btrfs check 
--repair", which may also help to fix the super block size mismatch.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  3:13                   ` Qu Wenruo
@ 2026-04-28  4:03                     ` brainchild
  2026-04-28  5:13                       ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-28  4:03 UTC (permalink / raw)
  To: linux-btrfs

I am simply reporting on my observations.

Deleting all instances of the file corresponding to the identified 
inode, across all subvolumes, has resolved the problem with nbytes, but 
fix-device-size reported no action, and the super mismatch remains.

In case you have any further thoughts on finding a resolution, I am 
eager for any suggestions. I would like the volume to be fully healthy.

On Tue, Apr 28 2026 at 12:43:51 PM +09:30:00, Qu Wenruo <wqu@suse.com> 
wrote:
> 
> 
> 在 2026/4/28 12:03, brainchild 写道:
>> With swap disabled, the scrub operation seems to reach completion, 
>> with no errors found.
>>   \x7fHowever, the check operation still discovers the same errors.
> 
> Scrub is not a fsck, from man page of btrfs-scrub:
> 
> Note:
>         Scrub is not a filesystem checker (fsck, btrfs-check(8)). It 
> can only detect filesystem damage using the checksum validation, and 
> it can only repair filesystem damage by copying from other known good 
> replicas.
> 
>         btrfs-check(8) performs more exhaustive checking and can 
> sometimes be used, with expert guidance, to rebuild certain corrupted 
> filesystem structures in the absence of any good replica.
> 
> 
>>   \x7fThe output of `rescue fix-device-size' is "No device size related 
>> problem found".
>>   \x7fAfter, check still reports the same errors, including the super 
>> mismatch.
> 
> If you do not want to manually fix the nbytes mismatch, go "btrfs 
> check --repair", which may also help to fix the super block size 
> mismatch.



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  4:03                     ` brainchild
@ 2026-04-28  5:13                       ` Qu Wenruo
  2026-04-28  5:29                         ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28  5:13 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/28 13:33, brainchild@mailbox.org 写道:
> I am simply reporting on my observations.
> 
> Deleting all instances of the file corresponding to the identified 
> inode, across all subvolumes, has resolved the problem with nbytes, but 
> fix-device-size reported no action, and the super mismatch remains.
> 
> In case you have any further thoughts on finding a resolution, I am 
> eager for any suggestions. I would like the volume to be fully healthy.

In this case, your fs has no problem and can be used as usual.
That super block bytes mismatch can be ignored.

I'll enhance the rescue command to handle the case you reported.

Thanks,
Qu

> 
> On Tue, Apr 28 2026 at 12:43:51 PM +09:30:00, Qu Wenruo <wqu@suse.com> 
> wrote:
>>
>>
>> 在 2026/4/28 12:03, brainchild 写道:
>>> With swap disabled, the scrub operation seems to reach completion, 
>>> with no errors found.
>>>   \x7fHowever, the check operation still discovers the same errors.
>>
>> Scrub is not a fsck, from man page of btrfs-scrub:
>>
>> Note:
>>         Scrub is not a filesystem checker (fsck, btrfs-check(8)). It 
>> can only detect filesystem damage using the checksum validation, and 
>> it can only repair filesystem damage by copying from other known good 
>> replicas.
>>
>>         btrfs-check(8) performs more exhaustive checking and can 
>> sometimes be used, with expert guidance, to rebuild certain corrupted 
>> filesystem structures in the absence of any good replica.
>>
>>
>>>   \x7fThe output of `rescue fix-device-size' is "No device size related 
>>> problem found".
>>>   \x7fAfter, check still reports the same errors, including the super 
>>> mismatch.
>>
>> If you do not want to manually fix the nbytes mismatch, go "btrfs 
>> check --repair", which may also help to fix the super block size 
>> mismatch.
> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  5:13                       ` Qu Wenruo
@ 2026-04-28  5:29                         ` brainchild
  2026-04-28  6:41                           ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-28  5:29 UTC (permalink / raw)
  To: linux-btrfs

The volume seems to suffer still from at least two problems:

- If scrub actually completes under any circumstances, it is required that the swap file is unused. Disabling swap is not always feasible during normal operation of the system.

- Periodically, the filesystem reverts to a state in which creation of new files is prevented. Naturally, it is impossible to use a system normally if the root filesystem is effectively read only.

As such, I feel less optimistic than you about the usability of the volume.

Apr 28, 2026 01:13:47 Qu Wenruo <wqu@suse.com>:

> 
> 
> 在 2026/4/28 13:33, brainchild@mailbox.org 写道:
>> I am simply reporting on my observations.
>> Deleting all instances of the file corresponding to the identified inode, across all subvolumes, has resolved the problem with nbytes, but fix-device-size reported no action, and the super mismatch remains.
>> In case you have any further thoughts on finding a resolution, I am eager for any suggestions. I would like the volume to be fully healthy.
> 
> In this case, your fs has no problem and can be used as usual.
> That super block bytes mismatch can be ignored.
> 
> I'll enhance the rescue command to handle the case you reported.
> 
> Thanks,
> Qu
> 
>> On Tue, Apr 28 2026 at 12:43:51 PM +09:30:00, Qu Wenruo <wqu@suse.com> wrote:
>>> 
>>> 
>>> 在 2026/4/28 12:03, brainchild 写道:
>>>> With swap disabled, the scrub operation seems to reach completion, with no errors found.
>>>>   \x7fHowever, the check operation still discovers the same errors.
>>> 
>>> Scrub is not a fsck, from man page of btrfs-scrub:
>>> 
>>> Note:
>>>         Scrub is not a filesystem checker (fsck, btrfs-check(8)). It can only detect filesystem damage using the checksum validation, and it can only repair filesystem damage by copying from other known good replicas.
>>> 
>>>         btrfs-check(8) performs more exhaustive checking and can sometimes be used, with expert guidance, to rebuild certain corrupted filesystem structures in the absence of any good replica.
>>> 
>>> 
>>>>   \x7fThe output of `rescue fix-device-size' is "No device size related problem found".
>>>>   \x7fAfter, check still reports the same errors, including the super mismatch.
>>> 
>>> If you do not want to manually fix the nbytes mismatch, go "btrfs check --repair", which may also help to fix the super block size mismatch.
>> 
>> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  5:29                         ` brainchild
@ 2026-04-28  6:41                           ` Qu Wenruo
  2026-04-28 19:30                             ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28  6:41 UTC (permalink / raw)
  To: brainchild, linux-btrfs

在 2026/4/28 14:59, brainchild@mailbox.org 写道:
> The volume seems to suffer still from at least two problems:
> 
> - If scrub actually completes under any circumstances, it is required that the swap file is unused. Disabling swap is not always feasible during normal operation of the system.

By design it's not easy to support swap file on a COW filesystem, that's 
why btrfs has so many limits when using swap files.

I strongly don't recommend to use swap files on btrfs, as you have 
already experienced the limit on scrub, and I believe a lot of end users 
are not aware of all the limits when using swap file on btrfs, please 
check the long long list of limitations in "SWAPFILE SUPPORT" of btrfs(5).

> 
> - Periodically, the filesystem reverts to a state in which creation of new files is prevented. Naturally, it is impossible to use a system normally if the root filesystem is effectively read only.

Any dmesg of that RO flips? That indicates the fs flipped read-only, 
which is a huge problem by itself.

Especially with your initial info, there should be enough data space, 
metadata space is less ideal but should be enough.

$ sudo btrfs filesystem df /
Data, single: total=813.13GiB, used=662.49GiB
System, DUP: total=32.00MiB, used=144.00KiB
Metadata, DUP: total=9.03GiB, used=6.26GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

Considering how many snapshots you have (triggering qgroup lag), I 
strongly recommended to remove unused snapshots to free up space.

After freeing up enough space, then try to balance data block groups to 
make space for future metadata usages.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28  6:41                           ` Qu Wenruo
@ 2026-04-28 19:30                             ` brainchild
  2026-04-28 22:19                               ` brainchild
  2026-04-28 22:23                               ` Qu Wenruo
  0 siblings, 2 replies; 25+ messages in thread
From: brainchild @ 2026-04-28 19:30 UTC (permalink / raw)
  To: linux-btrfs

On Tue, Apr 28 2026 at 04:11:05 PM +09:30:00, Qu Wenruo <wqu@suse.com> 
wrote:
> 
> I strongly don't recommend to use swap files on btrfs, as you have 
> already experienced the limit on scrub, and I believe a lot of end 
> users are not aware of all the limits when using swap file on btrfs, 
> please check the long long list of limitations in "SWAPFILE SUPPORT" 
> of btrfs(5).

Is it expected that the scrub operation cannot function properly if the 
volume has a swap file? I never before observed such a problem, nor 
find any mention in the documentation.

The specific restrictions, as documented, for the swap file, seem 
completely compatible with my use, a single partition with no data 
duplication. I have no need for spanning devices or duplicating data, 
on the particular system.

> Any dmesg of that RO flips? That indicates the fs flipped read-only, 
> which is a huge problem by itself.

No. There are no kernel messages that are errors for the file system, 
or switches to read-only.

> Especially with your initial info, there should be enough data space, 
> metadata space is less ideal but should be enough.

I have read that the space allocated for metadata is expanded as 
needed. Why would problems follow from too little space being allocated?

> Considering how many snapshots you have (triggering qgroup lag), I 
> strongly recommended to remove unused snapshots to free up space.
> 
> After freeing up enough space, then try to balance data block groups 
> to make space for future metadata usages.

The situation with balance is quite confused.

The problem with the reported lack of free space first occurred several 
weeks ago. At that time, I deleted snapshots, and ran balance 
operations with incrementally higher usage values for data and 
metadata. By the end, I had run the operation, without reported 
failure, with values as high as 95%. Normally, such an operation would 
be very long, but in my case it finished in less than a minute. Also by 
the end, only about ten blocks in total had actually been reported as 
moved.

Perhaps my installation of btrfsd has been successfully maintaining the 
balance for the volume. The system logs are not extensive enough for me 
to know when it last performed any operations.

Regardless, it seems that the general problems are not becoming 
resolved by invocations of balance.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28 19:30                             ` brainchild
@ 2026-04-28 22:19                               ` brainchild
  2026-04-28 22:26                                 ` Qu Wenruo
  2026-04-28 22:23                               ` Qu Wenruo
  1 sibling, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-28 22:19 UTC (permalink / raw)
  To: linux-btrfs

Further to the comments in the previous message, I have also found some 
messages in the kernel log, from the balance operations, which may be 
relevant.

As the console output of the command is "ERROR: error during balancing 
'/': No space left on device", the kernel messages are as shown below.

---

balance: start -musage=50 -susage=50
BTRFS info (device nvme0n1p5): relocating block group 4126692868096 
flags metadata|dup
BTRFS info (device nvme0n1p5): found 9279 extents, stage: move data 
extents
BTRFS info (device nvme0n1p5): relocating block group 4126155997184 
flags metadata|dup
BTRFS info (device nvme0n1p5): found 6365 extents, stage: move data 
extents
BTRFS info (device nvme0n1p5): relocating block group 4125149364224 
flags metadata|dup
BTRFS info (device nvme0n1p5): found 10145 extents, stage: move data 
extents
BTRFS info (device nvme0n1p5): relocating block group 4124612493312 
flags metadata|dup
BTRFS info (device nvme0n1p5): found 11487 extents, stage: move data 
extents
BTRFS info (device nvme0n1p5): 1 enospc errors during balance
BTRFS info (device nvme0n1p5): balance: ended with status: -28


On Tue, Apr 28 2026 at 03:30:00 PM -04:00:00, brainchild@mailbox.org 
wrote:
> 
> On Tue, Apr 28 2026 at 04:11:05 PM +09:30:00, Qu Wenruo 
> <wqu@suse.com> wrote:
>> 
>> I strongly don't recommend to use swap files on btrfs, as you have 
>> \x7falready experienced the limit on scrub, and I believe a lot of end 
>> \x7fusers are not aware of all the limits when using swap file on 
>> btrfs, \x7fplease check the long long list of limitations in "SWAPFILE 
>> SUPPORT" \x7fof btrfs(5).
> 
> Is it expected that the scrub operation cannot function properly if 
> the volume has a swap file? I never before observed such a problem, 
> nor find any mention in the documentation.
> 
> The specific restrictions, as documented, for the swap file, seem 
> completely compatible with my use, a single partition with no data 
> duplication. I have no need for spanning devices or duplicating data, 
> on the particular system.
> 
>> Any dmesg of that RO flips? That indicates the fs flipped read-only, 
>> \x7fwhich is a huge problem by itself.
> 
> No. There are no kernel messages that are errors for the file system, 
> or switches to read-only.
> 
>> Especially with your initial info, there should be enough data 
>> space, \x7fmetadata space is less ideal but should be enough.
> 
> I have read that the space allocated for metadata is expanded as 
> needed. Why would problems follow from too little space being 
> allocated?
> 
>> Considering how many snapshots you have (triggering qgroup lag), I 
>> \x7fstrongly recommended to remove unused snapshots to free up space.
>> 
>> After freeing up enough space, then try to balance data block groups 
>> \x7fto make space for future metadata usages.
> 
> The situation with balance is quite confused.
> 
> The problem with the reported lack of free space first occurred 
> several weeks ago. At that time, I deleted snapshots, and ran balance 
> operations with incrementally higher usage values for data and 
> metadata. By the end, I had run the operation, without reported 
> failure, with values as high as 95%. Normally, such an operation 
> would be very long, but in my case it finished in less than a minute. 
> Also by the end, only about ten blocks in total had actually been 
> reported as moved.
> 
> Perhaps my installation of btrfsd has been successfully maintaining 
> the balance for the volume. The system logs are not extensive enough 
> for me to know when it last performed any operations.
> 
> Regardless, it seems that the general problems are not becoming 
> resolved by invocations of balance.
> 



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28 19:30                             ` brainchild
  2026-04-28 22:19                               ` brainchild
@ 2026-04-28 22:23                               ` Qu Wenruo
  2026-04-28 22:34                                 ` Qu Wenruo
  2026-04-29  0:57                                 ` brainchild
  1 sibling, 2 replies; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28 22:23 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/29 05:00, brainchild@mailbox.org 写道:
> 
> On Tue, Apr 28 2026 at 04:11:05 PM +09:30:00, Qu Wenruo <wqu@suse.com> 
> wrote:
>>
>> I strongly don't recommend to use swap files on btrfs, as you have 
>> already experienced the limit on scrub, and I believe a lot of end 
>> users are not aware of all the limits when using swap file on btrfs, 
>> please check the long long list of limitations in "SWAPFILE SUPPORT" 
>> of btrfs(5).
> 
> Is it expected that the scrub operation cannot function properly if the 
> volume has a swap file?

Yes, scrub works by iterating each block group, and to avoid concurrency 
modification, scrub will mark the block group RO.

But if there is a scrub covering even one block of the block group, the 
block group cannot be marked read-only, thus scrub will skip the whole 
block group.

> I never before observed such a problem, nor find 
> any mention in the documentation.

OK, btrfs(5) only mentions dev-replace, not scrub itself.

Something we need to update it.

> 
> The specific restrictions, as documented, for the swap file, seem 
> completely compatible with my use, a single partition with no data 
> duplication. I have no need for spanning devices or duplicating data, on 
> the particular system.
> 
>> Any dmesg of that RO flips? That indicates the fs flipped read-only, 
>> which is a huge problem by itself.
> 
> No. There are no kernel messages that are errors for the file system, or 
> switches to read-only.

Then how did the problem of failing to create new files happen?

Any extra output when that happened?
Just returning -ENOSPC error messages but the fs is still read-write?

Then it may be not enough meta/data space left.

> 
>> Especially with your initial info, there should be enough data space, 
>> metadata space is less ideal but should be enough.
> 
> I have read that the space allocated for metadata is expanded as needed. 
> Why would problems follow from too little space being allocated?

Because you have no unallocated space so metadata can not be expanded 
anymore.

> 
>> Considering how many snapshots you have (triggering qgroup lag), I 
>> strongly recommended to remove unused snapshots to free up space.
>>
>> After freeing up enough space, then try to balance data block groups 
>> to make space for future metadata usages.
> 
> The situation with balance is quite confused.
> 
> The problem with the reported lack of free space first occurred several 
> weeks ago. At that time, I deleted snapshots, and ran balance operations 
> with incrementally higher usage values for data and metadata. By the 
> end, I had run the operation, without reported failure, with values as 
> high as 95%. Normally, such an operation would be very long, but in my 
> case it finished in less than a minute. Also by the end, only about ten 
> blocks in total had actually been reported as moved.

But your 'btrfs fi df' shows the other way, a lot of free data space, 
but not much for metadata, and still no unallocated space.

> 
> Perhaps my installation of btrfsd has been successfully maintaining the 
> balance for the volume. The system logs are not extensive enough for me 
> to know when it last performed any operations.
> 
> Regardless, it seems that the general problems are not becoming resolved 
> by invocations of balance.
> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28 22:19                               ` brainchild
@ 2026-04-28 22:26                                 ` Qu Wenruo
  2026-04-28 22:50                                   ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28 22:26 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/29 07:49, brainchild@mailbox.org 写道:
> Further to the comments in the previous message, I have also found some 
> messages in the kernel log, from the balance operations, which may be 
> relevant.
> 
> As the console output of the command is "ERROR: error during balancing 
> '/': No space left on device", the kernel messages are as shown below.
> 
> ---
> 
> balance: start -musage=50 -susage=50
> BTRFS info (device nvme0n1p5): relocating block group 4126692868096 
> flags metadata|dup
> BTRFS info (device nvme0n1p5): found 9279 extents, stage: move data extents
> BTRFS info (device nvme0n1p5): relocating block group 4126155997184 
> flags metadata|dup
> BTRFS info (device nvme0n1p5): found 6365 extents, stage: move data extents
> BTRFS info (device nvme0n1p5): relocating block group 4125149364224 
> flags metadata|dup
> BTRFS info (device nvme0n1p5): found 10145 extents, stage: move data 
> extents
> BTRFS info (device nvme0n1p5): relocating block group 4124612493312 
> flags metadata|dup
> BTRFS info (device nvme0n1p5): found 11487 extents, stage: move data 
> extents
> BTRFS info (device nvme0n1p5): 1 enospc errors during balance
> BTRFS info (device nvme0n1p5): balance: ended with status: -28

 From the dmesg, you're relocating only metadata block groups.

Meanwhile all the free space is inside data block groups, you need to 
balance *only* data block groups to free up space for metadata.

Not the opposite.

> 
> 
> On Tue, Apr 28 2026 at 03:30:00 PM -04:00:00, brainchild@mailbox.org wrote:
>>
>> On Tue, Apr 28 2026 at 04:11:05 PM +09:30:00, Qu Wenruo <wqu@suse.com> 
>> wrote:
>>>
>>> I strongly don't recommend to use swap files on btrfs, as you have 
>>> \x7falready experienced the limit on scrub, and I believe a lot of end 
>>> \x7fusers are not aware of all the limits when using swap file on btrfs, 
>>> \x7fplease check the long long list of limitations in "SWAPFILE SUPPORT" 
>>> \x7fof btrfs(5).
>>
>> Is it expected that the scrub operation cannot function properly if 
>> the volume has a swap file? I never before observed such a problem, 
>> nor find any mention in the documentation.
>>
>> The specific restrictions, as documented, for the swap file, seem 
>> completely compatible with my use, a single partition with no data 
>> duplication. I have no need for spanning devices or duplicating data, 
>> on the particular system.
>>
>>> Any dmesg of that RO flips? That indicates the fs flipped read-only, 
>>> \x7fwhich is a huge problem by itself.
>>
>> No. There are no kernel messages that are errors for the file system, 
>> or switches to read-only.
>>
>>> Especially with your initial info, there should be enough data space, 
>>> \x7fmetadata space is less ideal but should be enough.
>>
>> I have read that the space allocated for metadata is expanded as 
>> needed. Why would problems follow from too little space being allocated?
>>
>>> Considering how many snapshots you have (triggering qgroup lag), I 
>>> \x7fstrongly recommended to remove unused snapshots to free up space.
>>>
>>> After freeing up enough space, then try to balance data block groups 
>>> \x7fto make space for future metadata usages.
>>
>> The situation with balance is quite confused.
>>
>> The problem with the reported lack of free space first occurred 
>> several weeks ago. At that time, I deleted snapshots, and ran balance 
>> operations with incrementally higher usage values for data and 
>> metadata. By the end, I had run the operation, without reported 
>> failure, with values as high as 95%. Normally, such an operation would 
>> be very long, but in my case it finished in less than a minute. Also 
>> by the end, only about ten blocks in total had actually been reported 
>> as moved.
>>
>> Perhaps my installation of btrfsd has been successfully maintaining 
>> the balance for the volume. The system logs are not extensive enough 
>> for me to know when it last performed any operations.
>>
>> Regardless, it seems that the general problems are not becoming 
>> resolved by invocations of balance.
>>
> 
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28 22:23                               ` Qu Wenruo
@ 2026-04-28 22:34                                 ` Qu Wenruo
  2026-04-29  0:57                                 ` brainchild
  1 sibling, 0 replies; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28 22:34 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/29 07:53, Qu Wenruo 写道:
> 
> 
> 在 2026/4/29 05:00, brainchild@mailbox.org 写道:
>>
>> On Tue, Apr 28 2026 at 04:11:05 PM +09:30:00, Qu Wenruo <wqu@suse.com> 
>> wrote:
>>>
>>> I strongly don't recommend to use swap files on btrfs, as you have 
>>> already experienced the limit on scrub, and I believe a lot of end 
>>> users are not aware of all the limits when using swap file on btrfs, 
>>> please check the long long list of limitations in "SWAPFILE SUPPORT" 
>>> of btrfs(5).
>>
>> Is it expected that the scrub operation cannot function properly if 
>> the volume has a swap file?
> 
> Yes, scrub works by iterating each block group, and to avoid concurrency 
> modification, scrub will mark the block group RO.
> 
> But if there is a scrub covering even one block of the block group,

Typo, "scrub" -> "swap file".

> the 
> block group cannot be marked read-only, thus scrub will skip the whole 
> block group.
> 
>> I never before observed such a problem, nor find any mention in the 
>> documentation.
> 
> OK, btrfs(5) only mentions dev-replace, not scrub itself.
> 
> Something we need to update it.
> 
>>
>> The specific restrictions, as documented, for the swap file, seem 
>> completely compatible with my use, a single partition with no data 
>> duplication. I have no need for spanning devices or duplicating data, 
>> on the particular system.
>>
>>> Any dmesg of that RO flips? That indicates the fs flipped read-only, 
>>> which is a huge problem by itself.
>>
>> No. There are no kernel messages that are errors for the file system, 
>> or switches to read-only.
> 
> Then how did the problem of failing to create new files happen?
> 
> Any extra output when that happened?
> Just returning -ENOSPC error messages but the fs is still read-write?
> 
> Then it may be not enough meta/data space left.
> 
>>
>>> Especially with your initial info, there should be enough data space, 
>>> metadata space is less ideal but should be enough.
>>
>> I have read that the space allocated for metadata is expanded as 
>> needed. Why would problems follow from too little space being allocated?
> 
> Because you have no unallocated space so metadata can not be expanded 
> anymore.
> 
>>
>>> Considering how many snapshots you have (triggering qgroup lag), I 
>>> strongly recommended to remove unused snapshots to free up space.
>>>
>>> After freeing up enough space, then try to balance data block groups 
>>> to make space for future metadata usages.
>>
>> The situation with balance is quite confused.
>>
>> The problem with the reported lack of free space first occurred 
>> several weeks ago. At that time, I deleted snapshots, and ran balance 
>> operations with incrementally higher usage values for data and 
>> metadata. By the end, I had run the operation, without reported 
>> failure, with values as high as 95%. Normally, such an operation would 
>> be very long, but in my case it finished in less than a minute. Also 
>> by the end, only about ten blocks in total had actually been reported 
>> as moved.
> 
> But your 'btrfs fi df' shows the other way, a lot of free data space, 
> but not much for metadata, and still no unallocated space.
> 
>>
>> Perhaps my installation of btrfsd has been successfully maintaining 
>> the balance for the volume. The system logs are not extensive enough 
>> for me to know when it last performed any operations.
>>
>> Regardless, it seems that the general problems are not becoming 
>> resolved by invocations of balance.
>>
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28 22:26                                 ` Qu Wenruo
@ 2026-04-28 22:50                                   ` Qu Wenruo
  0 siblings, 0 replies; 25+ messages in thread
From: Qu Wenruo @ 2026-04-28 22:50 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/29 07:56, Qu Wenruo 写道:
> 
> 
> 在 2026/4/29 07:49, brainchild@mailbox.org 写道:
>> Further to the comments in the previous message, I have also found 
>> some messages in the kernel log, from the balance operations, which 
>> may be relevant.
>>
>> As the console output of the command is "ERROR: error during balancing 
>> '/': No space left on device", the kernel messages are as shown below.
>>
>> ---
>>
>> balance: start -musage=50 -susage=50
>> BTRFS info (device nvme0n1p5): relocating block group 4126692868096 
>> flags metadata|dup
>> BTRFS info (device nvme0n1p5): found 9279 extents, stage: move data 
>> extents
>> BTRFS info (device nvme0n1p5): relocating block group 4126155997184 
>> flags metadata|dup
>> BTRFS info (device nvme0n1p5): found 6365 extents, stage: move data 
>> extents
>> BTRFS info (device nvme0n1p5): relocating block group 4125149364224 
>> flags metadata|dup
>> BTRFS info (device nvme0n1p5): found 10145 extents, stage: move data 
>> extents
>> BTRFS info (device nvme0n1p5): relocating block group 4124612493312 
>> flags metadata|dup
>> BTRFS info (device nvme0n1p5): found 11487 extents, stage: move data 
>> extents
>> BTRFS info (device nvme0n1p5): 1 enospc errors during balance
>> BTRFS info (device nvme0n1p5): balance: ended with status: -28
> 
>  From the dmesg, you're relocating only metadata block groups.
> 
> Meanwhile all the free space is inside data block groups, you need to 
> balance *only* data block groups to free up space for metadata.
> 
> Not the opposite.

And forgot to mention, balance is also affected by swapfiles.

The same as scrub, a block group (1GiB) will be completely skipped if 
there is any extent of an active swapfile.

So your previous balance runs may also be screwed up by your swapfile.

> 
>>
>>
>> On Tue, Apr 28 2026 at 03:30:00 PM -04:00:00, brainchild@mailbox.org 
>> wrote:
>>>
>>> On Tue, Apr 28 2026 at 04:11:05 PM +09:30:00, Qu Wenruo 
>>> <wqu@suse.com> wrote:
>>>>
>>>> I strongly don't recommend to use swap files on btrfs, as you have 
>>>> \x7falready experienced the limit on scrub, and I believe a lot of end 
>>>> \x7fusers are not aware of all the limits when using swap file on 
>>>> btrfs, \x7fplease check the long long list of limitations in "SWAPFILE 
>>>> SUPPORT" \x7fof btrfs(5).
>>>
>>> Is it expected that the scrub operation cannot function properly if 
>>> the volume has a swap file? I never before observed such a problem, 
>>> nor find any mention in the documentation.
>>>
>>> The specific restrictions, as documented, for the swap file, seem 
>>> completely compatible with my use, a single partition with no data 
>>> duplication. I have no need for spanning devices or duplicating data, 
>>> on the particular system.
>>>
>>>> Any dmesg of that RO flips? That indicates the fs flipped read-only, 
>>>> \x7fwhich is a huge problem by itself.
>>>
>>> No. There are no kernel messages that are errors for the file system, 
>>> or switches to read-only.
>>>
>>>> Especially with your initial info, there should be enough data 
>>>> space, \x7fmetadata space is less ideal but should be enough.
>>>
>>> I have read that the space allocated for metadata is expanded as 
>>> needed. Why would problems follow from too little space being allocated?
>>>
>>>> Considering how many snapshots you have (triggering qgroup lag), I 
>>>> \x7fstrongly recommended to remove unused snapshots to free up space.
>>>>
>>>> After freeing up enough space, then try to balance data block groups 
>>>> \x7fto make space for future metadata usages.
>>>
>>> The situation with balance is quite confused.
>>>
>>> The problem with the reported lack of free space first occurred 
>>> several weeks ago. At that time, I deleted snapshots, and ran balance 
>>> operations with incrementally higher usage values for data and 
>>> metadata. By the end, I had run the operation, without reported 
>>> failure, with values as high as 95%. Normally, such an operation 
>>> would be very long, but in my case it finished in less than a minute. 
>>> Also by the end, only about ten blocks in total had actually been 
>>> reported as moved.
>>>
>>> Perhaps my installation of btrfsd has been successfully maintaining 
>>> the balance for the volume. The system logs are not extensive enough 
>>> for me to know when it last performed any operations.
>>>
>>> Regardless, it seems that the general problems are not becoming 
>>> resolved by invocations of balance.
>>>
>>
>>
>>
> 
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-28 22:23                               ` Qu Wenruo
  2026-04-28 22:34                                 ` Qu Wenruo
@ 2026-04-29  0:57                                 ` brainchild
  2026-04-29  1:11                                   ` Qu Wenruo
  1 sibling, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-29  0:57 UTC (permalink / raw)
  To: linux-btrfs

On Wed, 2026-04-29 at 07:53 +0930, Qu Wenruo wrote:
> 

> >Then how did the problem of failing to create new files happen?

It began suddenly, with no clear cause.

The Timeshift snapshots had been causing serious lag since a few weeks 
earlier, but investigating had not yet become a top priority.

One similarity for the two times the problem has occurred is that I was 
running file searches across the whole system using 'find'. Perhaps 
something about these searches triggered the more immediate problem.

> >Any extra output when that happened?
> >Just returning -ENOSPC error messages but the fs is still read-write?

Yes, the FS is still RW.

> >Because you have no unallocated space so metadata can not be expanded
> >anymore.
> >
>> 
> >Meanwhile all the free space is inside data block groups, you need to
> >balance *only* data block groups to free up space for metadata.
> >
> >Not the opposite.

I already tried balancing just data, but there is no longer any 
benefit. It seems all of the data is already balanced. Otherwise, 
balance is failing to do its job.

Attempts to balance data now instantly complete, reporting nothing to 
do:

---
$ time sudo btrfs balance start -dusage=95 /
Done, had to relocate 0 out of 833 chunks

real 0m0.059s
user 0m0.000s
sys 0m0.013s
---

The only hint of any problem is from the kernel log:

---
_btrfs_printk: 788 callbacks suppressed
---

The other messages relate only to reporting blocks as skipped due to 
the swap file.

Earlier the more aggressive balance operations were failing, reporting 
no available space. From this situation, I started with -dusage=5, and 
then reached 95, in increments of 5. Each of the calls lasted no more 
than a few minutes, and in total only about ten blocks were reported as 
moved.

I would consider removing the swap file, but it would require shrinking 
the Btrfs partition, to make room for another partition dedicated to 
swap. I wonder whether it is safe to resize the volume, considering its 
unstable condition.
> 

Also, since the swap file is so small compared to the size of the 
volume, I doubt that it is causing the serious problems.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-29  0:57                                 ` brainchild
@ 2026-04-29  1:11                                   ` Qu Wenruo
  2026-04-29  1:16                                     ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-29  1:11 UTC (permalink / raw)
  To: brainchild, linux-btrfs



在 2026/4/29 10:27, brainchild@mailbox.org 写道:
> On Wed, 2026-04-29 at 07:53 +0930, Qu Wenruo wrote:
>>
> 
>> >Then how did the problem of failing to create new files happen?
> 
> It began suddenly, with no clear cause.
> 
> The Timeshift snapshots had been causing serious lag since a few weeks 
> earlier, but investigating had not yet become a top priority.
> 
> One similarity for the two times the problem has occurred is that I was 
> running file searches across the whole system using 'find'. Perhaps 
> something about these searches triggered the more immediate problem.
> 
>> >Any extra output when that happened?
>> >Just returning -ENOSPC error messages but the fs is still read-write?
> 
> Yes, the FS is still RW.
> 
>> >Because you have no unallocated space so metadata can not be expanded
>> >anymore.
>> >
>>>
>> >Meanwhile all the free space is inside data block groups, you need to
>> >balance *only* data block groups to free up space for metadata.
>> >
>> >Not the opposite.
> 
> I already tried balancing just data, but there is no longer any benefit. 
> It seems all of the data is already balanced. Otherwise, balance is 
> failing to do its job.
> 
> Attempts to balance data now instantly complete, reporting nothing to do:
> 
> ---
> $ time sudo btrfs balance start -dusage=95 /
> Done, had to relocate 0 out of 833 chunks
> 
> real 0m0.059s
> user 0m0.000s
> sys 0m0.013s
> ---
> 
> The only hint of any problem is from the kernel log:
> 
> ---
> _btrfs_printk: 788 callbacks suppressed
> ---
> 
> The other messages relate only to reporting blocks as skipped due to the 
> swap file.
> 
> Earlier the more aggressive balance operations were failing, reporting 
> no available space. From this situation, I started with -dusage=5, and 
> then reached 95, in increments of 5. Each of the calls lasted no more 
> than a few minutes, and in total only about ten blocks were reported as 
> moved.
> 
> I would consider removing the swap file, but it would require shrinking 
> the Btrfs partition, to make room for another partition dedicated to 
> swap. I wonder whether it is safe to resize the volume, considering its 
> unstable condition.
>>
> 
> Also, since the swap file is so small compared to the size of the 
> volume, I doubt that it is causing the serious problems.
> 
Nope, just disable the swap file.

32GiB seems small, but you have no control on how large the real file 
extents are.

It can be 128MiB (the normal one), and you still have 256 extents.
If each extent is one a different block group, you can have 256GiB data 
blocks unable to be balanced.

Furthermore, your scrub is already showing that only 11GiB data can be 
properly scrubbed, the remaining hundreds of GiB are all skipped.

Now you tell me if this is causing series problems.

And this is the new docs update to explicitly warn end users about the 
problems with swap files on btrfs:
https://lore.kernel.org/linux-btrfs/357262c371343c6d6919b7827803194cb46a5e40.1777420050.git.wqu@suse.com/T/#u


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-29  1:11                                   ` Qu Wenruo
@ 2026-04-29  1:16                                     ` brainchild
  2026-04-29  1:27                                       ` Qu Wenruo
  0 siblings, 1 reply; 25+ messages in thread
From: brainchild @ 2026-04-29  1:16 UTC (permalink / raw)
  To: linux-btrfs


On Wed, Apr 29 2026 at 10:41:51 AM +09:30:00, Qu Wenruo <wqu@suse.com> 
wrote:

> Nope, just disable the swap file.

Is shrinking the volume safe, or even possible, in the current 
condition?



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-29  1:16                                     ` brainchild
@ 2026-04-29  1:27                                       ` Qu Wenruo
  2026-04-29  2:11                                         ` brainchild
  0 siblings, 1 reply; 25+ messages in thread
From: Qu Wenruo @ 2026-04-29  1:27 UTC (permalink / raw)
  To: brainchild, linux-btrfs

在 2026/4/29 10:46, brainchild@mailbox.org 写道:
> 
> On Wed, Apr 29 2026 at 10:41:51 AM +09:30:00, Qu Wenruo <wqu@suse.com> 
> wrote:
> 
>> Nope, just disable the swap file.
> 
> Is shrinking the volume safe, or even possible, in the current condition?

Shrink is going to relocate some block groups, and if any block group is 
covered by a swap extent, the shrink will fail.

So I'm just asking to disable swap file, with all the explanation 
provided, then balance data.

If you do not want to follow, do whatever you want, you are on your own.

All I can do is to enhance the docs and hope there will be no more end 
users sticking with stupid swapfile idea without really understanding 
all the limitations.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Strange behavior with scrub, quotas, and snapshots
  2026-04-29  1:27                                       ` Qu Wenruo
@ 2026-04-29  2:11                                         ` brainchild
  0 siblings, 0 replies; 25+ messages in thread
From: brainchild @ 2026-04-29  2:11 UTC (permalink / raw)
  To: linux-btrfs

On Wed, Apr 29 2026 at 10:57:02 AM +09:30:00, Qu Wenruo <wqu@suse.com> 
wrote:
> 
> All I can do is to enhance the docs and hope there will be no more 
> end users sticking with stupid swapfile idea without really 
> understanding all the limitations.

I think we misunderstood each other. Running balance with swap disabled 
is not a problem. I have just done it, and as you predicted, many more 
blocks were identified for balancing.

However, I cannot disable swap permanently. If I am planning to migrate 
away from a swap file on the Btrfs volume, then I need to resize it to 
make space for a separate partition for swap. I was concerned that the 
super mismatch earlier discovered might make it unsafe to resize the 
partition.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2026-04-29  2:11 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-26 23:52 Strange behavior with scrub, quotas, and snapshots brainchild
2026-04-27  2:05 ` Qu Wenruo
2026-04-27 20:32   ` brainchild
2026-04-27 22:10     ` Qu Wenruo
     [not found]       ` <SNC6ET.5NSSU3PO7MKD2@mailbox.org>
2026-04-27 22:58         ` Qu Wenruo
2026-04-28  0:22           ` brainchild
2026-04-28  1:16             ` Qu Wenruo
2026-04-28  1:21               ` brainchild
2026-04-28  2:33                 ` brainchild
2026-04-28  3:13                   ` Qu Wenruo
2026-04-28  4:03                     ` brainchild
2026-04-28  5:13                       ` Qu Wenruo
2026-04-28  5:29                         ` brainchild
2026-04-28  6:41                           ` Qu Wenruo
2026-04-28 19:30                             ` brainchild
2026-04-28 22:19                               ` brainchild
2026-04-28 22:26                                 ` Qu Wenruo
2026-04-28 22:50                                   ` Qu Wenruo
2026-04-28 22:23                               ` Qu Wenruo
2026-04-28 22:34                                 ` Qu Wenruo
2026-04-29  0:57                                 ` brainchild
2026-04-29  1:11                                   ` Qu Wenruo
2026-04-29  1:16                                     ` brainchild
2026-04-29  1:27                                       ` Qu Wenruo
2026-04-29  2:11                                         ` brainchild

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox