public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* BTRFS w/ quotas hangs on read-write mount using all available RAM - rev2
@ 2022-10-09 11:03 admiral
  2022-10-09 11:13 ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: admiral @ 2022-10-09 11:03 UTC (permalink / raw)
  To: linux-btrfs

Dear btrfs team,
thanks for all your great work!
I have been running btrfs now for several years and really like the
robustness and ease of use!

Last week I experienced 99% the same thing as described here by Loren M.
Lang:
https://www.spinics.net/lists/linux-btrfs/msg81173.html
only difference: This is not my / but a 40TB storage mounted to
/media/btrfs1/

quick summary what happend:
- enabled quotas to better understand where all my space has gone
- started balancing
- system got completely stuck due to the meanwhile well understood reasons
- pushed reset button

I can mount my btrfs system perfectly read-only and access the data. As soon
as I try to mount rw, my system will exremely slow down, memory will fill up
until I will finally end up with a panicking kernel.

So, no problem to successfully boot with the fstab entries on ro or
commented out.

   admiral@server:/$ uname -a
   Linux server.domain.loc 4.19.0-21-amd64 #1 SMP Debian 4.19.249-2
(2022-06-30) x86_64 GNU/Linux

   admiral@server:/$ btrfs --version
   btrfs-progs v5.10.1

Here the question:
I am looking for the option to disable quota on an unmounted btrfs like
described here:
https://patchwork.kernel.org/project/linux-btrfs/patch/20180812013358.16431-
1-wqu@suse.com/

All my trials and checks et cetera were performed with btrfs-progs v4.20.1-2
as debian buster's latest state:
https://packages.debian.org/de/buster/btrfs-progs

I already upgraded the btrfs-progs to debian backport v5.10.1 but do not
find any option to offline disable quota, yet:
https://packages.debian.org/buster-backports/btrfs-progs

Can you point me some direction how to move forward to recover the btrfs?

Thanks a lot,

admiralbulli


^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: BTRFS w/ quotas hangs on read-write mount using all available RAM - rev2
@ 2024-05-05  3:55 O'Brien Dave
  2024-05-05  6:09 ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: O'Brien Dave @ 2024-05-05  3:55 UTC (permalink / raw)
  To: linux-btrfs

Dear BTRFS team, 

I’ve had a weird hanging situation as described by the previous email in this chain: https://lore.kernel.org/linux-btrfs/133101d8dbce$c666a030$5333e090$@admiralbulli.de/
The situation:
I have /home as 2x6TB hdd in BTRFS Raid0/data, RAID1/MData. I make a daily snapshot by cronjob overnight, so there's about 1000 snapshots on it. (/ is on a separated ssd)

To see where all the space was going, I enabled quotas: `btrfs quota enable /home`, and it started doing its thing. When it was nearly complete, I deleted one of the subvols with `btrfs subvol delete /home/BACKUP....` (one of the earlier backups, about 117MB exclusive, according to the qgroup), and realised it would take a while to complete, so I left it alone.

Later that same day, there was a power outage, and when I restarted the box, everything came up as normal, but a `btrfs-cleaner` process started that eventually took all of memory (32GB) and then eventually made the machine non-responsive. I tried to disable the quotas with `btrfs quota disable /home` while this was happening, but the command didn't return.

I rebooted in single user with `/home` unmounted, set up 128GB of swap using a USB 3.0 flashdrive, then ran `btrfs check -p -Q /home`. It took 75 hours to run, and used a max of about 80GB of RAM+Swap, and reported no errors.  I tried to mount the drive as normal again, and once more `btrfs-cleaner` spins up, takes all memory and makes everything unresponsive, with constant `OOM` killings of all processes, until eventually the system crashed. It didn't use the swap much, which might be relevant.  All through this, `btrfs-orphan-cleanup-progress` reports that there is one orphan to be deleted, corresponding to the snapshot I deleted, and it doesn't go away.

`btrfs qgroup show /home` shows the deleted subvol as <stale>.

I can mount the volume read-only and with `ro,rescue-all` with no drama, and nothing dramatic appears in the system logs, but mounting as `default` causes the eventual crash of the machine as described above.

I cannot run `btrfs quota disable /home` as the command doesn't return, and the system eventually locks up when mounted RW.

My current kernel is 6.8.7-fc200, which should all of the optimisations discussed in previous emails in this thread. The filesystem is about 3 years old (2021/04) but I don’t remember which kernel was running then, but it should have been at least 5.8 according to https://en.wikipedia.org/wiki/Fedora_Linux_release_history.

Is there a way to disable the quotas with device unmounted (I don’t really need that info, and I can always rescan later.) I made a start at patching the `disable-quota` command into btrfs-progs, but it reports an open transaction, when run.

Any advice on how to proceed?  (Apart from backup everything, of course)

thanks and regards,
dave

^ permalink raw reply	[flat|nested] 8+ messages in thread
* Re: BTRFS w/ quotas hangs on read-write mount using all available RAM - rev2
@ 2024-05-07 13:43 O'Brien Dave
  2024-05-07 20:44 ` Qu Wenruo
  0 siblings, 1 reply; 8+ messages in thread
From: O'Brien Dave @ 2024-05-07 13:43 UTC (permalink / raw)
  To: quwenruo.btrfs; +Cc: linux-btrfs

> So the only way to get rid of the situation is using the newer sysfs
> interface "/sys/fs/btrfs/<uuid>/qgroups/drop_subtree_treshold”.
>
> Some lower value like 2 or 3 would be good enough to address the
> situation, which would automatically change qgroup to inconsistent if a
> larger enough subtree is dropped.

Setting the threshold to 2 or 3 didn't work - the machine ran until OOM failure in both cases - but what did work was setting it to 1 or 0. (I’m not sure which fixed it, as I set it to 1, then 0, there was a flurry of disk activity and the qgroups were immediately marked as inconsistent.)

So, after rebooting into single user mode with /home in ro:

$ vim /etc/fstab # to change /home back to the defaults
$ mount /home
$ echo "0" >/sys/fs/btrfs/<UUID>/qgroups/drop_subtree_threshold
$ cat/sys/fs/btrfs/<UUID>/qgroups/drop_subtree_threshold  # to check
$ btrfs qgroup show -pcre /home
$ btrfs quota disable /home

Thanks for your help!

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-07 20:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-10-09 11:03 BTRFS w/ quotas hangs on read-write mount using all available RAM - rev2 admiral
2022-10-09 11:13 ` Qu Wenruo
2022-10-09 11:37   ` Qu Wenruo
2022-10-10 21:55     ` admiral
  -- strict thread matches above, loose matches on Subject: below --
2024-05-05  3:55 O'Brien Dave
2024-05-05  6:09 ` Qu Wenruo
2024-05-07 13:43 O'Brien Dave
2024-05-07 20:44 ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox