From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: O'Brien Dave <odaiwai@gmail.com>, linux-btrfs@vger.kernel.org
Subject: Re: BTRFS w/ quotas hangs on read-write mount using all available RAM - rev2
Date: Sun, 5 May 2024 15:39:37 +0930 [thread overview]
Message-ID: <618f7bf7-0ffd-407d-a42c-bf86199bb1e0@gmx.com> (raw)
In-Reply-To: <57CAA156-27B5-453F-8A83-1C8812E49B98@gmail.com>
在 2024/5/5 13:25, O'Brien Dave 写道:
> Dear BTRFS team,
>
> I’ve had a weird hanging situation as described by the previous email in this chain: https://lore.kernel.org/linux-btrfs/133101d8dbce$c666a030$5333e090$@admiralbulli.de/
> The situation:
> I have /home as 2x6TB hdd in BTRFS Raid0/data, RAID1/MData. I make a daily snapshot by cronjob overnight, so there's about 1000 snapshots on it. (/ is on a separated ssd)
>
> To see where all the space was going, I enabled quotas: `btrfs quota enable /home`, and it started doing its thing. When it was nearly complete, I deleted one of the subvols with `btrfs subvol delete /home/BACKUP....` (one of the earlier backups, about 117MB exclusive, according to the qgroup), and realised it would take a while to complete, so I left it alone.
Deleting a snapshot is super qgroup heavy, it needs to remark all
involved data extents for qgroup to rescan, and furthermore, the rescan
has to be done in just one transaction, mostly to hang the whole system.
>
> Later that same day, there was a power outage, and when I restarted the box, everything came up as normal, but a `btrfs-cleaner` process started that eventually took all of memory (32GB) and then eventually made the machine non-responsive. I tried to disable the quotas with `btrfs quota disable /home` while this was happening, but the command didn't return.
That's the same thing, doing the same subvolume dropping.
And unfortunately there is no proper way to handle it without marking
qgroup inconsistent.
So the only way to get rid of the situation is using the newer sysfs
interface "/sys/fs/btrfs/<uuid>/qgroups/drop_subtree_treshold".
Some lower value like 2 or 3 would be good enough to address the
situation, which would automatically change qgroup to inconsistent if a
larger enough subtree is dropped.
Thanks,
Qu
>
> I rebooted in single user with `/home` unmounted, set up 128GB of swap using a USB 3.0 flashdrive, then ran `btrfs check -p -Q /home`. It took 75 hours to run, and used a max of about 80GB of RAM+Swap, and reported no errors. I tried to mount the drive as normal again, and once more `btrfs-cleaner` spins up, takes all memory and makes everything unresponsive, with constant `OOM` killings of all processes, until eventually the system crashed. It didn't use the swap much, which might be relevant. All through this, `btrfs-orphan-cleanup-progress` reports that there is one orphan to be deleted, corresponding to the snapshot I deleted, and it doesn't go away.
>
> `btrfs qgroup show /home` shows the deleted subvol as <stale>.
>
> I can mount the volume read-only and with `ro,rescue-all` with no drama, and nothing dramatic appears in the system logs, but mounting as `default` causes the eventual crash of the machine as described above.
>
> I cannot run `btrfs quota disable /home` as the command doesn't return, and the system eventually locks up when mounted RW.
>
> My current kernel is 6.8.7-fc200, which should all of the optimisations discussed in previous emails in this thread. The filesystem is about 3 years old (2021/04) but I don’t remember which kernel was running then, but it should have been at least 5.8 according to https://en.wikipedia.org/wiki/Fedora_Linux_release_history.
>
> Is there a way to disable the quotas with device unmounted (I don’t really need that info, and I can always rescan later.) I made a start at patching the `disable-quota` command into btrfs-progs, but it reports an open transaction, when run.
>
> Any advice on how to proceed? (Apart from backup everything, of course)
>
> thanks and regards,
> dave
next prev parent reply other threads:[~2024-05-05 6:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-05 3:55 BTRFS w/ quotas hangs on read-write mount using all available RAM - rev2 O'Brien Dave
2024-05-05 6:09 ` Qu Wenruo [this message]
-- strict thread matches above, loose matches on Subject: below --
2024-05-07 13:43 O'Brien Dave
2024-05-07 20:44 ` Qu Wenruo
2022-10-09 11:03 admiral
2022-10-09 11:13 ` Qu Wenruo
2022-10-09 11:37 ` Qu Wenruo
2022-10-10 21:55 ` admiral
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=618f7bf7-0ffd-407d-a42c-bf86199bb1e0@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=odaiwai@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox