From: Qu Wenruo <wqu@suse.com>
To: linux-btrfs@vger.kernel.org
Cc: safinaskar@gmail.com
Subject: [PATCH v3 0/3] btrfs: scrub: enhance freezing and signal handling
Date: Sun, 19 Oct 2025 11:15:25 +1030 [thread overview]
Message-ID: <cover.1760834294.git.wqu@suse.com> (raw)
[CHANGELOG]
v3:
- Update the commit message and cover letter to indicate a behavior
change
Btrfs-progs updates will follow soon after merge into for-next.
This will includes docs update, and maybe an automatic resume behavior
(with new options to toggle).
- Update the commit message of the last patch to explain why v2 cgroup
freezing is depending on signals
Unlike legacy freezer of v1 cgroup, v2 cgroup freezer is fully based
on signal handling, thus freezing through v2 cgroup looks exactly like
a pending non-fatal signal. And freezing() will not return true in
this case.
- Add the reviewed-by tag from Filipe
v2:
- Remove copy-pasted comments that are too obvious in the first place
Also remove a stale comment in the old code.
- Add extra explanation on why both fs and process freezing need to be
checked
Mostly due to the configurable behavior of pm suspension/hiberation,
thus we have to handle both cases.
It's a long known bug that when scrub/dev-replace is running, power
management suspension will time out and fail.
After more debugging and helps from Askar Safin, it turns out there are
at least 3 components involved:
- Process freezing
This is at the preparation for suspension, which requires all user
space processes (and some kthreads) to be frozen, which requires the
process return to user space.
Thus if the process (normally btrfs command) is falling into a long
running ioctl (like scrub/dev-replace) it will not be frozen thus
breaking the pm suspension.
This mean paused scrub is not feasible, as paused scrub will still
make the ioctl executing process trapped inside kernel space.
- Filesystem freezing
It's an optional behavior during pm suspension, previously I submitted
one patch detecting such situation, and so far it works as expected.
But this fs freezing is only optional, not yet default behavior of pm
suspension.
- Systemd slice freezing
Systemd slice freezing utilize cgroup freezer, and the freezing()
checks will not return true until the whole cgroup is marked frozen.
But before that a wakeup signal is sent to the user process, and
during the kernel signal handling that process will be frozen.
So until the process returned to user space, it will not be marked
frozen.
Thus we have to do regular pending signal checks to prevent cgroup
freezing time out.
To address all those problems, the series will:
- Add extra cancel/pause/removed bg checks for raid56 parity stripes
Mostly to reduce delay for RAID56 cases, and make the behavior more
consistent.
- Cancel the scrub if the fs or process is being frozen
Please note that here we have to check both fs and process freezing,
please refer to the changelog and comment of the second patch for the
reason.
- Cancel the scrub if there is a pending signal
This allows regular signal to interrupt scrub/dev-replace, without the
extra signal handling hack in btrfs-progs (but that existing handling
won't hurt either).
This also address the time out during systemd slice freezing. Unlike
pm freezing, v2 cgroup freezing is fully based on signal handling thus
freezing() function will not return true during v2 cgroup freeing.
So when v2 cgroup is freezing the processing running scrub/replace, we
will properly detect a pending signal and abort the scrub/replace so
that freezing can be done when the process returned to the user space.
[BEHAVIOR CHANGE]
Unfortunately this will bring a behavior change, and will mostly
affecting dev-replace:
Dev-replace will be interrupted by pm, and can not be resumed but only
starts from the beginning.
The same dev-replace now will fail due to pm, breaking the old
expectation. End users will need extra steps to prevent
suspension/hibernation instead.
And btrfs-progs also needs to be updated to handle the new -EINTR
error (maybe restart the operation if possible).
Qu Wenruo (3):
btrfs: scrub: add cancel/pause/removed bg checks for raid56 parity
stripes
btrfs: scrub: cancel the run if the process or fs is being frozen
btrfs: scrub: cancel the run if there is a pending signal
fs/btrfs/scrub.c | 70 +++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 60 insertions(+), 10 deletions(-)
--
2.51.0
next reply other threads:[~2025-10-19 0:46 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-19 0:45 Qu Wenruo [this message]
2025-10-19 0:45 ` [PATCH v3 1/3] btrfs: scrub: add cancel/pause/removed bg checks for raid56 parity stripes Qu Wenruo
2025-10-19 0:45 ` [PATCH v3 2/3] btrfs: scrub: cancel the run if the process or fs is being frozen Qu Wenruo
2025-10-19 0:45 ` [PATCH v3 3/3] btrfs: scrub: cancel the run if there is a pending signal Qu Wenruo
2025-10-19 2:43 ` [PATCH v3 0/3] btrfs: scrub: enhance freezing and signal handling Askar Safin
2025-12-04 20:06 ` Askar Safin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1760834294.git.wqu@suse.com \
--to=wqu@suse.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=safinaskar@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox