public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] btrfs: scrub: enhance freezing and signal handling
@ 2025-10-16  9:42 Qu Wenruo
  2025-10-16  9:42 ` [PATCH v2 1/3] btrfs: scrub: add cancel/pause/removed bg checks for raid56 parity stripes Qu Wenruo
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Qu Wenruo @ 2025-10-16  9:42 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v2:
- Remove copy-pasted comments that are too obvious in the first place
  Also remove a stale comment in the old code.

- Add extra explanation on why both fs and process freezing need to be
  checked
  Mostly due to the configurable behavior of pm suspension/hiberation,
  thus we have to handle both cases.

It's a long known bug that when scrub/dev-replace is running, power
management suspension will time out and fail.

After more debugging and helps from Askar Safin, it turns out there are
at least 3 components involved:

- Process freezing
  This is at the preparation for suspension, which requires all user
  space processes (and some kthreads) to be frozen, which requires the
  process return to user space.

  Thus if the process (normally btrfs command) is falling into a long
  running ioctl (like scrub/dev-replace) it will not be frozen thus
  breaking the pm suspension.

  This mean paused scrub is not feasible, as paused scrub will still
  make the ioctl executing process trapped inside kernel space.

- Filesystem freezing
  It's an optional behavior during pm suspension, previously I submitted
  one patch detecting such situation, and so far it works as expected.
  But this fs freezing is only optional, not yet default behavior of pm
  suspension.

- Systemd slice freezing
  This is the most complex part that I have not yet fully pinned down,
  but during the tests it looks like systemd is sending some signals to
  the processes under the user slice.

  Thus if the process is falling into the kernel for a long time, it will
  not return to the user space and no chance to handle the signal.

To address all those problems, the series will:

- Add extra cancel/pause/removed bg checks for raid56 parity stripes
  Mostly to reduce delay for RAID56 cases, and make the behavior more
  consistent.

- Cancel the scrub if the fs or process is being frozen
  Please note that here we have to check both fs and process freezing,
  please refer to the changelog and comment of the second patch for the
  reason.

- Cancel the scrub if there is a pending signal
  This is mostly for the systemd slice handling, which affects users
  running the scrub inside a user slice. This can cause an obvious
  delay during pm suspension/hiberation and power off/restart.

Qu Wenruo (3):
  btrfs: scrub: add cancel/pause/removed bg checks for raid56 parity
    stripes
  btrfs: scrub: cancel the run if the process or fs is being frozen
  btrfs: scrub: cancel the run if there is a pending signal

 fs/btrfs/scrub.c | 64 ++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 8 deletions(-)

-- 
2.51.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-10-18  7:14 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-16  9:42 [PATCH v2 0/3] btrfs: scrub: enhance freezing and signal handling Qu Wenruo
2025-10-16  9:42 ` [PATCH v2 1/3] btrfs: scrub: add cancel/pause/removed bg checks for raid56 parity stripes Qu Wenruo
2025-10-16  9:42 ` [PATCH v2 2/3] btrfs: scrub: cancel the run if the process or fs is being frozen Qu Wenruo
2025-10-16 10:02   ` Filipe Manana
2025-10-16  9:42 ` [PATCH v2 3/3] btrfs: scrub: cancel the run if there is a pending signal Qu Wenruo
2025-10-17  9:47   ` Askar Safin
2025-10-17 10:34     ` Qu Wenruo
2025-10-17  9:39 ` [PATCH v2 0/3] btrfs: scrub: enhance freezing and signal handling Askar Safin
2025-10-17 13:03   ` Askar Safin
2025-10-18  4:16 ` Askar Safin
2025-10-18  4:26   ` Qu Wenruo
2025-10-18  6:43     ` Askar Safin
2025-10-18  7:14       ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox