linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to address process dependency issues
@ 2025-08-07 12:14 Zihuan Zhang
  2025-08-07 12:14 ` [RFC PATCH v1 1/9] freezer: Introduce freeze_priority field in task_struct Zihuan Zhang
                   ` (10 more replies)
  0 siblings, 11 replies; 38+ messages in thread
From: Zihuan Zhang @ 2025-08-07 12:14 UTC (permalink / raw)
  To: Rafael J . Wysocki, Peter Zijlstra, Oleg Nesterov,
	David Hildenbrand, Michal Hocko, Jonathan Corbet
  Cc: Ingo Molnar, Juri Lelli, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	len brown, pavel machek, Kees Cook, Andrew Morton,
	Lorenzo Stoakes, Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Catalin Marinas, Nico Pache, xu xin,
	wangfushuai, Andrii Nakryiko, Christian Brauner, Thomas Gleixner,
	Jeff Layton, Al Viro, Adrian Ratiu, linux-pm, linux-mm,
	linux-fsdevel, linux-doc, linux-kernel, Zihuan Zhang

The Linux task freezer was designed in a much earlier era, when userspace was relatively simple and flat.
Over the years, as modern desktop and mobile systems have become increasingly complex—with intricate IPC,
asynchronous I/O, and deep event loops—the original freezer model has shown its age.

## Background

Currently, the freezer traverses the task list linearly and attempts to freeze all tasks equally.
It sends a signal and waits for `freezing()` to become true. While this model works well in many cases, it has several inherent limitations:

- Signal-based logic cannot freeze uninterruptible (D-state) tasks
- Dependencies between processes can cause freeze retries 
- Retry-based recovery introduces unpredictable suspend latency

## Real-world problem illustration

Consider the following scenario during suspend:

Freeze Window Begins

    [process A] - epoll_wait()
        │
        ▼
    [process B] - event source (already frozen)

→ A enters D-state because of waiting for B
→ Cannot respond to freezing signal
→ Freezer retries in a loop
→ Suspend latency spikes

In such cases, we observed that a normal 1–2ms freezer cycle could balloon to **tens of milliseconds**. 
Worse, the kernel has no insight into the root cause and simply retries blindly.

## Proposed solution: Freeze priority model

To address this, we propose a **layered freeze model** based on per-task freeze priorities.

### Design

We introduce 4 levels of freeze priority:


| Priority | Level             | Description                       |
|----------|-------------------|-----------------------------------|
| 0        | HIGH              | D-state TASKs                     |
| 1        | NORMAL            | regular  use space TASKS          |
| 2        | LOW               | not yet used                      |
| 4        | NEVER_FREEZE      | zombie TASKs , PF_SUSPNED_TASK    |


The kernel will freeze processes **in priority order**, ensuring that higher-priority tasks are frozen first.
This avoids dependency inversion scenarios and provides a deterministic path forward for tricky cases.
By freezing control or event-source threads first, we prevent dependent tasks from entering D-state prematurely — effectively avoiding dependency inversion.

Although introducing more fine-grained freeze_priority levels improves extensibility and allows better modeling of task dependencies, 
it may also introduce additional overhead during task traversal, potentially affecting freezer performance.

In our test environment, increasing the maximum freeze retries to 16 only added ~4ms of overhead to the total suspend latency,
suggesting the added robustness comes at a relatively low cost. However, for latency-critical systems, this trade-off should be carefully evaluated.

## Benefits

- Solves D-state process freeze stalls caused by premature freezing of dependencies
- Enables more robust and reliable suspend/resume on complex userspace systems
- Introduces extensibility: tasks can be categorized by role, urgency, or dependency
- Reduces race conditions by introducing deterministic freezing order

## Previous Discussion
Link: https://lore.kernel.org/all/20250606062502.19607-1-zhangzihuan@kylinos.cn/
Link: https://lore.kernel.org/all/1ca889fd-6ead-4d4f-a3c7-361ea05bb659@kylinos.cn/

## Future directions

This framework opens up several promising areas for further development:

1. Adaptive behavior based on runtime statistics or retry feedback
The freezer adapts dynamically during suspend/hibernate based on the number of retries and which tasks failed to freeze. 
Tasks that failed in previous rounds will be assigned a higher freeze priority, improving convergence speed and reducing unnecessary retries.

2. cgroup-aware hierarchical freezing for containerized systems
The design supports cgroup-aware task traversal and freezing. 
This ensures compatibility with containerized environments, allowing for better control and visibility when freezing processes in different cgroups.

3. Unified freezing of userspace processes and kernel threads
Based on extensive testing, we found that freezing userspace tasks and kernel threads together works reliably in practice. 
Separating them does not resolve dependency issues between user and kernel context. Moreover, most kernel threads are marked as non-freezable,
so including them in the same freeze pass does not impact correctness and simplifies the logic.

Although the current implementation is relatively simple, it already helps alleviate some suspend failures caused by tasks stuck in D state.
In our testing, we observed that certain D-state tasks are triggered by filesystem sync operations during the freezing phase.
At this stage, we don't yet have a comprehensive solution for that class of problems.
This patchset represents a testable version of our design. We plan to further investigate and address such filesystem-related D-state issues in future revisions.

Patch summary:
 - Patch 1-3: Core infrastructure: field, API, layered freeze logic
 - Patch 4-7: Default priorities and dynamic adjustments
 - Patch 8: Statistics: freeze pass retry count
 - Patch 9: Procfs interface for userspace access

Zihuan Zhang (9):
  freezer: Introduce freeze_priority field in task_struct
  freezer: Introduce API to set per-task freeze priority
  freezer: Add per-priority layered freeze logic
  freezer: Set default freeze priority for userspace tasks
  freezer: set default freeze priority for PF_SUSPEND_TASK processes
  freezer: Set default freeze priority for zombie tasks
  freezer: raise freeze priority of tasks failed to freeze last time
  freezer: Add retry count statistics for freeze pass iterations
  proc: Add /proc/<pid>/freeze_priority interface

 Documentation/filesystems/proc.rst | 14 ++++++-
 fs/proc/base.c                     | 64 ++++++++++++++++++++++++++++++
 include/linux/freezer.h            | 20 ++++++++++
 include/linux/sched.h              |  3 ++
 kernel/fork.c                      |  1 +
 kernel/power/process.c             | 23 ++++++++++-
 kernel/sched/core.c                |  2 +
 7 files changed, 124 insertions(+), 3 deletions(-)

-- 
2.25.1



^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2025-08-15  8:27 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-07 12:14 [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to address process dependency issues Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 1/9] freezer: Introduce freeze_priority field in task_struct Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 2/9] freezer: Introduce API to set per-task freeze priority Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 3/9] freezer: Add per-priority layered freeze logic Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 4/9] freezer: Set default freeze priority for userspace tasks Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 5/9] freezer: set default freeze priority for PF_SUSPEND_TASK processes Zihuan Zhang
2025-08-08 14:39   ` Oleg Nesterov
2025-08-11  9:25     ` Zihuan Zhang
2025-08-11  9:32       ` Oleg Nesterov
2025-08-11  9:42         ` Zihuan Zhang
2025-08-11  9:46           ` Oleg Nesterov
2025-08-11  9:54             ` Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 6/9] freezer: Set default freeze priority for zombie tasks Zihuan Zhang
2025-08-08 14:29   ` Oleg Nesterov
2025-08-11  9:29     ` Zihuan Zhang
2025-08-11  9:42       ` Oleg Nesterov
2025-08-12  8:07     ` Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 7/9] freezer: raise freeze priority of tasks failed to freeze last time Zihuan Zhang
2025-08-08 14:53   ` Oleg Nesterov
2025-08-11  9:31     ` Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 8/9] freezer: Add retry count statistics for freeze pass iterations Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 9/9] proc: Add /proc/<pid>/freeze_priority interface Zihuan Zhang
2025-08-07 13:25 ` [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to address process dependency issues Michal Hocko
2025-08-08  1:13   ` Zihuan Zhang
2025-08-08  7:00     ` Michal Hocko
2025-08-08  7:52       ` Zihuan Zhang
2025-08-08  8:58         ` Michal Hocko
2025-08-11  9:13           ` Zihuan Zhang
2025-08-11 10:58             ` Michal Hocko
2025-08-12  5:57               ` Zihuan Zhang
2025-08-12 17:26                 ` Darrick J. Wong
2025-08-13  5:48                   ` Zihuan Zhang
2025-08-14 16:43                     ` Darrick J. Wong
2025-08-15  8:17                       ` Zihuan Zhang
2025-08-08  7:57     ` Oleg Nesterov
2025-08-08  8:40       ` Zihuan Zhang
2025-08-14 14:37 ` Peter Zijlstra
2025-08-15  8:27   ` Zihuan Zhang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).