* [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce
@ 2025-07-16 6:26 Zihuan Zhang
2025-07-16 6:26 ` [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency Zihuan Zhang
2025-07-16 12:26 ` [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Rafael J. Wysocki
0 siblings, 2 replies; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-16 6:26 UTC (permalink / raw)
To: rafael J . wysocki, Peter Zijlstra
Cc: Oleg Nesterov, len brown, pavel machek, linux-pm, linux-kernel,
Zihuan Zhang
Hi all,
This patch series improves the performance of the process freezer by
skipping zombie tasks during freezing.
In the suspend and hibernation paths, the freezer traverses all tasks
and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
PF_EXITING) are already dead — they are not schedulable and cannot enter
the refrigerator. Attempting to freeze such tasks is redundant and
unnecessarily increases freezing time.
In particular, on systems under fork storm conditions (e.g., many
short-lived processes quickly becoming zombies), the number of zombie tasks
can spike into the thousands or more. We observed that this causes the
freezer loop to waste significant time processing tasks that are guaranteed
to not need freezing.
Testing and Results
Platform:
- Architecture: x86_64
- CPU: ZHAOXIN KaiXian KX-7000
- RAM: 16 GB
- Kernel: v6.6.93
result without the patch:
dmesg | grep elap
[ 219.608992] Freezing user space processes completed (elapsed 0.010 seconds)
[ 219.617355] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
[ 228.029119] Freezing user space processes completed (elapsed 0.013 seconds)
[ 228.040672] Freezing remaining freezable tasks completed (elapsed 0.011 seconds)
[ 236.879065] Freezing user space processes completed (elapsed 0.020 seconds)
[ 236.897976] Freezing remaining freezable tasks completed (elapsed 0.018 seconds)
[ 246.276679] Freezing user space processes completed (elapsed 0.026 seconds)
[ 246.298636] Freezing remaining freezable tasks completed (elapsed 0.021 seconds)
[ 256.221504] Freezing user space processes completed (elapsed 0.030 seconds)
[ 256.248955] Freezing remaining freezable tasks completed (elapsed 0.027 seconds)
[ 266.674987] Freezing user space processes completed (elapsed 0.040 seconds)
[ 266.709811] Freezing remaining freezable tasks completed (elapsed 0.034 seconds)
[ 277.701679] Freezing user space processes completed (elapsed 0.046 seconds)
[ 277.742048] Freezing remaining freezable tasks completed (elapsed 0.040 seconds)
[ 289.246611] Freezing user space processes completed (elapsed 0.046 seconds)
[ 289.290753] Freezing remaining freezable tasks completed (elapsed 0.044 seconds)
[ 301.516854] Freezing user space processes completed (elapsed 0.041 seconds)
[ 301.576287] Freezing remaining freezable tasks completed (elapsed 0.059 seconds)
[ 314.422499] Freezing user space processes completed (elapsed 0.043 seconds)
[ 314.465804] Freezing remaining freezable tasks completed (elapsed 0.043 seconds)
result with the patch:
dmesg | grep elap
[ 54.161674] Freezing user space processes completed (elapsed 0.007 seconds)
[ 54.171536] Freezing remaining freezable tasks completed (elapsed 0.009 seconds)
[ 62.556462] Freezing user space processes completed (elapsed 0.006 seconds)
[ 62.566496] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
[ 71.395421] Freezing user space processes completed (elapsed 0.009 seconds)
[ 71.402820] Freezing remaining freezable tasks completed (elapsed 0.007 seconds)
[ 80.785463] Freezing user space processes completed (elapsed 0.010 seconds)
[ 80.793914] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
[ 90.962659] Freezing user space processes completed (elapsed 0.012 seconds)
[ 90.973519] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
[ 101.435638] Freezing user space processes completed (elapsed 0.013 seconds)
[ 101.449023] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
[ 112.669786] Freezing user space processes completed (elapsed 0.015 seconds)
[ 112.683540] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
[ 124.585681] Freezing user space processes completed (elapsed 0.017 seconds)
[ 124.599553] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
[ 136.826635] Freezing user space processes completed (elapsed 0.016 seconds)
[ 136.841840] Freezing remaining freezable tasks completed (elapsed 0.015 seconds)
[ 149.686575] Freezing user space processes completed (elapsed 0.016 seconds)
[ 149.701549] Freezing remaining freezable tasks completed (elapsed 0.014 seconds)
Here is the user-space fork storm simulator used for testing:
```c
// create_zombie.c
void usage(const char *prog) {
fprintf(stderr, "Usage: %s <number_of_zombies>\n", prog);
exit(EXIT_FAILURE);
}
int main(int argc, char *argv[]) {
if (argc != 2) {
usage(argv[0]);
}
long num_zombies = strtol(argv[1], NULL, 10);
if (num_zombies <= 0 || num_zombies > 1000000) {
fprintf(stderr, "Invalid number of zombies: %ld\n", num_zombies);
exit(EXIT_FAILURE);
}
printf("Creating %ld zombie processes...\n", num_zombies);
for (long i = 0; i < num_zombies; i++) {
pid_t pid = fork();
if (pid < 0) {
perror("fork failed");
exit(EXIT_FAILURE);
} else if (pid == 0) {
// Child exits immediately
exit(0);
}
// Parent does NOT wait, leaving zombie
}
printf("All child processes created. Sleeping for 60 seconds...\n");
sleep(60);
printf("Parent exiting, zombies will be reaped by init.\n");
return 0;
}
```
And we used a shell loop to suspend repeatedly:
```bash
LOOPS=10
echo none > /sys/power/pm_test
echo 5 > /sys/module/suspend/parameters/pm_test_delay
for ((i=1; i<=LOOPS; i++)); do
echo "===== Test round $i/$LOOPS ====="
./create_zombie $((i * 3000)) &
sleep 5
echo mem > /sys/power/state
pkill create_zombie
echo "Round $i complete. Waiting 5s..."
sleep 5
done
echo "==== All $LOOPS rounds complete ===="
```
Zihuan Zhang (1):
PM / Freezer: Skip zombie/dead processes to reduce freeze latency
kernel/power/process.c | 2 +-
1 file changed, 9 insertion(+), 1 deletion(-)
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-16 6:26 [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Zihuan Zhang
@ 2025-07-16 6:26 ` Zihuan Zhang
2025-07-16 16:38 ` Oleg Nesterov
2025-07-16 12:26 ` [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Rafael J. Wysocki
1 sibling, 1 reply; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-16 6:26 UTC (permalink / raw)
To: rafael J . wysocki, Peter Zijlstra
Cc: Oleg Nesterov, len brown, pavel machek, linux-pm, linux-kernel,
Zihuan Zhang
When freezing user space during suspend or hibernation, the freezer
iterates over all tasks and attempts to freeze them via
try_to_freeze_tasks().
However, zombie processes (i.e., tasks in EXIT_ZOMBIE state) are no
longer running and will never enter the refrigerator. Trying to freeze
them is meaningless and causes extra overhead, especially when there are
thousands of zombies created during stress conditions such as fork
storms.
This patch skips zombie processes during the freezing phase.
In our testing with ~30,000 user processes (including many zombies), the
average freeze time during suspend (S3) dropped from ~43 ms to ~16 ms:
- Without the patch: ~43 ms average freeze latency
- With the patch: ~16 ms average freeze latency
- Improvement: ~62%
This confirms that skipping zombies significantly speeds up the freezing
process when the system is under heavy load with many short-lived tasks.
Signed-off-by: Zihuan Zhang <zhangzihuan@kylinos.cn>
Changes in v4:
- Fix incomplete patch title
- Add a comment: exit_state is better than PF_NOFREEZE if we only intend
to skip user processes. TODO added for possible future replacement.
Changes in v3:
- Added performance test
Changes in v2:
- Simplified code, added judgment of dead processes
- Rewrite changelog
---
kernel/power/process.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/power/process.c b/kernel/power/process.c
index dc0dfc349f22..c1d6c5150033 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -51,7 +51,15 @@ static int try_to_freeze_tasks(bool user_only)
todo = 0;
read_lock(&tasklist_lock);
for_each_process_thread(g, p) {
- if (p == current || !freeze_task(p))
+ /*
+ * Zombie and dead tasks are not running anymore and cannot enter
+ * the __refrigerator(). Skipping them avoids unnecessary freeze attempts.
+ *
+ * TODO: Consider using PF_NOFREEZE instead, which may provide
+ * a more generic exclusion mechanism for other non-freezable tasks.
+ * However, for now, exit_state is sufficient to skip user processes.
+ */
+ if (p == current || p->exit_state || !freeze_task(p))
continue;
todo++;
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce
2025-07-16 6:26 [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Zihuan Zhang
2025-07-16 6:26 ` [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency Zihuan Zhang
@ 2025-07-16 12:26 ` Rafael J. Wysocki
2025-07-17 1:02 ` Zihuan Zhang
1 sibling, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2025-07-16 12:26 UTC (permalink / raw)
To: Zihuan Zhang
Cc: rafael J . wysocki, Peter Zijlstra, Oleg Nesterov, len brown,
pavel machek, linux-pm, linux-kernel
Hi,
On Wed, Jul 16, 2025 at 8:26 AM Zihuan Zhang <zhangzihuan@kylinos.cn> wrote:
>
> Hi all,
>
> This patch series improves the performance of the process freezer by
> skipping zombie tasks during freezing.
>
> In the suspend and hibernation paths, the freezer traverses all tasks
> and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
> PF_EXITING) are already dead — they are not schedulable and cannot enter
> the refrigerator. Attempting to freeze such tasks is redundant and
> unnecessarily increases freezing time.
>
> In particular, on systems under fork storm conditions (e.g., many
> short-lived processes quickly becoming zombies), the number of zombie tasks
> can spike into the thousands or more. We observed that this causes the
> freezer loop to waste significant time processing tasks that are guaranteed
> to not need freezing.
I think that the discussion with Peter regarding this has not been concluded.
I thought that there was an alternative patch proposed during that
discussion. If I'm not mistaken about this, what happened to that
patch?
Thanks!
> Testing and Results
>
> Platform:
> - Architecture: x86_64
> - CPU: ZHAOXIN KaiXian KX-7000
> - RAM: 16 GB
> - Kernel: v6.6.93
>
> result without the patch:
> dmesg | grep elap
> [ 219.608992] Freezing user space processes completed (elapsed 0.010 seconds)
> [ 219.617355] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
> [ 228.029119] Freezing user space processes completed (elapsed 0.013 seconds)
> [ 228.040672] Freezing remaining freezable tasks completed (elapsed 0.011 seconds)
> [ 236.879065] Freezing user space processes completed (elapsed 0.020 seconds)
> [ 236.897976] Freezing remaining freezable tasks completed (elapsed 0.018 seconds)
> [ 246.276679] Freezing user space processes completed (elapsed 0.026 seconds)
> [ 246.298636] Freezing remaining freezable tasks completed (elapsed 0.021 seconds)
> [ 256.221504] Freezing user space processes completed (elapsed 0.030 seconds)
> [ 256.248955] Freezing remaining freezable tasks completed (elapsed 0.027 seconds)
> [ 266.674987] Freezing user space processes completed (elapsed 0.040 seconds)
> [ 266.709811] Freezing remaining freezable tasks completed (elapsed 0.034 seconds)
> [ 277.701679] Freezing user space processes completed (elapsed 0.046 seconds)
> [ 277.742048] Freezing remaining freezable tasks completed (elapsed 0.040 seconds)
> [ 289.246611] Freezing user space processes completed (elapsed 0.046 seconds)
> [ 289.290753] Freezing remaining freezable tasks completed (elapsed 0.044 seconds)
> [ 301.516854] Freezing user space processes completed (elapsed 0.041 seconds)
> [ 301.576287] Freezing remaining freezable tasks completed (elapsed 0.059 seconds)
> [ 314.422499] Freezing user space processes completed (elapsed 0.043 seconds)
> [ 314.465804] Freezing remaining freezable tasks completed (elapsed 0.043 seconds)
>
> result with the patch:
> dmesg | grep elap
> [ 54.161674] Freezing user space processes completed (elapsed 0.007 seconds)
> [ 54.171536] Freezing remaining freezable tasks completed (elapsed 0.009 seconds)
> [ 62.556462] Freezing user space processes completed (elapsed 0.006 seconds)
> [ 62.566496] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
> [ 71.395421] Freezing user space processes completed (elapsed 0.009 seconds)
> [ 71.402820] Freezing remaining freezable tasks completed (elapsed 0.007 seconds)
> [ 80.785463] Freezing user space processes completed (elapsed 0.010 seconds)
> [ 80.793914] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
> [ 90.962659] Freezing user space processes completed (elapsed 0.012 seconds)
> [ 90.973519] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
> [ 101.435638] Freezing user space processes completed (elapsed 0.013 seconds)
> [ 101.449023] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
> [ 112.669786] Freezing user space processes completed (elapsed 0.015 seconds)
> [ 112.683540] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
> [ 124.585681] Freezing user space processes completed (elapsed 0.017 seconds)
> [ 124.599553] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
> [ 136.826635] Freezing user space processes completed (elapsed 0.016 seconds)
> [ 136.841840] Freezing remaining freezable tasks completed (elapsed 0.015 seconds)
> [ 149.686575] Freezing user space processes completed (elapsed 0.016 seconds)
> [ 149.701549] Freezing remaining freezable tasks completed (elapsed 0.014 seconds)
>
> Here is the user-space fork storm simulator used for testing:
>
> ```c
> // create_zombie.c
>
> void usage(const char *prog) {
> fprintf(stderr, "Usage: %s <number_of_zombies>\n", prog);
> exit(EXIT_FAILURE);
> }
>
> int main(int argc, char *argv[]) {
> if (argc != 2) {
> usage(argv[0]);
> }
>
> long num_zombies = strtol(argv[1], NULL, 10);
> if (num_zombies <= 0 || num_zombies > 1000000) {
> fprintf(stderr, "Invalid number of zombies: %ld\n", num_zombies);
> exit(EXIT_FAILURE);
> }
>
> printf("Creating %ld zombie processes...\n", num_zombies);
>
> for (long i = 0; i < num_zombies; i++) {
> pid_t pid = fork();
> if (pid < 0) {
> perror("fork failed");
> exit(EXIT_FAILURE);
> } else if (pid == 0) {
> // Child exits immediately
> exit(0);
> }
> // Parent does NOT wait, leaving zombie
> }
>
> printf("All child processes created. Sleeping for 60 seconds...\n");
> sleep(60);
>
> printf("Parent exiting, zombies will be reaped by init.\n");
> return 0;
> }
> ```
>
> And we used a shell loop to suspend repeatedly:
>
> ```bash
> LOOPS=10
>
> echo none > /sys/power/pm_test
> echo 5 > /sys/module/suspend/parameters/pm_test_delay
> for ((i=1; i<=LOOPS; i++)); do
> echo "===== Test round $i/$LOOPS ====="
> ./create_zombie $((i * 3000)) &
> sleep 5
> echo mem > /sys/power/state
>
> pkill create_zombie
> echo "Round $i complete. Waiting 5s..."
> sleep 5
>
> done
> echo "==== All $LOOPS rounds complete ===="
> ```
>
> Zihuan Zhang (1):
> PM / Freezer: Skip zombie/dead processes to reduce freeze latency
>
> kernel/power/process.c | 2 +-
> 1 file changed, 9 insertion(+), 1 deletion(-)
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-16 6:26 ` [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency Zihuan Zhang
@ 2025-07-16 16:38 ` Oleg Nesterov
2025-07-16 18:36 ` Peter Zijlstra
2025-07-17 1:16 ` Zihuan Zhang
0 siblings, 2 replies; 13+ messages in thread
From: Oleg Nesterov @ 2025-07-16 16:38 UTC (permalink / raw)
To: Zihuan Zhang
Cc: rafael J . wysocki, Peter Zijlstra, len brown, pavel machek,
linux-pm, linux-kernel
On 07/16, Zihuan Zhang wrote:
>
> @@ -51,7 +51,15 @@ static int try_to_freeze_tasks(bool user_only)
> todo = 0;
> read_lock(&tasklist_lock);
> for_each_process_thread(g, p) {
> - if (p == current || !freeze_task(p))
> + /*
> + * Zombie and dead tasks are not running anymore and cannot enter
> + * the __refrigerator(). Skipping them avoids unnecessary freeze attempts.
> + *
> + * TODO: Consider using PF_NOFREEZE instead, which may provide
> + * a more generic exclusion mechanism for other non-freezable tasks.
> + * However, for now, exit_state is sufficient to skip user processes.
I don't really understand the comment... The freeze_task() paths already
consider PF_NOFREEZE, although we can check it earlier as Peter suggests.
> + */
> + if (p == current || p->exit_state || !freeze_task(p))
> continue;
I leave this to you and Rafael, but this change doesn't look safe to me.
What if the exiting task does some IO after exit_notify() ?
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-16 16:38 ` Oleg Nesterov
@ 2025-07-16 18:36 ` Peter Zijlstra
2025-07-17 1:30 ` Zihuan Zhang
2025-07-17 1:16 ` Zihuan Zhang
1 sibling, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2025-07-16 18:36 UTC (permalink / raw)
To: Oleg Nesterov
Cc: Zihuan Zhang, rafael J . wysocki, len brown, pavel machek,
linux-pm, linux-kernel
On Wed, Jul 16, 2025 at 06:38:55PM +0200, Oleg Nesterov wrote:
> On 07/16, Zihuan Zhang wrote:
> >
> > @@ -51,7 +51,15 @@ static int try_to_freeze_tasks(bool user_only)
> > todo = 0;
> > read_lock(&tasklist_lock);
> > for_each_process_thread(g, p) {
> > - if (p == current || !freeze_task(p))
> > + /*
> > + * Zombie and dead tasks are not running anymore and cannot enter
> > + * the __refrigerator(). Skipping them avoids unnecessary freeze attempts.
> > + *
> > + * TODO: Consider using PF_NOFREEZE instead, which may provide
> > + * a more generic exclusion mechanism for other non-freezable tasks.
> > + * However, for now, exit_state is sufficient to skip user processes.
>
> I don't really understand the comment... The freeze_task() paths already
> consider PF_NOFREEZE, although we can check it earlier as Peter suggests.
Right; I really don't understand why we should special case
->exit_state. Why not DTRT and optimize NOFREEZE if all this really
matters (smalls gains from what ISTR from the previous discussion).
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce
2025-07-16 12:26 ` [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Rafael J. Wysocki
@ 2025-07-17 1:02 ` Zihuan Zhang
2025-07-17 9:50 ` Rafael J. Wysocki
0 siblings, 1 reply; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-17 1:02 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Peter Zijlstra, Oleg Nesterov, len brown, pavel machek, linux-pm,
linux-kernel
HI Rafael,
在 2025/7/16 20:26, Rafael J. Wysocki 写道:
> Hi,
>
> On Wed, Jul 16, 2025 at 8:26 AM Zihuan Zhang <zhangzihuan@kylinos.cn> wrote:
>> Hi all,
>>
>> This patch series improves the performance of the process freezer by
>> skipping zombie tasks during freezing.
>>
>> In the suspend and hibernation paths, the freezer traverses all tasks
>> and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
>> PF_EXITING) are already dead — they are not schedulable and cannot enter
>> the refrigerator. Attempting to freeze such tasks is redundant and
>> unnecessarily increases freezing time.
>>
>> In particular, on systems under fork storm conditions (e.g., many
>> short-lived processes quickly becoming zombies), the number of zombie tasks
>> can spike into the thousands or more. We observed that this causes the
>> freezer loop to waste significant time processing tasks that are guaranteed
>> to not need freezing.
> I think that the discussion with Peter regarding this has not been concluded.
>
> I thought that there was an alternative patch proposed during that
> discussion. If I'm not mistaken about this, what happened to that
> patch?
>
> Thanks!
>
Currently, the general consensus from the discussion is that skipping
zombie or dead tasks can help reduce locking overhead during freezing.
The remaining question is how best to implement that.
Peter suggested skipping all tasks with PF_NOFREEZE, which would make
the logic more general and cover all cases. However, as Oleg pointed
out, the current implementation based on PF_NOFREEZE might be problematic.
My current thought is that exit_state already reliably covers all
exiting user processes, and it’s a good fit for skipping user-space
tasks. For the kernel side, we may safely skip a few kernel threads like
kthreadd that set PF_NOFREEZE and never change it — we can consider
refining this further in the future.
>> Testing and Results
>>
>> Platform:
>> - Architecture: x86_64
>> - CPU: ZHAOXIN KaiXian KX-7000
>> - RAM: 16 GB
>> - Kernel: v6.6.93
>>
>> result without the patch:
>> dmesg | grep elap
>> [ 219.608992] Freezing user space processes completed (elapsed 0.010 seconds)
>> [ 219.617355] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
>> [ 228.029119] Freezing user space processes completed (elapsed 0.013 seconds)
>> [ 228.040672] Freezing remaining freezable tasks completed (elapsed 0.011 seconds)
>> [ 236.879065] Freezing user space processes completed (elapsed 0.020 seconds)
>> [ 236.897976] Freezing remaining freezable tasks completed (elapsed 0.018 seconds)
>> [ 246.276679] Freezing user space processes completed (elapsed 0.026 seconds)
>> [ 246.298636] Freezing remaining freezable tasks completed (elapsed 0.021 seconds)
>> [ 256.221504] Freezing user space processes completed (elapsed 0.030 seconds)
>> [ 256.248955] Freezing remaining freezable tasks completed (elapsed 0.027 seconds)
>> [ 266.674987] Freezing user space processes completed (elapsed 0.040 seconds)
>> [ 266.709811] Freezing remaining freezable tasks completed (elapsed 0.034 seconds)
>> [ 277.701679] Freezing user space processes completed (elapsed 0.046 seconds)
>> [ 277.742048] Freezing remaining freezable tasks completed (elapsed 0.040 seconds)
>> [ 289.246611] Freezing user space processes completed (elapsed 0.046 seconds)
>> [ 289.290753] Freezing remaining freezable tasks completed (elapsed 0.044 seconds)
>> [ 301.516854] Freezing user space processes completed (elapsed 0.041 seconds)
>> [ 301.576287] Freezing remaining freezable tasks completed (elapsed 0.059 seconds)
>> [ 314.422499] Freezing user space processes completed (elapsed 0.043 seconds)
>> [ 314.465804] Freezing remaining freezable tasks completed (elapsed 0.043 seconds)
>>
>> result with the patch:
>> dmesg | grep elap
>> [ 54.161674] Freezing user space processes completed (elapsed 0.007 seconds)
>> [ 54.171536] Freezing remaining freezable tasks completed (elapsed 0.009 seconds)
>> [ 62.556462] Freezing user space processes completed (elapsed 0.006 seconds)
>> [ 62.566496] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
>> [ 71.395421] Freezing user space processes completed (elapsed 0.009 seconds)
>> [ 71.402820] Freezing remaining freezable tasks completed (elapsed 0.007 seconds)
>> [ 80.785463] Freezing user space processes completed (elapsed 0.010 seconds)
>> [ 80.793914] Freezing remaining freezable tasks completed (elapsed 0.008 seconds)
>> [ 90.962659] Freezing user space processes completed (elapsed 0.012 seconds)
>> [ 90.973519] Freezing remaining freezable tasks completed (elapsed 0.010 seconds)
>> [ 101.435638] Freezing user space processes completed (elapsed 0.013 seconds)
>> [ 101.449023] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
>> [ 112.669786] Freezing user space processes completed (elapsed 0.015 seconds)
>> [ 112.683540] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
>> [ 124.585681] Freezing user space processes completed (elapsed 0.017 seconds)
>> [ 124.599553] Freezing remaining freezable tasks completed (elapsed 0.013 seconds)
>> [ 136.826635] Freezing user space processes completed (elapsed 0.016 seconds)
>> [ 136.841840] Freezing remaining freezable tasks completed (elapsed 0.015 seconds)
>> [ 149.686575] Freezing user space processes completed (elapsed 0.016 seconds)
>> [ 149.701549] Freezing remaining freezable tasks completed (elapsed 0.014 seconds)
>>
>> Here is the user-space fork storm simulator used for testing:
>>
>> ```c
>> // create_zombie.c
>>
>> void usage(const char *prog) {
>> fprintf(stderr, "Usage: %s <number_of_zombies>\n", prog);
>> exit(EXIT_FAILURE);
>> }
>>
>> int main(int argc, char *argv[]) {
>> if (argc != 2) {
>> usage(argv[0]);
>> }
>>
>> long num_zombies = strtol(argv[1], NULL, 10);
>> if (num_zombies <= 0 || num_zombies > 1000000) {
>> fprintf(stderr, "Invalid number of zombies: %ld\n", num_zombies);
>> exit(EXIT_FAILURE);
>> }
>>
>> printf("Creating %ld zombie processes...\n", num_zombies);
>>
>> for (long i = 0; i < num_zombies; i++) {
>> pid_t pid = fork();
>> if (pid < 0) {
>> perror("fork failed");
>> exit(EXIT_FAILURE);
>> } else if (pid == 0) {
>> // Child exits immediately
>> exit(0);
>> }
>> // Parent does NOT wait, leaving zombie
>> }
>>
>> printf("All child processes created. Sleeping for 60 seconds...\n");
>> sleep(60);
>>
>> printf("Parent exiting, zombies will be reaped by init.\n");
>> return 0;
>> }
>> ```
>>
>> And we used a shell loop to suspend repeatedly:
>>
>> ```bash
>> LOOPS=10
>>
>> echo none > /sys/power/pm_test
>> echo 5 > /sys/module/suspend/parameters/pm_test_delay
>> for ((i=1; i<=LOOPS; i++)); do
>> echo "===== Test round $i/$LOOPS ====="
>> ./create_zombie $((i * 3000)) &
>> sleep 5
>> echo mem > /sys/power/state
>>
>> pkill create_zombie
>> echo "Round $i complete. Waiting 5s..."
>> sleep 5
>>
>> done
>> echo "==== All $LOOPS rounds complete ===="
>> ```
>>
>> Zihuan Zhang (1):
>> PM / Freezer: Skip zombie/dead processes to reduce freeze latency
>>
>> kernel/power/process.c | 2 +-
>> 1 file changed, 9 insertion(+), 1 deletion(-)
>>
>> --
>> 2.25.1
>>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-16 16:38 ` Oleg Nesterov
2025-07-16 18:36 ` Peter Zijlstra
@ 2025-07-17 1:16 ` Zihuan Zhang
2025-07-17 1:31 ` Oleg Nesterov
1 sibling, 1 reply; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-17 1:16 UTC (permalink / raw)
To: Oleg Nesterov
Cc: rafael J . wysocki, Peter Zijlstra, len brown, pavel machek,
linux-pm, linux-kernel
Hi Oleg,
在 2025/7/17 00:38, Oleg Nesterov 写道:
> On 07/16, Zihuan Zhang wrote:
>> @@ -51,7 +51,15 @@ static int try_to_freeze_tasks(bool user_only)
>> todo = 0;
>> read_lock(&tasklist_lock);
>> for_each_process_thread(g, p) {
>> - if (p == current || !freeze_task(p))
>> + /*
>> + * Zombie and dead tasks are not running anymore and cannot enter
>> + * the __refrigerator(). Skipping them avoids unnecessary freeze attempts.
>> + *
>> + * TODO: Consider using PF_NOFREEZE instead, which may provide
>> + * a more generic exclusion mechanism for other non-freezable tasks.
>> + * However, for now, exit_state is sufficient to skip user processes.
> I don't really understand the comment... The freeze_task() paths already
> consider PF_NOFREEZE, although we can check it earlier as Peter suggests.
You’re right — freeze_task() already takes PF_NOFREEZE into account.
Our intention here is to skip zombie and dead tasks earlier to avoid
calling freeze_task() unnecessarily, especially when the number of such
tasks is large.
The comment is meant to highlight a possible future direction: while
exit_state already allows us to skip all exiting user-space tasks
safely, we may later extend the logic to skip certain kernel threads
that set PF_NOFREEZE and never clear it (e.g., kthreadd), as suggested
by Peter
>> + */
>> + if (p == current || p->exit_state || !freeze_task(p))
>> continue;
> I leave this to you and Rafael, but this change doesn't look safe to me.
> What if the exiting task does some IO after exit_notify() ?
Tasks that have passed exit_notify() and entered EXIT_ZOMBIE are no
longer schedulable, so they cannot do I/O anymore. Skipping them during
freezing should be safe
> Oleg.
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-16 18:36 ` Peter Zijlstra
@ 2025-07-17 1:30 ` Zihuan Zhang
2025-07-17 1:43 ` Oleg Nesterov
0 siblings, 1 reply; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-17 1:30 UTC (permalink / raw)
To: Peter Zijlstra, Oleg Nesterov
Cc: rafael J . wysocki, len brown, pavel machek, linux-pm,
linux-kernel
Hi Peter,
在 2025/7/17 02:36, Peter Zijlstra 写道:
> On Wed, Jul 16, 2025 at 06:38:55PM +0200, Oleg Nesterov wrote:
>> On 07/16, Zihuan Zhang wrote:
>>> @@ -51,7 +51,15 @@ static int try_to_freeze_tasks(bool user_only)
>>> todo = 0;
>>> read_lock(&tasklist_lock);
>>> for_each_process_thread(g, p) {
>>> - if (p == current || !freeze_task(p))
>>> + /*
>>> + * Zombie and dead tasks are not running anymore and cannot enter
>>> + * the __refrigerator(). Skipping them avoids unnecessary freeze attempts.
>>> + *
>>> + * TODO: Consider using PF_NOFREEZE instead, which may provide
>>> + * a more generic exclusion mechanism for other non-freezable tasks.
>>> + * However, for now, exit_state is sufficient to skip user processes.
>> I don't really understand the comment... The freeze_task() paths already
>> consider PF_NOFREEZE, although we can check it earlier as Peter suggests.
> Right; I really don't understand why we should special case
> ->exit_state. Why not DTRT and optimize NOFREEZE if all this really
> matters (smalls gains from what ISTR from the previous discussion).
The main reason we didn’t rely directly on PF_NOFREEZE is that it’s a
mutable flag — in some cases, it can be cleared later, which makes early
skipping potentially unsafe.
In contrast, exit_state is stable and skipping tasks based on it is safe.
Also, the previous version of the patch you shared might allow some
paths to bypass lock_system_sleep(), which could break the intended
protection.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-17 1:16 ` Zihuan Zhang
@ 2025-07-17 1:31 ` Oleg Nesterov
2025-07-17 8:45 ` Zihuan Zhang
0 siblings, 1 reply; 13+ messages in thread
From: Oleg Nesterov @ 2025-07-17 1:31 UTC (permalink / raw)
To: Zihuan Zhang
Cc: rafael J . wysocki, Peter Zijlstra, len brown, pavel machek,
linux-pm, linux-kernel
Hi Zihuan,
On 07/17, Zihuan Zhang wrote:
>
> >>+ */
> >>+ if (p == current || p->exit_state || !freeze_task(p))
> >> continue;
> >I leave this to you and Rafael, but this change doesn't look safe to me.
> >What if the exiting task does some IO after exit_notify() ?
>
> Tasks that have passed exit_notify() and entered EXIT_ZOMBIE are no longer
> schedulable,
How so? please look at do_exit(). The exiting task is still running
until it does its last __schedule() in do_task_dead().
> so they cannot do I/O anymore. Skipping them during freezing
> should be safe
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-17 1:30 ` Zihuan Zhang
@ 2025-07-17 1:43 ` Oleg Nesterov
0 siblings, 0 replies; 13+ messages in thread
From: Oleg Nesterov @ 2025-07-17 1:43 UTC (permalink / raw)
To: Zihuan Zhang
Cc: Peter Zijlstra, rafael J . wysocki, len brown, pavel machek,
linux-pm, linux-kernel
On 07/17, Zihuan Zhang wrote:
>
> The main reason we didn’t rely directly on PF_NOFREEZE is that it’s a
> mutable flag — in some cases, it can be cleared later, which makes early
> skipping potentially unsafe.
Afaics userspace tasks can only set PF_NOFREEZE in do_task_dead() and never
clear it.
Apart from lock_system_sleep(). That is why (I think) Peter rightly suggests
to take system_transition_mutex in this function earlier.
> In contrast, exit_state is stable and skipping tasks based on it is safe.
I don't think it is really safe...
Oleg.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency
2025-07-17 1:31 ` Oleg Nesterov
@ 2025-07-17 8:45 ` Zihuan Zhang
0 siblings, 0 replies; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-17 8:45 UTC (permalink / raw)
To: Oleg Nesterov
Cc: rafael J . wysocki, Peter Zijlstra, len brown, pavel machek,
linux-pm, linux-kernel
Hi Oleg,
在 2025/7/17 09:31, Oleg Nesterov 写道:
> Hi Zihuan,
>
> On 07/17, Zihuan Zhang wrote:
>>>> + */
>>>> + if (p == current || p->exit_state || !freeze_task(p))
>>>> continue;
>>> I leave this to you and Rafael, but this change doesn't look safe to me.
>>> What if the exiting task does some IO after exit_notify() ?
>> Tasks that have passed exit_notify() and entered EXIT_ZOMBIE are no longer
>> schedulable,
> How so? please look at do_exit(). The exiting task is still running
> until it does its last __schedule() in do_task_dead().
>
To verify the potential presence of EXIT_DEAD tasks during the freezing
stage, I added some logging in try_to_freeze_tasks() to print out any
task with exit_state == EXIT_DEAD. Then I created a fork storm scenario
to ensure a large number of tasks are exiting during the freeze window.
In practice, even after running hundreds of iterations under heavy load,
I wasn’t able to capture any such task being printed. Since the exit
phase is very fast, it seems unlikely that an EXIT_DEAD task stays in
the process list long enough to be observed during the freeze loop.
So I believe it's safe to skip tasks with exit_state in this context.
diff --git a/kernel/power/process.c b/kernel/power/process.c
index c1d6c5150033..054fad43ed31 100644
--- a/kernel/power/process.c
+++ b/kernel/power/process.c
@@ -59,6 +59,8 @@ static int try_to_freeze_tasks(bool user_only)
* a more generic exclusion mechanism for other
non-freezable tasks.
* However, for now, exit_state is sufficient
to skip user processes.
*/
+ if (p->exit_state == EXIT_DEAD)
+ pr_info("current process is going to
dead name:%s pid:%d \n", p->comm, p->pid);
if (p == current || p->exit_state ||
!freeze_task(p))
continue;
>> so they cannot do I/O anymore. Skipping them during freezing
>> should be safe
> Oleg.
>
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce
2025-07-17 1:02 ` Zihuan Zhang
@ 2025-07-17 9:50 ` Rafael J. Wysocki
2025-07-21 10:39 ` Zihuan Zhang
0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2025-07-17 9:50 UTC (permalink / raw)
To: Zihuan Zhang
Cc: Rafael J. Wysocki, Peter Zijlstra, Oleg Nesterov, len brown,
pavel machek, linux-pm, linux-kernel
On Thu, Jul 17, 2025 at 3:02 AM Zihuan Zhang <zhangzihuan@kylinos.cn> wrote:
>
> HI Rafael,
>
> 在 2025/7/16 20:26, Rafael J. Wysocki 写道:
> > Hi,
> >
> > On Wed, Jul 16, 2025 at 8:26 AM Zihuan Zhang <zhangzihuan@kylinos.cn> wrote:
> >> Hi all,
> >>
> >> This patch series improves the performance of the process freezer by
> >> skipping zombie tasks during freezing.
> >>
> >> In the suspend and hibernation paths, the freezer traverses all tasks
> >> and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
> >> PF_EXITING) are already dead — they are not schedulable and cannot enter
> >> the refrigerator. Attempting to freeze such tasks is redundant and
> >> unnecessarily increases freezing time.
> >>
> >> In particular, on systems under fork storm conditions (e.g., many
> >> short-lived processes quickly becoming zombies), the number of zombie tasks
> >> can spike into the thousands or more. We observed that this causes the
> >> freezer loop to waste significant time processing tasks that are guaranteed
> >> to not need freezing.
> > I think that the discussion with Peter regarding this has not been concluded.
> >
> > I thought that there was an alternative patch proposed during that
> > discussion. If I'm not mistaken about this, what happened to that
> > patch?
> >
> > Thanks!
> >
>
> Currently, the general consensus from the discussion is that skipping
> zombie or dead tasks can help reduce locking overhead during freezing.
Peter doesn't seem to be convinced that this is the case.
> The remaining question is how best to implement that.
>
> Peter suggested skipping all tasks with PF_NOFREEZE, which would make
> the logic more general and cover all cases. However, as Oleg pointed
> out, the current implementation based on PF_NOFREEZE might be problematic.
>
> My current thought is that exit_state already reliably covers all
> exiting user processes, and it’s a good fit for skipping user-space
> tasks. For the kernel side, we may safely skip a few kernel threads like
> kthreadd that set PF_NOFREEZE and never change it — we can consider
> refining this further in the future.
There is the counter argument of special-casing of p->exit_state and
the relatively weak justification for it.
You have created a synthetic workload where it matters, but how likely
is it to be the case in practice?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce
2025-07-17 9:50 ` Rafael J. Wysocki
@ 2025-07-21 10:39 ` Zihuan Zhang
0 siblings, 0 replies; 13+ messages in thread
From: Zihuan Zhang @ 2025-07-21 10:39 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Peter Zijlstra, Oleg Nesterov, len brown, pavel machek, linux-pm,
linux-kernel
在 2025/7/17 17:50, Rafael J. Wysocki 写道:
> On Thu, Jul 17, 2025 at 3:02 AM Zihuan Zhang <zhangzihuan@kylinos.cn> wrote:
>> HI Rafael,
>>
>> 在 2025/7/16 20:26, Rafael J. Wysocki 写道:
>>> Hi,
>>>
>>> On Wed, Jul 16, 2025 at 8:26 AM Zihuan Zhang <zhangzihuan@kylinos.cn> wrote:
>>>> Hi all,
>>>>
>>>> This patch series improves the performance of the process freezer by
>>>> skipping zombie tasks during freezing.
>>>>
>>>> In the suspend and hibernation paths, the freezer traverses all tasks
>>>> and attempts to freeze them. However, zombie tasks (EXIT_ZOMBIE with
>>>> PF_EXITING) are already dead — they are not schedulable and cannot enter
>>>> the refrigerator. Attempting to freeze such tasks is redundant and
>>>> unnecessarily increases freezing time.
>>>>
>>>> In particular, on systems under fork storm conditions (e.g., many
>>>> short-lived processes quickly becoming zombies), the number of zombie tasks
>>>> can spike into the thousands or more. We observed that this causes the
>>>> freezer loop to waste significant time processing tasks that are guaranteed
>>>> to not need freezing.
>>> I think that the discussion with Peter regarding this has not been concluded.
>>>
>>> I thought that there was an alternative patch proposed during that
>>> discussion. If I'm not mistaken about this, what happened to that
>>> patch?
>>>
>>> Thanks!
>>>
>> Currently, the general consensus from the discussion is that skipping
>> zombie or dead tasks can help reduce locking overhead during freezing.
> Peter doesn't seem to be convinced that this is the case.
>
Yeah.
>> The remaining question is how best to implement that.
>>
>> Peter suggested skipping all tasks with PF_NOFREEZE, which would make
>> the logic more general and cover all cases. However, as Oleg pointed
>> out, the current implementation based on PF_NOFREEZE might be problematic.
>>
>> My current thought is that exit_state already reliably covers all
>> exiting user processes, and it’s a good fit for skipping user-space
>> tasks. For the kernel side, we may safely skip a few kernel threads like
>> kthreadd that set PF_NOFREEZE and never change it — we can consider
>> refining this further in the future.
> There is the counter argument of special-casing of p->exit_state and
> the relatively weak justification for it.
>
> You have created a synthetic workload where it matters, but how likely
> is it to be the case in practice?
Our initial thought was that the freezer should primarily focus on tasks
that can be frozen. If a task is not freezable and its state will not
change (such as kernel threads that have PF_NOFREEZE set permanently),
it should be safe to skip it during the iteration. This helps to
reduce unnecessary overhead when handling a large number of such tasks.
We do not insist that this is the only correct way to implement the
optimization — if there’s a better approach that is equally safe and
more general, we are happy to adopt it.
In practice, the improvement becomes noticeable only when there are a
lot of tasks present. So the benefit is scenario-dependent, and we agree
that real-world relevance should be considered carefully.
Thanks again for the discussion.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-07-21 10:39 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 6:26 [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Zihuan Zhang
2025-07-16 6:26 ` [PATCH v4] PM / Freezer: Skip zombie/dead processes to reduce freeze latency Zihuan Zhang
2025-07-16 16:38 ` Oleg Nesterov
2025-07-16 18:36 ` Peter Zijlstra
2025-07-17 1:30 ` Zihuan Zhang
2025-07-17 1:43 ` Oleg Nesterov
2025-07-17 1:16 ` Zihuan Zhang
2025-07-17 1:31 ` Oleg Nesterov
2025-07-17 8:45 ` Zihuan Zhang
2025-07-16 12:26 ` [PATCH v4 0/1] PM / Freezer: Skip zombie/dead processes to reduce Rafael J. Wysocki
2025-07-17 1:02 ` Zihuan Zhang
2025-07-17 9:50 ` Rafael J. Wysocki
2025-07-21 10:39 ` Zihuan Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).