From: Zihuan Zhang <zhangzihuan@kylinos.cn>
To: Michal Hocko <mhocko@suse.com>, Theodore Ts'o <tytso@mit.edu>,
Jan Kara <jack@suse.com>
Cc: "Rafael J . Wysocki" <rafael@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Oleg Nesterov <oleg@redhat.com>,
David Hildenbrand <david@redhat.com>,
Jonathan Corbet <corbet@lwn.net>, Ingo Molnar <mingo@redhat.com>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
len brown <len.brown@intel.com>, pavel machek <pavel@kernel.org>,
Kees Cook <kees@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Vlastimil Babka <vbabka@suse.cz>, Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Nico Pache <npache@redhat.com>, xu xin <xu.xin16@zte.com.cn>,
wangfushuai <wangfushuai@baidu.com>,
Andrii Nakryiko <andrii@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Jeff Layton <jlayton@kernel.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Adrian Ratiu <adrian.ratiu@collabora.com>,
linux-pm@vger.kernel.org, linux-mm@kvack.org,
linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to address process dependency issues
Date: Tue, 12 Aug 2025 13:57:49 +0800 [thread overview]
Message-ID: <d86a9883-9d2e-4bb2-a93d-0d95b4a60e5f@kylinos.cn> (raw)
In-Reply-To: <aJnM32xKq0FOWBzw@tiehlicka>
Hi all,
We encountered an issue where the number of freeze retries increased due
to processes stuck in D state. The logs point to jbd2-related activity.
log1:
6616.650482] task:ThreadPoolForeg state:D stack:0 pid:262026
tgid:4065 ppid:2490 task_flags:0x400040 flags:0x00004004
[ 6616.650485] Call Trace:
[ 6616.650486] <TASK>
[ 6616.650489] __schedule+0x532/0xea0
[ 6616.650494] schedule+0x27/0x80
[ 6616.650496] jbd2_log_wait_commit+0xa6/0x120
[ 6616.650499] ? __pfx_autoremove_wake_function+0x10/0x10
[ 6616.650502] ext4_sync_file+0x1ba/0x380
[ 6616.650505] do_fsync+0x3b/0x80
log2:
[ 631.206315] jdb2_log_wait_log_commit completed (elapsed 0.002 seconds)
[ 631.215325] jdb2_log_wait_log_commit completed (elapsed 0.001 seconds)
[ 631.240704] jdb2_log_wait_log_commit completed (elapsed 0.386 seconds)
[ 631.262167] Filesystems sync: 0.424 seconds
[ 631.262821] Freezing user space processes
[ 631.263839] freeze round: 1, task to freeze: 852
[ 631.265128] freeze round: 2, task to freeze: 2
[ 631.267039] freeze round: 3, task to freeze: 2
[ 631.271176] freeze round: 4, task to freeze: 2
[ 631.279160] freeze round: 5, task to freeze: 2
[ 631.287152] freeze round: 6, task to freeze: 2
[ 631.295346] freeze round: 7, task to freeze: 2
[ 631.301747] freeze round: 8, task to freeze: 2
[ 631.309346] freeze round: 9, task to freeze: 2
[ 631.317353] freeze round: 10, task to freeze: 2
[ 631.325348] freeze round: 11, task to freeze: 2
[ 631.333353] freeze round: 12, task to freeze: 2
[ 631.341358] freeze round: 13, task to freeze: 2
[ 631.349357] freeze round: 14, task to freeze: 2
[ 631.357363] freeze round: 15, task to freeze: 2
[ 631.365361] freeze round: 16, task to freeze: 2
[ 631.373379] freeze round: 17, task to freeze: 2
[ 631.381366] freeze round: 18, task to freeze: 2
[ 631.389365] freeze round: 19, task to freeze: 2
[ 631.397371] freeze round: 20, task to freeze: 2
[ 631.405373] freeze round: 21, task to freeze: 2
[ 631.413373] freeze round: 22, task to freeze: 2
[ 631.421392] freeze round: 23, task to freeze: 1
[ 631.429948] freeze round: 24, task to freeze: 1
[ 631.438295] freeze round: 25, task to freeze: 1
[ 631.444546] jdb2_log_wait_log_commit completed (elapsed 0.249 seconds)
[ 631.446387] freeze round: 26, task to freeze: 0
[ 631.446390] Freezing user space processes completed (elapsed 0.183
seconds)
[ 631.446392] OOM killer disabled.
[ 631.446393] Freezing remaining freezable tasks
[ 631.446656] freeze round: 1, task to freeze: 4
[ 631.447976] freeze round: 2, task to freeze: 0
[ 631.447978] Freezing remaining freezable tasks completed (elapsed
0.001 seconds)
[ 631.447980] PM: suspend debug: Waiting for 1 second(s).
[ 632.450858] OOM killer enabled.
[ 632.450859] Restarting tasks: Starting
[ 632.453140] Restarting tasks: Done
[ 632.453173] random: crng reseeded on system resumption
[ 632.453370] PM: suspend exit
[ 632.462799] jdb2_log_wait_log_commit completed (elapsed 0.000 seconds)
[ 632.466114] jdb2_log_wait_log_commit completed (elapsed 0.001 seconds)
This is the reason:
[ 631.444546] jdb2_log_wait_log_commit completed (elapsed 0.249 seconds)
During freezing, user processes executing jbd2_log_wait_commit enter D
state because this function calls wait_event and can take tens of
milliseconds to complete. This long execution time, coupled with
possible competition with the freezer, causes repeated freeze retries.
While we understand that jbd2 is a freezable kernel thread, we would
like to know if there is a way to freeze it earlier or freeze some
critical processes proactively to reduce this contention.
Thanks for your input and suggestions.
在 2025/8/11 18:58, Michal Hocko 写道:
> On Mon 11-08-25 17:13:43, Zihuan Zhang wrote:
>> 在 2025/8/8 16:58, Michal Hocko 写道:
> [...]
>>> Also the interface seems to be really coarse grained and it can easily
>>> turn out insufficient for other usecases while it is not entirely clear
>>> to me how this could be extended for those.
>> We recognize that the current interface is relatively coarse-grained and
>> may not be sufficient for all scenarios. The present implementation is a
>> basic version.
>>
>> Our plan is to introduce a classification-based mechanism that assigns
>> different freeze priorities according to process categories. For example,
>> filesystem and graphics-related processes will be given higher default
>> freeze priority, as they are critical in the freezing workflow. This
>> classification approach helps target important processes more precisely.
>>
>> However, this requires further testing and refinement before full
>> deployment. We believe this incremental, category-based design will make the
>> mechanism more effective and adaptable over time while keeping it
>> manageable.
> Unless there is a clear path for a more extendable interface then
> introducing this one is a no-go. We do not want to grow different ways
> to establish freezing policies.
>
> But much more fundamentally. So far I haven't really seen any argument
> why different priorities help with the underlying problem other than the
> timing might be slightly different if you change the order of freezing.
> This to me sounds like the proposed scheme mostly works around the
> problem you are seeing and as such is not a really good candidate to be
> merged as a long term solution. Not to mention with a user API that
> needs to be maintained for ever.
>
> So NAK from me on the interface.
>
Thanks for the feedback. I understand your concern that changing the
freezer priority order looks like working around the symptom rather than
solving the root cause.
Since the last discussion, we have analyzed the D-state processes
further and identified that the long wait time is caused by
jbd2_log_wait_commit. This wait happens because user tasks call into
this function during fsync/fdatasync and it can take tens of
milliseconds to complete. When this coincides with the freezer
operation, the tasks are stuck in D state and retried multiple times,
increasing the total freeze time.
Although we know that jbd2 is a freezable kernel thread, we are
exploring whether freezing it earlier — or freezing certain key
processes first — could reduce this contention and improve freeze
completion time.
>>> I believe it would be more useful to find sources of those freezer
>>> blockers and try to address those. Making more blocked tasks
>>> __set_task_frozen compatible sounds like a general improvement in
>>> itself.
>> we have already identified some causes of D-state tasks, many of which are
>> related to the filesystem. On some systems, certain processes frequently
>> execute ext4_sync_file, and under contention this can lead to D-state tasks.
> Please work with maintainers of those subsystems to find proper
> solutions.
We’ve pulled in the jbd2 maintainer to get feedback on whether changing
the freeze ordering for jbd2 is safe or if there’s a better approach to
avoid the repeated retries caused by this wait.
next prev parent reply other threads:[~2025-08-12 5:58 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-07 12:14 [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to address process dependency issues Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 1/9] freezer: Introduce freeze_priority field in task_struct Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 2/9] freezer: Introduce API to set per-task freeze priority Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 3/9] freezer: Add per-priority layered freeze logic Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 4/9] freezer: Set default freeze priority for userspace tasks Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 5/9] freezer: set default freeze priority for PF_SUSPEND_TASK processes Zihuan Zhang
2025-08-08 14:39 ` Oleg Nesterov
2025-08-11 9:25 ` Zihuan Zhang
2025-08-11 9:32 ` Oleg Nesterov
2025-08-11 9:42 ` Zihuan Zhang
2025-08-11 9:46 ` Oleg Nesterov
2025-08-11 9:54 ` Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 6/9] freezer: Set default freeze priority for zombie tasks Zihuan Zhang
2025-08-08 14:29 ` Oleg Nesterov
2025-08-11 9:29 ` Zihuan Zhang
2025-08-11 9:42 ` Oleg Nesterov
2025-08-12 8:07 ` Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 7/9] freezer: raise freeze priority of tasks failed to freeze last time Zihuan Zhang
2025-08-08 14:53 ` Oleg Nesterov
2025-08-11 9:31 ` Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 8/9] freezer: Add retry count statistics for freeze pass iterations Zihuan Zhang
2025-08-07 12:14 ` [RFC PATCH v1 9/9] proc: Add /proc/<pid>/freeze_priority interface Zihuan Zhang
2025-08-07 13:25 ` [RFC PATCH v1 0/9] freezer: Introduce freeze priority model to address process dependency issues Michal Hocko
2025-08-08 1:13 ` Zihuan Zhang
2025-08-08 7:00 ` Michal Hocko
2025-08-08 7:52 ` Zihuan Zhang
2025-08-08 8:58 ` Michal Hocko
2025-08-11 9:13 ` Zihuan Zhang
2025-08-11 10:58 ` Michal Hocko
2025-08-12 5:57 ` Zihuan Zhang [this message]
2025-08-12 17:26 ` Darrick J. Wong
2025-08-13 5:48 ` Zihuan Zhang
2025-08-14 16:43 ` Darrick J. Wong
2025-08-15 8:17 ` Zihuan Zhang
2025-08-08 7:57 ` Oleg Nesterov
2025-08-08 8:40 ` Zihuan Zhang
2025-08-14 14:37 ` Peter Zijlstra
2025-08-15 8:27 ` Zihuan Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d86a9883-9d2e-4bb2-a93d-0d95b4a60e5f@kylinos.cn \
--to=zhangzihuan@kylinos.cn \
--cc=Liam.Howlett@oracle.com \
--cc=adrian.ratiu@collabora.com \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=brauner@kernel.org \
--cc=bsegall@google.com \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=dietmar.eggemann@arm.com \
--cc=jack@suse.com \
--cc=jlayton@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=len.brown@intel.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pm@vger.kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=npache@redhat.com \
--cc=oleg@redhat.com \
--cc=pavel@kernel.org \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=tytso@mit.edu \
--cc=vbabka@suse.cz \
--cc=vincent.guittot@linaro.org \
--cc=viro@zeniv.linux.org.uk \
--cc=vschneid@redhat.com \
--cc=wangfushuai@baidu.com \
--cc=xu.xin16@zte.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).