public inbox for cgroups@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] cgroup-v2/freezer: small improvements
@ 2025-12-23 10:20 Pavel Tikhomirov
  2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads Pavel Tikhomirov
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-23 10:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel,
	Pavel Tikhomirov

This patch series contains two small improvements for cgroup-v2 freezer
controller.

First allows freezing cgroups with kthreads inside, we still won't
freeze kthreads, we still ignore them, but at the same time we allow
cgroup to report frozen when all other non-kthread tasks are frozen.

Second patch adds information into dmesg to identify processes which
prevent cgroup from being frozen or just don't allow it to freeze fast
enough.

I hope these changes will be generally useful for the users of freezer
cgroup controller.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>

Pavel Tikhomirov (2):
  cgroup-v2/freezer: Allow freezing with kthreads
  cgroup-v2/freezer: Print information about unfreezable process

 include/linux/cgroup-defs.h     |   5 ++
 kernel/cgroup/cgroup-internal.h |   5 ++
 kernel/cgroup/cgroup.c          |   2 +
 kernel/cgroup/freezer.c         | 155 ++++++++++++++++++++++++++++++--
 4 files changed, 161 insertions(+), 6 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads
  2025-12-23 10:20 [PATCH 0/2] cgroup-v2/freezer: small improvements Pavel Tikhomirov
@ 2025-12-23 10:20 ` Pavel Tikhomirov
  2025-12-23 10:25   ` Pavel Tikhomirov
  2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: Allow " Pavel Tikhomirov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-23 10:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel,
	Pavel Tikhomirov

Cgroup-v2 implementation of freezer ignores kernel threads, but still
counts them against nr_frozen_tasks. So the cgroup with kthread inside
will never report frozen.

It might be generally beneficial to put kthreads into cgroups. One
example is vhost-xxx kthreads used for qemu virtual machines, those are
already put into cgroups of their virtual machine. This way they can be
restricted by the same limits the instance they belong to is.

To make the cgroups with kthreads freezable, let's count the number of
kthreads in each cgroup when it is freezing, and offset nr_frozen_tasks
checks with it. This way we can ignore kthreads completely and report
cgroup frozen when all non-kthread tasks are frozen.

Note: The nr_kthreads_ignore is protected with css_set_lock. And it is
zero unless cgroup is freezing.
Note2: This restores parity with cgroup-v1 freezer behavior, which
already ignored kthreads when counting frozen tasks.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
---
 include/linux/cgroup-defs.h |  5 +++++
 kernel/cgroup/freezer.c     | 37 +++++++++++++++++++++++++++++++------
 2 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index b760a3c470a5..949f80dc33c5 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -451,6 +451,11 @@ struct cgroup_freezer_state {
 	 */
 	int nr_frozen_tasks;
 
+	/*
+	 * Number of kernel threads to ignore while freezing
+	 */
+	int nr_kthreads_ignore;
+
 	/* Freeze time data consistency protection */
 	seqcount_spinlock_t freeze_seq;
 
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
index 6c18854bff34..02a1db180b70 100644
--- a/kernel/cgroup/freezer.c
+++ b/kernel/cgroup/freezer.c
@@ -73,7 +73,8 @@ void cgroup_update_frozen(struct cgroup *cgrp)
 	 * the cgroup frozen. Otherwise it's not frozen.
 	 */
 	frozen = test_bit(CGRP_FREEZE, &cgrp->flags) &&
-		cgrp->freezer.nr_frozen_tasks == __cgroup_task_count(cgrp);
+		 (cgrp->freezer.nr_frozen_tasks +
+		  cgrp->freezer.nr_kthreads_ignore == __cgroup_task_count(cgrp));
 
 	/* If flags is updated, update the state of ancestor cgroups. */
 	if (cgroup_update_frozen_flag(cgrp, frozen))
@@ -145,6 +146,17 @@ void cgroup_leave_frozen(bool always_leave)
 	spin_unlock_irq(&css_set_lock);
 }
 
+static inline void cgroup_inc_kthread_ignore_cnt(struct cgroup *cgrp)
+{
+	cgrp->freezer.nr_kthreads_ignore++;
+}
+
+static inline void cgroup_dec_kthread_ignore_cnt(struct cgroup *cgrp)
+{
+	cgrp->freezer.nr_kthreads_ignore--;
+	WARN_ON_ONCE(cgrp->freezer.nr_kthreads_ignore < 0);
+}
+
 /*
  * Freeze or unfreeze the task by setting or clearing the JOBCTL_TRAP_FREEZE
  * jobctl bit.
@@ -199,11 +211,15 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze, u64 ts_nsec)
 	css_task_iter_start(&cgrp->self, 0, &it);
 	while ((task = css_task_iter_next(&it))) {
 		/*
-		 * Ignore kernel threads here. Freezing cgroups containing
-		 * kthreads isn't supported.
+		 * Count kernel threads to ignore them during freezing.
 		 */
-		if (task->flags & PF_KTHREAD)
+		if (task->flags & PF_KTHREAD) {
+			if (freeze)
+				cgroup_inc_kthread_ignore_cnt(cgrp);
+			else
+				cgroup_dec_kthread_ignore_cnt(cgrp);
 			continue;
+		}
 		cgroup_freeze_task(task, freeze);
 	}
 	css_task_iter_end(&it);
@@ -228,10 +244,19 @@ void cgroup_freezer_migrate_task(struct task_struct *task,
 	lockdep_assert_held(&css_set_lock);
 
 	/*
-	 * Kernel threads are not supposed to be frozen at all.
+	 * Kernel threads are not supposed to be frozen at all, but we need to
+	 * count them in order to properly ignore.
 	 */
-	if (task->flags & PF_KTHREAD)
+	if (task->flags & PF_KTHREAD) {
+		if (test_bit(CGRP_FREEZE, &dst->flags))
+			cgroup_inc_kthread_ignore_cnt(dst);
+		if (test_bit(CGRP_FREEZE, &src->flags))
+			cgroup_dec_kthread_ignore_cnt(src);
+
+		cgroup_update_frozen(dst);
+		cgroup_update_frozen(src);
 		return;
+	}
 
 	/*
 	 * It's not necessary to do changes if both of the src and dst cgroups
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 1/2] cgroup-v2/freezer: Allow freezing with kthreads
  2025-12-23 10:20 [PATCH 0/2] cgroup-v2/freezer: small improvements Pavel Tikhomirov
  2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads Pavel Tikhomirov
@ 2025-12-23 10:20 ` Pavel Tikhomirov
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
  2025-12-23 17:29 ` [PATCH 0/2] cgroup-v2/freezer: small improvements Michal Koutný
  3 siblings, 0 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-23 10:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel,
	Pavel Tikhomirov

Cgroup-v2 implementation of freezer ignores kernel threads, but still
counts them against nr_frozen_tasks. So the cgroup with kthread inside
will never report frozen.

It might be generally beneficial to put kthreads into cgroups. One
example is vhost-xxx kthreads used for qemu virtual machines, are put
into cgroups of their virtual machine. This way they can be restricted
by the same limits the instance they belong to is.

To make the cgroups with kthreads freezable, let's count the number of
kthreads in each cgroup when it is freezing, and offset nr_frozen_tasks
checks with it. This way we can ignore kthreads completely and report
cgroup frozen when all non-kthread tasks are frozen.

Note: The nr_kthreads_ignore is protected with css_set_lock. And it is
zero unless cgroup is freezing.

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
---
 include/linux/cgroup-defs.h |  5 +++++
 kernel/cgroup/freezer.c     | 37 +++++++++++++++++++++++++++++++------
 2 files changed, 36 insertions(+), 6 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index b760a3c470a5..949f80dc33c5 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -451,6 +451,11 @@ struct cgroup_freezer_state {
 	 */
 	int nr_frozen_tasks;
 
+	/*
+	 * Number of kernel threads to ignore while freezing
+	 */
+	int nr_kthreads_ignore;
+
 	/* Freeze time data consistency protection */
 	seqcount_spinlock_t freeze_seq;
 
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
index 6c18854bff34..02a1db180b70 100644
--- a/kernel/cgroup/freezer.c
+++ b/kernel/cgroup/freezer.c
@@ -73,7 +73,8 @@ void cgroup_update_frozen(struct cgroup *cgrp)
 	 * the cgroup frozen. Otherwise it's not frozen.
 	 */
 	frozen = test_bit(CGRP_FREEZE, &cgrp->flags) &&
-		cgrp->freezer.nr_frozen_tasks == __cgroup_task_count(cgrp);
+		 (cgrp->freezer.nr_frozen_tasks +
+		  cgrp->freezer.nr_kthreads_ignore == __cgroup_task_count(cgrp));
 
 	/* If flags is updated, update the state of ancestor cgroups. */
 	if (cgroup_update_frozen_flag(cgrp, frozen))
@@ -145,6 +146,17 @@ void cgroup_leave_frozen(bool always_leave)
 	spin_unlock_irq(&css_set_lock);
 }
 
+static inline void cgroup_inc_kthread_ignore_cnt(struct cgroup *cgrp)
+{
+	cgrp->freezer.nr_kthreads_ignore++;
+}
+
+static inline void cgroup_dec_kthread_ignore_cnt(struct cgroup *cgrp)
+{
+	cgrp->freezer.nr_kthreads_ignore--;
+	WARN_ON_ONCE(cgrp->freezer.nr_kthreads_ignore < 0);
+}
+
 /*
  * Freeze or unfreeze the task by setting or clearing the JOBCTL_TRAP_FREEZE
  * jobctl bit.
@@ -199,11 +211,15 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze, u64 ts_nsec)
 	css_task_iter_start(&cgrp->self, 0, &it);
 	while ((task = css_task_iter_next(&it))) {
 		/*
-		 * Ignore kernel threads here. Freezing cgroups containing
-		 * kthreads isn't supported.
+		 * Count kernel threads to ignore them during freezing.
 		 */
-		if (task->flags & PF_KTHREAD)
+		if (task->flags & PF_KTHREAD) {
+			if (freeze)
+				cgroup_inc_kthread_ignore_cnt(cgrp);
+			else
+				cgroup_dec_kthread_ignore_cnt(cgrp);
 			continue;
+		}
 		cgroup_freeze_task(task, freeze);
 	}
 	css_task_iter_end(&it);
@@ -228,10 +244,19 @@ void cgroup_freezer_migrate_task(struct task_struct *task,
 	lockdep_assert_held(&css_set_lock);
 
 	/*
-	 * Kernel threads are not supposed to be frozen at all.
+	 * Kernel threads are not supposed to be frozen at all, but we need to
+	 * count them in order to properly ignore.
 	 */
-	if (task->flags & PF_KTHREAD)
+	if (task->flags & PF_KTHREAD) {
+		if (test_bit(CGRP_FREEZE, &dst->flags))
+			cgroup_inc_kthread_ignore_cnt(dst);
+		if (test_bit(CGRP_FREEZE, &src->flags))
+			cgroup_dec_kthread_ignore_cnt(src);
+
+		cgroup_update_frozen(dst);
+		cgroup_update_frozen(src);
 		return;
+	}
 
 	/*
 	 * It's not necessary to do changes if both of the src and dst cgroups
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 [PATCH 0/2] cgroup-v2/freezer: small improvements Pavel Tikhomirov
  2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads Pavel Tikhomirov
  2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: Allow " Pavel Tikhomirov
@ 2025-12-23 10:20 ` Pavel Tikhomirov
  2025-12-23 20:58   ` kernel test robot
                     ` (5 more replies)
  2025-12-23 17:29 ` [PATCH 0/2] cgroup-v2/freezer: small improvements Michal Koutný
  3 siblings, 6 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-23 10:20 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel,
	Pavel Tikhomirov

There can be a situation when freezer cgroup can not freeze for a long
time, e.g. we saw some nfs related hangs (due to lost connection) when
stop and suspend (CRIU) of containers (we use freezer cgroup in both)
were failing with timeout (waiting for frozen status of cgroup).

So we came up with this debugging infrastructure for freezer cgroup
which points to the stack of the unfreezable task, so that later one can
identify the problem location in dmesg.

When one reads from cgroup.events cgroup file, and freeze is in progress
and time since freeze start is over the timeout we trigger the warning.
It walks over all the tasks in the cgroup sub-tree which is freezing and
report the first task which is not frozen.

This patch also adds kernel.freeze_timeout_us sysctl to control the
timeout for reporting unfreezable tasks. Default is 0, which means
it is disabled.

Example output:

I used the (https://github.com/Snorch/proc-hang-module) test module
which introduces proc file, reading from which hangs in kernel, to
emulate unfreezable process and it produces this stack:

[  220.994136] Freeze of /test took 10 sec, due to unfreezable process 6192:cat.
[  220.994326] Call Trace:
[  220.994418]  <TASK>
[  220.994507]  ? proc_hang_read+0x35/0x60 [proc_hang]
[  220.994680]  ? proc_hang_read+0x3a/0x60 [proc_hang]
[  220.994861]  ? proc_reg_read+0x5a/0xa0
[  220.995021]  ? vfs_read+0xc1/0x370
[  220.995176]  ? auditd_test_task+0x3d/0x50
[  220.995344]  ? __audit_syscall_entry+0xf1/0x140
[  220.995514]  ? ksys_read+0x6b/0xe0
[  220.995667]  ? do_syscall_64+0x7f/0x6d0
...
[  220.998999]  </TASK>

Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
---
 kernel/cgroup/cgroup-internal.h |   5 ++
 kernel/cgroup/cgroup.c          |   2 +
 kernel/cgroup/freezer.c         | 118 ++++++++++++++++++++++++++++++++
 3 files changed, 125 insertions(+)

diff --git a/kernel/cgroup/cgroup-internal.h b/kernel/cgroup/cgroup-internal.h
index 22051b4f1ccb..7e2f729996c8 100644
--- a/kernel/cgroup/cgroup-internal.h
+++ b/kernel/cgroup/cgroup-internal.h
@@ -283,6 +283,11 @@ void cgroup_base_stat_cputime_show(struct seq_file *seq);
  */
 extern const struct proc_ns_operations cgroupns_operations;
 
+/*
+ * freezer.c
+ */
+void check_freeze_timeout(struct cgroup *cgrp);
+
 /*
  * cgroup-v1.c
  */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index e717208cfb18..097cebbeed1b 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3822,6 +3822,8 @@ static int cgroup_events_show(struct seq_file *seq, void *v)
 	seq_printf(seq, "populated %d\n", cgroup_is_populated(cgrp));
 	seq_printf(seq, "frozen %d\n", test_bit(CGRP_FROZEN, &cgrp->flags));
 
+	check_freeze_timeout(cgrp);
+
 	return 0;
 }
 
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
index 02a1db180b70..3880ed400879 100644
--- a/kernel/cgroup/freezer.c
+++ b/kernel/cgroup/freezer.c
@@ -1,8 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/cgroup.h>
+#include <linux/ratelimit.h>
+#include <linux/sysctl.h>
 #include <linux/sched.h>
 #include <linux/sched/task.h>
 #include <linux/sched/signal.h>
+#include <linux/sched/debug.h>
 
 #include "cgroup-internal.h"
 
@@ -349,3 +352,118 @@ void cgroup_freeze(struct cgroup *cgrp, bool freeze)
 		cgroup_file_notify(&cgrp->events_file);
 	}
 }
+
+#define MAX_STACK_TRACE_DEPTH 64
+
+static void warn_freeze_timeout_task(struct cgroup *cgrp, int timeout,
+				     struct task_struct *task)
+{
+	char *buf __free(kfree) = NULL;
+	pid_t tgid;
+
+	buf = kmalloc(PATH_MAX, GFP_KERNEL);
+	if (!buf)
+		return;
+
+	if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
+		return;
+
+	tgid = task_pid_nr_ns(task, &init_pid_ns);
+	pr_warn("Freeze of %s took %ld sec, due to unfreezable process %d:%s.\n",
+		buf, timeout / USEC_PER_SEC, tgid, task->comm);
+	if (!try_get_task_stack(task))
+		return;
+	show_stack(task, NULL, KERN_WARNING);
+	put_task_stack(task);
+}
+
+static void warn_freeze_timeout(struct cgroup *cgrp, int timeout)
+{
+	char *buf __free(kfree) = NULL;
+	struct cgroup_subsys_state *css;
+
+	guard(rcu)();
+	css_for_each_descendant_post(css, &cgrp->self) {
+		struct task_struct *task;
+		struct css_task_iter it;
+
+		css_task_iter_start(css, 0, &it);
+		while ((task = css_task_iter_next(&it))) {
+			if (task->flags & PF_KTHREAD)
+				continue;
+			if (task->frozen)
+				continue;
+
+			warn_freeze_timeout_task(cgrp, timeout, task);
+			css_task_iter_end(&it);
+			return;
+		}
+		css_task_iter_end(&it);
+	}
+
+	buf = kmalloc(PATH_MAX, GFP_KERNEL);
+	if (!buf)
+		return;
+
+	if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
+		return;
+
+	pr_warn("Freeze of %s took %ld sec, but no unfreezable process detected.\n",
+		buf, timeout / USEC_PER_SEC);
+}
+
+#define DEFAULT_FREEZE_RATELIMIT (30 * HZ)
+int sysctl_freeze_timeout_us;
+
+void check_freeze_timeout(struct cgroup *cgrp)
+{
+	static DEFINE_RATELIMIT_STATE(freeze_timeout_rs,
+				      DEFAULT_FREEZE_RATELIMIT, 1);
+	unsigned int sequence;
+	u64 last_freeze_start = 0;
+	u64 last_freeze_time;
+	int timeout;
+
+	timeout = READ_ONCE(sysctl_freeze_timeout_us);
+	if (!timeout)
+		return;
+
+	do {
+		sequence = read_seqcount_begin(&cgrp->freezer.freeze_seq);
+		if (test_bit(CGRP_FREEZE, &cgrp->flags) &&
+		    !test_bit(CGRP_FROZEN, &cgrp->flags))
+			last_freeze_start = cgrp->freezer.freeze_start_nsec;
+	} while (read_seqcount_retry(&cgrp->freezer.freeze_seq, sequence));
+
+	if (!last_freeze_start)
+		return;
+
+	last_freeze_time = ktime_get_ns() - last_freeze_start;
+	do_div(last_freeze_time, NSEC_PER_USEC);
+
+	if (last_freeze_time < timeout)
+		return;
+
+	if (!__ratelimit(&freeze_timeout_rs))
+		return;
+
+	warn_freeze_timeout(cgrp, timeout);
+}
+
+static const struct ctl_table freezer_sysctls[] = {
+	{
+		.procname	= "freeze_timeout_us",
+		.data		= &sysctl_freeze_timeout_us,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+	},
+};
+
+static int __init freezer_sysctls_init(void)
+{
+	register_sysctl_init("kernel", freezer_sysctls);
+	return 0;
+}
+late_initcall(freezer_sysctls_init);
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads
  2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads Pavel Tikhomirov
@ 2025-12-23 10:25   ` Pavel Tikhomirov
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-23 10:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel

This one is accidental, please ignore.

On 12/23/25 18:20, Pavel Tikhomirov wrote:
> Cgroup-v2 implementation of freezer ignores kernel threads, but still
> counts them against nr_frozen_tasks. So the cgroup with kthread inside
> will never report frozen.
> 
> It might be generally beneficial to put kthreads into cgroups. One
> example is vhost-xxx kthreads used for qemu virtual machines, those are
> already put into cgroups of their virtual machine. This way they can be
> restricted by the same limits the instance they belong to is.
> 
> To make the cgroups with kthreads freezable, let's count the number of
> kthreads in each cgroup when it is freezing, and offset nr_frozen_tasks
> checks with it. This way we can ignore kthreads completely and report
> cgroup frozen when all non-kthread tasks are frozen.
> 
> Note: The nr_kthreads_ignore is protected with css_set_lock. And it is
> zero unless cgroup is freezing.
> Note2: This restores parity with cgroup-v1 freezer behavior, which
> already ignored kthreads when counting frozen tasks.
> 
> Signed-off-by: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
> ---
>  include/linux/cgroup-defs.h |  5 +++++
>  kernel/cgroup/freezer.c     | 37 +++++++++++++++++++++++++++++++------
>  2 files changed, 36 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
> index b760a3c470a5..949f80dc33c5 100644
> --- a/include/linux/cgroup-defs.h
> +++ b/include/linux/cgroup-defs.h
> @@ -451,6 +451,11 @@ struct cgroup_freezer_state {
>  	 */
>  	int nr_frozen_tasks;
>  
> +	/*
> +	 * Number of kernel threads to ignore while freezing
> +	 */
> +	int nr_kthreads_ignore;
> +
>  	/* Freeze time data consistency protection */
>  	seqcount_spinlock_t freeze_seq;
>  
> diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
> index 6c18854bff34..02a1db180b70 100644
> --- a/kernel/cgroup/freezer.c
> +++ b/kernel/cgroup/freezer.c
> @@ -73,7 +73,8 @@ void cgroup_update_frozen(struct cgroup *cgrp)
>  	 * the cgroup frozen. Otherwise it's not frozen.
>  	 */
>  	frozen = test_bit(CGRP_FREEZE, &cgrp->flags) &&
> -		cgrp->freezer.nr_frozen_tasks == __cgroup_task_count(cgrp);
> +		 (cgrp->freezer.nr_frozen_tasks +
> +		  cgrp->freezer.nr_kthreads_ignore == __cgroup_task_count(cgrp));
>  
>  	/* If flags is updated, update the state of ancestor cgroups. */
>  	if (cgroup_update_frozen_flag(cgrp, frozen))
> @@ -145,6 +146,17 @@ void cgroup_leave_frozen(bool always_leave)
>  	spin_unlock_irq(&css_set_lock);
>  }
>  
> +static inline void cgroup_inc_kthread_ignore_cnt(struct cgroup *cgrp)
> +{
> +	cgrp->freezer.nr_kthreads_ignore++;
> +}
> +
> +static inline void cgroup_dec_kthread_ignore_cnt(struct cgroup *cgrp)
> +{
> +	cgrp->freezer.nr_kthreads_ignore--;
> +	WARN_ON_ONCE(cgrp->freezer.nr_kthreads_ignore < 0);
> +}
> +
>  /*
>   * Freeze or unfreeze the task by setting or clearing the JOBCTL_TRAP_FREEZE
>   * jobctl bit.
> @@ -199,11 +211,15 @@ static void cgroup_do_freeze(struct cgroup *cgrp, bool freeze, u64 ts_nsec)
>  	css_task_iter_start(&cgrp->self, 0, &it);
>  	while ((task = css_task_iter_next(&it))) {
>  		/*
> -		 * Ignore kernel threads here. Freezing cgroups containing
> -		 * kthreads isn't supported.
> +		 * Count kernel threads to ignore them during freezing.
>  		 */
> -		if (task->flags & PF_KTHREAD)
> +		if (task->flags & PF_KTHREAD) {
> +			if (freeze)
> +				cgroup_inc_kthread_ignore_cnt(cgrp);
> +			else
> +				cgroup_dec_kthread_ignore_cnt(cgrp);
>  			continue;
> +		}
>  		cgroup_freeze_task(task, freeze);
>  	}
>  	css_task_iter_end(&it);
> @@ -228,10 +244,19 @@ void cgroup_freezer_migrate_task(struct task_struct *task,
>  	lockdep_assert_held(&css_set_lock);
>  
>  	/*
> -	 * Kernel threads are not supposed to be frozen at all.
> +	 * Kernel threads are not supposed to be frozen at all, but we need to
> +	 * count them in order to properly ignore.
>  	 */
> -	if (task->flags & PF_KTHREAD)
> +	if (task->flags & PF_KTHREAD) {
> +		if (test_bit(CGRP_FREEZE, &dst->flags))
> +			cgroup_inc_kthread_ignore_cnt(dst);
> +		if (test_bit(CGRP_FREEZE, &src->flags))
> +			cgroup_dec_kthread_ignore_cnt(src);
> +
> +		cgroup_update_frozen(dst);
> +		cgroup_update_frozen(src);
>  		return;
> +	}
>  
>  	/*
>  	 * It's not necessary to do changes if both of the src and dst cgroups

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] cgroup-v2/freezer: small improvements
  2025-12-23 10:20 [PATCH 0/2] cgroup-v2/freezer: small improvements Pavel Tikhomirov
                   ` (2 preceding siblings ...)
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
@ 2025-12-23 17:29 ` Michal Koutný
  2025-12-24  3:06   ` Pavel Tikhomirov
  3 siblings, 1 reply; 17+ messages in thread
From: Michal Koutný @ 2025-12-23 17:29 UTC (permalink / raw)
  To: Pavel Tikhomirov; +Cc: Tejun Heo, Johannes Weiner, cgroups, linux-kernel

On Tue, Dec 23, 2025 at 06:20:06PM +0800, Pavel Tikhomirov <ptikhomirov@virtuozzo.com> wrote:
> First allows freezing cgroups with kthreads inside, we still won't
> freeze kthreads, we still ignore them, but at the same time we allow
> cgroup to report frozen when all other non-kthread tasks are frozen.

kthreads in non-root cgroups are kind of an antipattern.
For which kthreads you would like this change? (See for instance the
commit d96c77bd4eeba ("KVM: x86: switch hugepage recovery thread to
vhost_task") as a possible refactoring of such threads.)

> Second patch adds information into dmesg to identify processes which
> prevent cgroup from being frozen or just don't allow it to freeze fast
> enough.

I can see how this can be useful for debugging, however, it resembles
the existing CONFIG_DETECT_HUNG_TASK and its
kernel.hung_task_timeout_secs. Could that be used instead?

Thanks,
Michal

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
@ 2025-12-23 20:58   ` kernel test robot
  2025-12-24  4:43     ` Pavel Tikhomirov
  2025-12-24  1:30   ` kernel test robot
                     ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: kernel test robot @ 2025-12-23 20:58 UTC (permalink / raw)
  To: Pavel Tikhomirov, Tejun Heo
  Cc: oe-kbuild-all, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel, Pavel Tikhomirov

Hi Pavel,

kernel test robot noticed the following build errors:

[auto build test ERROR on tj-cgroup/for-next]
[also build test ERROR on linus/master v6.19-rc2 next-20251219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/cgroup-v2-freezer-allow-freezing-with-kthreads/20251223-182826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
patch link:    https://lore.kernel.org/r/20251223102124.738818-4-ptikhomirov%40virtuozzo.com
patch subject: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
config: s390-randconfig-r071-20251224 (https://download.01.org/0day-ci/archive/20251224/202512240409.06R0khaZ-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 8.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251224/202512240409.06R0khaZ-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512240409.06R0khaZ-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/cgroup/freezer.c: In function 'warn_freeze_timeout_task':
>> kernel/cgroup/freezer.c:374:7: error: implicit declaration of function 'try_get_task_stack'; did you mean 'tryget_task_struct'? [-Werror=implicit-function-declaration]
     if (!try_get_task_stack(task))
          ^~~~~~~~~~~~~~~~~~
          tryget_task_struct
>> kernel/cgroup/freezer.c:377:2: error: implicit declaration of function 'put_task_stack'; did you mean 'put_task_struct'? [-Werror=implicit-function-declaration]
     put_task_stack(task);
     ^~~~~~~~~~~~~~
     put_task_struct
   cc1: some warnings being treated as errors

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for CAN_DEV
   Depends on [n]: NETDEVICES [=n] && CAN [=y]
   Selected by [y]:
   - CAN [=y] && NET [=y]


vim +374 kernel/cgroup/freezer.c

   357	
   358	static void warn_freeze_timeout_task(struct cgroup *cgrp, int timeout,
   359					     struct task_struct *task)
   360	{
   361		char *buf __free(kfree) = NULL;
   362		pid_t tgid;
   363	
   364		buf = kmalloc(PATH_MAX, GFP_KERNEL);
   365		if (!buf)
   366			return;
   367	
   368		if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
   369			return;
   370	
   371		tgid = task_pid_nr_ns(task, &init_pid_ns);
   372		pr_warn("Freeze of %s took %ld sec, due to unfreezable process %d:%s.\n",
   373			buf, timeout / USEC_PER_SEC, tgid, task->comm);
 > 374		if (!try_get_task_stack(task))
   375			return;
   376		show_stack(task, NULL, KERN_WARNING);
 > 377		put_task_stack(task);
   378	}
   379	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
  2025-12-23 20:58   ` kernel test robot
@ 2025-12-24  1:30   ` kernel test robot
  2025-12-24  3:26   ` kernel test robot
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-12-24  1:30 UTC (permalink / raw)
  To: Pavel Tikhomirov, Tejun Heo
  Cc: llvm, oe-kbuild-all, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel, Pavel Tikhomirov

Hi Pavel,

kernel test robot noticed the following build errors:

[auto build test ERROR on tj-cgroup/for-next]
[also build test ERROR on linus/master v6.19-rc2 next-20251219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/cgroup-v2-freezer-allow-freezing-with-kthreads/20251223-182826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
patch link:    https://lore.kernel.org/r/20251223102124.738818-4-ptikhomirov%40virtuozzo.com
patch subject: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20251224/202512240240.TCvafX5S-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251224/202512240240.TCvafX5S-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512240240.TCvafX5S-lkp@intel.com/

All errors (new ones prefixed by >>):

>> kernel/cgroup/freezer.c:374:7: error: call to undeclared function 'try_get_task_stack'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     374 |         if (!try_get_task_stack(task))
         |              ^
   kernel/cgroup/freezer.c:374:7: note: did you mean 'tryget_task_struct'?
   include/linux/sched/task.h:120:35: note: 'tryget_task_struct' declared here
     120 | static inline struct task_struct *tryget_task_struct(struct task_struct *t)
         |                                   ^
>> kernel/cgroup/freezer.c:377:2: error: call to undeclared function 'put_task_stack'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     377 |         put_task_stack(task);
         |         ^
   kernel/cgroup/freezer.c:377:2: note: did you mean 'put_task_struct'?
   include/linux/sched/task.h:128:20: note: 'put_task_struct' declared here
     128 | static inline void put_task_struct(struct task_struct *t)
         |                    ^
   2 errors generated.


vim +/try_get_task_stack +374 kernel/cgroup/freezer.c

   357	
   358	static void warn_freeze_timeout_task(struct cgroup *cgrp, int timeout,
   359					     struct task_struct *task)
   360	{
   361		char *buf __free(kfree) = NULL;
   362		pid_t tgid;
   363	
   364		buf = kmalloc(PATH_MAX, GFP_KERNEL);
   365		if (!buf)
   366			return;
   367	
   368		if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
   369			return;
   370	
   371		tgid = task_pid_nr_ns(task, &init_pid_ns);
   372		pr_warn("Freeze of %s took %ld sec, due to unfreezable process %d:%s.\n",
   373			buf, timeout / USEC_PER_SEC, tgid, task->comm);
 > 374		if (!try_get_task_stack(task))
   375			return;
   376		show_stack(task, NULL, KERN_WARNING);
 > 377		put_task_stack(task);
   378	}
   379	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 0/2] cgroup-v2/freezer: small improvements
  2025-12-23 17:29 ` [PATCH 0/2] cgroup-v2/freezer: small improvements Michal Koutný
@ 2025-12-24  3:06   ` Pavel Tikhomirov
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-24  3:06 UTC (permalink / raw)
  To: Michal Koutný; +Cc: Tejun Heo, Johannes Weiner, cgroups, linux-kernel

On 12/24/25 01:29, Michal Koutný wrote:
> On Tue, Dec 23, 2025 at 06:20:06PM +0800, Pavel Tikhomirov <ptikhomirov@virtuozzo.com> wrote:
>> First allows freezing cgroups with kthreads inside, we still won't
>> freeze kthreads, we still ignore them, but at the same time we allow
>> cgroup to report frozen when all other non-kthread tasks are frozen.
> 
> kthreads in non-root cgroups are kind of an antipattern.
> For which kthreads you would like this change? (See for instance the
> commit d96c77bd4eeba ("KVM: x86: switch hugepage recovery thread to
> vhost_task") as a possible refactoring of such threads.)

To explain our usecase, I would need to dive a bit into how Virtuozzo containers (OpenVZ) on our custom Virtuozzo kernel works.

In our case we have two custom kernel threads for each container: "kthreadd" and "umh".

https://bitbucket.org/openvz/vzkernel/src/662c0172a9d4aecf52dbaea4f903ccc801b569b2/kernel/ve/ve.c#lines-481
https://bitbucket.org/openvz/vzkernel/src/662c0172a9d4aecf52dbaea4f903ccc801b569b2/kernel/ve/ve.c#lines-581

The "kthreadd" is used to allow creating per-container kthreads, through it we create kthreads for "umh" (explained below) and to create sunrpc svc kthreads in container.

https://bitbucket.org/openvz/vzkernel/src/662c0172a9d4aecf52dbaea4f903ccc801b569b2/net/sunrpc/svc.c#lines-815

The "umh" is used to be able to run userspace commands from kernel in container, e.g.: we use it to virtualize (run in ct) coredump collection, nfs upcall, and cgroup-v1 release-agent.

https://bitbucket.org/openvz/vzkernel/src/662c0172a9d4aecf52dbaea4f903ccc801b569b2/fs/coredump.c#lines-640
https://bitbucket.org/openvz/vzkernel/src/662c0172a9d4aecf52dbaea4f903ccc801b569b2/fs/nfsd/nfs4recover.c#lines-1849
https://bitbucket.org/openvz/vzkernel/src/662c0172a9d4aecf52dbaea4f903ccc801b569b2/kernel/cgroup/cgroup-v1.c#lines-930

And we really want those threads be restricted by the same cgroups as the container.

The commit you've mentioned is an interesting one, we can try to switch our custom kthreads to "vhost_task" similar to what kvm did. It's not obvious if it will fly until we try =)

> 
>> Second patch adds information into dmesg to identify processes which
>> prevent cgroup from being frozen or just don't allow it to freeze fast
>> enough.
> 
> I can see how this can be useful for debugging, however, it resembles
> the existing CONFIG_DETECT_HUNG_TASK and its
> kernel.hung_task_timeout_secs. Could that be used instead?

The hung_task_timeout_secs detects the hang task only if it's in D state and didn't schedule, but it's not the only case. For instance the module I used to test the problem with, is not detected by this mechanism (as it schedules) (https://github.com/Snorch/proc-hang-module). 

Previously we saw tasks sleeping in nfs, presumably waiting for reply from server, which prevented freeze, and though hardlockup/softlockup/hang-task warnings were enabled they didn't trigger. Also one might want a separate timeout for freezer than for general hang cases (general hang timeouts should probably be higher). 

> 
> Thanks,
> Michal

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
  2025-12-23 20:58   ` kernel test robot
  2025-12-24  1:30   ` kernel test robot
@ 2025-12-24  3:26   ` kernel test robot
  2025-12-24  4:28   ` kernel test robot
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-12-24  3:26 UTC (permalink / raw)
  To: Pavel Tikhomirov, Tejun Heo
  Cc: llvm, oe-kbuild-all, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel, Pavel Tikhomirov

Hi Pavel,

kernel test robot noticed the following build errors:

[auto build test ERROR on tj-cgroup/for-next]
[also build test ERROR on linus/master v6.19-rc2 next-20251219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/cgroup-v2-freezer-allow-freezing-with-kthreads/20251223-182826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
patch link:    https://lore.kernel.org/r/20251223102124.738818-4-ptikhomirov%40virtuozzo.com
patch subject: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
config: sparc64-randconfig-002-20251224 (https://download.01.org/0day-ci/archive/20251224/202512241151.JDZuy1z3-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251224/202512241151.JDZuy1z3-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512241151.JDZuy1z3-lkp@intel.com/

All errors (new ones prefixed by >>):

>> kernel/cgroup/freezer.c:374:7: error: call to undeclared function 'try_get_task_stack'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     374 |         if (!try_get_task_stack(task))
         |              ^
   kernel/cgroup/freezer.c:374:7: note: did you mean 'tryget_task_struct'?
   include/linux/sched/task.h:120:35: note: 'tryget_task_struct' declared here
     120 | static inline struct task_struct *tryget_task_struct(struct task_struct *t)
         |                                   ^
>> kernel/cgroup/freezer.c:377:2: error: call to undeclared function 'put_task_stack'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     377 |         put_task_stack(task);
         |         ^
   kernel/cgroup/freezer.c:377:2: note: did you mean 'put_task_struct'?
   include/linux/sched/task.h:128:20: note: 'put_task_struct' declared here
     128 | static inline void put_task_struct(struct task_struct *t)
         |                    ^
   2 errors generated.


vim +/try_get_task_stack +374 kernel/cgroup/freezer.c

   357	
   358	static void warn_freeze_timeout_task(struct cgroup *cgrp, int timeout,
   359					     struct task_struct *task)
   360	{
   361		char *buf __free(kfree) = NULL;
   362		pid_t tgid;
   363	
   364		buf = kmalloc(PATH_MAX, GFP_KERNEL);
   365		if (!buf)
   366			return;
   367	
   368		if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
   369			return;
   370	
   371		tgid = task_pid_nr_ns(task, &init_pid_ns);
   372		pr_warn("Freeze of %s took %ld sec, due to unfreezable process %d:%s.\n",
   373			buf, timeout / USEC_PER_SEC, tgid, task->comm);
 > 374		if (!try_get_task_stack(task))
   375			return;
   376		show_stack(task, NULL, KERN_WARNING);
 > 377		put_task_stack(task);
   378	}
   379	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
                     ` (2 preceding siblings ...)
  2025-12-24  3:26   ` kernel test robot
@ 2025-12-24  4:28   ` kernel test robot
  2025-12-24 11:03   ` kernel test robot
  2025-12-27 22:51   ` Tejun Heo
  5 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-12-24  4:28 UTC (permalink / raw)
  To: Pavel Tikhomirov, Tejun Heo
  Cc: oe-kbuild-all, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel, Pavel Tikhomirov

Hi Pavel,

kernel test robot noticed the following build errors:

[auto build test ERROR on tj-cgroup/for-next]
[also build test ERROR on next-20251219]
[cannot apply to linus/master v6.16-rc1]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/cgroup-v2-freezer-allow-freezing-with-kthreads/20251223-182826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
patch link:    https://lore.kernel.org/r/20251223102124.738818-4-ptikhomirov%40virtuozzo.com
patch subject: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
config: x86_64-rhel-9.4-ltp (https://download.01.org/0day-ci/archive/20251224/202512240512.qCx1CzKN-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251224/202512240512.qCx1CzKN-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512240512.qCx1CzKN-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/cgroup/freezer.c: In function 'warn_freeze_timeout_task':
>> kernel/cgroup/freezer.c:374:14: error: implicit declaration of function 'try_get_task_stack'; did you mean 'tryget_task_struct'? [-Wimplicit-function-declaration]
     374 |         if (!try_get_task_stack(task))
         |              ^~~~~~~~~~~~~~~~~~
         |              tryget_task_struct
>> kernel/cgroup/freezer.c:377:9: error: implicit declaration of function 'put_task_stack'; did you mean 'put_task_struct'? [-Wimplicit-function-declaration]
     377 |         put_task_stack(task);
         |         ^~~~~~~~~~~~~~
         |         put_task_struct


vim +374 kernel/cgroup/freezer.c

   357	
   358	static void warn_freeze_timeout_task(struct cgroup *cgrp, int timeout,
   359					     struct task_struct *task)
   360	{
   361		char *buf __free(kfree) = NULL;
   362		pid_t tgid;
   363	
   364		buf = kmalloc(PATH_MAX, GFP_KERNEL);
   365		if (!buf)
   366			return;
   367	
   368		if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
   369			return;
   370	
   371		tgid = task_pid_nr_ns(task, &init_pid_ns);
   372		pr_warn("Freeze of %s took %ld sec, due to unfreezable process %d:%s.\n",
   373			buf, timeout / USEC_PER_SEC, tgid, task->comm);
 > 374		if (!try_get_task_stack(task))
   375			return;
   376		show_stack(task, NULL, KERN_WARNING);
 > 377		put_task_stack(task);
   378	}
   379	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 20:58   ` kernel test robot
@ 2025-12-24  4:43     ` Pavel Tikhomirov
  0 siblings, 0 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-24  4:43 UTC (permalink / raw)
  To: kernel test robot, Tejun Heo
  Cc: oe-kbuild-all, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel

linux$ git diff
diff --git a/kernel/cgroup/freezer.c b/kernel/cgroup/freezer.c
index 3880ed400879..21a0edc4a97d 100644
--- a/kernel/cgroup/freezer.c
+++ b/kernel/cgroup/freezer.c
@@ -4,6 +4,7 @@
 #include <linux/sysctl.h>
 #include <linux/sched.h>
 #include <linux/sched/task.h>
+#include <linux/sched/task_stack.h>
 #include <linux/sched/signal.h>
 #include <linux/sched/debug.h>
 
the above should fix that.

On 12/24/25 04:58, kernel test robot wrote:
> Hi Pavel,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on tj-cgroup/for-next]
> [also build test ERROR on linus/master v6.19-rc2 next-20251219]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/cgroup-v2-freezer-allow-freezing-with-kthreads/20251223-182826
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
> patch link:    https://lore.kernel.org/r/20251223102124.738818-4-ptikhomirov%40virtuozzo.com
> patch subject: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
> config: s390-randconfig-r071-20251224 (https://download.01.org/0day-ci/archive/20251224/202512240409.06R0khaZ-lkp@intel.com/config)
> compiler: s390-linux-gcc (GCC) 8.5.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251224/202512240409.06R0khaZ-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202512240409.06R0khaZ-lkp@intel.com/
> 
> All errors (new ones prefixed by >>):
> 
>    kernel/cgroup/freezer.c: In function 'warn_freeze_timeout_task':
>>> kernel/cgroup/freezer.c:374:7: error: implicit declaration of function 'try_get_task_stack'; did you mean 'tryget_task_struct'? [-Werror=implicit-function-declaration]
>      if (!try_get_task_stack(task))
>           ^~~~~~~~~~~~~~~~~~
>           tryget_task_struct
>>> kernel/cgroup/freezer.c:377:2: error: implicit declaration of function 'put_task_stack'; did you mean 'put_task_struct'? [-Werror=implicit-function-declaration]
>      put_task_stack(task);
>      ^~~~~~~~~~~~~~
>      put_task_struct
>    cc1: some warnings being treated as errors
> 
> Kconfig warnings: (for reference only)
>    WARNING: unmet direct dependencies detected for CAN_DEV
>    Depends on [n]: NETDEVICES [=n] && CAN [=y]
>    Selected by [y]:
>    - CAN [=y] && NET [=y]
> 
> 
> vim +374 kernel/cgroup/freezer.c
> 
>    357	
>    358	static void warn_freeze_timeout_task(struct cgroup *cgrp, int timeout,
>    359					     struct task_struct *task)
>    360	{
>    361		char *buf __free(kfree) = NULL;
>    362		pid_t tgid;
>    363	
>    364		buf = kmalloc(PATH_MAX, GFP_KERNEL);
>    365		if (!buf)
>    366			return;
>    367	
>    368		if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
>    369			return;
>    370	
>    371		tgid = task_pid_nr_ns(task, &init_pid_ns);
>    372		pr_warn("Freeze of %s took %ld sec, due to unfreezable process %d:%s.\n",
>    373			buf, timeout / USEC_PER_SEC, tgid, task->comm);
>  > 374		if (!try_get_task_stack(task))
>    375			return;
>    376		show_stack(task, NULL, KERN_WARNING);
>  > 377		put_task_stack(task);
>    378	}
>    379	
> 

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
                     ` (3 preceding siblings ...)
  2025-12-24  4:28   ` kernel test robot
@ 2025-12-24 11:03   ` kernel test robot
  2025-12-27 22:51   ` Tejun Heo
  5 siblings, 0 replies; 17+ messages in thread
From: kernel test robot @ 2025-12-24 11:03 UTC (permalink / raw)
  To: Pavel Tikhomirov, Tejun Heo
  Cc: oe-kbuild-all, Johannes Weiner, Michal Koutný, cgroups,
	linux-kernel, Pavel Tikhomirov

Hi Pavel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on tj-cgroup/for-next]
[also build test WARNING on linus/master v6.19-rc2 next-20251219]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Pavel-Tikhomirov/cgroup-v2-freezer-allow-freezing-with-kthreads/20251223-182826
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-next
patch link:    https://lore.kernel.org/r/20251223102124.738818-4-ptikhomirov%40virtuozzo.com
patch subject: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
config: arm64-randconfig-r134-20251224 (https://download.01.org/0day-ci/archive/20251224/202512241804.JAJl5a7k-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 4ef602d446057dabf5f61fb221669ecbeda49279)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251224/202512241804.JAJl5a7k-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202512241804.JAJl5a7k-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
   kernel/cgroup/freezer.c:144:35: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/cgroup/freezer.c:144:35: sparse:     expected struct spinlock [usertype] *lock
   kernel/cgroup/freezer.c:144:35: sparse:     got struct spinlock [noderef] __rcu *
   kernel/cgroup/freezer.c:147:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   kernel/cgroup/freezer.c:147:37: sparse:     expected struct spinlock [usertype] *lock
   kernel/cgroup/freezer.c:147:37: sparse:     got struct spinlock [noderef] __rcu *
>> kernel/cgroup/freezer.c:416:5: sparse: sparse: symbol 'sysctl_freeze_timeout_us' was not declared. Should it be static?
   kernel/cgroup/freezer.c: note: in included file (through include/linux/rcuwait.h, include/linux/percpu-rwsem.h, include/linux/fs/super_types.h, ...):
   include/linux/sched/signal.h:756:37: sparse: sparse: incorrect type in argument 1 (different address spaces) @@     expected struct spinlock [usertype] *lock @@     got struct spinlock [noderef] __rcu * @@
   include/linux/sched/signal.h:756:37: sparse:     expected struct spinlock [usertype] *lock
   include/linux/sched/signal.h:756:37: sparse:     got struct spinlock [noderef] __rcu *

vim +/sysctl_freeze_timeout_us +416 kernel/cgroup/freezer.c

   414	
   415	#define DEFAULT_FREEZE_RATELIMIT (30 * HZ)
 > 416	int sysctl_freeze_timeout_us;
   417	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
                     ` (4 preceding siblings ...)
  2025-12-24 11:03   ` kernel test robot
@ 2025-12-27 22:51   ` Tejun Heo
  2025-12-29  5:32     ` Pavel Tikhomirov
  2025-12-29  7:05     ` Pavel Tikhomirov
  5 siblings, 2 replies; 17+ messages in thread
From: Tejun Heo @ 2025-12-27 22:51 UTC (permalink / raw)
  To: Pavel Tikhomirov
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel

Hello,

On Tue, Dec 23, 2025 at 06:20:09PM +0800, Pavel Tikhomirov wrote:
> +static void warn_freeze_timeout(struct cgroup *cgrp, int timeout)
> +{
> +	char *buf __free(kfree) = NULL;
> +	struct cgroup_subsys_state *css;
> +
> +	guard(rcu)();
> +	css_for_each_descendant_post(css, &cgrp->self) {
> +		struct task_struct *task;
> +		struct css_task_iter it;
> +
> +		css_task_iter_start(css, 0, &it);
> +		while ((task = css_task_iter_next(&it))) {
> +			if (task->flags & PF_KTHREAD)
> +				continue;
> +			if (task->frozen)
> +				continue;
> +
> +			warn_freeze_timeout_task(cgrp, timeout, task);
> +			css_task_iter_end(&it);
> +			return;
> +		}
> +		css_task_iter_end(&it);
> +	}
> +
> +	buf = kmalloc(PATH_MAX, GFP_KERNEL);
> +	if (!buf)
> +		return;
> +
> +	if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
> +		return;
> +
> +	pr_warn("Freeze of %s took %ld sec, but no unfreezable process detected.\n",
> +		buf, timeout / USEC_PER_SEC);
> +}

This is only suitable for debugging, and, for that, this can be done from
userspace by walking the tasks and check /proc/PID/wchan. Should be
do_freezer_trap for everything frozen. If something is not, read and dump
its /proc/PID/stack. Wouldn't that work?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-27 22:51   ` Tejun Heo
@ 2025-12-29  5:32     ` Pavel Tikhomirov
  2025-12-29 17:39       ` Tejun Heo
  2025-12-29  7:05     ` Pavel Tikhomirov
  1 sibling, 1 reply; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-29  5:32 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel



On 12/28/25 06:51, Tejun Heo wrote:
> Hello,
> 
> On Tue, Dec 23, 2025 at 06:20:09PM +0800, Pavel Tikhomirov wrote:
>> +static void warn_freeze_timeout(struct cgroup *cgrp, int timeout)
>> +{
>> +	char *buf __free(kfree) = NULL;
>> +	struct cgroup_subsys_state *css;
>> +
>> +	guard(rcu)();
>> +	css_for_each_descendant_post(css, &cgrp->self) {
>> +		struct task_struct *task;
>> +		struct css_task_iter it;
>> +
>> +		css_task_iter_start(css, 0, &it);
>> +		while ((task = css_task_iter_next(&it))) {
>> +			if (task->flags & PF_KTHREAD)
>> +				continue;
>> +			if (task->frozen)
>> +				continue;
>> +
>> +			warn_freeze_timeout_task(cgrp, timeout, task);
>> +			css_task_iter_end(&it);
>> +			return;
>> +		}
>> +		css_task_iter_end(&it);
>> +	}
>> +
>> +	buf = kmalloc(PATH_MAX, GFP_KERNEL);
>> +	if (!buf)
>> +		return;
>> +
>> +	if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
>> +		return;
>> +
>> +	pr_warn("Freeze of %s took %ld sec, but no unfreezable process detected.\n",
>> +		buf, timeout / USEC_PER_SEC);
>> +}
> 
> This is only suitable for debugging, and, for that, this can be done from
> userspace by walking the tasks and check /proc/PID/wchan. Should be
> do_freezer_trap for everything frozen. If something is not, read and dump
> its /proc/PID/stack. Wouldn't that work?

Yes, I think that will do, Thanks.

Though the trace printing in /proc/PID/stack is a bit less informative than show_stack(), e.g. for my test module (https://github.com/Snorch/proc-hang-module) the stack in /proc/PID/stack will be just empty. (I guess it has to do with the fact that you need less privileges to read /proc/PID/stack than dmesg, so you can do a more informative thing in dmesg).

> 
> Thanks.
> 

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-27 22:51   ` Tejun Heo
  2025-12-29  5:32     ` Pavel Tikhomirov
@ 2025-12-29  7:05     ` Pavel Tikhomirov
  1 sibling, 0 replies; 17+ messages in thread
From: Pavel Tikhomirov @ 2025-12-29  7:05 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel



On 12/28/25 06:51, Tejun Heo wrote:
> Hello,
> 
> On Tue, Dec 23, 2025 at 06:20:09PM +0800, Pavel Tikhomirov wrote:
>> +static void warn_freeze_timeout(struct cgroup *cgrp, int timeout)
>> +{
>> +	char *buf __free(kfree) = NULL;
>> +	struct cgroup_subsys_state *css;
>> +
>> +	guard(rcu)();
>> +	css_for_each_descendant_post(css, &cgrp->self) {
>> +		struct task_struct *task;
>> +		struct css_task_iter it;
>> +
>> +		css_task_iter_start(css, 0, &it);
>> +		while ((task = css_task_iter_next(&it))) {
>> +			if (task->flags & PF_KTHREAD)
>> +				continue;
>> +			if (task->frozen)
>> +				continue;
>> +
>> +			warn_freeze_timeout_task(cgrp, timeout, task);
>> +			css_task_iter_end(&it);
>> +			return;
>> +		}
>> +		css_task_iter_end(&it);
>> +	}
>> +
>> +	buf = kmalloc(PATH_MAX, GFP_KERNEL);
>> +	if (!buf)
>> +		return;
>> +
>> +	if (cgroup_path(cgrp, buf, PATH_MAX) < 0)
>> +		return;
>> +
>> +	pr_warn("Freeze of %s took %ld sec, but no unfreezable process detected.\n",
>> +		buf, timeout / USEC_PER_SEC);
>> +}
> 
> This is only suitable for debugging, and, for that, this can be done from
> userspace by walking the tasks and check /proc/PID/wchan. Should be
> do_freezer_trap for everything frozen. If something is not, read and dump
> its /proc/PID/stack. Wouldn't that work?

Yes, that will do. I just hoped it might be a little bit more robust to detect it in kernel. Thanks.

Note the trace printing in /proc/PID/stack is a bit less informative than show_stack(), e.g. for my test module (https://github.com/Snorch/proc-hang-module) the stack in /proc/PID/stack will be just empty. (I guess it has to do with the fact that you need less privileges to read /proc/PID/stack than dmesg, so you can do a more informative thing in dmesg.

> 
> Thanks.
> 

-- 
Best regards, Pavel Tikhomirov
Senior Software Developer, Virtuozzo.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process
  2025-12-29  5:32     ` Pavel Tikhomirov
@ 2025-12-29 17:39       ` Tejun Heo
  0 siblings, 0 replies; 17+ messages in thread
From: Tejun Heo @ 2025-12-29 17:39 UTC (permalink / raw)
  To: Pavel Tikhomirov
  Cc: Johannes Weiner, Michal Koutný, cgroups, linux-kernel

Hello,

On Mon, Dec 29, 2025 at 01:32:11PM +0800, Pavel Tikhomirov wrote:
> > This is only suitable for debugging, and, for that, this can be done from
> > userspace by walking the tasks and check /proc/PID/wchan. Should be
> > do_freezer_trap for everything frozen. If something is not, read and dump
> > its /proc/PID/stack. Wouldn't that work?
> 
> Yes, I think that will do, Thanks.
> 
> Though the trace printing in /proc/PID/stack is a bit less informative
> than show_stack(), e.g. for my test module
> (https://github.com/Snorch/proc-hang-module) the stack in /proc/PID/stack
> will be just empty. (I guess it has to do with the fact that you need less
> privileges to read /proc/PID/stack than dmesg, so you can do a more
> informative thing in dmesg).

You can use drgn, bpftrace or any other bpf tools to read way more detailed
backtraces.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-12-29 17:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-23 10:20 [PATCH 0/2] cgroup-v2/freezer: small improvements Pavel Tikhomirov
2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: allow freezing with kthreads Pavel Tikhomirov
2025-12-23 10:25   ` Pavel Tikhomirov
2025-12-23 10:20 ` [PATCH 1/2] cgroup-v2/freezer: Allow " Pavel Tikhomirov
2025-12-23 10:20 ` [PATCH 2/2] cgroup-v2/freezer: Print information about unfreezable process Pavel Tikhomirov
2025-12-23 20:58   ` kernel test robot
2025-12-24  4:43     ` Pavel Tikhomirov
2025-12-24  1:30   ` kernel test robot
2025-12-24  3:26   ` kernel test robot
2025-12-24  4:28   ` kernel test robot
2025-12-24 11:03   ` kernel test robot
2025-12-27 22:51   ` Tejun Heo
2025-12-29  5:32     ` Pavel Tikhomirov
2025-12-29 17:39       ` Tejun Heo
2025-12-29  7:05     ` Pavel Tikhomirov
2025-12-23 17:29 ` [PATCH 0/2] cgroup-v2/freezer: small improvements Michal Koutný
2025-12-24  3:06   ` Pavel Tikhomirov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox