linux-doc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: "Tejun Heo" <tj@kernel.org>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Frederic Weisbecker" <frederic@kernel.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	"Neeraj Upadhyay" <neeraj.upadhyay@kernel.org>,
	"Joel Fernandes" <joelagnelf@nvidia.com>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Uladzislau Rezki" <urezki@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Lai Jiangshan" <jiangshanlai@gmail.com>,
	Zqiang <qiang.zhang@linux.dev>,
	"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
	"Ingo Molnar" <mingo@kernel.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Juri Lelli" <juri.lelli@redhat.com>,
	"Vincent Guittot" <vincent.guittot@linaro.org>,
	"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
	"Ben Segall" <bsegall@google.com>, "Mel Gorman" <mgorman@suse.de>,
	"Valentin Schneider" <vschneid@redhat.com>,
	"Shuah Khan" <shuah@kernel.org>
Cc: cgroups@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kernel@vger.kernel.org, rcu@vger.kernel.org,
	linux-kselftest@vger.kernel.org, Phil Auld <pauld@redhat.com>,
	Costa Shulyupin <costa.shul@redhat.com>,
	Gabriele Monaco <gmonaco@redhat.com>,
	Cestmir Kalina <ckalina@redhat.com>,
	Waiman Long <longman@redhat.com>
Subject: [RFC PATCH 13/18] tick/nohz: Allow runtime changes in full dynticks CPUs
Date: Fri,  8 Aug 2025 11:19:56 -0400	[thread overview]
Message-ID: <20250808152001.20245-4-longman@redhat.com> (raw)
In-Reply-To: <20250808151053.19777-1-longman@redhat.com>

Full dynticks can only be enabled if "nohz_full" boot option has been
been specified with or without parameter. Any change in the list of
nohz_full CPUs have to be reflected in tick_nohz_full_mask. So the newly
introduced tick_nohz_full_update_cpus() will be called to update the
mask.

We also need to enable CPU context tracking for those CPUs that
are in tick_nohz_full_mask. So remove __init from tick_nohz_init()
and ct_cpu_track_user() so that they be called later when an isolated
cpuset partition is being created. The __ro_after_init attribute is
taken away from context_tracking_key as well.

Also add a new ct_cpu_untrack_user() function to reverse the action of
ct_cpu_track_user() in case we need to disable the nohz_full mode of
a CPU.

With nohz_full enabled, the boot CPU (typically CPU 0) will be the
tick CPU which cannot be shut down easily. So the boot CPU should not
be used in an isolated cpuset partition.

With runtime modification of nohz_full CPUs, tick_do_timer_cpu can become
TICK_DO_TIMER_NONE. So remove the two TICK_DO_TIMER_NONE WARN_ON_ONCE()
calls in tick-sched.c to avoid unnecessary warnings.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/context_tracking.h |  1 +
 kernel/cgroup/cpuset.c           | 23 ++++++++++++++++++++++-
 kernel/context_tracking.c        | 17 ++++++++++++++---
 kernel/time/tick-sched.c         |  7 -------
 4 files changed, 37 insertions(+), 11 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index a3fea7f9fef6..1a6b816f1ad6 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -17,6 +17,7 @@
 #define CONTEXT_TRACKING_FORCE_ENABLE	(-1)
 
 extern void ct_cpu_track_user(int cpu);
+extern void ct_cpu_untrack_user(int cpu);
 
 /* Called with interrupts disabled.  */
 extern void __ct_user_enter(enum ctx_state state);
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 6308bb14e018..45c82c18bec4 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -23,6 +23,7 @@
  */
 #include "cpuset-internal.h"
 
+#include <linux/context_tracking.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
 #include <linux/kernel.h>
@@ -1361,7 +1362,7 @@ static void partition_xcpus_del(int old_prs, struct cpuset *parent,
  */
 static int do_housekeeping_exclude_cpumask(void *arg __maybe_unused)
 {
-	int ret;
+	int cpu, ret;
 	struct cpumask *icpus = isolated_cpus;
 	unsigned long flags = BIT(HK_TYPE_DOMAIN) | BIT(HK_TYPE_KERNEL_NOISE);
 
@@ -1395,6 +1396,26 @@ static int do_housekeeping_exclude_cpumask(void *arg __maybe_unused)
 	ret = housekeeping_exclude_cpumask(icpus, flags);
 	WARN_ON_ONCE((ret < 0) && (ret != -EOPNOTSUPP));
 
+#ifdef CONFIG_NO_HZ_FULL
+	/*
+	 * To properly enable/disable nohz_full dynticks for the affected CPUs,
+	 * the new nohz_full CPUs have to be copied to tick_nohz_full_mask and
+	 * ct_cpu_track_user/ct_cpu_untrack_user() will have to be called
+	 * for those CPUs that have their states changed.
+	 */
+	if (tick_nohz_full_enabled()) {
+		tick_nohz_full_update_cpus(icpus);
+		for_each_cpu(cpu, isolcpus_update_state.cpus) {
+			if (cpumask_test_cpu(cpu, icpus))
+				ct_cpu_track_user(cpu);
+			else
+				ct_cpu_untrack_user(cpu);
+		}
+	} else {
+		pr_warn_once("Full dynticks cannot be enabled without the nohz_full kernel boot parameter!\n");
+	}
+#endif
+
 	if (icpus != isolated_cpus)
 		kfree(icpus);
 	return ret;
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 734354bbfdbb..ed5653a3d6f7 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -431,7 +431,7 @@ static __always_inline void ct_kernel_enter(bool user, int offset) { }
 #define CREATE_TRACE_POINTS
 #include <trace/events/context_tracking.h>
 
-DEFINE_STATIC_KEY_FALSE_RO(context_tracking_key);
+DEFINE_STATIC_KEY_FALSE(context_tracking_key);
 EXPORT_SYMBOL_GPL(context_tracking_key);
 
 static noinstr bool context_tracking_recursion_enter(void)
@@ -694,9 +694,9 @@ void user_exit_callable(void)
 }
 NOKPROBE_SYMBOL(user_exit_callable);
 
-void __init ct_cpu_track_user(int cpu)
+void ct_cpu_track_user(int cpu)
 {
-	static __initdata bool initialized = false;
+	static bool initialized;
 
 	if (cpu == CONTEXT_TRACKING_FORCE_ENABLE) {
 		static_branch_inc(&context_tracking_key);
@@ -720,6 +720,17 @@ void __init ct_cpu_track_user(int cpu)
 	initialized = true;
 }
 
+void ct_cpu_untrack_user(int cpu)
+{
+#ifndef CONFIG_CONTEXT_TRACKING_USER_FORCE
+	if (!per_cpu(context_tracking.active, cpu))
+		return;
+
+	per_cpu(context_tracking.active, cpu) = false;
+	static_branch_dec(&context_tracking_key);
+#endif
+}
+
 #ifdef CONFIG_CONTEXT_TRACKING_USER_FORCE
 void __init context_tracking_init(void)
 {
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 9204808b7a55..c16250c6a79f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -220,9 +220,6 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
 	tick_cpu = READ_ONCE(tick_do_timer_cpu);
 
 	if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && unlikely(tick_cpu == TICK_DO_TIMER_NONE)) {
-#ifdef CONFIG_NO_HZ_FULL
-		WARN_ON_ONCE(tick_nohz_full_running);
-#endif
 		WRITE_ONCE(tick_do_timer_cpu, cpu);
 		tick_cpu = cpu;
 	}
@@ -1201,10 +1198,6 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
 		 */
 		if (tick_cpu == cpu)
 			return false;
-
-		/* Should not happen for nohz-full */
-		if (WARN_ON_ONCE(tick_cpu == TICK_DO_TIMER_NONE))
-			return false;
 	}
 
 	return true;
-- 
2.50.0


  parent reply	other threads:[~2025-08-08 15:21 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-08 15:10 [RFC PATCH 00/18] cgroup/cpuset: Enable runtime modification of Waiman Long
2025-08-08 15:10 ` [RFC PATCH 01/18] sched/isolation: Enable runtime update of housekeeping cpumasks Waiman Long
2025-08-08 15:10 ` [RFC PATCH 02/18] sched/isolation: Call sched_tick_offload_init() when HK_FLAG_KERNEL_NOISE is first set Waiman Long
2025-08-08 15:10 ` [RFC PATCH 03/18] sched/isolation: Use RCU to delay successive housekeeping cpumask updates Waiman Long
2025-08-08 15:10 ` [RFC PATCH 04/18] sched/isolation: Add a debugfs file to dump housekeeping cpumasks Waiman Long
2025-08-08 15:10 ` [RFC PATCH 05/18] cpu/hotplug: Add a new cpuhp_offline_cb() API Waiman Long
2025-08-08 15:10 ` [RFC PATCH 06/18] cgroup/cpuset: Introduce a new top level isolcpus_update_mutex Waiman Long
2025-08-08 15:10 ` [RFC PATCH 07/18] cgroup/cpuset: Allow overwriting HK_TYPE_DOMAIN housekeeping cpumask Waiman Long
2025-08-08 15:10 ` [RFC PATCH 08/18] cgroup/cpuset: Use CPU hotplug to enable runtime nohz_full modification Waiman Long
2025-08-08 15:10 ` [RFC PATCH 09/18] cgroup/cpuset: Revert "Include isolated cpuset CPUs in cpu_is_isolated() check" Waiman Long
2025-08-08 15:19 ` [RFC PATCH 10/18] sched/core: Ignore DL BW deactivation error if in cpuhp_offline_cb_mode Waiman Long
2025-08-08 15:19 ` [RFC PATCH 11/18] tick/nohz: Make nohz_full parameter optional Waiman Long
2025-08-08 15:19 ` [RFC PATCH 12/18] tick/nohz: Introduce tick_nohz_full_update_cpus() to update tick_nohz_full_mask Waiman Long
2025-08-08 15:19 ` Waiman Long [this message]
2025-08-08 15:19 ` [RFC PATCH 14/18] tick: Pass timer tick job to an online HK CPU in tick_cpu_dying() Waiman Long
2025-08-08 15:19 ` [RFC PATCH 15/18] cgroup/cpuset: Enable RCU NO-CB CPU offloading of newly isolated CPUs Waiman Long
2025-08-08 15:19 ` [RFC PATCH 16/18] cgroup/cpuset: Don't set have_boot_nohz_full without any boot time nohz_full CPU Waiman Long
2025-08-08 15:20 ` [RFC PATCH 17/18] cgroup/cpuset: Documentation updates & don't use CPU 0 for isolated partition Waiman Long
2025-08-08 15:20 ` [RFC PATCH 18/18] cgroup/cpuset: Add pr_debug() statements for cpuhp_offline_cb() call Waiman Long
2025-08-08 15:50 ` [RFC PATCH 00/18] cgroup/cpuset: Enable runtime modification of Frederic Weisbecker
2025-08-08 16:27   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250808152001.20245-4-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=anna-maria@linutronix.de \
    --cc=boqun.feng@gmail.com \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=ckalina@redhat.com \
    --cc=corbet@lwn.net \
    --cc=costa.shul@redhat.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=frederic@kernel.org \
    --cc=gmonaco@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=jiangshanlai@gmail.com \
    --cc=joelagnelf@nvidia.com \
    --cc=josh@joshtriplett.org \
    --cc=juri.lelli@redhat.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=mkoutny@suse.com \
    --cc=neeraj.upadhyay@kernel.org \
    --cc=pauld@redhat.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qiang.zhang@linux.dev \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=urezki@gmail.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).