public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	Peng Liu <liupeng17@lenovo.com>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: [PATCH 14/15] tick: Shut down low-res tick from dying CPU
Date: Wed, 24 Jan 2024 18:04:58 +0100	[thread overview]
Message-ID: <20240124170459.24850-15-frederic@kernel.org> (raw)
In-Reply-To: <20240124170459.24850-1-frederic@kernel.org>

The timekeeping duty is handed over from the outgoing CPU within stop
machine. This works well if CONFIG_NO_HZ_COMMON=n or the tick is in
high-res mode. However in low-res dynticks mode, the tick isn't
cancelled until the clockevent is shut down, which can happen later. The
tick may therefore fire again once IRQs are re-enabled on stop machine
and until IRQs are disabled for good upon the last call to idle.

That's so many opportunities for a timekeeper to go idle and the
outgoing CPU to take over that duty. This is why
tick_nohz_idle_stop_tick() is called one last time on idle if the CPU
is seen offline: so that the timekeeping duty is handed over again in
case the CPU has re-taken the duty.

This means there are two timekeeping handovers on CPU down hotplug with
different undocumented constraints and purposes:

1) A handover on stop machine for !dynticks || highres. All online CPUs
  are guaranteed to be non-idle and the timekeeping duty can be safely
  handed-over. The hrtimer tick is cancelled so it is guaranteed that in
  dynticks mode the outgoing CPU won't take again the duty.

2) A handover on last idle call for dynticks && lowres.  Setting the
  duty to TICK_DO_TIMER_NONE makes sure that a CPU will take over the
  timekeeping.

Prepare for consolidating the handover to a single place (the first one)
with shutting down the low-res tick as well from
tick_cancel_sched_timer() as well. This will simplify the handover and
unify the tick cancellation between high-res and low-res.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/time/tick-common.c |  3 ++-
 kernel/time/tick-sched.c  | 32 +++++++++++++++++++++++++-------
 kernel/time/tick-sched.h  |  4 ++--
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 522414089c0d..9cd09eea06d6 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -410,7 +410,8 @@ int tick_cpu_dying(unsigned int dying_cpu)
 	if (tick_do_timer_cpu == dying_cpu)
 		tick_do_timer_cpu = cpumask_first(cpu_online_mask);
 
-	tick_cancel_sched_timer(dying_cpu);
+	/* Make sure the CPU won't try to retake the timekeeping duty */
+	tick_sched_timer_dying(dying_cpu);
 
 	/* Remove CPU from timer broadcasting */
 	tick_offline_cpu(dying_cpu);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 274ac5941b16..5e7fe19b9977 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -309,6 +309,14 @@ static enum hrtimer_restart tick_nohz_highres_handler(struct hrtimer *timer)
 	return HRTIMER_RESTART;
 }
 
+static void tick_sched_timer_cancel(struct tick_sched *ts)
+{
+	if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
+		hrtimer_cancel(&ts->sched_timer);
+	else if (tick_sched_flag_test(ts, TS_FLAG_NOHZ))
+		tick_program_event(KTIME_MAX, 1);
+}
+
 #ifdef CONFIG_NO_HZ_FULL
 cpumask_var_t tick_nohz_full_mask;
 EXPORT_SYMBOL_GPL(tick_nohz_full_mask);
@@ -998,10 +1006,7 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
 	 * the tick timer.
 	 */
 	if (unlikely(expires == KTIME_MAX)) {
-		if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
-			hrtimer_cancel(&ts->sched_timer);
-		else
-			tick_program_event(KTIME_MAX, 1);
+		tick_sched_timer_cancel(ts);
 		return;
 	}
 
@@ -1563,13 +1568,26 @@ void tick_setup_sched_timer(bool hrtimer)
 	tick_nohz_activate(ts);
 }
 
-void tick_cancel_sched_timer(int cpu)
+/*
+ * Shut down the tick and make sure the CPU won't try to retake the timekeeping
+ * duty before disabling IRQs in idle for the last time.
+ */
+void tick_sched_timer_dying(int cpu)
 {
+	struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+	struct clock_event_device *dev = td->evtdev;
 	ktime_t idle_sleeptime, iowait_sleeptime;
 
-	if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
-		hrtimer_cancel(&ts->sched_timer);
+	/* This must happen before hrtimers are migrated! */
+	tick_sched_timer_cancel(ts);
+
+	/*
+	 * If the clockevents doesn't support CLOCK_EVT_STATE_ONESHOT_STOPPED,
+	 * make sure not to call low-res tick handler.
+	 */
+	if (tick_sched_flag_test(ts, TS_FLAG_NOHZ))
+		dev->event_handler = clockevents_handle_noop;
 
 	idle_sleeptime = ts->idle_sleeptime;
 	iowait_sleeptime = ts->iowait_sleeptime;
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index bbe72a078985..58d8d1c49dd3 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -106,9 +106,9 @@ extern struct tick_sched *tick_get_tick_sched(int cpu);
 
 extern void tick_setup_sched_timer(bool hrtimer);
 #if defined CONFIG_NO_HZ_COMMON || defined CONFIG_HIGH_RES_TIMERS
-extern void tick_cancel_sched_timer(int cpu);
+extern void tick_sched_timer_dying(int cpu);
 #else
-static inline void tick_cancel_sched_timer(int cpu) { }
+static inline void tick_sched_timer_dying(int cpu) { }
 #endif
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
-- 
2.43.0


  parent reply	other threads:[~2024-01-24 17:05 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24 17:04 [PATCH 00/15] timers/nohz cleanups and hotplug reorganization Frederic Weisbecker
2024-01-24 17:04 ` [PATCH 01/15] tick/nohz: Remove duplicate between tick_nohz_switch_to_nohz() and tick_setup_sched_timer() Frederic Weisbecker
2024-01-25  9:12   ` Thomas Gleixner
2024-01-25 11:57     ` Frederic Weisbecker
2024-01-25 13:34       ` Thomas Gleixner
2024-01-25 14:35         ` Frederic Weisbecker
2024-01-24 17:04 ` [PATCH 02/15] tick/nohz: Remove duplicate between lowres and highres handlers Frederic Weisbecker
2024-01-25  9:32   ` Thomas Gleixner
2024-01-25 11:58     ` Frederic Weisbecker
2024-01-25 13:30       ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 03/15] tick: Remove useless oneshot ifdeffery Frederic Weisbecker
2024-01-25  9:32   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 04/15] tick: Use IS_ENABLED() whenever possible Frederic Weisbecker
2024-01-25  9:33   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 05/15] tick: s/tick_nohz_stop_sched_tick/tick_nohz_full_stop_tick Frederic Weisbecker
2024-01-25  9:33   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 06/15] tick: No need to clear ts->next_tick again Frederic Weisbecker
2024-01-25  9:33   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 07/15] tick: Start centralizing tick related CPU hotplug operations Frederic Weisbecker
2024-01-25  9:36   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 08/15] tick: Move tick cancellation up to CPUHP_AP_TICK_DYING Frederic Weisbecker
2024-01-25  9:37   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 09/15] tick: Move broadcast " Frederic Weisbecker
2024-01-25  9:38   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 10/15] tick: Assume the tick can't be stopped in NOHZ_MODE_INACTIVE mode Frederic Weisbecker
2024-01-25  9:39   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 11/15] tick: Move got_idle_tick away from common flags Frederic Weisbecker
2024-01-25  9:40   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 12/15] tick: Move individual bit features to debuggable mask accesses Frederic Weisbecker
2024-01-25  9:41   ` Thomas Gleixner
2024-01-24 17:04 ` [PATCH 13/15] tick: Split nohz and highres features from nohz_mode Frederic Weisbecker
2024-01-25  9:42   ` Thomas Gleixner
2024-01-24 17:04 ` Frederic Weisbecker [this message]
2024-01-25  9:43   ` [PATCH 14/15] tick: Shut down low-res tick from dying CPU Thomas Gleixner
2024-01-24 17:04 ` [PATCH 15/15] tick: Assume timekeeping is correctly handed over upon last offline idle call Frederic Weisbecker
2024-01-25  9:43   ` Thomas Gleixner
  -- strict thread matches above, loose matches on Subject: below --
2024-01-31 23:11 [PATCH 00/15 v2] timers/nohz cleanups and hotplug reorganization Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 14/15] tick: Shut down low-res tick from dying CPU Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240124170459.24850-15-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liupeng17@lenovo.com \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox