All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>,
	Anna-Maria Behnsen <anna-maria@linutronix.de>,
	Peng Liu <liupeng17@lenovo.com>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: [PATCH 14/15] tick: Shut down low-res tick from dying CPU
Date: Thu,  1 Feb 2024 00:11:19 +0100	[thread overview]
Message-ID: <20240131231120.12006-15-frederic@kernel.org> (raw)
In-Reply-To: <20240131231120.12006-1-frederic@kernel.org>

The timekeeping duty is handed over from the outgoing CPU within stop
machine. This works well if CONFIG_NO_HZ_COMMON=n or the tick is in
high-res mode. However in low-res dynticks mode, the tick isn't
cancelled until the clockevent is shut down, which can happen later. The
tick may therefore fire again once IRQs are re-enabled on stop machine
and until IRQs are disabled for good upon the last call to idle.

That's so many opportunities for a timekeeper to go idle and the
outgoing CPU to take over that duty. This is why
tick_nohz_idle_stop_tick() is called one last time on idle if the CPU
is seen offline: so that the timekeeping duty is handed over again in
case the CPU has re-taken the duty.

This means there are two timekeeping handovers on CPU down hotplug with
different undocumented constraints and purposes:

1) A handover on stop machine for !dynticks || highres. All online CPUs
  are guaranteed to be non-idle and the timekeeping duty can be safely
  handed-over. The hrtimer tick is cancelled so it is guaranteed that in
  dynticks mode the outgoing CPU won't take again the duty.

2) A handover on last idle call for dynticks && lowres.  Setting the
  duty to TICK_DO_TIMER_NONE makes sure that a CPU will take over the
  timekeeping.

Prepare for consolidating the handover to a single place (the first one)
with shutting down the low-res tick as well from
tick_cancel_sched_timer() as well. This will simplify the handover and
unify the tick cancellation between high-res and low-res.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/time/tick-common.c |  3 ++-
 kernel/time/tick-sched.c  | 32 +++++++++++++++++++++++++-------
 kernel/time/tick-sched.h  |  4 ++--
 3 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 522414089c0d..9cd09eea06d6 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -410,7 +410,8 @@ int tick_cpu_dying(unsigned int dying_cpu)
 	if (tick_do_timer_cpu == dying_cpu)
 		tick_do_timer_cpu = cpumask_first(cpu_online_mask);
 
-	tick_cancel_sched_timer(dying_cpu);
+	/* Make sure the CPU won't try to retake the timekeeping duty */
+	tick_sched_timer_dying(dying_cpu);
 
 	/* Remove CPU from timer broadcasting */
 	tick_offline_cpu(dying_cpu);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 67759e7e025a..cb8e4a171288 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -308,6 +308,14 @@ static enum hrtimer_restart tick_nohz_handler(struct hrtimer *timer)
 	return HRTIMER_RESTART;
 }
 
+static void tick_sched_timer_cancel(struct tick_sched *ts)
+{
+	if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
+		hrtimer_cancel(&ts->sched_timer);
+	else if (tick_sched_flag_test(ts, TS_FLAG_NOHZ))
+		tick_program_event(KTIME_MAX, 1);
+}
+
 #ifdef CONFIG_NO_HZ_FULL
 cpumask_var_t tick_nohz_full_mask;
 EXPORT_SYMBOL_GPL(tick_nohz_full_mask);
@@ -997,10 +1005,7 @@ static void tick_nohz_stop_tick(struct tick_sched *ts, int cpu)
 	 * the tick timer.
 	 */
 	if (unlikely(expires == KTIME_MAX)) {
-		if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
-			hrtimer_cancel(&ts->sched_timer);
-		else
-			tick_program_event(KTIME_MAX, 1);
+		tick_sched_timer_cancel(ts);
 		return;
 	}
 
@@ -1560,14 +1565,27 @@ void tick_setup_sched_timer(bool hrtimer)
 	tick_nohz_activate(ts);
 }
 
-void tick_cancel_sched_timer(int cpu)
+/*
+ * Shut down the tick and make sure the CPU won't try to retake the timekeeping
+ * duty before disabling IRQs in idle for the last time.
+ */
+void tick_sched_timer_dying(int cpu)
 {
+	struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
 	struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
+	struct clock_event_device *dev = td->evtdev;
 	ktime_t idle_sleeptime, iowait_sleeptime;
 	unsigned long idle_calls, idle_sleeps;
 
-	if (tick_sched_flag_test(ts, TS_FLAG_HIGHRES))
-		hrtimer_cancel(&ts->sched_timer);
+	/* This must happen before hrtimers are migrated! */
+	tick_sched_timer_cancel(ts);
+
+	/*
+	 * If the clockevents doesn't support CLOCK_EVT_STATE_ONESHOT_STOPPED,
+	 * make sure not to call low-res tick handler.
+	 */
+	if (tick_sched_flag_test(ts, TS_FLAG_NOHZ))
+		dev->event_handler = clockevents_handle_noop;
 
 	idle_sleeptime = ts->idle_sleeptime;
 	iowait_sleeptime = ts->iowait_sleeptime;
diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h
index bbe72a078985..58d8d1c49dd3 100644
--- a/kernel/time/tick-sched.h
+++ b/kernel/time/tick-sched.h
@@ -106,9 +106,9 @@ extern struct tick_sched *tick_get_tick_sched(int cpu);
 
 extern void tick_setup_sched_timer(bool hrtimer);
 #if defined CONFIG_NO_HZ_COMMON || defined CONFIG_HIGH_RES_TIMERS
-extern void tick_cancel_sched_timer(int cpu);
+extern void tick_sched_timer_dying(int cpu);
 #else
-static inline void tick_cancel_sched_timer(int cpu) { }
+static inline void tick_sched_timer_dying(int cpu) { }
 #endif
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST
-- 
2.43.0


  parent reply	other threads:[~2024-01-31 23:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-31 23:11 [PATCH 00/15 v2] timers/nohz cleanups and hotplug reorganization Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 01/15] tick/nohz: Remove duplicate between tick_nohz_switch_to_nohz() and tick_setup_sched_timer() Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 02/15] tick/nohz: Remove duplicate between lowres and highres handlers Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 03/15] tick: Remove useless oneshot ifdeffery Frederic Weisbecker
2024-02-01  9:40   ` Anna-Maria Behnsen
2024-02-01 13:16     ` Frederic Weisbecker
2024-02-01 14:04       ` Anna-Maria Behnsen
2024-01-31 23:11 ` [PATCH 04/15] tick: Use IS_ENABLED() whenever possible Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 05/15] tick: s/tick_nohz_stop_sched_tick/tick_nohz_full_stop_tick Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 06/15] tick: No need to clear ts->next_tick again Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 07/15] tick: Start centralizing tick related CPU hotplug operations Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 08/15] tick: Move tick cancellation up to CPUHP_AP_TICK_DYING Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 09/15] tick: Move broadcast " Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 10/15] tick: Assume the tick can't be stopped in NOHZ_MODE_INACTIVE mode Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 11/15] tick: Move got_idle_tick away from common flags Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 12/15] tick: Move individual bit features to debuggable mask accesses Frederic Weisbecker
2024-01-31 23:11 ` [PATCH 13/15] tick: Split nohz and highres features from nohz_mode Frederic Weisbecker
2024-01-31 23:11 ` Frederic Weisbecker [this message]
2024-01-31 23:11 ` [PATCH 15/15] tick: Assume timekeeping is correctly handed over upon last offline idle call Frederic Weisbecker
  -- strict thread matches above, loose matches on Subject: below --
2024-01-24 17:04 [PATCH 00/15] timers/nohz cleanups and hotplug reorganization Frederic Weisbecker
2024-01-24 17:04 ` [PATCH 14/15] tick: Shut down low-res tick from dying CPU Frederic Weisbecker
2024-01-25  9:43   ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240131231120.12006-15-frederic@kernel.org \
    --to=frederic@kernel.org \
    --cc=anna-maria@linutronix.de \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=liupeng17@lenovo.com \
    --cc=mingo@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.