public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] sched/idle: Add a few cpuidle VS timers comments
@ 2023-11-14 19:38 Frederic Weisbecker
  2023-11-14 19:38 ` [PATCH 1/2] sched/cpuidle: Comment about timers requirements VS idle handler Frederic Weisbecker
  2023-11-14 19:38 ` [PATCH 2/2] sched/timers: Explain why idle task schedules out on remote timer enqueue Frederic Weisbecker
  0 siblings, 2 replies; 5+ messages in thread
From: Frederic Weisbecker @ 2023-11-14 19:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Daniel Lezcano,
	Thomas Gleixner, Anna-Maria Behnsen, Ingo Molnar

Hi,

Those are the scheduler specific bits extracted from a previous series
([PATCH 00/10] timers/cpuidle: Fixes and cleanups).

Thanks.

Frederic Weisbecker (2):
  sched/cpuidle: Comment about timers requirements VS idle handler
  sched/timers: Explain why idle task schedules out on remote timer
    enqueue

 kernel/sched/core.c | 22 ++++++++++++++++++++++
 kernel/sched/idle.c | 30 ++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

-- 
2.42.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] sched/cpuidle: Comment about timers requirements VS idle handler
  2023-11-14 19:38 [PATCH 0/2] sched/idle: Add a few cpuidle VS timers comments Frederic Weisbecker
@ 2023-11-14 19:38 ` Frederic Weisbecker
  2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
  2023-11-14 19:38 ` [PATCH 2/2] sched/timers: Explain why idle task schedules out on remote timer enqueue Frederic Weisbecker
  1 sibling, 1 reply; 5+ messages in thread
From: Frederic Weisbecker @ 2023-11-14 19:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Daniel Lezcano,
	Thomas Gleixner, Anna-Maria Behnsen, Ingo Molnar

Add missing explanation concerning IRQs re-enablement constraints in
the cpuidle path against timers.

Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/sched/idle.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 565f8374ddbb..31231925f1ec 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -258,6 +258,36 @@ static void do_idle(void)
 	while (!need_resched()) {
 		rmb();
 
+		/*
+		 * Interrupts shouldn't be re-enabled from that point on until
+		 * the CPU sleeping instruction is reached. Otherwise an interrupt
+		 * may fire and queue a timer that would be ignored until the CPU
+		 * wakes from the sleeping instruction. And testing need_resched()
+		 * doesn't tell about pending needed timer reprogram.
+		 *
+		 * Several cases to consider:
+		 *
+		 * - SLEEP-UNTIL-PENDING-INTERRUPT based instructions such as
+		 *   "wfi" or "mwait" are fine because they can be entered with
+		 *   interrupt disabled.
+		 *
+		 * - sti;mwait() couple is fine because the interrupts are
+		 *   re-enabled only upon the execution of mwait, leaving no gap
+		 *   in-between.
+		 *
+		 * - ROLLBACK based idle handlers with the sleeping instruction
+		 *   called with interrupts enabled are NOT fine. In this scheme
+		 *   when the interrupt detects it has interrupted an idle handler,
+		 *   it rolls back to its beginning which performs the
+		 *   need_resched() check before re-executing the sleeping
+		 *   instruction. This can leak a pending needed timer reprogram.
+		 *   If such a scheme is really mandatory due to the lack of an
+		 *   appropriate CPU sleeping instruction, then a FAST-FORWARD
+		 *   must instead be applied: when the interrupt detects it has
+		 *   interrupted an idle handler, it must resume to the end of
+		 *   this idle handler so that the generic idle loop is iterated
+		 *   again to reprogram the tick.
+		 */
 		local_irq_disable();
 
 		if (cpu_is_offline(cpu)) {
-- 
2.42.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] sched/timers: Explain why idle task schedules out on remote timer enqueue
  2023-11-14 19:38 [PATCH 0/2] sched/idle: Add a few cpuidle VS timers comments Frederic Weisbecker
  2023-11-14 19:38 ` [PATCH 1/2] sched/cpuidle: Comment about timers requirements VS idle handler Frederic Weisbecker
@ 2023-11-14 19:38 ` Frederic Weisbecker
  2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
  1 sibling, 1 reply; 5+ messages in thread
From: Frederic Weisbecker @ 2023-11-14 19:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Daniel Lezcano,
	Thomas Gleixner, Anna-Maria Behnsen, Ingo Molnar

Trying to avoid that didn't bring much value after testing, add comment
about this.

Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/sched/core.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index a708d225c28e..50abc7eddb82 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1131,6 +1131,28 @@ static void wake_up_idle_cpu(int cpu)
 	if (cpu == smp_processor_id())
 		return;
 
+	/*
+	 * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
+	 * part of the idle loop. This forces an exit from the idle loop
+	 * and a round trip to schedule(). Now this could be optimized
+	 * because a simple new idle loop iteration is enough to
+	 * re-evaluate the next tick. Provided some re-ordering of tick
+	 * nohz functions that would need to follow TIF_NR_POLLING
+	 * clearing:
+	 *
+	 * - On most archs, a simple fetch_or on ti::flags with a
+	 *   "0" value would be enough to know if an IPI needs to be sent.
+	 *
+	 * - x86 needs to perform a last need_resched() check between
+	 *   monitor and mwait which doesn't take timers into account.
+	 *   There a dedicated TIF_TIMER flag would be required to
+	 *   fetch_or here and be checked along with TIF_NEED_RESCHED
+	 *   before mwait().
+	 *
+	 * However, remote timer enqueue is not such a frequent event
+	 * and testing of the above solutions didn't appear to report
+	 * much benefits.
+	 */
 	if (set_nr_and_not_polling(rq->idle))
 		smp_send_reschedule(cpu);
 	else
-- 
2.42.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip: sched/core] sched/timers: Explain why idle task schedules out on remote timer enqueue
  2023-11-14 19:38 ` [PATCH 2/2] sched/timers: Explain why idle task schedules out on remote timer enqueue Frederic Weisbecker
@ 2023-11-15  9:04   ` tip-bot2 for Frederic Weisbecker
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2023-11-15  9:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Frederic Weisbecker, Peter Zijlstra (Intel), Rafael J. Wysocki,
	x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     194600008d5c43b5a4ba98c4b81633397e34ffad
Gitweb:        https://git.kernel.org/tip/194600008d5c43b5a4ba98c4b81633397e34ffad
Author:        Frederic Weisbecker <frederic@kernel.org>
AuthorDate:    Tue, 14 Nov 2023 14:38:40 -05:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 15 Nov 2023 09:57:52 +01:00

sched/timers: Explain why idle task schedules out on remote timer enqueue

Trying to avoid that didn't bring much value after testing, add comment
about this.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lkml.kernel.org/r/20231114193840.4041-3-frederic@kernel.org
---
 kernel/sched/core.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f5f4495..2de77a6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1131,6 +1131,28 @@ static void wake_up_idle_cpu(int cpu)
 	if (cpu == smp_processor_id())
 		return;
 
+	/*
+	 * Set TIF_NEED_RESCHED and send an IPI if in the non-polling
+	 * part of the idle loop. This forces an exit from the idle loop
+	 * and a round trip to schedule(). Now this could be optimized
+	 * because a simple new idle loop iteration is enough to
+	 * re-evaluate the next tick. Provided some re-ordering of tick
+	 * nohz functions that would need to follow TIF_NR_POLLING
+	 * clearing:
+	 *
+	 * - On most archs, a simple fetch_or on ti::flags with a
+	 *   "0" value would be enough to know if an IPI needs to be sent.
+	 *
+	 * - x86 needs to perform a last need_resched() check between
+	 *   monitor and mwait which doesn't take timers into account.
+	 *   There a dedicated TIF_TIMER flag would be required to
+	 *   fetch_or here and be checked along with TIF_NEED_RESCHED
+	 *   before mwait().
+	 *
+	 * However, remote timer enqueue is not such a frequent event
+	 * and testing of the above solutions didn't appear to report
+	 * much benefits.
+	 */
 	if (set_nr_and_not_polling(rq->idle))
 		smp_send_reschedule(cpu);
 	else

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [tip: sched/core] sched/cpuidle: Comment about timers requirements VS idle handler
  2023-11-14 19:38 ` [PATCH 1/2] sched/cpuidle: Comment about timers requirements VS idle handler Frederic Weisbecker
@ 2023-11-15  9:04   ` tip-bot2 for Frederic Weisbecker
  0 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Frederic Weisbecker @ 2023-11-15  9:04 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Frederic Weisbecker, Peter Zijlstra (Intel), Rafael J. Wysocki,
	x86, linux-kernel

The following commit has been merged into the sched/core branch of tip:

Commit-ID:     dd5403869a40595eb953f12e8cd2bb57bb88bb67
Gitweb:        https://git.kernel.org/tip/dd5403869a40595eb953f12e8cd2bb57bb88bb67
Author:        Frederic Weisbecker <frederic@kernel.org>
AuthorDate:    Tue, 14 Nov 2023 14:38:39 -05:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 15 Nov 2023 09:57:51 +01:00

sched/cpuidle: Comment about timers requirements VS idle handler

Add missing explanation concerning IRQs re-enablement constraints in
the cpuidle path against timers.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Link: https://lkml.kernel.org/r/20231114193840.4041-2-frederic@kernel.org
---
 kernel/sched/idle.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 565f837..3123192 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -258,6 +258,36 @@ static void do_idle(void)
 	while (!need_resched()) {
 		rmb();
 
+		/*
+		 * Interrupts shouldn't be re-enabled from that point on until
+		 * the CPU sleeping instruction is reached. Otherwise an interrupt
+		 * may fire and queue a timer that would be ignored until the CPU
+		 * wakes from the sleeping instruction. And testing need_resched()
+		 * doesn't tell about pending needed timer reprogram.
+		 *
+		 * Several cases to consider:
+		 *
+		 * - SLEEP-UNTIL-PENDING-INTERRUPT based instructions such as
+		 *   "wfi" or "mwait" are fine because they can be entered with
+		 *   interrupt disabled.
+		 *
+		 * - sti;mwait() couple is fine because the interrupts are
+		 *   re-enabled only upon the execution of mwait, leaving no gap
+		 *   in-between.
+		 *
+		 * - ROLLBACK based idle handlers with the sleeping instruction
+		 *   called with interrupts enabled are NOT fine. In this scheme
+		 *   when the interrupt detects it has interrupted an idle handler,
+		 *   it rolls back to its beginning which performs the
+		 *   need_resched() check before re-executing the sleeping
+		 *   instruction. This can leak a pending needed timer reprogram.
+		 *   If such a scheme is really mandatory due to the lack of an
+		 *   appropriate CPU sleeping instruction, then a FAST-FORWARD
+		 *   must instead be applied: when the interrupt detects it has
+		 *   interrupted an idle handler, it must resume to the end of
+		 *   this idle handler so that the generic idle loop is iterated
+		 *   again to reprogram the tick.
+		 */
 		local_irq_disable();
 
 		if (cpu_is_offline(cpu)) {

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-11-15  9:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-14 19:38 [PATCH 0/2] sched/idle: Add a few cpuidle VS timers comments Frederic Weisbecker
2023-11-14 19:38 ` [PATCH 1/2] sched/cpuidle: Comment about timers requirements VS idle handler Frederic Weisbecker
2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker
2023-11-14 19:38 ` [PATCH 2/2] sched/timers: Explain why idle task schedules out on remote timer enqueue Frederic Weisbecker
2023-11-15  9:04   ` [tip: sched/core] " tip-bot2 for Frederic Weisbecker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox