public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] sched/core: Fixes and enhancements around spurious need_resched() and idle load balancing
@ 2024-07-10  9:02 K Prateek Nayak
  2024-07-10  9:02 ` [PATCH 1/3] sched/core: Remove the unnecessary need_resched() check in nohz_csd_func() K Prateek Nayak
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: K Prateek Nayak @ 2024-07-10  9:02 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	linux-kernel
  Cc: Dietmar Eggemann, Steven Rostedt, Ben Segall, Mel Gorman,
	Daniel Bristot de Oliveira, Valentin Schneider, Paul E. McKenney,
	Imran Khan, Leonardo Bras, Guo Ren, Rik van Riel, Tejun Heo,
	Cruz Zhao, Lai Jiangshan, Joel Fernandes, Zqiang, Julia Lawall,
	Gautham R. Shenoy, K Prateek Nayak

Since commit b2a02fc43a1f ("smp: Optimize
send_call_function_single_ipi()"), an idle CPU in TIF_POLLING_NRFLAG can
be pulled out of idle by setting TIF_NEED_RESCHED instead of sending an
actual IPI. This affects at least three scenarios that have been
described below:

 o A need_resched() check within a call function does not necessarily
   indicate a task wakeup since a CPU intending to send an IPI to an
   idle target in TIF_POLLING_NRFLAG mode can simply queue the
   SMP-call-function and set the TIF_NEED_RESCHED flag to pull the
   polling target out of idle. The SMP-call-function will be executed by
   flush_smp_call_function_queue() on the idle-exit path. On x86, where
   mwait_idle_with_hints() sets TIF_POLLING_NRFLAG for long idling,
   this leads to idle load balancer bailing out early since
   need_resched() check in nohz_csd_func() returns true in most
   instances.

o A TIF_POLLING_NRFLAG idling CPU woken up to process an IPI will end
  up calling schedule() even in cases where the call function does not
  wake up a new task on the idle CPU, thus delaying the idle re-entry.

o Julia Lawall reported a case where a softirq raised from a
  SMP-call-function on an idle CPU will wake up ksoftirqd since
  flush_smp_call_function_queue() executes in the idle thread's context.
  This can throw off the idle load balancer by making the idle CPU
  appear busy since ksoftirqd just woke on the said CPU [1].

The three patches address each of the above issue individually, the
first one by removing the need_resched() check in nohz_csd_func() with
a proper justification, the second by introducing a fast-path in
__schedule() to speed up idle re-entry in case TIF_NEED_RESCHED was set
simply to process an IPI that did not perform a wakeup, and the third by
notifying raise_softirq() that the softirq was raised from a
SMP-call-function executed by the idle or migration thread in
flush_smp_call_function_queue(), and waking ksoftirqd is unnecessary
since a call to do_softirq_post_smp_call_flush() will follow soon.

Previous attempts to solve these problems involved introducing a new
TIF_NOTIFY_IPI flag to notify a TIF_POLLING_NRFLAG CPU of a pending IPI
and skip calling __schedule() in such cases but it involved using atomic
ops which could have performance implications [2]. Instead, Peter
suggested the approach outlined in the first two patches of the series.
The third one is an RFC to that (hopefully) solves the problem Julia was
chasing down related to idle load balancing.

[1] https://lore.kernel.org/lkml/fcf823f-195e-6c9a-eac3-25f870cb35ac@inria.fr/
[2] https://lore.kernel.org/lkml/20240615014256.GQ8774@noisy.programming.kicks-ass.net/

This patch is based on tip:sched/core at commit c793a62823d1
("sched/core: Drop spinlocks on contention iff kernel is preemptible")

--
K Prateek Nayak (2):
  sched/core: Remove the unnecessary need_resched() check in
    nohz_csd_func()
  softirq: Avoid waking up ksoftirqd from
    flush_smp_call_function_queue()

Peter Zijlstra (1):
  sched/core: Introduce SM_IDLE and an idle re-entry fast-path in
    __schedule()

 kernel/sched/core.c | 40 ++++++++++++++++++++--------------------
 kernel/sched/smp.h  |  2 ++
 kernel/smp.c        | 32 ++++++++++++++++++++++++++++++++
 kernel/softirq.c    | 10 +++++++++-
 4 files changed, 63 insertions(+), 21 deletions(-)


base-commit: c793a62823d1ce8f70d9cfc7803e3ea436277cda
-- 
2.34.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-08-05  4:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-10  9:02 [PATCH 0/3] sched/core: Fixes and enhancements around spurious need_resched() and idle load balancing K Prateek Nayak
2024-07-10  9:02 ` [PATCH 1/3] sched/core: Remove the unnecessary need_resched() check in nohz_csd_func() K Prateek Nayak
2024-07-10 14:53   ` Peter Zijlstra
2024-07-10 17:57     ` K Prateek Nayak
2024-07-23  6:46   ` K Prateek Nayak
2024-07-10  9:02 ` [PATCH 2/3] sched/core: Introduce SM_IDLE and an idle re-entry fast-path in __schedule() K Prateek Nayak
2024-07-11  8:00   ` Vincent Guittot
2024-07-11  9:19     ` Peter Zijlstra
2024-07-11 13:14       ` Vincent Guittot
2024-07-12  6:40         ` K Prateek Nayak
2024-07-30 16:13   ` Chen Yu
2024-08-04  4:05     ` Chen Yu
2024-08-05  4:03       ` K Prateek Nayak
2024-07-10  9:02 ` [RFC PATCH 3/3] softirq: Avoid waking up ksoftirqd from flush_smp_call_function_queue() K Prateek Nayak
2024-07-10 15:05   ` Peter Zijlstra
2024-07-10 18:20     ` K Prateek Nayak
2024-07-23  4:50       ` K Prateek Nayak
2024-07-29  2:42 ` [PATCH 0/3] sched/core: Fixes and enhancements around spurious need_resched() and idle load balancing Chen Yu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox