From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org, mingo@kernel.org,
laijs@cn.fujitsu.com, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@efficios.com,
josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org,
dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com,
fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com,
rafael@kernel.org
Subject: Re: [PATCH v5 tip/core/rcu 15/16] rcu: Make RCU-tasks wait for idle tasks
Date: Wed, 13 Aug 2014 11:20:30 -0700 [thread overview]
Message-ID: <20140813182030.GD4752@linux.vnet.ibm.com> (raw)
In-Reply-To: <20140813144219.GQ9918@twins.programming.kicks-ass.net>
On Wed, Aug 13, 2014 at 04:42:19PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 13, 2014 at 07:12:17AM -0700, Paul E. McKenney wrote:
> > > That's not an excuse for doing horrible things. And inventing new infra
> > > that needs to wake all CPUs is horrible.
> >
> > Does your patch even work?
>
> Haven't even tried compiling it, but making it work shouldn't be too
> hard.
>
> > Looks like it should, and yes, the idle loop
> > seems quite a bit simpler than it was a few years ago, but we really
> > don't need some strange thing that leaves a CPU idle but not visible as
> > such to RCU.
>
> There's slightly more to it though; things like the x86 mwait idle wait
> functions tend to do far too much; for instance look at:
>
> drivers/idle/intel_idle.c:intel_idle()
OK, let's see if I can follow the idly bouncing ball.
Looks like most arches call cpu_startup_entry(), which calls
arch_cpu_idle_prepare() followed by cpu_idle_loop().
arch_cpu_idle_prepare() is local_fiq_enable() on ARM and empty elsewhere.
cpu_idle_loop() does tick_nohz_idle_enter(), which does some nohz stuff.
It also checks for offline, and invokes arch_cpu_idle_enter(), which
is empty except on x86 and ARM. On ARM, it messes with LEDs, and on
x86 it appears to disable an NMI-based watchdog timer.
Interrupts are disabled, and we do either cpu_idle_poll() if specified
or cpuidle_idle_call(). cpu_idle_poll() is the old-time idle loop,
with rcu_idle_enter()/_exit() and enabling interrupts prior to spinning.
No sign of stop_critical_timings(), though. So let's not bury
rcu_idle_enter() in stop_critical_timings().
cpuidle_idle_call() does a fastpath irq-enable/exit if need-resched,
then does stop_critical_timings() and rcu_idle_enter(). Then we
have the buried complexity with cpuidle_select(), but a negative
return says to check need-resched and enable interrupts or to
invoke arch_cpu_idle(), which executes various sleep instructions
on various architectures. Some notable variants:
o ARM has an arm_pm_idle() function pointer so that different
SoCs can have different idle-power-down sequences. Alternatively,
some SoCs have functions named CPUNAME_do_idle(), for example,
imx5_cpu_do_idle(). These seem to invoke processor.do_idle().
Pushing rcu_idle_enter() and rcu_idle_exit() down below
arch_cpu_idle() on ARM looks to be asking for it.
o ARM64 does cpu_do_idle, which does a wait-for-interrupt instruction.
o AVR calls into assembly, hand-coding the need-resched check.
Not sure why that would still be needed.
o CRIS has one function that just enables interrupts and returns,
and anohter that enables interrupts and halts.
o UM times the duration of the idle time based on what appears to
be the time until the next event.
o Unicore does a string of what appear to be no-ops.
o x86 does a couple of levels of indirection, one being the
x86_idle() function pointer and another being the safe_halt()
function pointer.
amd_e400_idle() is a bit ornate, but still does default_idle()
which wrappers the safe_halt() pointer.
And various other architectures seem to work similarly, but lots of
hair here. So Steven, you OK with the underlying arch_cpu_idle()
functions being off-limits to tracing?
Now, if cpuidle_select() returns non-negative, we are dealing with
the CPU-idle governor, which is invoked at the later cpuidle_enter().
Hmmm... On the CPU-idle drivers...
o apm_idle_driver puts the idle loop into the ->enter() function,
apm_cpu_idle().
o ACPI puts the idle loop in acpi_idle_do_entry(), and does call
stop_critical_timings(), but not rcu_idle_enter().
So presumably stop_critical_timings() can nest? Not clear
from the code.
o The CPS driver is even stranger... Is cps_gen_entry_code()
really depositing assembly instructions into a buffer that is
passed back as a function?
o The intel_idle driver is the one with mwait_idle_with_hints(),
so you covered it below.
Your patch covers the cpuidle_enter() transition, which means
that functions like cpuidle_enter(), acpi_idle_enter_c1(), and
acpi_idle_do_entry() would be off-limits to trampolining. In the case
of CPS, quite a bit of code.
> We should push the rcu_idle_{enter,exit}() down to around
> mwait_idle_with_hints(), so we don't call half the word with RCU
> disabled.
That would be for the intel_idle.c CPU-idle driver. The other drivers
also need rcu_idle_{enter,exit}().
> > I have already said that I will be happy to rip out the wakeup code
> > when it is no longer needed, and I agree that it would be way better if
> > not needed.
>
> I'd prefer to dtrt now and not needing to fix it later.
Once it works, I might consider it "right" and adjust accordingly.
At the moment, speculation.
> Auditing all idle functions will be somewhat of a pain, but its entirely
> doable. Looking at this stuff, it appears we can clean it up massively;
> see how the generic cpuidle code already has the broadcast logic in, so
> we can remove that from the drivers by setting the right flags.
There is certainly quite a bit of hair in a number of these drivers,
no two ways about it.
> We can similarly pull out the leave_mm() call by adding a
> CPUIDLE_FLAG_TLB_FLUSH. At which point all we'd need to do is mark the
> intel_idle (and all other cpuidle_state::enter functions with __notrace.
That one seems to be specific to intel_idle. But yes, nice to avoid
waking an idle CPU for TLB flushes.
> > But I won't base a patch on hypotheticals. You have already
> > drawn way too much water from -that- well over the past years! ;-)
>
> not entirely sure what you're referring to there ;-)
Heh!
Thanx, Paul
next prev parent reply other threads:[~2014-08-13 18:20 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-11 22:48 [PATCH tip/core/rcu 0/16] RCU-tasks implementation Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 01/16] rcu: Add call_rcu_tasks() Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 02/16] rcu: Provide cond_resched_rcu_qs() to force quiescent states in long loops Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 03/16] rcu: Add synchronous grace-period waiting for RCU-tasks Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 04/16] rcu: Make TASKS_RCU handle tasks that are almost done exiting Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 05/16] rcu: Export RCU-tasks APIs to GPL modules Paul E. McKenney
2014-08-14 19:08 ` Pranith Kumar
2014-08-14 21:29 ` Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 06/16] rcutorture: Add torture tests for RCU-tasks Paul E. McKenney
2014-08-14 21:34 ` Pranith Kumar
2014-08-14 21:44 ` Paul E. McKenney
2014-08-14 21:49 ` Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 07/16] rcutorture: Add RCU-tasks test cases Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 08/16] rcu: Add stall-warning checks for RCU-tasks Paul E. McKenney
2014-08-14 21:39 ` Pranith Kumar
2014-08-14 21:59 ` Paul E. McKenney
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 09/16] rcu: Improve RCU-tasks energy efficiency Paul E. McKenney
2014-08-14 21:42 ` Pranith Kumar
2014-08-14 21:55 ` Paul E. McKenney
2014-08-14 22:00 ` Pranith Kumar
2014-08-11 22:48 ` [PATCH v5 tip/core/rcu 10/16] documentation: Add verbiage on RCU-tasks stall warning messages Paul E. McKenney
2014-08-11 22:49 ` [PATCH v5 tip/core/rcu 11/16] rcu: Defer rcu_tasks_kthread() creation till first call_rcu_tasks() Paul E. McKenney
2014-08-14 22:28 ` Pranith Kumar
2014-08-14 22:53 ` Paul E. McKenney
2014-08-11 22:49 ` [PATCH v5 tip/core/rcu 12/16] rcu: Make TASKS_RCU handle nohz_full= CPUs Paul E. McKenney
2014-08-14 22:55 ` Pranith Kumar
2014-08-14 23:16 ` Paul E. McKenney
2014-08-11 22:49 ` [PATCH v5 tip/core/rcu 13/16] rcu: Make rcu_tasks_kthread()'s GP-wait loop allow preemption Paul E. McKenney
2014-08-11 22:49 ` [PATCH v5 tip/core/rcu 14/16] rcu: Remove redundant preempt_disable() from rcu_note_voluntary_context_switch() Paul E. McKenney
2014-08-13 10:56 ` Peter Zijlstra
2014-08-13 14:07 ` Paul E. McKenney
2014-08-13 14:33 ` Peter Zijlstra
2014-08-13 20:06 ` Paul E. McKenney
2014-08-11 22:49 ` [PATCH v5 tip/core/rcu 15/16] rcu: Make RCU-tasks wait for idle tasks Paul E. McKenney
2014-08-13 8:12 ` Peter Zijlstra
2014-08-13 12:48 ` Paul E. McKenney
2014-08-13 13:40 ` Peter Zijlstra
2014-08-13 13:51 ` Steven Rostedt
2014-08-13 14:07 ` Peter Zijlstra
2014-08-13 14:13 ` Steven Rostedt
2014-08-13 14:43 ` Paul E. McKenney
2014-08-13 16:30 ` Peter Zijlstra
2014-08-13 16:43 ` Jacob Pan
2014-08-13 18:24 ` Paul E. McKenney
2014-08-13 16:35 ` Peter Zijlstra
2014-08-13 18:25 ` Paul E. McKenney
2014-08-13 14:43 ` Peter Zijlstra
2014-08-13 20:56 ` Paul E. McKenney
2014-08-13 14:12 ` Paul E. McKenney
2014-08-13 14:42 ` Peter Zijlstra
2014-08-13 17:24 ` Peter Zijlstra
2014-08-13 17:30 ` Peter Zijlstra
2014-08-13 18:16 ` Peter Zijlstra
2014-08-13 18:20 ` Paul E. McKenney [this message]
2014-08-13 18:55 ` Peter Zijlstra
2014-08-13 19:54 ` Paul E. McKenney
2014-08-11 22:49 ` [PATCH v5 tip/core/rcu 16/16] rcu: Additional information on RCU-tasks stall-warning messages Paul E. McKenney
2014-08-14 20:46 ` [PATCH v5 tip/core/rcu 01/16] rcu: Add call_rcu_tasks() Pranith Kumar
2014-08-14 21:22 ` Paul E. McKenney
2014-08-12 23:57 ` [PATCH tip/core/rcu 0/16] RCU-tasks implementation Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140813182030.GD4752@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=bobby.prani@gmail.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=dvhart@linux.intel.com \
--cc=edumazet@google.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.