From mboxrd@z Thu Jan  1 00:00:00 1970
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sat, 22 Jan 2011 02:42:33 +0000
Subject: Re: [PATCH V2] tracing, perf : add cpu hotplug trace events
Message-Id: <20110122024229.GB2870@nowhere>
List-Id: <linux-hotplug.vger.kernel.org>
References: <AANLkTinNoYbDe5huY4_Q6YidbWxqXqVWbVP2_7o-=P3W@mail.gmail.com>
 <20110120161101.GA17218@nowhere>
 <AANLkTinhX6UwQA0HFhuu5=jbq84woVhqtahuzA5hEo2o@mail.gmail.com>
 <20110121164404.GA2520@nowhere>
 <AANLkTi=Tx3DYczwf9U=QS9XGs6C8sjKwVZr+RLuo781V@mail.gmail.com>
In-Reply-To: <AANLkTi=Tx3DYczwf9U=QS9XGs6C8sjKwVZr+RLuo781V@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel@vger.kernel.org, linux-hotplug@vger.kernel.org, rostedt@goodmis.org, amit.kucheria@linaro.org

On Fri, Jan 21, 2011 at 06:41:58PM +0100, Vincent Guittot wrote:
> On 21 January 2011 17:44, Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > On Fri, Jan 21, 2011 at 09:43:18AM +0100, Vincent Guittot wrote:
> >> On 20 January 2011 17:11, Frederic Weisbecker <fweisbec@gmail.com> wro=
te:
> >> > On Thu, Jan 20, 2011 at 09:25:54AM +0100, Vincent Guittot wrote:
> >> >> Please find below a new proposal for adding trace events for cpu ho=
tplug.
> >> >> The goal is to measure the latency of each part (kernel, architectu=
re)
> >> >> and also to trace the cpu hotplug activity with other power events.=
 I
> >> >> have tested these traces events on an arm platform.
> >> >>
> >> >> Changes since previous version:
> >> >> -Use cpu_hotplug for trace name
> >> >> -Define traces for kernel core and arch parts only
> >> >> -Use DECLARE_EVENT_CLASS and DEFINE_EVENT
> >> >> -Use proper indentation
> >> >>
> >> >> Subject: [PATCH] cpu hotplug tracepoint
> >> >>
> >> >> this patch adds new events for cpu hotplug tracing
> >> >> =A0* plug/unplug sequence
> >> >> =A0* core and architecture latency measurements
> >> >>
> >> >> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.com>
> >> >> ---
> >> >> =A0include/trace/events/cpu_hotplug.h | =A0117 ++++++++++++++++++++=
++++++++++++++++
> >> >
> >> > Note we can't apply new tracepoints if they are not inserted in the =
code.
> >>
> >> I agree, i just want to have 1st feedbacks on the tracepoint interface
> >> before providing a patch which inserts the trace in the code.
> >>
> >> >
> >> >> +DEFINE_EVENT(cpu_hotplug, cpu_hotplug_arch_wait_die_start,
> >> >> +
> >> >> + =A0 =A0 TP_PROTO(unsigned int cpuid),
> >> >> +
> >> >> + =A0 =A0 TP_ARGS(cpuid)
> >> >> +);
> >> >> +
> >> >> +DEFINE_EVENT(cpu_hotplug, cpu_hotplug_arch_wait_die_end,
> >> >> +
> >> >> + =A0 =A0 TP_PROTO(unsigned int cpuid),
> >> >> +
> >> >> + =A0 =A0 TP_ARGS(cpuid)
> >> >> +);
> >> >
> >> > What is wait die, compared to die for example?
> >> >
> >>
> >> The arch_wait_die is used to trace the process which waits for the cpu
> >> to die (__cpu_die) and the arch_die is used to trace when the cpu dies
> >> (cpu_die)
> >
> > I still can't find the difference.
> >
> > Having:
> >
> > trace_cpu_hotplug_arch_die_start(cpu)
> > __cpu_die();
> > trace_cpu_hotplug_arch_die_end(cpu)
> >
> > Is not enough to get both the information that a cpu dies
> > and the time took to do so?
> >
>=20
> it's quite interesting to trace the cpu_die function because the cpu
> really dies in this one.

Note in case of success, you have barely the same time between die and
wait_die, the difference will reside in some completion wait/polling,
noise, mostly. Probably most of the time unnoticeable and irrelevant.

Plus if you opt for this scheme, you need to put your die hook into
every architectures, while otherwise a simple trace_cpu_die_start()
trace_cpu_die_stop() pair around __cpu_die() call in the generic code
is enough.

> The __cpu_die function can't return if the
> cpu fails to die in the very last step and then wake up. But this
> could be detected with some cpu_die traces.
>
>=20
> for a normal use case we have something like :
> cpu 0 enters __cpu_die
> cpu 1 enters cpu_die
> cpu1 acks that it is going to died
> cpu0 returns from __cpu_die
>=20
> if the cpu 1 fails to die at the very last step, we could have:
> cpu 0 enters __cpu_die
> cpu 1 enters cpu_idle --> cpu_die
> cpu1 leaves cpu_die because of some issues and comes back into cpu_idle.
> cpu0 returns from __cpu_die after a timeout or an error ack

If it fails in the hardware level, you'll certainly notice in your
power profiling because a CPU is not supposed to take seconds to
die. Especially with a such visual tool like pytimechart, it will
be obvious.

For the details, that's something that must be found in syslogs and
that's it.

I don't think it's a good idea to handle such buggy and unexpected case at
the tracepoint level. You don't want to profile bugs, you want to debug the=
m.
So it doesn't belong to this space IMHO.

> Then, cpu_die traces can be used with power traces for profiling the
> cpu power state. May be, the power.h trace file is a better place for
> the cpu_die traces ?

Hmm, this should probably stay inside the cpu hotplug tracepoint family,
this is where people will seek them in the first place.