From: Thomas Renninger <trenn@suse.de>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
Frank Eigler <fche@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Kevin Hilman <khilman@deeprootsystems.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-omap@vger.kernel.org, rjw@sisk.pl,
linux-pm@lists.linux-foundation.org,
linux-trace-users@vger.kernel.org,
Jean Pihet <jean.pihet@newoldbits.com>,
Pierre Tardy <tardyp@gmail.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Tejun Heo <tj@kernel.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
Arjan van de Ven <arjan@linux.intel.com>
Subject: Re: [PATCH] PERF(kernel): Cleanup power events V2
Date: Tue, 26 Oct 2010 12:37:48 +0200 [thread overview]
Message-ID: <201010261237.49386.trenn@suse.de> (raw)
In-Reply-To: <20101026071020.GC13036@elte.hu>
On Tuesday 26 October 2010 09:10:20 Ingo Molnar wrote:
>
> * Thomas Renninger <trenn@suse.de> wrote:
>
> > Changes in V2:
> > - Introduce PWR_EVENT_EXIT instead of 0 to mark non-power state
> > - Use u32 instead of u64 for cpuid, state which is by far enough
> >
> > New power trace events:
> > power:processor_idle
> > power:processor_frequency
> > power:machine_suspend
> >
> >
> > C-state/idle accounting events:
> > power:power_start
> > power:power_end
> > are replaced with:
> > power:processor_idle
> >
> > and
> > power:power_frequency
> > is replaced with:
> > power:processor_frequency
>
> Could you please name it power:cpu_idle and power:cpu_frequency instead, for
> shortness? We generally use 'cpu' in the kernel and for events.
Sure.
>
> > power:machine_suspend
>
> How will future PCI (or other device) power saving tracepoints be called?
>
> Might be more consistent to use:
>
> power:cpu_idle
> power:machine_idle
> power:device_idle
device idle is not true. Those may be low power modes
like reduced network throughput, reduced wlan range, the device
needs not to be idle.
Device power states is probably the most complex area, if such
a thing gets introduced, it should makes sense to state
the interface experimental for some time until a wider range of
devices uses it (in contrast to these new ones
which should not change that soon anymore...).
Also machine_idle may be true, but machine_suspend sounds more
familiar and everyone immediately knows what the event is about.
-> *_idle convention is not really worth it.
> Where machine_idle is the suspend event.
Here you name it. You talk about machine_idle but you mean
the suspend event, better just name it what it is.
> > the type= field got removed from both, it was never
> > used and the type is differed by the event type itself.
> >
> > +#define PWR_EVENT_EXIT 0xFFFFFFFF
>
> Shouldnt this be part of the POWER_ enum? (and you can write -1 there)
No, below enum will vanish, but -1 is nicer.
...
> Plus:
>
> > +DECLARE_EVENT_CLASS(processor,
> > +
> > + TP_PROTO(unsigned int state, unsigned int cpu_id),
> > +
> > + TP_ARGS(state, cpu_id),
> > +
> > + TP_STRUCT__entry(
> > + __field( u32, state )
> > + __field( u32, cpu_id )
>
> Trace entries can carry a cpu_id of the current processor already. Can this cpu_id
> ever be different from that CPU id?
Yes. A core's frequency can depend on another one which
will get switched as well (by one command/MSR).
Compare with commit 6f4f2723d08534fd4e407e1e.
This can theoretically also be the case for sleep states.
Afaik such HW does not exist yet, but ACPI spec already
provides interfaces to pass these dependency from BIOS to OS.
-> We want a stable ABI and should be prepared for such stuff.
> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->state = state;
> > + __entry->cpu_id = cpu_id;
> > + ),
> > +
> > + TP_printk("state=%lu cpu_id=%lu", (unsigned long)__entry->state,
> > + (unsigned long)__entry->cpu_id)
> > +);
> > +
> > +DEFINE_EVENT(processor, processor_idle,
> > +
> > + TP_PROTO(unsigned int state, unsigned int cpu_id),
> > +
> > + TP_ARGS(state, cpu_id)
> > +);
> > +
> > +#define PWR_EVENT_EXIT 0xFFFFFFFF
> > +
> > +DEFINE_EVENT(processor, processor_frequency,
> > +
> > + TP_PROTO(unsigned int frequency, unsigned int cpu_id),
> > +
> > + TP_ARGS(frequency, cpu_id)
> > +);
>
> So, we have a 'state' field in the class, which is used as 'state' by the
> power::cpu_idle event, and as 'frequency' by the power::cpu_freq event?
Yes, is this a problem?
Definitions are a bit shorter having one power processor class.
As "frequency" is stated in frequency event definition everything should
be obvious and this one looks like the more elegant way to me.
> Are there any architectures that track frequency in Hz, not in kHz? If yes, might
> there ever be a need for the frequency value to be larger than 4.29 GHz? If yes,
> then it wont fit into u32.
drivers/cpufreq subsystem is fixed to unsigned int (cmp. include/linux/cpufreq.h):
unsigned int min; /* in kHz */
unsigned int max; /* in kHz */
unsigned int cur; /* in kHz,
...
that should be fine.
> Also, might there be a future need to express different types of frequencies? For
> example, should we decide to track turbo frequencies in Intel CPUs, how would that
> be expressed via these events? Are there any architectures and CPUs that somehow
> have some extra attribute to the frequency value?
I wonder whether this ever can/will work in a sane way.
Userspace can compare with:
/sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
everything above is turbo. So I do not think it's ever needed.
But adding an additional value at the end does not violate
userspace compatibility. This has been done with the cpuid
as well.
> > +TRACE_EVENT(machine_suspend,
> > +
> > + TP_PROTO(unsigned int state),
> > +
> > + TP_ARGS(state),
> > +
> > + TP_STRUCT__entry(
> > + __field( u32, state )
> > + ),
>
> Hm, this event is not used anywhere in the submitted patches. Where is the patch
> that adds usage, and what are the possible values for 'state'?
Jean wants to make use of it on ARM.
I also had patch for x86, I can have another look at it, Rafael
already gave me a comment on it. But on X86 you typically realize
when you suspend the machine (could imagine this is more useful on
ARM driven mobile phones and similar), still I can add it..
Values probably should be (include/linux/suspend.h):
#define PM_SUSPEND_ON 0
#define PM_SUSPEND_STANDBY 1
#define PM_SUSPEND_MEM 3
#define PM_SUSPEND_MAX 4
How this strange state Arjan talked about is passed is up
to these guys. Instead of using 0 and above pre-defined such
arch specific special states better should get passed by:
#define X86_MOORESTOWN_STANDBY_S0 0x100
.. 0x101
#define ARM_WHATEVER_STRANGE_THING 0x200
...
Thomas
next prev parent reply other threads:[~2010-10-26 10:37 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1287488171-25303-1-git-send-email-trenn@suse.de>
2010-10-19 11:36 ` [PATCH 1/3] PERF: Do not export power_frequency, but power_start event Thomas Renninger
2010-10-19 11:36 ` [PATCH 2/3] PERF(kernel): Cleanup power events Thomas Renninger
2010-10-25 6:54 ` Arjan van de Ven
2010-10-25 9:41 ` Thomas Renninger
2010-10-25 13:55 ` Arjan van de Ven
2010-10-25 14:36 ` Thomas Renninger
2010-10-25 14:45 ` Arjan van de Ven
2010-10-25 14:56 ` Ingo Molnar
2010-10-25 15:48 ` Thomas Renninger
2010-10-25 16:00 ` Arjan van de Ven
2010-10-25 23:32 ` Thomas Renninger
2010-10-25 6:58 ` Arjan van de Ven
2010-10-25 10:04 ` Ingo Molnar
2010-10-25 11:03 ` Thomas Renninger
2010-10-25 11:55 ` Ingo Molnar
2010-10-25 12:55 ` Thomas Renninger
2010-10-25 14:11 ` Arjan van de Ven
2010-10-25 14:51 ` Thomas Renninger
2010-10-25 12:58 ` Mathieu Desnoyers
2010-10-25 20:29 ` Rafael J. Wysocki
2010-10-25 13:58 ` Arjan van de Ven
2010-10-25 20:33 ` Rafael J. Wysocki
2010-10-25 23:33 ` [PATCH] PERF(kernel): Cleanup power events V2 Thomas Renninger
2010-10-26 1:09 ` Arjan van de Ven
2010-10-26 7:10 ` Ingo Molnar
2010-10-26 8:08 ` Jean Pihet
2010-10-26 11:21 ` Ingo Molnar
2010-10-26 11:48 ` Thomas Renninger
2010-10-26 11:54 ` Ingo Molnar
2010-10-26 13:17 ` Thomas Renninger
2010-10-26 13:35 ` Thomas Renninger
2010-10-26 18:57 ` Rafael J. Wysocki
2010-10-27 0:00 ` Thomas Renninger
2010-10-27 9:16 ` Rafael J. Wysocki
2010-10-26 9:58 ` Arjan van de Ven
2010-10-26 10:19 ` Ingo Molnar
2010-10-26 10:37 ` Thomas Renninger [this message]
2010-10-26 11:19 ` Ingo Molnar
2010-10-26 19:01 ` Rafael J. Wysocki
2010-10-26 15:32 ` Pierre Tardy
2010-10-26 16:04 ` Arjan van de Ven
2010-10-26 16:56 ` Pierre Tardy
2010-10-26 17:58 ` Peter Zijlstra
2010-10-26 18:14 ` Mathieu Desnoyers
2010-10-26 18:50 ` [linux-pm] " Alan Stern
2010-10-26 21:33 ` Mathieu Desnoyers
2010-10-26 22:20 ` Rafael J. Wysocki
2010-10-26 22:39 ` Rafael J. Wysocki
2010-10-27 0:46 ` Mathieu Desnoyers
2010-10-27 10:22 ` Rafael J. Wysocki
2010-10-27 12:21 ` Mathieu Desnoyers
2010-10-27 21:43 ` Rafael J. Wysocki
2010-10-26 19:04 ` Rafael J. Wysocki
2010-10-26 21:38 ` Mathieu Desnoyers
2010-10-26 22:22 ` Rafael J. Wysocki
2010-10-26 18:15 ` Pierre Tardy
2010-10-26 19:08 ` Rafael J. Wysocki
2010-10-26 20:23 ` Pierre Tardy
2010-10-26 20:38 ` Rafael J. Wysocki
2010-10-26 20:52 ` Arjan van de Ven
2010-10-26 21:17 ` Rafael J. Wysocki
2010-10-26 7:59 ` Jean Pihet
2010-10-26 18:52 ` Rafael J. Wysocki
2010-10-19 11:36 ` [PATCH 3/3] PERF(userspace): Adjust perf timechart to the new power events Thomas Renninger
2010-10-26 0:18 ` [PATCH] PERF(userspace): Adjust perf timechart to the new power events V2 Thomas Renninger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201010261237.49386.trenn@suse.de \
--to=trenn@suse.de \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=fche@redhat.com \
--cc=fweisbec@gmail.com \
--cc=jean.pihet@newoldbits.com \
--cc=khilman@deeprootsystems.com \
--cc=linux-omap@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=linux-trace-users@vger.kernel.org \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=rjw@sisk.pl \
--cc=rostedt@goodmis.org \
--cc=tardyp@gmail.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox