* [PATCH 1/4] perf: cleanup power events API
2010-10-04 15:20 PATCH [0/4] perf: clean-up of power events API Jean Pihet
@ 2010-10-04 15:20 ` Jean Pihet
2010-10-04 15:20 ` [PATCH 2/4] perf: add OMAP support for the new power events Jean Pihet
` (3 subsequent siblings)
4 siblings, 0 replies; 47+ messages in thread
From: Jean Pihet @ 2010-10-04 15:20 UTC (permalink / raw)
To: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap
Cc: Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
mathieu.desnoyers, Thomas Renninger, Jean Pihet
From: Thomas Renninger <trenn@suse.de>
New power trace events:
power:processor_idle
power:processor_frequency
power:machine_suspend
C-state/idle accounting events:
power:power_start
power:power_end
are replaced with:
power:processor_idle
and
power:power_frequency
is replaced with:
power:processor_frequency
power:machine_suspend
is newly introduced, a first implementation
comes from the ARM side, but it's easy to add these events
in X86 as well if needed.
the type= field got removed from both, it was never
used and the type is differed by the event type itself.
perf timechart
userspace tool gets adjusted in a separate patch.
Signed-off-by: Thomas Renninger <trenn@suse.de>
CC: Jean Pihet <jean.pihet@newoldbits.com>
CC: Arjan van de Ven <arjan@linux.intel.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: linux-perf-users@vger.kernel.org
CC: linux-trace-users@vger.kernel.org
CC: linux-pm@lists.linux-foundation.org
CC: Kevin Hilman <khilman@deeprootsystems.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Frank Eigler <fche@redhat.com>
CC: Rafael Wysocki <rjw@sisk.pl>
---
arch/x86/kernel/process.c | 8 ++---
arch/x86/kernel/process_32.c | 2 +-
arch/x86/kernel/process_64.c | 2 +-
drivers/cpufreq/cpufreq.c | 2 +-
drivers/cpuidle/cpuidle.c | 2 +-
drivers/idle/intel_idle.c | 6 ++--
include/trace/events/power.h | 69 ++++++++++++++++++++++--------------------
kernel/trace/power-traces.c | 3 +-
8 files changed, 47 insertions(+), 47 deletions(-)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index e8a5ad1..284df73 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -372,7 +372,7 @@ static inline int hlt_use_halt(void)
void default_idle(void)
{
if (hlt_use_halt()) {
- trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_processor_idle(1, smp_processor_id());
current_thread_info()->status &= ~TS_POLLING;
/*
* TS_POLLING-cleared state must be visible before we
@@ -442,7 +442,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
*/
void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
{
- trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
+ trace_processor_idle((ax>>4)+1, smp_processor_id());
if (!need_resched()) {
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -458,7 +458,7 @@ void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
static void mwait_idle(void)
{
if (!need_resched()) {
- trace_power_start(POWER_CSTATE, 1, smp_processor_id());
+ trace_processor_idle(1, smp_processor_id());
if (cpu_has(¤t_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
clflush((void *)¤t_thread_info()->flags);
@@ -479,11 +479,9 @@ static void mwait_idle(void)
*/
static void poll_idle(void)
{
- trace_power_start(POWER_CSTATE, 0, smp_processor_id());
local_irq_enable();
while (!need_resched())
cpu_relax();
- trace_power_end(0);
}
/*
diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
index 96586c3..8dee35c 100644
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -114,7 +114,7 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
+ trace_processor_idle(0, smp_processor_id());
}
tick_nohz_restart_sched_tick();
preempt_enable_no_resched();
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index b3d7a3a..d30069a 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -141,7 +141,7 @@ void cpu_idle(void)
pm_idle();
start_critical_timings();
- trace_power_end(smp_processor_id());
+ trace_processor_idle(0, smp_processor_id());
/* In many cases the interrupt that ended idle
has already called exit_idle. But some idle
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 199dcb9..8c0dbe6 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -354,7 +354,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
dprintk("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
(unsigned long)freqs->cpu);
- trace_power_frequency(POWER_PSTATE, freqs->new, freqs->cpu);
+ trace_processor_frequency(freqs->new, freqs->cpu);
srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
CPUFREQ_POSTCHANGE, freqs);
if (likely(policy) && likely(policy->cpu == freqs->cpu))
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index a507108..7ce9065 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -106,7 +106,7 @@ static void cpuidle_idle_call(void)
/* give the governor an opportunity to reflect on the outcome */
if (cpuidle_curr_governor->reflect)
cpuidle_curr_governor->reflect(dev);
- trace_power_end(smp_processor_id());
+ trace_processor_idle(0, smp_processor_id());
}
/**
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index d0e42e0..fe08983 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -194,9 +194,9 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
kt_before = ktime_get_real();
stop_critical_timings();
-#ifndef MODULE
- trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
-#endif
+
+ trace_processor_idle((eax >> 4) + 1, cpu);
+
if (!need_resched()) {
__monitor((void *)¤t_thread_info()->flags, 0, 0);
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 286784d..00bf175 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -7,71 +7,62 @@
#include <linux/ktime.h>
#include <linux/tracepoint.h>
-#ifndef _TRACE_POWER_ENUM_
-#define _TRACE_POWER_ENUM_
-enum {
- POWER_NONE = 0,
- POWER_CSTATE = 1, /* C-State */
- POWER_PSTATE = 2, /* Fequency change or DVFS */
- POWER_SSTATE = 3, /* Suspend */
-};
-#endif
-
/*
- * The power events are used for cpuidle & suspend (power_start, power_end)
- * and for cpufreq (power_frequency)
+ * The processor events class is used for cpuidle (processor_idle) and
+ * for cpufreq (processor_frequency)
*/
-DECLARE_EVENT_CLASS(power,
+DECLARE_EVENT_CLASS(processor,
- TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
+ TP_PROTO(unsigned int state, unsigned int cpu_id),
- TP_ARGS(type, state, cpu_id),
+ TP_ARGS(state, cpu_id),
TP_STRUCT__entry(
- __field( u64, type )
__field( u64, state )
__field( u64, cpu_id )
),
TP_fast_assign(
- __entry->type = type;
__entry->state = state;
__entry->cpu_id = cpu_id;
),
- TP_printk("type=%lu state=%lu cpu_id=%lu", (unsigned long)__entry->type,
- (unsigned long)__entry->state, (unsigned long)__entry->cpu_id)
+ TP_printk("state=%lu cpu_id=%lu", (unsigned long)__entry->state,
+ (unsigned long)__entry->cpu_id)
);
-DEFINE_EVENT(power, power_start,
+DEFINE_EVENT(processor, processor_idle,
- TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
+ TP_PROTO(unsigned int state, unsigned int cpu_id),
- TP_ARGS(type, state, cpu_id)
+ TP_ARGS(state, cpu_id)
);
-DEFINE_EVENT(power, power_frequency,
+DEFINE_EVENT(processor, processor_frequency,
- TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
+ TP_PROTO(unsigned int frequency, unsigned int cpu_id),
- TP_ARGS(type, state, cpu_id)
+ TP_ARGS(frequency, cpu_id)
);
-TRACE_EVENT(power_end,
+/*
+ * The machine_suspend event is used for system suspend (machine_suspend)
+ */
+TRACE_EVENT(machine_suspend,
- TP_PROTO(unsigned int cpu_id),
+ TP_PROTO(unsigned int state),
- TP_ARGS(cpu_id),
+ TP_ARGS(state),
TP_STRUCT__entry(
- __field( u64, cpu_id )
+ __field( u64, state )
),
TP_fast_assign(
- __entry->cpu_id = cpu_id;
+ __entry->state = state;
),
- TP_printk("cpu_id=%lu", (unsigned long)__entry->cpu_id)
+ TP_printk("state=%lu", (unsigned long)__entry->state)
);
@@ -97,7 +88,7 @@ DECLARE_EVENT_CLASS(clock,
__entry->cpu_id = cpu_id;
),
- TP_printk("%s state=%lu cpu_id=%lu", __get_str(name),
+ TP_printk("name=%s state=%lu cpu_id=%lu", __get_str(name),
(unsigned long)__entry->state, (unsigned long)__entry->cpu_id)
);
@@ -143,10 +134,15 @@ DECLARE_EVENT_CLASS(power_domain,
__entry->cpu_id = cpu_id;
),
- TP_printk("%s state=%lu cpu_id=%lu", __get_str(name),
+ TP_printk("name=%s state=%lu cpu_id=%lu", __get_str(name),
(unsigned long)__entry->state, (unsigned long)__entry->cpu_id)
);
+/*
+ * The power domain events are used for power domains transitions
+ * (power_domain_target for target state, power_domain_failed for
+ * missed target state).
+ */
DEFINE_EVENT(power_domain, power_domain_target,
TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
@@ -154,6 +150,13 @@ DEFINE_EVENT(power_domain, power_domain_target,
TP_ARGS(name, state, cpu_id)
);
+DEFINE_EVENT(power_domain, power_domain_missed,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id)
+);
+
#endif /* _TRACE_POWER_H */
/* This part must be outside protection */
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index a22582a..a5555be 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,5 +13,4 @@
#define CREATE_TRACE_POINTS
#include <trace/events/power.h>
-EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency);
-
+EXPORT_TRACEPOINT_SYMBOL_GPL(processor_idle);
--
1.7.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* [PATCH 2/4] perf: add OMAP support for the new power events
2010-10-04 15:20 PATCH [0/4] perf: clean-up of power events API Jean Pihet
2010-10-04 15:20 ` [PATCH 1/4] perf: cleanup " Jean Pihet
@ 2010-10-04 15:20 ` Jean Pihet
2010-10-04 19:45 ` Thomas Renninger
2010-10-04 15:21 ` [PATCH 3/4] perf: add calls to suspend trace point Jean Pihet
` (2 subsequent siblings)
4 siblings, 1 reply; 47+ messages in thread
From: Jean Pihet @ 2010-10-04 15:20 UTC (permalink / raw)
To: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap
Cc: Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
mathieu.desnoyers, Jean Pihet, Jean Pihet
The patch adds the new power management trace points for
the OMAP architecture.
The trace points are for cpuidle, cpufreq (DVFS), the clocks
changes (enable, disable, set_rate) and the power domains
transitions.
Signed-off-by: Jean Pihet <j-pihet@ti.com>
---
arch/arm/mach-omap2/cpuidle34xx.c | 3 +++
arch/arm/mach-omap2/pm34xx.c | 5 +++++
arch/arm/mach-omap2/powerdomain.c | 3 +++
arch/arm/plat-omap/clock.c | 13 ++++++++++---
arch/arm/plat-omap/cpu-omap.c | 2 ++
5 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c
index 3d3d035..96a6f38 100644
--- a/arch/arm/mach-omap2/cpuidle34xx.c
+++ b/arch/arm/mach-omap2/cpuidle34xx.c
@@ -24,6 +24,7 @@
#include <linux/sched.h>
#include <linux/cpuidle.h>
+#include <trace/events/power.h>
#include <plat/prcm.h>
#include <plat/irqs.h>
@@ -130,6 +131,8 @@ static int omap3_enter_idle(struct cpuidle_device *dev,
local_irq_disable();
local_fiq_disable();
+ trace_processor_idle(cx->type, smp_processor_id());
+
pwrdm_set_next_pwrst(mpu_pd, mpu_state);
pwrdm_set_next_pwrst(core_pd, core_state);
diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c
index 7b03426..4b5383f 100644
--- a/arch/arm/mach-omap2/pm34xx.c
+++ b/arch/arm/mach-omap2/pm34xx.c
@@ -28,6 +28,7 @@
#include <linux/clk.h>
#include <linux/delay.h>
#include <linux/slab.h>
+#include <trace/events/power.h>
#include <plat/sram.h>
#include <plat/clockdomain.h>
@@ -557,6 +558,8 @@ static void omap3_pm_idle(void)
if (omap_irq_pending() || need_resched())
goto out;
+ trace_processor_idle(1, smp_processor_id());
+
omap_sram_idle();
out:
@@ -623,6 +626,8 @@ restore:
printk(KERN_INFO "Powerdomain (%s) didn't enter "
"target state %d\n",
pwrst->pwrdm->name, pwrst->next_state);
+ trace_power_domain_missed(pwrst->pwrdm->name,
+ state, smp_processor_id());
ret = -1;
}
set_pwrdm_state(pwrst->pwrdm, pwrst->saved_state);
diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
index 6527ec3..73cbe9a 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -23,6 +23,7 @@
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/io.h>
+#include <trace/events/power.h>
#include <asm/atomic.h>
@@ -440,6 +441,8 @@ int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst)
pr_debug("powerdomain: setting next powerstate for %s to %0x\n",
pwrdm->name, pwrst);
+ trace_power_domain_target(pwrdm->name, pwrst, smp_processor_id());
+
prm_rmw_mod_reg_bits(OMAP_POWERSTATE_MASK,
(pwrst << OMAP_POWERSTATE_SHIFT),
pwrdm->prcm_offs, pwrstctrl_reg_offs);
diff --git a/arch/arm/plat-omap/clock.c b/arch/arm/plat-omap/clock.c
index 7190cbd..d6518f5 100644
--- a/arch/arm/plat-omap/clock.c
+++ b/arch/arm/plat-omap/clock.c
@@ -21,6 +21,7 @@
#include <linux/cpufreq.h>
#include <linux/debugfs.h>
#include <linux/io.h>
+#include <trace/events/power.h>
#include <plat/clock.h>
@@ -43,8 +44,10 @@ int clk_enable(struct clk *clk)
return -EINVAL;
spin_lock_irqsave(&clockfw_lock, flags);
- if (arch_clock->clk_enable)
+ if (arch_clock->clk_enable) {
+ trace_clock_enable(clk->name, 1, smp_processor_id());
ret = arch_clock->clk_enable(clk);
+ }
spin_unlock_irqrestore(&clockfw_lock, flags);
return ret;
@@ -66,8 +69,10 @@ void clk_disable(struct clk *clk)
goto out;
}
- if (arch_clock->clk_disable)
+ if (arch_clock->clk_disable) {
+ trace_clock_disable(clk->name, 0, smp_processor_id());
arch_clock->clk_disable(clk);
+ }
out:
spin_unlock_irqrestore(&clockfw_lock, flags);
@@ -120,8 +125,10 @@ int clk_set_rate(struct clk *clk, unsigned long rate)
return ret;
spin_lock_irqsave(&clockfw_lock, flags);
- if (arch_clock->clk_set_rate)
+ if (arch_clock->clk_set_rate) {
+ trace_clock_set_rate(clk->name, rate, smp_processor_id());
ret = arch_clock->clk_set_rate(clk, rate);
+ }
if (ret == 0) {
if (clk->recalc)
clk->rate = clk->recalc(clk);
diff --git a/arch/arm/plat-omap/cpu-omap.c b/arch/arm/plat-omap/cpu-omap.c
index 6d3d333..19d5a6c 100644
--- a/arch/arm/plat-omap/cpu-omap.c
+++ b/arch/arm/plat-omap/cpu-omap.c
@@ -21,6 +21,7 @@
#include <linux/err.h>
#include <linux/clk.h>
#include <linux/io.h>
+#include <trace/events/power.h>
#include <mach/hardware.h>
#include <plat/clock.h>
@@ -95,6 +96,7 @@ static int omap_target(struct cpufreq_policy *policy,
printk(KERN_DEBUG "cpufreq-omap: transition: %u --> %u\n",
freqs.old, freqs.new);
#endif
+ trace_processor_frequency(freqs.new, freqs.cpu);
ret = clk_set_rate(mpu_clk, freqs.new * 1000);
cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
--
1.7.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 2/4] perf: add OMAP support for the new power events
2010-10-04 15:20 ` [PATCH 2/4] perf: add OMAP support for the new power events Jean Pihet
@ 2010-10-04 19:45 ` Thomas Renninger
0 siblings, 0 replies; 47+ messages in thread
From: Thomas Renninger @ 2010-10-04 19:45 UTC (permalink / raw)
To: Jean Pihet
Cc: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Peter Zijlstra, Kevin Hilman, Steven Rostedt,
Frank Eigler, mathieu.desnoyers, Jean Pihet
On Monday 04 October 2010 05:20:59 pm Jean Pihet wrote:
> The patch adds the new power management trace points for
> the OMAP architecture.
>
> The trace points are for cpuidle, cpufreq (DVFS), the clocks
> changes (enable, disable, set_rate) and the power domains
> transitions.
...
> diff --git a/arch/arm/plat-omap/cpu-omap.c b/arch/arm/plat-omap/cpu-omap.c
> index 6d3d333..19d5a6c 100644
> --- a/arch/arm/plat-omap/cpu-omap.c
> +++ b/arch/arm/plat-omap/cpu-omap.c
> @@ -21,6 +21,7 @@
> #include <linux/err.h>
> #include <linux/clk.h>
> #include <linux/io.h>
> +#include <trace/events/power.h>
>
> #include <mach/hardware.h>
> #include <plat/clock.h>
> @@ -95,6 +96,7 @@ static int omap_target(struct cpufreq_policy *policy,
> printk(KERN_DEBUG "cpufreq-omap: transition: %u --> %u\n",
> freqs.old, freqs.new);
> #endif
> + trace_processor_frequency(freqs.new, freqs.cpu);
> ret = clk_set_rate(mpu_clk, freqs.new * 1000);
> cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
This one is not needed, frequency power events are all fired in
drivers/cpufreq/cpufreq.c
for cpufreq drivers:
static struct cpufreq_driver omap_driver = {
.target = omap_target,
}
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 3/4] perf: add calls to suspend trace point
2010-10-04 15:20 PATCH [0/4] perf: clean-up of power events API Jean Pihet
2010-10-04 15:20 ` [PATCH 1/4] perf: cleanup " Jean Pihet
2010-10-04 15:20 ` [PATCH 2/4] perf: add OMAP support for the new power events Jean Pihet
@ 2010-10-04 15:21 ` Jean Pihet
2010-10-04 22:52 ` Rafael J. Wysocki
2010-10-04 15:21 ` [PATCH 4/4] perf: provide a DEPRECTAED power trace API to user space Jean Pihet
2010-10-06 21:34 ` PATCH [0/4] perf: clean-up of power events API Thomas Renninger
4 siblings, 1 reply; 47+ messages in thread
From: Jean Pihet @ 2010-10-04 15:21 UTC (permalink / raw)
To: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap
Cc: Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
mathieu.desnoyers, Jean Pihet, Jean Pihet, Thomas Renninger
Uses the machine_suspend trace point, from the
generic kernel suspend_enter function.
Signed-off-by: Jean Pihet <j-pihet@ti.com>
CC: Thomas Renninger <trenn@suse.de>
---
kernel/power/suspend.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
index 7335952..10cad5c 100644
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -22,6 +22,7 @@
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/suspend.h>
+#include <trace/events/power.h>
#include "power.h"
@@ -164,7 +165,9 @@ static int suspend_enter(suspend_state_t state)
error = sysdev_suspend(PMSG_SUSPEND);
if (!error) {
if (!suspend_test(TEST_CORE) && pm_check_wakeup_events()) {
+ trace_machine_suspend(state);
error = suspend_ops->enter(state);
+ trace_machine_suspend(0);
events_check_enabled = false;
}
sysdev_resume();
--
1.7.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 3/4] perf: add calls to suspend trace point
2010-10-04 15:21 ` [PATCH 3/4] perf: add calls to suspend trace point Jean Pihet
@ 2010-10-04 22:52 ` Rafael J. Wysocki
0 siblings, 0 replies; 47+ messages in thread
From: Rafael J. Wysocki @ 2010-10-04 22:52 UTC (permalink / raw)
To: Jean Pihet
Cc: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan,
linux-omap, Thomas Renninger, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, mathieu.desnoyers, Jean Pihet
On Monday, October 04, 2010, Jean Pihet wrote:
> Uses the machine_suspend trace point, from the
> generic kernel suspend_enter function.
>
> Signed-off-by: Jean Pihet <j-pihet@ti.com>
> CC: Thomas Renninger <trenn@suse.de>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
> ---
> kernel/power/suspend.c | 3 +++
> 1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c
> index 7335952..10cad5c 100644
> --- a/kernel/power/suspend.c
> +++ b/kernel/power/suspend.c
> @@ -22,6 +22,7 @@
> #include <linux/mm.h>
> #include <linux/slab.h>
> #include <linux/suspend.h>
> +#include <trace/events/power.h>
>
> #include "power.h"
>
> @@ -164,7 +165,9 @@ static int suspend_enter(suspend_state_t state)
> error = sysdev_suspend(PMSG_SUSPEND);
> if (!error) {
> if (!suspend_test(TEST_CORE) && pm_check_wakeup_events()) {
> + trace_machine_suspend(state);
> error = suspend_ops->enter(state);
> + trace_machine_suspend(0);
> events_check_enabled = false;
> }
> sysdev_resume();
>
^ permalink raw reply [flat|nested] 47+ messages in thread
* [PATCH 4/4] perf: provide a DEPRECTAED power trace API to user space
2010-10-04 15:20 PATCH [0/4] perf: clean-up of power events API Jean Pihet
` (2 preceding siblings ...)
2010-10-04 15:21 ` [PATCH 3/4] perf: add calls to suspend trace point Jean Pihet
@ 2010-10-04 15:21 ` Jean Pihet
2010-10-04 16:10 ` Frank Ch. Eigler
2010-10-06 21:34 ` PATCH [0/4] perf: clean-up of power events API Thomas Renninger
4 siblings, 1 reply; 47+ messages in thread
From: Jean Pihet @ 2010-10-04 15:21 UTC (permalink / raw)
To: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap
Cc: Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
mathieu.desnoyers, Jean Pihet, Jean Pihet, Thomas Renninger
Provide a CONFIG_DEPRECATED_POWER_EVENT_TRACING option in order
to provide backward compatibility with the user space tracing tools.
To be removed as soon as the tools are in sync with the new API.
Note: only the old events are mapped to.
Signed-off-by: Jean Pihet <j-pihet@ti.com>
CC: Thomas Renninger <trenn@suse.de>
---
include/trace/events/power.h | 98 ++++++++++++++++++++++++++++++++++++++++++
kernel/trace/Kconfig | 4 ++
kernel/trace/power-traces.c | 4 ++
3 files changed, 106 insertions(+), 0 deletions(-)
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 00bf175..dd00512 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -7,6 +7,103 @@
#include <linux/ktime.h>
#include <linux/tracepoint.h>
+
+#ifdef CONFIG_DEPRECATED_POWER_EVENT_TRACING
+/*
+ * Keep the old power API for user space tools backward compatibility.
+ * To be removed as soon as the tools are in sync with the new API.
+ *
+ * Note: only the old events are mapped to.
+ *
+ * The mapping is as follows:
+ * - trace_processor_idle(state != 0, cpu_id) =>
+ * trace_power_start(POWER_CSTATE, state, cpu_id)
+ * - trace_processor_idle(state == 0, cpu_id) =>
+ * trace_power_end(cpu_id)
+ * - trace_processor_frequency(frequency, cpu_id) =>
+ * trace_power_frequency(POWER_PSTATE, frequency, cpu_id)
+ *
+ */
+
+/* Define old API events */
+#ifndef _TRACE_POWER_OLD_API_
+#define _TRACE_POWER_OLD_API_
+enum {
+ POWER_NONE = 0,
+ POWER_CSTATE = 1,
+ POWER_PSTATE = 2,
+};
+#endif /* _TRACE_POWER_OLD_API_ */
+
+DECLARE_EVENT_CLASS(power,
+
+ TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(type, state, cpu_id),
+
+ TP_STRUCT__entry(
+ __field( u64, type )
+ __field( u64, state )
+ __field( u64, cpu_id )
+ ),
+
+ TP_fast_assign(
+ __entry->type = type;
+ __entry->state = state;
+ __entry->cpu_id = cpu_id;
+ ),
+
+ TP_printk("type=%lu state=%lu cpu_id=%lu", (unsigned long)__entry->type,
+ (unsigned long)__entry->state, (unsigned long)__entry->cpu_id)
+);
+
+DEFINE_EVENT(power, power_start,
+
+ TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(type, state, cpu_id)
+);
+
+DEFINE_EVENT(power, power_frequency,
+
+ TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(type, state, cpu_id)
+);
+
+TRACE_EVENT(power_end,
+
+ TP_PROTO(unsigned int cpu_id),
+
+ TP_ARGS(cpu_id),
+
+ TP_STRUCT__entry(
+ __field( u64, cpu_id )
+ ),
+
+ TP_fast_assign(
+ __entry->cpu_id = cpu_id;
+ ),
+
+ TP_printk("cpu_id=%lu", (unsigned long)__entry->cpu_id)
+);
+
+/* Map new events trace points calls to old ones */
+#define trace_processor_idle(state, cpu_id) \
+ do { \
+ if (state != 0) \
+ trace_power_start(POWER_CSTATE, state, cpu_id); \
+ else \
+ trace_power_end(cpu_id); \
+ } while (0)
+
+#define trace_processor_frequency(frequency, cpu_id) \
+ do { \
+ trace_power_frequency(POWER_PSTATE, frequency, cpu_id); \
+ } while (0)
+
+#else /* CONFIG_DEPRECATED_POWER_EVENT_TRACING */
+
/*
* The processor events class is used for cpuidle (processor_idle) and
* for cpufreq (processor_frequency)
@@ -44,6 +141,7 @@ DEFINE_EVENT(processor, processor_frequency,
TP_ARGS(frequency, cpu_id)
);
+#endif /* CONFIG_DEPRECATED_POWER_EVENT_TRACING */
/*
* The machine_suspend event is used for system suspend (machine_suspend)
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 538501c..dab2bc1 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -64,6 +64,10 @@ config EVENT_TRACING
select CONTEXT_SWITCH_TRACER
bool
+config DEPRECATED_POWER_EVENT_TRACING
+ bool "Use the deprecated power trace API"
+ default n
+
config CONTEXT_SWITCH_TRACER
bool
diff --git a/kernel/trace/power-traces.c b/kernel/trace/power-traces.c
index a5555be..3a95670 100644
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,4 +13,8 @@
#define CREATE_TRACE_POINTS
#include <trace/events/power.h>
+#ifdef CONFIG_DEPRECATED_POWER_EVENT_TRACING
+EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
+#else
EXPORT_TRACEPOINT_SYMBOL_GPL(processor_idle);
+#endif
--
1.7.1
^ permalink raw reply related [flat|nested] 47+ messages in thread* Re: [PATCH 4/4] perf: provide a DEPRECTAED power trace API to user space
2010-10-04 15:21 ` [PATCH 4/4] perf: provide a DEPRECTAED power trace API to user space Jean Pihet
@ 2010-10-04 16:10 ` Frank Ch. Eigler
2010-10-04 16:47 ` Jean Pihet
0 siblings, 1 reply; 47+ messages in thread
From: Frank Ch. Eigler @ 2010-10-04 16:10 UTC (permalink / raw)
To: Jean Pihet
Cc: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Thomas Renninger, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, mathieu.desnoyers, Jean Pihet
Hi -
> Provide a CONFIG_DEPRECATED_POWER_EVENT_TRACING option in order
> to provide backward compatibility with the user space tracing tools.
This is clever:
> +/* Map new events trace points calls to old ones */
> +#define trace_processor_idle(state, cpu_id) \
> + do { \
> + if (state != 0) \
> + trace_power_start(POWER_CSTATE, state, cpu_id); \
> + else \
> + trace_power_end(cpu_id); \
> + } while (0)
but with this code, are the new tracepoints firing at all? In other
words, does this backward-compatibility config option effectively
disable the newer ones? If so, this is probably avoidable.
Instead of redirecting the new tracepoint calls to these macros, an
alternative may intercept (un)registration for the old tracepoint
names, and map them to (un)registration of the real tracepoints. See
how syscall tracepoints map to a special (un)registration callback in
include/trace/events/syscalls.h (syscall_*regfunc). Instead of
setting a TIF flag, the new functions can call
register_processor_idle() et al., and their respective callbacks can
call the backward-compatibility tracepoints.
The starting point would be to recast the power_start, power_end,
power_frequency tracepoints via TRACE_EVENT_FN() instead of
TRACE_EVENT().
- FChE
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: [PATCH 4/4] perf: provide a DEPRECTAED power trace API to user space
2010-10-04 16:10 ` Frank Ch. Eigler
@ 2010-10-04 16:47 ` Jean Pihet
0 siblings, 0 replies; 47+ messages in thread
From: Jean Pihet @ 2010-10-04 16:47 UTC (permalink / raw)
To: Frank Ch. Eigler
Cc: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Thomas Renninger, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, mathieu.desnoyers, Jean Pihet
Hi,
On Mon, Oct 4, 2010 at 6:10 PM, Frank Ch. Eigler <fche@redhat.com> wrote:
> Hi -
>
>> Provide a CONFIG_DEPRECATED_POWER_EVENT_TRACING option in order
>> to provide backward compatibility with the user space tracing tools.
>
> This is clever:
>
>> +/* Map new events trace points calls to old ones */
>> +#define trace_processor_idle(state, cpu_id) \
>> + do { \
>> + if (state != 0) \
>> + trace_power_start(POWER_CSTATE, state, cpu_id); \
>> + else \
>> + trace_power_end(cpu_id); \
>> + } while (0)
>
> but with this code, are the new tracepoints firing at all? In other
> words, does this backward-compatibility config option effectively
> disable the newer ones? If so, this is probably avoidable.
Yes the events are fired. In fact with the CONFIG option defined the
trace_processor_* trace points are not defined, only the macro remaps
them (at compile time) to the old ones.
> Instead of redirecting the new tracepoint calls to these macros, an
> alternative may intercept (un)registration for the old tracepoint
> names, and map them to (un)registration of the real tracepoints. See
> how syscall tracepoints map to a special (un)registration callback in
> include/trace/events/syscalls.h (syscall_*regfunc). Instead of
> setting a TIF flag, the new functions can call
> register_processor_idle() et al., and their respective callbacks can
> call the backward-compatibility tracepoints.
>
> The starting point would be to recast the power_start, power_end,
> power_frequency tracepoints via TRACE_EVENT_FN() instead of
> TRACE_EVENT().
That could be done but it is not easy to do. In fact the power.h is
included and parsed multiple times in order to generate the trace
points prototype function _and_ the trace generation code. Please
check 'include/trace/ftrace.h' which is really smart but rather
complicated.
I am aiming at simple to read and maintainable code. Furthermore the
code is going to be removed soon or later.
Thanks for the suggestion!
>
> - FChE
>
Jean
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-04 15:20 PATCH [0/4] perf: clean-up of power events API Jean Pihet
` (3 preceding siblings ...)
2010-10-04 15:21 ` [PATCH 4/4] perf: provide a DEPRECTAED power trace API to user space Jean Pihet
@ 2010-10-06 21:34 ` Thomas Renninger
2010-10-07 15:08 ` Mathieu Desnoyers
4 siblings, 1 reply; 47+ messages in thread
From: Thomas Renninger @ 2010-10-06 21:34 UTC (permalink / raw)
To: Jean Pihet
Cc: linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Peter Zijlstra, Kevin Hilman, Steven Rostedt,
Frank Eigler, mathieu.desnoyers
Hi,
On Monday 04 October 2010 17:20:57 Jean Pihet wrote:
> Here is a re-spin of the patches after discussion.
what is going to happen here now?
Is this supposed to go through Ingo's tree?
Ingo: do you mind commenting on this.
I see 3 possibilities:
1) Power (or all) perf events are never going to change.
If they are going to change, then now is the right time and
2) Backward compatibility is provided in some way for some time.
3) The power events get cleaned up without compatibility to
former kernels versions.
There are patches for 2. and 3., for 1. there obviously are no
needed.
For 2., the patches (mine or Jeans), need some polishing. IMO
these double events inside of general code aren't that bad.
I trust Jean, that it's not that easy with all the include magic
and macros, partly realized that myself already and it's not worth
it to dig further for a temporary solution.
Votes so far:
1. Arjan
2. Myself, Jean
3. Peter Zijlstra and Mathieu Desnoyers
Jean's work got successfully blocked for weeks now.
If there would be a final decision by a maintainer who is going to
merge Jean's work, that would be great and it would finally be worth
to send updated patches again which hopefully some day find their way into
a linux-next kernel...
Thanks,
Thomas
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-06 21:34 ` PATCH [0/4] perf: clean-up of power events API Thomas Renninger
@ 2010-10-07 15:08 ` Mathieu Desnoyers
2010-10-07 15:23 ` Pierre Tardy
` (2 more replies)
0 siblings, 3 replies; 47+ messages in thread
From: Mathieu Desnoyers @ 2010-10-07 15:08 UTC (permalink / raw)
To: Thomas Renninger
Cc: Jean Pihet, linux-trace-users, linux-pm, linux-perf-users, mingo,
arjan, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Frederic Weisbecker, Andrew Morton, Linus Torvalds
[ Adding a few more CCs, since this discussion is about a tracepoint
userspace ABI policy, which is a topic of general interest. ]
* Thomas Renninger (trenn@suse.de) wrote:
> Hi,
>
> On Monday 04 October 2010 17:20:57 Jean Pihet wrote:
> > Here is a re-spin of the patches after discussion.
>
> what is going to happen here now?
>
> Is this supposed to go through Ingo's tree?
>
> Ingo: do you mind commenting on this.
Meanwhile, here are some ideas...
> I see 3 possibilities:
> 1) Power (or all) perf events are never going to change.
Persisting with bad interfaces (which were never meant to be stable, and were
actually explicitely said to be non-stable) for the sake of poorly written
proprietary userspace apps does not seem like viable to me. Since when did we
start designing kernel code for broken proprierary apps ? (see below for
solutions on how to fix the apps)
The only reason we have these tracepoints in there is because they can follow
kernel code changes, thanks to their flexible nature. Being stucked with
badly named tracepoints because of some monolithic analyzer app is just insane.
> If they are going to change, then now is the right time and
> 2) Backward compatibility is provided in some way for some time.
I've looked at the resulting code, and, honestly, it's ugly and it complexifies
the test matrix. I would really prefer to move this compatibility crap out of
the kernel out into userspace libraries, where it belongs. It should have got
there in the first place when the developers of these propritary
tracepoint-consumers got the hint that those were going to change. Then you have
a sane design:
1) The kernel, providing a tracepoint ABI that *can change over time*, because
tracepoints are too tied to kernel code to afford not being changed.
2) Adapdation libraries, some which could be provided with the perf userspace
libraries, some which could be provided along with the tracepoint consumer
application, so the proprietary application can link on an open-source
library that can be upgraded when needed.
3) The trace analyzer. So if the analyzer is open source, then it's fine, it
could follow the rare ABI breakups that are needed by a simple upgrade.
Ideally we might want to keep backward compatibility code in there too, but
it's OK to require users to upgrade their tools if the kernel is upgraded.
If the analyzer is closed-source, then it should interact with an open source
library rather than with the kernel tracepoints ABI.
So, given that I don't want to uglify kernel code based on some badly written
proprietary userspace tools, and given we've given all possible warnings telling
that the tracepoint ABI might change, I really don't see why we should bother
bloating the kernel with this. The analyzers should be changed to use adaptation
libraries instead.
> 3) The power events get cleaned up without compatibility to
> former kernels versions.
>
> There are patches for 2. and 3., for 1. there obviously are no
> needed.
> For 2., the patches (mine or Jeans), need some polishing. IMO
> these double events inside of general code aren't that bad.
> I trust Jean, that it's not that easy with all the include magic
> and macros, partly realized that myself already and it's not worth
> it to dig further for a temporary solution.
>
> Votes so far:
> 1. Arjan
> 2. Myself, Jean
> 3. Peter Zijlstra and Mathieu Desnoyers
>
> Jean's work got successfully blocked for weeks now.
> If there would be a final decision by a maintainer who is going to
> merge Jean's work, that would be great and it would finally be worth
> to send updated patches again which hopefully some day find their way into
> a linux-next kernel...
Yes, sadly this debate running in circles hurts contributors.
Thanks for the summary!
Mathieu
>
> Thanks,
>
> Thomas
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:08 ` Mathieu Desnoyers
@ 2010-10-07 15:23 ` Pierre Tardy
2010-10-07 15:45 ` Steven Rostedt
2010-10-07 15:58 ` Frederic Weisbecker
2010-10-07 15:45 ` Jean Pihet
2010-10-07 15:49 ` Thomas Renninger
2 siblings, 2 replies; 47+ messages in thread
From: Pierre Tardy @ 2010-10-07 15:23 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, mingo, arjan, rjw, linux-omap, Peter Zijlstra,
Kevin Hilman, Steven Rostedt, Frank Eigler, Masami Hiramatsu,
Thomas Gleixner, Frederic Weisbecker, Andrew Morton,
Linus Torvalds
On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> [ Adding a few more CCs, since this discussion is about a tracepoint
> userspace ABI policy, which is a topic of general interest. ]
To add a little more comment, this is not the first time that
tracepoints ABI changes. You can look at pytimechart sourcecode:
http://gitorious.org/pytimechart/pytimechart/blobs/master/timechart/ftrace.py
from 2.6.31 which is the first kernel I support,
sched_switch: 'task %s:%d [%d] ==> %s:%d [%d]',
changed to:
sched_switch: 'prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s
==> next_comm=%s next_pid=%d next_prio=%d',
workqueue_execution: 'thread=%s
func=%s\\+%s/%s','thread','func','func_offset','func_size'),
changed to:
workqueue_execution: 'thread=%s func=%s','thread','func'),
actually, over all the events pytimechart supports, only power traces
are stable...
Regards,
--
Pierre
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:23 ` Pierre Tardy
@ 2010-10-07 15:45 ` Steven Rostedt
2010-10-07 15:58 ` Frederic Weisbecker
1 sibling, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2010-10-07 15:45 UTC (permalink / raw)
To: Pierre Tardy
Cc: Mathieu Desnoyers, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Peter Zijlstra, Kevin Hilman, Frank Eigler,
Masami Hiramatsu, Thomas Gleixner, Frederic Weisbecker,
Andrew Morton, Linus Torvalds
On Thu, 2010-10-07 at 17:23 +0200, Pierre Tardy wrote:
>
> actually, over all the events pytimechart supports, only power traces
> are stable...
Let me rephrase that for you...
"actually, over all the events pytimechart supports, only power traces
are inflexible..."
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:23 ` Pierre Tardy
2010-10-07 15:45 ` Steven Rostedt
@ 2010-10-07 15:58 ` Frederic Weisbecker
2010-10-07 16:10 ` Pierre Tardy
2010-10-08 8:14 ` Tejun Heo
1 sibling, 2 replies; 47+ messages in thread
From: Frederic Weisbecker @ 2010-10-07 15:58 UTC (permalink / raw)
To: Pierre Tardy
Cc: Mathieu Desnoyers, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Peter Zijlstra, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, Tejun Heo
On Thu, Oct 07, 2010 at 05:23:43PM +0200, Pierre Tardy wrote:
> On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers
> <mathieu.desnoyers@efficios.com> wrote:
> > [ Adding a few more CCs, since this discussion is about a tracepoint
> > userspace ABI policy, which is a topic of general interest. ]
>
> To add a little more comment, this is not the first time that
> tracepoints ABI changes. You can look at pytimechart sourcecode:
> http://gitorious.org/pytimechart/pytimechart/blobs/master/timechart/ftrace.py
>
> from 2.6.31 which is the first kernel I support,
>
> sched_switch: 'task %s:%d [%d] ==> %s:%d [%d]',
> changed to:
> sched_switch: 'prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s
> ==> next_comm=%s next_pid=%d next_prio=%d',
>
> workqueue_execution: 'thread=%s
> func=%s\\+%s/%s','thread','func','func_offset','func_size'),
> changed to:
> workqueue_execution: 'thread=%s func=%s','thread','func'),
>
Seems to be only formatting changes, but no field has been removed and
no tracepoint has been renamed, etc...
So these are no stable ABI changes because the formatting can be changed
anytime. We want that flexibility and it stands on top of the per event
"format" files.
Tools are not supposed to read ascii formatted traces from trace/trace_pipe
files. Instead they need to read binary traces from trace_pipe_raw files
and look at the format file to know how to format this.
This is why we have these format files: to let tools adapt with changes
like format change or fields added.
And we have a library in perf and trace-cmd that let you
- request a field value in a raw trace, by its name. So the field doesn't
need to have a stable offset in the trace.
- request ascii format info, so that if ascii format changes, the tool
adapt.
- record binary traces, much more leightweight for the writer (kernel)
and for the reader (user).
I did told you that it would be better you make PyTimeChart use the perf
scripting facilities, it handles all the above things + it would
avoid you to handle a lot of things.
Now it's up to you, but don't count on us to make the ascii formatting
a stable ABI.
> actually, over all the events pytimechart supports, only power traces
> are stable...
Now one problem is that we have really broken the workqueue tracepoints
in this release. I thought nobody was using them so we could
refactor this tracepoint subsystem, my bad.
workqueue_execution has become workqueue_execution_start and
workqueue_execution_end. workqueue_insertion is going to
suffer a similar split.
workqueue_creation and workqueue_destruction have disappear but
I can probably restore them, but for the rest, what should we do?
I really feel uncomfortable with this tracepoint/ABI problem....
Mathieu suggested we start a user library that could handle these
changes when they are really necessary.
Thoughts?
(Adding Tejun in Cc).
--
To unsubscribe from this list: send the line "unsubscribe linux-omap" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:58 ` Frederic Weisbecker
@ 2010-10-07 16:10 ` Pierre Tardy
2010-10-08 8:14 ` Tejun Heo
1 sibling, 0 replies; 47+ messages in thread
From: Pierre Tardy @ 2010-10-07 16:10 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Mathieu Desnoyers, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Peter Zijlstra, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, Tejun Heo
> I did told you that it would be better you make PyTimeChart use the perf
> scripting facilities, it handles all the above things + it would
> avoid you to handle a lot of things.
Actually, perf scripting facility is already supported by pytimechart
but does not make it that easier to maintain.
event name changes => must update, event fields added/removed => must update
>
> Now it's up to you, but don't count on us to make the ascii formatting
> a stable ABI.
I'm not against adding 1 line in pytimechart each time there is some
change in ascii formatting
>
>
>> actually, over all the events pytimechart supports, only power traces
>> are stable...
>
>
> Now one problem is that we have really broken the workqueue tracepoints
> in this release. I thought nobody was using them so we could
> refactor this tracepoint subsystem, my bad.
No problem. I'll update pytimechart whenever someone sends me traces
that does not work (I'm okay with pre 2.6.31 traces too...)
Regards,
Pierre
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:58 ` Frederic Weisbecker
2010-10-07 16:10 ` Pierre Tardy
@ 2010-10-08 8:14 ` Tejun Heo
2010-10-08 8:38 ` Ingo Molnar
1 sibling, 1 reply; 47+ messages in thread
From: Tejun Heo @ 2010-10-08 8:14 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Pierre Tardy, Mathieu Desnoyers, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, mingo, arjan, rjw,
linux-omap, Peter Zijlstra, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds
Hello,
On 10/07/2010 05:58 PM, Frederic Weisbecker wrote:
> I really feel uncomfortable with this tracepoint/ABI problem....
> Mathieu suggested we start a user library that could handle these
> changes when they are really necessary.
>
> Thoughts?
>
> (Adding Tejun in Cc).
Given that tracepoints are supposed to make internal operation
visible. I don't think it's a good idea to make it part of fixed ABI.
Maybe some core part can be put in stone but I think things like
internal workqueue implementation should be changeable without
worrying about ABI issues.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 8:14 ` Tejun Heo
@ 2010-10-08 8:38 ` Ingo Molnar
2010-10-08 13:17 ` Arjan van de Ven
0 siblings, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2010-10-08 8:38 UTC (permalink / raw)
To: Tejun Heo
Cc: Frederic Weisbecker, Pierre Tardy, Mathieu Desnoyers,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, arjan, rjw, linux-omap, Peter Zijlstra,
Kevin Hilman, Steven Rostedt, Frank Eigler, Masami Hiramatsu,
Thomas Gleixner, Andrew Morton, Linus Torvalds
* Tejun Heo <tj@kernel.org> wrote:
> Hello,
>
> On 10/07/2010 05:58 PM, Frederic Weisbecker wrote:
> > I really feel uncomfortable with this tracepoint/ABI problem....
> > Mathieu suggested we start a user library that could handle these
> > changes when they are really necessary.
> >
> > Thoughts?
> >
> > (Adding Tejun in Cc).
>
> Given that tracepoints are supposed to make internal operation
> visible. I don't think it's a good idea to make it part of fixed ABI.
Yep, exactly.
OTOH since it exports information we can do disciplined versioning and
extensions only - i.e. leave the old power events around, add the new
ones with new distinct names, and phase out the old ones in a kernel
cycle or two. It's not hard to do.
That way apps can support old kernels too (if they want to), but new
events as well - and all in a controlled, non-disruptive manner.
More importantly, the kernel wont have cruft and will have no ABI
restrictions - the only 'restriction' is to treat information in an
append-only manner (i.e. change the event name if you change it
materially) - and that's not a big deal here.
The fundamental thing about tracing/instrumentation is that there are no
deep ABI needs: it's all about analyzing development kernels (and a few
select versions that get the enterprise treatment) but otherwise the
half-life of this kind of information is very short.
So we dont want to tie ourselves down with excessive ABIs.
> Maybe some core part can be put in stone but I think things like
> internal workqueue implementation should be changeable without
> worrying about ABI issues.
That's most definitely so! There is and will be zero back-coupling from
workqueue tracepoints to workqueue internals. Dont worry about this.
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 8:38 ` Ingo Molnar
@ 2010-10-08 13:17 ` Arjan van de Ven
2010-10-08 13:41 ` Mathieu Desnoyers
0 siblings, 1 reply; 47+ messages in thread
From: Arjan van de Ven @ 2010-10-08 13:17 UTC (permalink / raw)
To: Ingo Molnar
Cc: Tejun Heo, Frederic Weisbecker, Pierre Tardy, Mathieu Desnoyers,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On 10/8/2010 1:38 AM, Ingo Molnar wrote:
>
> The fundamental thing about tracing/instrumentation is that there are no
> deep ABI needs: it's all about analyzing development kernels (and a few
> select versions that get the enterprise treatment) but otherwise the
> half-life of this kind of information is very short.
>
> So we dont want to tie ourselves down with excessive ABIs.
>
ok I'll start working on a second mechanism then to export information
that applications need ;-(
it'll look a lot like tracing I suppose ;-(
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 13:17 ` Arjan van de Ven
@ 2010-10-08 13:41 ` Mathieu Desnoyers
2010-10-08 16:22 ` Arjan van de Ven
2010-10-09 6:28 ` Ingo Molnar
0 siblings, 2 replies; 47+ messages in thread
From: Mathieu Desnoyers @ 2010-10-08 13:41 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Ingo Molnar, Tejun Heo, Frederic Weisbecker, Pierre Tardy,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
* Arjan van de Ven (arjan@linux.intel.com) wrote:
> On 10/8/2010 1:38 AM, Ingo Molnar wrote:
>>
>> The fundamental thing about tracing/instrumentation is that there are no
>> deep ABI needs: it's all about analyzing development kernels (and a few
>> select versions that get the enterprise treatment) but otherwise the
>> half-life of this kind of information is very short.
>>
>> So we dont want to tie ourselves down with excessive ABIs.
>>
>
> ok I'll start working on a second mechanism then to export information
> that applications need ;-(
> it'll look a lot like tracing I suppose ;-(
What's wrong with doing the compatibility layer in a LGPL library shipped with
the kernel tree under tools/ ? Why does everything *have* to be done in
kernel-space ? Why are you so focused on making your application interact
directly with kernel ABIs ?
I'm being direct because there are trivial solutions to your problem that you
are rejecting without due consideration. (and also I just had one coffee too
many) ;-)
Regards,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 13:41 ` Mathieu Desnoyers
@ 2010-10-08 16:22 ` Arjan van de Ven
2010-10-08 17:21 ` Steven Rostedt
2010-10-08 17:32 ` Mathieu Desnoyers
2010-10-09 6:28 ` Ingo Molnar
1 sibling, 2 replies; 47+ messages in thread
From: Arjan van de Ven @ 2010-10-08 16:22 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Ingo Molnar, Tejun Heo, Frederic Weisbecker, Pierre Tardy,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote:
> * Arjan van de Ven (arjan@linux.intel.com) wrote:
>> On 10/8/2010 1:38 AM, Ingo Molnar wrote:
>>> The fundamental thing about tracing/instrumentation is that there are no
>>> deep ABI needs: it's all about analyzing development kernels (and a few
>>> select versions that get the enterprise treatment) but otherwise the
>>> half-life of this kind of information is very short.
>>>
>>> So we dont want to tie ourselves down with excessive ABIs.
>>>
>> ok I'll start working on a second mechanism then to export information
>> that applications need ;-(
>> it'll look a lot like tracing I suppose ;-(
> What's wrong with doing the compatibility layer in a LGPL library shipped with
> the kernel tree under tools/ ?
because that is not workable... at least nobody has shown to be able to
make this work.
libraries (after compilation) live in /lib or /usr/lib (or lib64 I
suppose).....
what mechanism ensures that a user who compiles his kernel gets a
library compatible with that kernel in /usr/lib?
and can said library deal with older kernels too? And distro kernels?
> Why does everything *have* to be done in
> kernel-space
it doesn't. but the alternative must be workable.
> Why are you so focused on making your application interact
> directly with kernel ABIs ?
>
> I'm being direct because there are trivial solutions to your problem that you
> are rejecting without due consideration. (and also I just had one coffee too
> many) ;-)
since you seem to think that dealing with such a library is trivial...
how about you do it for one function even, to
show that the deployment/use-in-an-app is workable.
I'd be more than happy to use it if it's workable and the API is at
least halfway sane.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 16:22 ` Arjan van de Ven
@ 2010-10-08 17:21 ` Steven Rostedt
2010-10-08 17:49 ` Frank Ch. Eigler
2010-10-08 17:32 ` Mathieu Desnoyers
1 sibling, 1 reply; 47+ messages in thread
From: Steven Rostedt @ 2010-10-08 17:21 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Mathieu Desnoyers, Ingo Molnar, Tejun Heo, Frederic Weisbecker,
Pierre Tardy, Thomas Renninger, Jean Pihet, linux-trace-users,
linux-pm, linux-perf-users, rjw, linux-omap, Peter Zijlstra,
Kevin Hilman, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On Fri, 2010-10-08 at 09:22 -0700, Arjan van de Ven wrote:
> On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote:
> because that is not workable... at least nobody has shown to be able to
> make this work.
> libraries (after compilation) live in /lib or /usr/lib (or lib64 I
> suppose).....
> what mechanism ensures that a user who compiles his kernel gets a
> library compatible with that kernel in /usr/lib?
> and can said library deal with older kernels too? And distro kernels?
Perhaps we should have "make install" of a kernel also install this
library?
Have two libraries? One that is linked to the app, the other that can
search for another library to link on load too (like a kernel.ld.so)
Then we could see the kernel version, and search for a library that is
compatible, and load that one.
The app only needs to worry about loading the generic library. The
generic library can test for compatible libraries for the kernel.
Could just be..
libkernel.ld.so which then loads..
/lib/modules/2.6.36/libkernel.so
Just a little brain storming.
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 17:21 ` Steven Rostedt
@ 2010-10-08 17:49 ` Frank Ch. Eigler
2010-10-08 18:11 ` Steven Rostedt
0 siblings, 1 reply; 47+ messages in thread
From: Frank Ch. Eigler @ 2010-10-08 17:49 UTC (permalink / raw)
To: Steven Rostedt
Cc: Arjan van de Ven, Mathieu Desnoyers, Ingo Molnar, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
Hi -
On Fri, Oct 08, 2010 at 01:21:35PM -0400, Steven Rostedt wrote:
> [...]
> Perhaps we should have "make install" of a kernel also install this
> library?
> [...]
> The app only needs to worry about loading the generic library. The
> generic library can test for compatible libraries for the kernel.
> [...]
If this library were to be distributed with the kernel, what would
make the generic side of the interface any less permanent than a
kernel ABI? That is, if there is a libkernel-internals.so built from
kernel sources, wouldn't its ABI become necessarily as fixed as any
old syscall or procfs file?
One can have some backward compatibility with symbol versioning et
al., but would that be sufficiently powerful to avoid handcuffing
kernel developers' inclinations to make random future changes?
- FChE
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 17:49 ` Frank Ch. Eigler
@ 2010-10-08 18:11 ` Steven Rostedt
0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2010-10-08 18:11 UTC (permalink / raw)
To: Frank Ch. Eigler
Cc: Arjan van de Ven, Mathieu Desnoyers, Ingo Molnar, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On Fri, 2010-10-08 at 13:49 -0400, Frank Ch. Eigler wrote:
> Hi -
>
> On Fri, Oct 08, 2010 at 01:21:35PM -0400, Steven Rostedt wrote:
> > [...]
> > Perhaps we should have "make install" of a kernel also install this
> > library?
> > [...]
> > The app only needs to worry about loading the generic library. The
> > generic library can test for compatible libraries for the kernel.
> > [...]
>
> If this library were to be distributed with the kernel, what would
> make the generic side of the interface any less permanent than a
> kernel ABI? That is, if there is a libkernel-internals.so built from
> kernel sources, wouldn't its ABI become necessarily as fixed as any
> old syscall or procfs file?
One thing, the backwards compatibility would reside in user space. The
big advantage to that than for this to be in kernel space is that it is
only there when used. When we have backward compatibility in the kernel,
it's there in memory for everyone, whether you want it or not.
>
> One can have some backward compatibility with symbol versioning et
> al., but would that be sufficiently powerful to avoid handcuffing
> kernel developers' inclinations to make random future changes?
>
Sure, also note, that this is a two lib design. We still have
a /usr/lib/libkernel.so that the apps will interface with. This will
need to load in the other kernel versions. When we change interfaces, we
can make the /usr/lib/libkernel.so.1 .2 etc.
Also doing this dynamically from a library, we can check if the kernel
versions work. It can test if the used function is compatible or not,
use an older version, or just tell the user "sorry, please update your
libkernel.so for this kernel." Doing this in userspace will allow a lot
more flexibility. We just need to think hard how these transactions will
work, and make it flexible for future enhancements.
But the kernel is free to do whatever it wants. The libraries will need
to worry about keeping the applications happy ;-)
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 16:22 ` Arjan van de Ven
2010-10-08 17:21 ` Steven Rostedt
@ 2010-10-08 17:32 ` Mathieu Desnoyers
1 sibling, 0 replies; 47+ messages in thread
From: Mathieu Desnoyers @ 2010-10-08 17:32 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Ingo Molnar, Tejun Heo, Frederic Weisbecker, Pierre Tardy,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
* Arjan van de Ven (arjan@linux.intel.com) wrote:
> On 10/8/2010 6:41 AM, Mathieu Desnoyers wrote:
>> * Arjan van de Ven (arjan@linux.intel.com) wrote:
>>> On 10/8/2010 1:38 AM, Ingo Molnar wrote:
>>>> The fundamental thing about tracing/instrumentation is that there are no
>>>> deep ABI needs: it's all about analyzing development kernels (and a few
>>>> select versions that get the enterprise treatment) but otherwise the
>>>> half-life of this kind of information is very short.
>>>>
>>>> So we dont want to tie ourselves down with excessive ABIs.
>>>>
>>> ok I'll start working on a second mechanism then to export information
>>> that applications need ;-(
>>> it'll look a lot like tracing I suppose ;-(
>> What's wrong with doing the compatibility layer in a LGPL library shipped with
>> the kernel tree under tools/ ?
>
> because that is not workable... at least nobody has shown to be able to
> make this work.
> libraries (after compilation) live in /lib or /usr/lib (or lib64 I
> suppose).....
> what mechanism ensures that a user who compiles his kernel gets a
> library compatible with that kernel in /usr/lib?
I don't think the perf tools/ do it right at the moment, but here is my
proposal:
Currently, we need to do make install from the tools/ directory. Since kernel
developers are lazy, I would propose a CONFIG_INSTALL_TOOLS default N config
option that would let make install from the root of the kernel tree install the
tools too. (looking at my inbox..) As Steven just beat me to it, see his lib
versioning proposal. ;)
The library would present an API to the application that would let apps consume
specific events of interest. Translation of fixed event names/fields into the
current kernel version tracepoint names/fields would be performed by the lib,
and the library would also deal with reading the perf events through the perf
ABI and would act as a middle-man to make sure they are always perceived by the
application in the same way.
> and can said library deal with older kernels too? And distro kernels?
Steven's proposal should work.
>
>> Why does everything *have* to be done in
>> kernel-space
> it doesn't. but the alternative must be workable.
>> Why are you so focused on making your application interact
>> directly with kernel ABIs ?
>>
>> I'm being direct because there are trivial solutions to your problem that you
>> are rejecting without due consideration. (and also I just had one coffee too
>> many) ;-)
>
> since you seem to think that dealing with such a library is trivial...
> how about you do it for one function even, to
> show that the deployment/use-in-an-app is workable.
> I'd be more than happy to use it if it's workable and the API is at
> least halfway sane.
I currently have a lot on my plate with trace format and ring buffer, but if
anyone is interested in trying to implement this, I can look at it and provide
feedback/hints.
Thanks,
Mathieu
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-08 13:41 ` Mathieu Desnoyers
2010-10-08 16:22 ` Arjan van de Ven
@ 2010-10-09 6:28 ` Ingo Molnar
2010-10-09 8:14 ` Pierre Tardy
2010-10-09 16:19 ` Arjan van de Ven
1 sibling, 2 replies; 47+ messages in thread
From: Ingo Molnar @ 2010-10-09 6:28 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Arjan van de Ven, Tejun Heo, Frederic Weisbecker, Pierre Tardy,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
* Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> * Arjan van de Ven (arjan@linux.intel.com) wrote:
> > On 10/8/2010 1:38 AM, Ingo Molnar wrote:
> >>
> >> The fundamental thing about tracing/instrumentation is that there
> >> are no deep ABI needs: it's all about analyzing development kernels
> >> (and a few select versions that get the enterprise treatment) but
> >> otherwise the half-life of this kind of information is very short.
> >>
> >> So we dont want to tie ourselves down with excessive ABIs.
> >
> > ok I'll start working on a second mechanism then to export
> > information that applications need ;-( it'll look a lot like tracing
> > I suppose ;-(
>
> What's wrong with doing the compatibility layer in a LGPL library
> shipped with the kernel tree under tools/ ? Why does everything *have*
> to be done in kernel-space ? Why are you so focused on making your
> application interact directly with kernel ABIs ?
The thing is, Arjan is 100% right that a library for this is not a
'solution', it's an unnecessary complication.
What i suggested in my mail was to _keep existing events_. I.e. do not
break powertop. We are 100% happy that we _have_ such apps, and we
should do reasonable things to not break them.
If we need to change events, we can add a new event. The old events will
lose their relevance without us having to do much - and without us
having to break powertop, pytimechart, etc. We can even have periods of
overlap when both events are available - to give instrumentation apps
time to learn the new events.
I.e. it's not an ABI in the classic sense - we do not (because we
cannot) guarantee the infinite availability of these events. But we can
guarantee that the fields do not change in some stupid, avoidable way.
Changing an existing event in some non-append way is just sloppy and we
can do better.
Arjan, Pierre, does that sound OK to you?
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 6:28 ` Ingo Molnar
@ 2010-10-09 8:14 ` Pierre Tardy
2010-10-09 18:36 ` Linus Torvalds
2010-10-09 16:19 ` Arjan van de Ven
1 sibling, 1 reply; 47+ messages in thread
From: Pierre Tardy @ 2010-10-09 8:14 UTC (permalink / raw)
To: Ingo Molnar
Cc: Mathieu Desnoyers, Arjan van de Ven, Tejun Heo,
Frederic Weisbecker, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
Masami Hiramatsu, Thomas Gleixner, Andrew Morton, Linus Torvalds
On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> The thing is, Arjan is 100% right that a library for this is not a
> 'solution', it's an unnecessary complication.
Yes. sounds like overengineering.
> If we need to change events, we can add a new event. The old events will
> lose their relevance without us having to do much - and without us
> having to break powertop, pytimechart, etc. We can even have periods of
> overlap when both events are available - to give instrumentation apps
> time to learn the new events.
>
> I.e. it's not an ABI in the classic sense - we do not (because we
> cannot) guarantee the infinite availability of these events. But we can
> guarantee that the fields do not change in some stupid, avoidable way.
>
> Changing an existing event in some non-append way is just sloppy and we
> can do better.
>
> Arjan, Pierre, does that sound OK to you?
Yes. Its reasonable.
Note that pytimechart patch for new power API is already ready and
just waiting for this issue to be decided.
Regards,
Pierre
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 8:14 ` Pierre Tardy
@ 2010-10-09 18:36 ` Linus Torvalds
2010-10-09 21:15 ` Steven Rostedt
2010-10-18 12:15 ` Jean Pihet
0 siblings, 2 replies; 47+ messages in thread
From: Linus Torvalds @ 2010-10-09 18:36 UTC (permalink / raw)
To: Pierre Tardy
Cc: Ingo Molnar, Mathieu Desnoyers, Arjan van de Ven, Tejun Heo,
Frederic Weisbecker, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
Masami Hiramatsu, Thomas Gleixner, Andrew Morton
On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy <tardyp@gmail.com> wrote:
> On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar <mingo@elte.hu> wrote:
>>
>
>> The thing is, Arjan is 100% right that a library for this is not a
>> 'solution', it's an unnecessary complication.
> Yes. sounds like overengineering.
I also want to remind people that backwards compatibility should
always absolutely be the #1 priority. Using libraries to "hide"
differences is a totally moronic thing to do, because if you can do a
compatibility library with good interfaces, then damn it, the kernel
interface should already _be_ that good interface.
And no, even if you interact purely with open source programs, the
backwards compatibility requirement doesn't go away. It's a damn pain
in the ass to have to recompile, and it means that you have a much
harder time mixing and matching, and just updating the kernel on top
of a standard distribution.
So changing kernel interfaces that get exported to user space is
always a disaster. Anybody who _designs_ for that kind of disaster
shouldn't be participating in kernel development, because they've
shown themselves to be unable to understand the pain and suffering.
Yes, we do it. Sometimes we change interfaces because not changing
them is too damn painful. But it should absolutely not be the design
model.
Linus
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 18:36 ` Linus Torvalds
@ 2010-10-09 21:15 ` Steven Rostedt
2010-10-09 23:20 ` Linus Torvalds
2010-10-18 12:15 ` Jean Pihet
1 sibling, 1 reply; 47+ messages in thread
From: Steven Rostedt @ 2010-10-09 21:15 UTC (permalink / raw)
To: Linus Torvalds
Cc: Pierre Tardy, Ingo Molnar, Mathieu Desnoyers, Arjan van de Ven,
Tejun Heo, Frederic Weisbecker, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Frank Eigler, Masami Hiramatsu,
Thomas Gleixner, Andrew Morton
On Sat, 2010-10-09 at 11:36 -0700, Linus Torvalds wrote:
> On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy <tardyp@gmail.com> wrote:
> > On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >>
> >
> >> The thing is, Arjan is 100% right that a library for this is not a
> >> 'solution', it's an unnecessary complication.
> > Yes. sounds like overengineering.
>
> I also want to remind people that backwards compatibility should
> always absolutely be the #1 priority. Using libraries to "hide"
> differences is a totally moronic thing to do, because if you can do a
> compatibility library with good interfaces, then damn it, the kernel
> interface should already _be_ that good interface.
>
> And no, even if you interact purely with open source programs, the
> backwards compatibility requirement doesn't go away. It's a damn pain
> in the ass to have to recompile, and it means that you have a much
> harder time mixing and matching, and just updating the kernel on top
> of a standard distribution.
>
> So changing kernel interfaces that get exported to user space is
> always a disaster. Anybody who _designs_ for that kind of disaster
> shouldn't be participating in kernel development, because they've
> shown themselves to be unable to understand the pain and suffering.
>
> Yes, we do it. Sometimes we change interfaces because not changing
> them is too damn painful. But it should absolutely not be the design
> model.
The difference here compared to all other user interfaces, is that this
interface has the sole purpose of showing what is happening inside the
kernel. By saying that "we expose this to userspace, it must too be
stable" is saying that all kernel internals that use trace events must
never change.
The big push against tracepoints/trace-markers/trace-events in the
beginning was the fear that they will hinder kernel development because
they become interfaces for users to see what is happening inside the
kernel. When I wrote the interface, I put it in the debugfs system so
people will know that this is a debug interface and can change without
notice.
Trace-events, unlike syscalls, may change depending on how you compiled
the kernel. There's no guarantee that they will even exist on a system.
If all trace-events are now stable ABI, then I suggest we stop adding
any more events, and only add new ones to places that we do not expect
to develop the kernel on anymore.
Not sure what other solution there is. Trace points have been added way
too freely, because any maintainer could add them to their system any
way they felt like it. Now if they are frozen in stone, then the code
that they expose must also be frozen.
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 21:15 ` Steven Rostedt
@ 2010-10-09 23:20 ` Linus Torvalds
2010-10-10 1:39 ` Steven Rostedt
0 siblings, 1 reply; 47+ messages in thread
From: Linus Torvalds @ 2010-10-09 23:20 UTC (permalink / raw)
To: Steven Rostedt
Cc: Thomas Gleixner, Andrew Morton, Pierre Tardy, Peter Zijlstra,
Frederic Weisbecker, Masami Hiramatsu, Jean Pihet,
linux-perf-users, linux-trace-users, Frank Eigler,
Mathieu Desnoyers, linux-pm, Tejun Heo, Ingo Molnar, linux-omap,
Arjan van de Ven
On Sat, Oct 9, 2010 at 2:15 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> The difference here compared to all other user interfaces, is that this
> interface has the sole purpose of showing what is happening inside the
> kernel.
Bogus and dishonest argument.
Listen to yourself, and read this thread again.
The thread was about doing some kind of open-source library to allow
non-open-source access to these events, and keeping backwards
compatibility in user space. In fact, that is what you yourself said.
So you claimed it could be backwards-compatible. If that's the case,
then there is no excuse for not being so in the kernel.
You can't have it both ways. Stop the f*cking waffling.
Linus
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 23:20 ` Linus Torvalds
@ 2010-10-10 1:39 ` Steven Rostedt
2010-10-10 6:41 ` Peter Zijlstra
0 siblings, 1 reply; 47+ messages in thread
From: Steven Rostedt @ 2010-10-10 1:39 UTC (permalink / raw)
To: Linus Torvalds
Cc: Pierre Tardy, Ingo Molnar, Mathieu Desnoyers, Arjan van de Ven,
Tejun Heo, Frederic Weisbecker, Thomas Renninger, Jean Pihet,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Frank Eigler, Masami Hiramatsu,
Thomas Gleixner, Andrew Morton
On Sat, 2010-10-09 at 16:20 -0700, Linus Torvalds wrote:
> On Sat, Oct 9, 2010 at 2:15 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > The difference here compared to all other user interfaces, is that this
> > interface has the sole purpose of showing what is happening inside the
> > kernel.
>
> Bogus and dishonest argument.
>
> Listen to yourself, and read this thread again.
>
> The thread was about doing some kind of open-source library to allow
> non-open-source access to these events, and keeping backwards
> compatibility in user space. In fact, that is what you yourself said.
>
> So you claimed it could be backwards-compatible. If that's the case,
> then there is no excuse for not being so in the kernel.
>
> You can't have it both ways. Stop the f*cking waffling.
Let me rephrase it then, and lets forget about the library. I was just
brain storming ideas.
I'm all for labeling specific trace points as "ABI", such that, these
trace points have had sufficient thought and are not expected to change
in the near future. But I'm against the idea that any tracepoint that
has been shown to userspace can be considered stable.
With or without libraries, I'm for two kinds of interfaces: One that is
stable and has been thoroughly thought through, and one that is free for
the maintainers to have an interface to let them see what is happening
in the kernel, even on a production system, but be able to change them
whenever they feel the need.
That's the basis of my idea. A stable backward-compatible interface, and
an interface that is unstable for developers. Whether we put the stable
interface into a library (to keep the ugliness from developers, which
you obviously do not like), or two, distinctly label the tracepoint as
ABI, to let the developers and everyone else know what tracepoints an
application can count on and what ones they should not.
Thus my waffling is really wanting both, a stable ABI and an unstable
one. I've been hesitant in the pass from doing the TRACE_EVENT_ABI()
before, because Peter Zijlstra (who is currently MIA) has been strongly
against it.
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-10 1:39 ` Steven Rostedt
@ 2010-10-10 6:41 ` Peter Zijlstra
2010-10-10 15:11 ` Steven Rostedt
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2010-10-10 6:41 UTC (permalink / raw)
To: Steven Rostedt
Cc: Linus Torvalds, Pierre Tardy, Ingo Molnar, Mathieu Desnoyers,
Arjan van de Ven, Tejun Heo, Frederic Weisbecker,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Kevin Hilman, Frank Eigler,
Masami Hiramatsu, Thomas Gleixner, Andrew Morton
On Sat, 2010-10-09 at 21:39 -0400, Steven Rostedt wrote:
> I've been hesitant in the pass from doing the TRACE_EVENT_ABI()
> before, because Peter Zijlstra (who is currently MIA) has been strongly
> against it.
I see no point in the TRACE_EVENT_ABI() because if I need to change such
a tracepoint to reflect changes in the kernel then I will freely do so.
Even seemingly stable points like sched_switch(), which we all agree
will stay around forever (gotta have context switches on a multi-tasking
OS) will not stay stable when we add/change scheduling policies.
Sure, the prev and next task thing will stay the same, but the meaning
and interpretation of things like the prio field will not, esp when we
go add something like a deadline scheduler that isn't priority based.
So one possibility is to simply remove all that information from the
tracepoints, remove the prio and state fields, but how useful is that?
I guess what I'm saying is that even if we were to provide _ABI I see us
getting into this very same argument over and over again, making me want
to remove all this trace event muck right now before it gets worse.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-10 6:41 ` Peter Zijlstra
@ 2010-10-10 15:11 ` Steven Rostedt
0 siblings, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2010-10-10 15:11 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Linus Torvalds, Pierre Tardy, Ingo Molnar, Mathieu Desnoyers,
Arjan van de Ven, Tejun Heo, Frederic Weisbecker,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Kevin Hilman, Frank Eigler,
Masami Hiramatsu, Thomas Gleixner, Andrew Morton
On Sun, 2010-10-10 at 08:41 +0200, Peter Zijlstra wrote:
> On Sat, 2010-10-09 at 21:39 -0400, Steven Rostedt wrote:
> > I've been hesitant in the pass from doing the TRACE_EVENT_ABI()
> > before, because Peter Zijlstra (who is currently MIA) has been strongly
> > against it.
>
> I see no point in the TRACE_EVENT_ABI() because if I need to change such
> a tracepoint to reflect changes in the kernel then I will freely do so.
>
> Even seemingly stable points like sched_switch(), which we all agree
> will stay around forever (gotta have context switches on a multi-tasking
> OS) will not stay stable when we add/change scheduling policies.
>
> Sure, the prev and next task thing will stay the same, but the meaning
> and interpretation of things like the prio field will not, esp when we
> go add something like a deadline scheduler that isn't priority based.
>
> So one possibility is to simply remove all that information from the
> tracepoints, remove the prio and state fields, but how useful is that?
>
> I guess what I'm saying is that even if we were to provide _ABI I see us
> getting into this very same argument over and over again, making me want
> to remove all this trace event muck right now before it gets worse.
Then how's this as a compromise. We do not add a TRACE_EVENT_ABI(), but
instead manually add the ABI interface to existing tracepoints. Let's
use the sched example you shown above.
We can connect to the sched_switch() tracepoint manually in something
perhaps called kernel/abi_trace.c or trace_abi.c (whatever).
Here we create the directories manually there:
/sys/kernel/event/sched/sched_switch/
But this sched_switch will only include the prev and next pids, comms,
and perhaps even run state. But not the prio (since we see that
changing).
It would then need the code to enable the trace point with:
register_trace_sched_switch(sched_switch_abi_probe, NULL);
Where we have
static void
sched_switch_abi_probe(void *ignore,
struct task_switch *prev,
struct task_struct *next)
{
/* code to grab just the ABI stuff */
}
And this code can then record to what ever hooked to it.
Making this a manual effort will make it easier to control what becomes
an ABI. We can have long discussions and flames over what goes here. But
that's good since debates before an ABI is created is much better than
debates after one is created.
I'm afraid that a easy macro called TRACE_EVENT_ABI() would have the
same issue. ABIs may be created too quickly before they are thought
through.
Thoughts?
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 18:36 ` Linus Torvalds
2010-10-09 21:15 ` Steven Rostedt
@ 2010-10-18 12:15 ` Jean Pihet
1 sibling, 0 replies; 47+ messages in thread
From: Jean Pihet @ 2010-10-18 12:15 UTC (permalink / raw)
To: Linus Torvalds
Cc: Pierre Tardy, Ingo Molnar, Mathieu Desnoyers, Arjan van de Ven,
Tejun Heo, Frederic Weisbecker, Thomas Renninger,
linux-trace-users, linux-pm, linux-perf-users, rjw, linux-omap,
Peter Zijlstra, Kevin Hilman, Steven Rostedt, Frank Eigler,
Masami Hiramatsu, Thomas Gleixner, Andrew Morton
On Sat, Oct 9, 2010 at 8:36 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Sat, Oct 9, 2010 at 1:14 AM, Pierre Tardy <tardyp@gmail.com> wrote:
>> On Sat, Oct 9, 2010 at 8:28 AM, Ingo Molnar <mingo@elte.hu> wrote:
>>>
>>
>>> The thing is, Arjan is 100% right that a library for this is not a
>>> 'solution', it's an unnecessary complication.
>> Yes. sounds like overengineering.
>
> I also want to remind people that backwards compatibility should
> always absolutely be the #1 priority. Using libraries to "hide"
> differences is a totally moronic thing to do, because if you can do a
> compatibility library with good interfaces, then damn it, the kernel
> interface should already _be_ that good interface.
Agree on that. The idea is to have the kernel interfaces cleaned up
and so have better user space apps in the end.
> And no, even if you interact purely with open source programs, the
> backwards compatibility requirement doesn't go away. It's a damn pain
> in the ass to have to recompile, and it means that you have a much
> harder time mixing and matching, and just updating the kernel on top
> of a standard distribution.
>
> So changing kernel interfaces that get exported to user space is
> always a disaster. Anybody who _designs_ for that kind of disaster
> shouldn't be participating in kernel development, because they've
> shown themselves to be unable to understand the pain and suffering.
>
> Yes, we do it. Sometimes we change interfaces because not changing
> them is too damn painful. But it should absolutely not be the design
> model.
So what is the best way to have the power tracing events elegantly cleaned up?
The proposed patch 4/4 [1] introduces a new Kconfig option
CONFIG_DEPRECATED_POWER_EVENT_TRACING which allows to select to map
the trace points to the old _OR_ to the new events API, only for the
already existing events. This gives some time for the adaptation of
the user space apps.
I understand this could require a kernel re-compilation in order to
use the old events API but I really want to avoid to duplicate the
trace points in the code to instrument, e.g.:
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 199dcb9..013c274 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -354,5 +354,7 @@ void cpufreq_notify_transition(struct
cpufreq_freqs *freqs, unsigned int state)
adjust_jiffies(CPUFREQ_POSTCHANGE, freqs);
dprintk("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
(unsigned long)freqs->cpu);
trace_power_frequency(POWER_PSTATE, freqs->new, freqs->cpu);
+ trace_power_switch_state(POWER_PSTATE, freqs->new, freqs->cpu,
+ smp_processor_id());
The proposed patch only changes the power tracing events API
definition files, not the code to instrument.
I am OK to re-spin the patches, only if a compromise is agreed on.
As said before the kernel Documentation and pytimechart user space
tools patches will be provided as well.
>
> Linus
>
Thanks,
Jean
[1] http://marc.info/?l=linux-omap&m=128620575900689&w=2
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 6:28 ` Ingo Molnar
2010-10-09 8:14 ` Pierre Tardy
@ 2010-10-09 16:19 ` Arjan van de Ven
2010-10-09 21:34 ` Steven Rostedt
2010-10-10 12:19 ` Ingo Molnar
1 sibling, 2 replies; 47+ messages in thread
From: Arjan van de Ven @ 2010-10-09 16:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: Mathieu Desnoyers, Tejun Heo, Frederic Weisbecker, Pierre Tardy,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On 10/8/2010 11:28 PM, Ingo Molnar wrote:
> * Mathieu Desnoyers<mathieu.desnoyers@efficios.com> wrote:
>
>> * Arjan van de Ven (arjan@linux.intel.com) wrote:
>>> On 10/8/2010 1:38 AM, Ingo Molnar wrote:
>>>> The fundamental thing about tracing/instrumentation is that there
>>>> are no deep ABI needs: it's all about analyzing development kernels
>>>> (and a few select versions that get the enterprise treatment) but
>>>> otherwise the half-life of this kind of information is very short.
>>>>
>>>> So we dont want to tie ourselves down with excessive ABIs.
>>> ok I'll start working on a second mechanism then to export
>>> information that applications need ;-( it'll look a lot like tracing
>>> I suppose ;-(
>> What's wrong with doing the compatibility layer in a LGPL library
>> shipped with the kernel tree under tools/ ? Why does everything *have*
>> to be done in kernel-space ? Why are you so focused on making your
>> application interact directly with kernel ABIs ?
> The thing is, Arjan is 100% right that a library for this is not a
> 'solution', it's an unnecessary complication.
>
> What i suggested in my mail was to _keep existing events_. I.e. do not
> break powertop. We are 100% happy that we _have_ such apps, and we
> should do reasonable things to not break them.
>
> If we need to change events, we can add a new event. The old events will
> lose their relevance without us having to do much - and without us
> having to break powertop, pytimechart, etc. We can even have periods of
> overlap when both events are available - to give instrumentation apps
> time to learn the new events.
>
> I.e. it's not an ABI in the classic sense - we do not (because we
> cannot) guarantee the infinite availability of these events. But we can
> guarantee that the fields do not change in some stupid, avoidable way.
also I have to say that some events are more likely to change than others
"function foo in the kernel called" is more likely to change than "the
processor went to THIS frequency".
The concept of CPU frequencies has been with us fora long time and is
going to be there for a long time as well ......
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 16:19 ` Arjan van de Ven
@ 2010-10-09 21:34 ` Steven Rostedt
2010-10-10 12:19 ` Ingo Molnar
1 sibling, 0 replies; 47+ messages in thread
From: Steven Rostedt @ 2010-10-09 21:34 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Ingo Molnar, Mathieu Desnoyers, Tejun Heo, Frederic Weisbecker,
Pierre Tardy, Thomas Renninger, Jean Pihet, linux-trace-users,
linux-pm, linux-perf-users, rjw, linux-omap, Peter Zijlstra,
Kevin Hilman, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On Sat, 2010-10-09 at 09:19 -0700, Arjan van de Ven wrote:
> > I.e. it's not an ABI in the classic sense - we do not (because we
> > cannot) guarantee the infinite availability of these events. But we can
> > guarantee that the fields do not change in some stupid, avoidable way.
>
> also I have to say that some events are more likely to change than others
>
> "function foo in the kernel called" is more likely to change than "the
> processor went to THIS frequency".
> The concept of CPU frequencies has been with us fora long time and is
> going to be there for a long time as well ......
Perhaps for basic concepts, we need a standard trace-event. Are people
willing to have a TRACE_EVENT_ABI() (it's trivial to write), and we can
mark those events with that macro that we know tools depend on.
These events can be exposed in a /sys/kernel/events/... directory, to
let tools know what what events they can rely on.
We've talked about doing this before, I've just been waiting to hear a
consensus on if we should. I know Peter Zijlstra was against the idea,
and too bad he's off gallivanting to share his input now.
-- Steve
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-09 16:19 ` Arjan van de Ven
2010-10-09 21:34 ` Steven Rostedt
@ 2010-10-10 12:19 ` Ingo Molnar
2010-10-19 11:31 ` Thomas Renninger
1 sibling, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2010-10-10 12:19 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Mathieu Desnoyers, Tejun Heo, Frederic Weisbecker, Pierre Tardy,
Thomas Renninger, Jean Pihet, linux-trace-users, linux-pm,
linux-perf-users, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
* Arjan van de Ven <arjan@linux.intel.com> wrote:
> On 10/8/2010 11:28 PM, Ingo Molnar wrote:
> >* Mathieu Desnoyers<mathieu.desnoyers@efficios.com> wrote:
> >
> >>* Arjan van de Ven (arjan@linux.intel.com) wrote:
> >>> On 10/8/2010 1:38 AM, Ingo Molnar wrote:
> >>>>The fundamental thing about tracing/instrumentation is that there
> >>>>are no deep ABI needs: it's all about analyzing development kernels
> >>>>(and a few select versions that get the enterprise treatment) but
> >>>>otherwise the half-life of this kind of information is very short.
> >>>>
> >>>>So we dont want to tie ourselves down with excessive ABIs.
> >>>ok I'll start working on a second mechanism then to export
> >>>information that applications need ;-( it'll look a lot like tracing
> >>>I suppose ;-(
> >>What's wrong with doing the compatibility layer in a LGPL library
> >>shipped with the kernel tree under tools/ ? Why does everything *have*
> >>to be done in kernel-space ? Why are you so focused on making your
> >>application interact directly with kernel ABIs ?
> >The thing is, Arjan is 100% right that a library for this is not a
> >'solution', it's an unnecessary complication.
> >
> >What i suggested in my mail was to _keep existing events_. I.e. do not
> >break powertop. We are 100% happy that we _have_ such apps, and we
> >should do reasonable things to not break them.
> >
> >If we need to change events, we can add a new event. The old events will
> >lose their relevance without us having to do much - and without us
> >having to break powertop, pytimechart, etc. We can even have periods of
> >overlap when both events are available - to give instrumentation apps
> >time to learn the new events.
> >
> >I.e. it's not an ABI in the classic sense - we do not (because we
> >cannot) guarantee the infinite availability of these events. But we can
> >guarantee that the fields do not change in some stupid, avoidable way.
>
> also I have to say that some events are more likely to change than others
>
> "function foo in the kernel called" is more likely to change than "the
> processor went to THIS frequency". The concept of CPU frequencies has
> been with us fora long time and is going to be there for a long time
> as well ......
Most definitely. It's no accident that it took such a long time for this
issue to be raised in the first place. It's a rare occurance - and then
we can deal with it intelligently, without breaking stuff unnecessarily.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-10 12:19 ` Ingo Molnar
@ 2010-10-19 11:31 ` Thomas Renninger
2010-10-19 11:45 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Thomas Renninger @ 2010-10-19 11:31 UTC (permalink / raw)
To: Ingo Molnar
Cc: Arjan van de Ven, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds
On Sunday 10 October 2010 14:19:28 Ingo Molnar wrote:
>
> * Arjan van de Ven <arjan@linux.intel.com> wrote:
>
...
> > also I have to say that some events are more likely to change than others
> >
> > "function foo in the kernel called" is more likely to change than "the
> > processor went to THIS frequency". The concept of CPU frequencies has
> > been with us fora long time and is going to be there for a long time
> > as well ......
Right, it's a frequency and a CPU that should get passed along with the
event. The X86/ACPI specific X-state data (even there unused and never will
get used) should vanish before ARM starts to make use of it.
The idle (power_start/power_end) state definition is worse...
> Most definitely. It's no accident that it took such a long time for this
> issue to be raised in the first place.
> It's a rare occurance -
Do you agree that this occurance happened now and these events
should get cleaned up before ARM and other archs make use of the broken
interface?
If not, discussing this further, is a big waste of time... and Jean
would have to try to adapt his ARM code on the broken ABI...
> and then
> we can deal with it intelligently, without breaking stuff unnecessarily.
Can we get this defined a bit clearer so that a patch can be created?
Compatibility can only be achieved by still firing the old events for
some kernel rounds.
I'll send some patches in a new thread with these people in CC.
It would be great to see a decision (in a way that a patch can be created)
how an event change can/should look like if there is urgent need.
Thanks,
Thomas
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 11:31 ` Thomas Renninger
@ 2010-10-19 11:45 ` Ingo Molnar
2010-10-19 11:47 ` Peter Zijlstra
0 siblings, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2010-10-19 11:45 UTC (permalink / raw)
To: Thomas Renninger
Cc: Arjan van de Ven, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Andrew Morton, Linus Torvalds, linux-kernel
* Thomas Renninger <trenn@suse.de> wrote:
> > Most definitely. It's no accident that it took such a long time for this issue
> > to be raised in the first place. It's a rare occurance -
>
> Do you agree that this occurance happened now and these events should get cleaned
> up before ARM and other archs make use of the broken interface?
>
> If not, discussing this further, is a big waste of time... and Jean would have to
> try to adapt his ARM code on the broken ABI...
The discussion seems to have died down somewhat. Please re-send to lkml the latest
patches you have to remind everyone of the latest state of things - the merge window
is getting near.
My only compatibility/ABI point is basically that it shouldnt break _existing_
tracepoints (and users thereof). If your latest bits meet that then it ought to be a
good first step. You are free to (and encouraged to) introduce more complete sets of
events.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 11:45 ` Ingo Molnar
@ 2010-10-19 11:47 ` Peter Zijlstra
2010-10-19 11:52 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Peter Zijlstra @ 2010-10-19 11:47 UTC (permalink / raw)
To: Ingo Molnar
Cc: Thomas Renninger, Arjan van de Ven, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, linux-kernel
On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
>
> * Thomas Renninger <trenn@suse.de> wrote:
>
> > > Most definitely. It's no accident that it took such a long time for this issue
> > > to be raised in the first place. It's a rare occurance -
> >
> > Do you agree that this occurance happened now and these events should get cleaned
> > up before ARM and other archs make use of the broken interface?
> >
> > If not, discussing this further, is a big waste of time... and Jean would have to
> > try to adapt his ARM code on the broken ABI...
>
> The discussion seems to have died down somewhat. Please re-send to lkml the latest
> patches you have to remind everyone of the latest state of things - the merge window
> is getting near.
>
> My only compatibility/ABI point is basically that it shouldnt break _existing_
> tracepoints (and users thereof). If your latest bits meet that then it ought to be a
> good first step. You are free to (and encouraged to) introduce more complete sets of
> events.
Can we deprecate and eventually remove the old ones, or will we be
forever obliged to carry the old ones too?
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 11:47 ` Peter Zijlstra
@ 2010-10-19 11:52 ` Ingo Molnar
2010-10-19 13:27 ` Arjan van de Ven
0 siblings, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2010-10-19 11:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Thomas Gleixner, Andrew Morton, Pierre Tardy, Frederic Weisbecker,
Masami Hiramatsu, Linus Torvalds, Jean Pihet, Steven Rostedt,
linux-kernel, linux-trace-users, Frank Eigler, Mathieu Desnoyers,
Tejun Heo, linux-pm, linux-omap, Arjan van de Ven
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
> >
> > * Thomas Renninger <trenn@suse.de> wrote:
> >
> > > > Most definitely. It's no accident that it took such a long time for this issue
> > > > to be raised in the first place. It's a rare occurance -
> > >
> > > Do you agree that this occurance happened now and these events should get cleaned
> > > up before ARM and other archs make use of the broken interface?
> > >
> > > If not, discussing this further, is a big waste of time... and Jean would have to
> > > try to adapt his ARM code on the broken ABI...
> >
> > The discussion seems to have died down somewhat. Please re-send to lkml the latest
> > patches you have to remind everyone of the latest state of things - the merge window
> > is getting near.
> >
> > My only compatibility/ABI point is basically that it shouldnt break _existing_
> > tracepoints (and users thereof). If your latest bits meet that then it ought to be a
> > good first step. You are free to (and encouraged to) introduce more complete sets of
> > events.
>
> Can we deprecate and eventually remove the old ones, or will we be forever obliged
> to carry the old ones too?
We most definitely want to deprecate and remove the old ones - but we want to give
instrumentation software some migration time for that.
Jean, Arjan, what would be a feasible and practical deprecation period for that? One
kernel cycle?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 11:52 ` Ingo Molnar
@ 2010-10-19 13:27 ` Arjan van de Ven
2010-10-19 13:50 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Arjan van de Ven @ 2010-10-19 13:27 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Thomas Renninger, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, linux-kernel
On 10/19/2010 4:52 AM, Ingo Molnar wrote:
> * Peter Zijlstra<peterz@infradead.org> wrote:
>
>> On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
>>> * Thomas Renninger<trenn@suse.de> wrote:
>>>
>>>>> Most definitely. It's no accident that it took such a long time for this issue
>>>>> to be raised in the first place. It's a rare occurance -
>>>> Do you agree that this occurance happened now and these events should get cleaned
>>>> up before ARM and other archs make use of the broken interface?
>>>>
>>>> If not, discussing this further, is a big waste of time... and Jean would have to
>>>> try to adapt his ARM code on the broken ABI...
>>> The discussion seems to have died down somewhat. Please re-send to lkml the latest
>>> patches you have to remind everyone of the latest state of things - the merge window
>>> is getting near.
>>>
>>> My only compatibility/ABI point is basically that it shouldnt break _existing_
>>> tracepoints (and users thereof). If your latest bits meet that then it ought to be a
>>> good first step. You are free to (and encouraged to) introduce more complete sets of
>>> events.
>> Can we deprecate and eventually remove the old ones, or will we be forever obliged
>> to carry the old ones too?
> We most definitely want to deprecate and remove the old ones - but we want to give
> instrumentation software some migration time for that.
>
> Jean, Arjan, what would be a feasible and practical deprecation period for that? One
> kernel cycle?
more like a year
for some time software needs to support both, especially if popular
distros stick to an older kernel like *cough* RHEL6
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 13:27 ` Arjan van de Ven
@ 2010-10-19 13:50 ` Ingo Molnar
2010-10-19 13:52 ` Arjan van de Ven
0 siblings, 1 reply; 47+ messages in thread
From: Ingo Molnar @ 2010-10-19 13:50 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Peter Zijlstra, Thomas Renninger, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, linux-kernel
* Arjan van de Ven <arjan@linux.intel.com> wrote:
> On 10/19/2010 4:52 AM, Ingo Molnar wrote:
> >* Peter Zijlstra<peterz@infradead.org> wrote:
> >
> >>On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
> >>>* Thomas Renninger<trenn@suse.de> wrote:
> >>>
> >>>>>Most definitely. It's no accident that it took such a long time for this issue
> >>>>>to be raised in the first place. It's a rare occurance -
> >>>>Do you agree that this occurance happened now and these events should get cleaned
> >>>>up before ARM and other archs make use of the broken interface?
> >>>>
> >>>>If not, discussing this further, is a big waste of time... and Jean would have to
> >>>>try to adapt his ARM code on the broken ABI...
> >>>The discussion seems to have died down somewhat. Please re-send to lkml the latest
> >>>patches you have to remind everyone of the latest state of things - the merge window
> >>>is getting near.
> >>>
> >>>My only compatibility/ABI point is basically that it shouldnt break _existing_
> >>>tracepoints (and users thereof). If your latest bits meet that then it ought to be a
> >>>good first step. You are free to (and encouraged to) introduce more complete sets of
> >>>events.
> >>Can we deprecate and eventually remove the old ones, or will we be forever obliged
> >>to carry the old ones too?
> >We most definitely want to deprecate and remove the old ones - but we want to give
> >instrumentation software some migration time for that.
> >
> >Jean, Arjan, what would be a feasible and practical deprecation period for that? One
> >kernel cycle?
>
> more like a year
>
> for some time software needs to support both, especially if popular distros stick
> to an older kernel like *cough* RHEL6
Sure, you can support both. But as long as support for the _new_ events is included
in PowerTop there's no need to keep the duality upstream. Running ancient PowerTop
on fresh kernels is not common.
An old RHEL kernel will still keep on working as you can keep support for old events
in PowerTop as long as you wish to.
The new kernel also wont 'overwrite' old events with new definitions in the future,
so PowerTop will keep working for as long as you want to support older kernels.
Does that sound good?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 13:50 ` Ingo Molnar
@ 2010-10-19 13:52 ` Arjan van de Ven
2010-10-19 14:51 ` Ingo Molnar
0 siblings, 1 reply; 47+ messages in thread
From: Arjan van de Ven @ 2010-10-19 13:52 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Thomas Renninger, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, linux-kernel
On 10/19/2010 6:50 AM, Ingo Molnar wrote:
> * Arjan van de Ven<arjan@linux.intel.com> wrote:
>
>> On 10/19/2010 4:52 AM, Ingo Molnar wrote:
>>> * Peter Zijlstra<peterz@infradead.org> wrote:
>>>
>>>> On Tue, 2010-10-19 at 13:45 +0200, Ingo Molnar wrote:
>>>>> * Thomas Renninger<trenn@suse.de> wrote:
>>>>>
>>>>>>> Most definitely. It's no accident that it took such a long time for this issue
>>>>>>> to be raised in the first place. It's a rare occurance -
>>>>>> Do you agree that this occurance happened now and these events should get cleaned
>>>>>> up before ARM and other archs make use of the broken interface?
>>>>>>
>>>>>> If not, discussing this further, is a big waste of time... and Jean would have to
>>>>>> try to adapt his ARM code on the broken ABI...
>>>>> The discussion seems to have died down somewhat. Please re-send to lkml the latest
>>>>> patches you have to remind everyone of the latest state of things - the merge window
>>>>> is getting near.
>>>>>
>>>>> My only compatibility/ABI point is basically that it shouldnt break _existing_
>>>>> tracepoints (and users thereof). If your latest bits meet that then it ought to be a
>>>>> good first step. You are free to (and encouraged to) introduce more complete sets of
>>>>> events.
>>>> Can we deprecate and eventually remove the old ones, or will we be forever obliged
>>>> to carry the old ones too?
>>> We most definitely want to deprecate and remove the old ones - but we want to give
>>> instrumentation software some migration time for that.
>>>
>>> Jean, Arjan, what would be a feasible and practical deprecation period for that? One
>>> kernel cycle?
>> more like a year
>>
>> for some time software needs to support both, especially if popular distros stick
>> to an older kernel like *cough* RHEL6
> Sure, you can support both. But as long as support for the _new_ events is included
> in PowerTop there's no need to keep the duality upstream. Running ancient PowerTop
> on fresh kernels is not common.
>
> An old RHEL kernel will still keep on working as you can keep support for old events
> in PowerTop as long as you wish to.
>
> The new kernel also wont 'overwrite' old events with new definitions in the future,
> so PowerTop will keep working for as long as you want to support older kernels.
>
> Does that sound good?
this does not scale much long term, eg this only works if this is only
done once, and these points are stable afterwards.
otherwise we get 25 of those different "workarounds for kernel ABI
breakage" into all different projects, and it becomes
untestable for all the poor software writers...
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-19 13:52 ` Arjan van de Ven
@ 2010-10-19 14:51 ` Ingo Molnar
0 siblings, 0 replies; 47+ messages in thread
From: Ingo Molnar @ 2010-10-19 14:51 UTC (permalink / raw)
To: Arjan van de Ven
Cc: Peter Zijlstra, Thomas Renninger, Mathieu Desnoyers, Tejun Heo,
Frederic Weisbecker, Pierre Tardy, Jean Pihet, linux-trace-users,
linux-pm, rjw, linux-omap, Kevin Hilman, Steven Rostedt,
Frank Eigler, Masami Hiramatsu, Thomas Gleixner, Andrew Morton,
Linus Torvalds, linux-kernel
* Arjan van de Ven <arjan@linux.intel.com> wrote:
> >> for some time software needs to support both, especially if popular distros
> >> stick to an older kernel like *cough* RHEL6
> >
> > Sure, you can support both. But as long as support for the _new_ events is
> > included in PowerTop there's no need to keep the duality upstream. Running
> > ancient PowerTop on fresh kernels is not common.
> >
> > An old RHEL kernel will still keep on working as you can keep support for old
> > events in PowerTop as long as you wish to.
> >
> > The new kernel also wont 'overwrite' old events with new definitions in the
> > future, so PowerTop will keep working for as long as you want to support older
> > kernels.
> >
> > Does that sound good?
>
> this does not scale much long term, eg this only works if this is only done once,
> and these points are stable afterwards. otherwise we get 25 of those different
> "workarounds for kernel ABI breakage" into all different projects, and it becomes
> untestable for all the poor software writers...
I have no intention for this to become common. For the 2+ years tracepoints have
been upstream this is the first time it has come up. It's a rare occurance, and as
long as we keep it rare and as long as we have a smooth transition process in place
it should be good. If it becomes common we are doing something wrong ...
Alternatively you might want to review the new power events and suggest ways to add
that extra information to existing events that suits your purposes as well. You
added the old power tracepoints so you sure must have an opinion about it all?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:08 ` Mathieu Desnoyers
2010-10-07 15:23 ` Pierre Tardy
@ 2010-10-07 15:45 ` Jean Pihet
2010-10-07 15:49 ` Thomas Renninger
2 siblings, 0 replies; 47+ messages in thread
From: Jean Pihet @ 2010-10-07 15:45 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Thomas Renninger, linux-trace-users, linux-pm, linux-perf-users,
mingo, arjan, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Frederic Weisbecker, Andrew Morton, Linus Torvalds
Hi,
On Thu, Oct 7, 2010 at 5:08 PM, Mathieu Desnoyers
<mathieu.desnoyers@efficios.com> wrote:
> [ Adding a few more CCs, since this discussion is about a tracepoint
> userspace ABI policy, which is a topic of general interest. ]
>
> * Thomas Renninger (trenn@suse.de) wrote:
>> Hi,
>>
>> On Monday 04 October 2010 17:20:57 Jean Pihet wrote:
>> > Here is a re-spin of the patches after discussion.
>>
>> what is going to happen here now?
>>
>> Is this supposed to go through Ingo's tree?
>>
>> Ingo: do you mind commenting on this.
>
> Meanwhile, here are some ideas...
>
>
>> I see 3 possibilities:
>> 1) Power (or all) perf events are never going to change.
>
> Persisting with bad interfaces (which were never meant to be stable, and were
> actually explicitely said to be non-stable) for the sake of poorly written
> proprietary userspace apps does not seem like viable to me. Since when did we
> start designing kernel code for broken proprierary apps ? (see below for
> solutions on how to fix the apps)
>
> The only reason we have these tracepoints in there is because they can follow
> kernel code changes, thanks to their flexible nature. Being stucked with
> badly named tracepoints because of some monolithic analyzer app is just insane.
>
>> If they are going to change, then now is the right time and
>> 2) Backward compatibility is provided in some way for some time.
>
> I've looked at the resulting code, and, honestly, it's ugly and it complexifies
> the test matrix. I would really prefer to move this compatibility crap out of
> the kernel out into userspace libraries, where it belongs. It should have got
> there in the first place when the developers of these propritary
> tracepoint-consumers got the hint that those were going to change. Then you have
> a sane design:
>
> 1) The kernel, providing a tracepoint ABI that *can change over time*, because
> tracepoints are too tied to kernel code to afford not being changed.
> 2) Adapdation libraries, some which could be provided with the perf userspace
> libraries, some which could be provided along with the tracepoint consumer
> application, so the proprietary application can link on an open-source
> library that can be upgraded when needed.
> 3) The trace analyzer. So if the analyzer is open source, then it's fine, it
> could follow the rare ABI breakups that are needed by a simple upgrade.
> Ideally we might want to keep backward compatibility code in there too, but
> it's OK to require users to upgrade their tools if the kernel is upgraded.
> If the analyzer is closed-source, then it should interact with an open source
> library rather than with the kernel tracepoints ABI.
Totally agree here! The real solution is to provide such a library.
Anyone interested?
> So, given that I don't want to uglify kernel code based on some badly written
> proprietary userspace tools, and given we've given all possible warnings telling
> that the tracepoint ABI might change, I really don't see why we should bother
> bloating the kernel with this. The analyzers should be changed to use adaptation
> libraries instead.
>
>> 3) The power events get cleaned up without compatibility to
>> former kernels versions.
>>
>> There are patches for 2. and 3., for 1. there obviously are no
>> needed.
>> For 2., the patches (mine or Jeans), need some polishing. IMO
>> these double events inside of general code aren't that bad.
>> I trust Jean, that it's not that easy with all the include magic
>> and macros, partly realized that myself already and it's not worth
>> it to dig further for a temporary solution.
>>
>> Votes so far:
>> 1. Arjan
>> 2. Myself, Jean
>> 3. Peter Zijlstra and Mathieu Desnoyers
I am for 3 but I do not mind to provide the code for 2.
>>
>> Jean's work got successfully blocked for weeks now.
>> If there would be a final decision by a maintainer who is going to
>> merge Jean's work, that would be great and it would finally be worth
>> to send updated patches again which hopefully some day find their way into
>> a linux-next kernel...
>
> Yes, sadly this debate running in circles hurts contributors.
Indeed.
Honestly I do not know what to do next. Here are some facts:
- Thomas and myself did completely rework the patches a couple of
times now and most of them got acked
- A transition API has been provided along with a Kconfig option.
Special care has been taken to provide an easy to maintain solution
(just remove the option from Kconfig and a few lines of code in
kernel/trace)
- I am willing to provide the according patches to pytimechart, for
the new API only. Some pytimechart fixes from myself are already in
the tree thanks to the maintainer (Pierre).
Please let us move this forward. Some more on-going work is depending
on those changes.
Thanks,
Jean
>
> Thanks for the summary!
>
> Mathieu
>
>>
>> Thanks,
>>
>> Thomas
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
>
--
To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:08 ` Mathieu Desnoyers
2010-10-07 15:23 ` Pierre Tardy
2010-10-07 15:45 ` Jean Pihet
@ 2010-10-07 15:49 ` Thomas Renninger
2010-10-07 15:56 ` Jean Pihet
2 siblings, 1 reply; 47+ messages in thread
From: Thomas Renninger @ 2010-10-07 15:49 UTC (permalink / raw)
To: Mathieu Desnoyers
Cc: Jean Pihet, linux-trace-users, linux-pm, linux-perf-users, mingo,
arjan, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Frederic Weisbecker, Andrew Morton, Linus Torvalds
On Thursday 07 October 2010 17:08:25 Mathieu Desnoyers wrote:
> [ Adding a few more CCs, since this discussion is about a tracepoint
> userspace ABI policy, which is a topic of general interest. ]
>
...
> Yes, sadly this debate running in circles hurts contributors.
>
> Thanks for the summary!
Thanks for yours!
So you (and Peter Zijlstra and some others) prefer the solution I
posted with these two patches:
[PATCH 1/2] PERF(kernel): Cleanup power events
[PATCH 2/2] PERF(userspace): Adjust perf timechart to the new power events
Those should be fine.
Ingo, can you merge them, so that Jean can finally put his ARM
specific implementation on top.
Thanks,
Thomas
^ permalink raw reply [flat|nested] 47+ messages in thread* Re: PATCH [0/4] perf: clean-up of power events API
2010-10-07 15:49 ` Thomas Renninger
@ 2010-10-07 15:56 ` Jean Pihet
0 siblings, 0 replies; 47+ messages in thread
From: Jean Pihet @ 2010-10-07 15:56 UTC (permalink / raw)
To: Thomas Renninger
Cc: Mathieu Desnoyers, linux-trace-users, linux-pm, linux-perf-users,
mingo, arjan, rjw, linux-omap, Peter Zijlstra, Kevin Hilman,
Steven Rostedt, Frank Eigler, Masami Hiramatsu, Thomas Gleixner,
Frederic Weisbecker, Andrew Morton, Linus Torvalds
Thomas,
On Thu, Oct 7, 2010 at 5:49 PM, Thomas Renninger <trenn@suse.de> wrote:
> On Thursday 07 October 2010 17:08:25 Mathieu Desnoyers wrote:
>> [ Adding a few more CCs, since this discussion is about a tracepoint
>> userspace ABI policy, which is a topic of general interest. ]
>>
> ...
>> Yes, sadly this debate running in circles hurts contributors.
>>
>> Thanks for the summary!
> Thanks for yours!
>
> So you (and Peter Zijlstra and some others) prefer the solution I
> posted with these two patches:
> [PATCH 1/2] PERF(kernel): Cleanup power events
My latest patch [1/4] is a rework of your patch. It adds:
- adaptation to linux-2.6-tip
- change of a wrong permission intel_idle.c
- light changes in the API (correction of trace printks ...)
I would prefer to use that version if you are ok.
> [PATCH 2/2] PERF(userspace): Adjust perf timechart to the new power events
>
> Those should be fine.
> Ingo, can you merge them, so that Jean can finally put his ARM
> specific implementation on top.
>
> Thanks,
>
> Thomas
>
Jean
--
To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 47+ messages in thread