* [PATCH 0/3] even more perf counter patches
@ 2009-05-25 15:39 Peter Zijlstra
2009-05-25 15:39 ` [PATCH 1/3] perf_counter: x86: expose INV and EDGE bits Peter Zijlstra
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Peter Zijlstra @ 2009-05-25 15:39 UTC (permalink / raw)
To: Ingo Molnar
Cc: Paul Mackerras, Corey Ashford, linux-kernel, Peter Zijlstra,
Arnaldo Carvalho de Melo, John Kacur
--
^ permalink raw reply [flat|nested] 9+ messages in thread* [PATCH 1/3] perf_counter: x86: expose INV and EDGE bits 2009-05-25 15:39 [PATCH 0/3] even more perf counter patches Peter Zijlstra @ 2009-05-25 15:39 ` Peter Zijlstra 2009-05-25 19:51 ` [tip:perfcounters/core] perf_counter: x86: Expose " tip-bot for Peter Zijlstra 2009-05-25 15:39 ` [PATCH 2/3] perf_counter: x86: remove interrupt throttle Peter Zijlstra 2009-05-25 15:39 ` [PATCH 3/3] perf_counter: generic per counter " Peter Zijlstra 2 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2009-05-25 15:39 UTC (permalink / raw) To: Ingo Molnar Cc: Paul Mackerras, Corey Ashford, linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo, John Kacur [-- Attachment #1: perf_counter-x86-masks.patch --] [-- Type: text/plain, Size: 1493 bytes --] Expose the INV and EDGE bits of the PMU to raw configs. LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- arch/x86/kernel/cpu/perf_counter.c | 8 ++++++++ 1 file changed, 8 insertions(+) Index: linux-2.6/arch/x86/kernel/cpu/perf_counter.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/cpu/perf_counter.c +++ linux-2.6/arch/x86/kernel/cpu/perf_counter.c @@ -87,11 +87,15 @@ static u64 intel_pmu_raw_event(u64 event { #define CORE_EVNTSEL_EVENT_MASK 0x000000FFULL #define CORE_EVNTSEL_UNIT_MASK 0x0000FF00ULL +#define CORE_EVNTSEL_EDGE_MASK 0x00040000ULL +#define CORE_EVNTSEL_INV_MASK 0x00800000ULL #define CORE_EVNTSEL_COUNTER_MASK 0xFF000000ULL #define CORE_EVNTSEL_MASK \ (CORE_EVNTSEL_EVENT_MASK | \ CORE_EVNTSEL_UNIT_MASK | \ + CORE_EVNTSEL_EDGE_MASK | \ + CORE_EVNTSEL_INV_MASK | \ CORE_EVNTSEL_COUNTER_MASK) return event & CORE_EVNTSEL_MASK; @@ -119,11 +123,15 @@ static u64 amd_pmu_raw_event(u64 event) { #define K7_EVNTSEL_EVENT_MASK 0x7000000FFULL #define K7_EVNTSEL_UNIT_MASK 0x00000FF00ULL +#define K7_EVNTSEL_EDGE_MASK 0x000040000ULL +#define K7_EVNTSEL_INV_MASK 0x000800000ULL #define K7_EVNTSEL_COUNTER_MASK 0x0FF000000ULL #define K7_EVNTSEL_MASK \ (K7_EVNTSEL_EVENT_MASK | \ K7_EVNTSEL_UNIT_MASK | \ + K7_EVNTSEL_EDGE_MASK | \ + K7_EVNTSEL_INV_MASK | \ K7_EVNTSEL_COUNTER_MASK) return event & K7_EVNTSEL_MASK; -- ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:perfcounters/core] perf_counter: x86: Expose INV and EDGE bits 2009-05-25 15:39 ` [PATCH 1/3] perf_counter: x86: expose INV and EDGE bits Peter Zijlstra @ 2009-05-25 19:51 ` tip-bot for Peter Zijlstra 0 siblings, 0 replies; 9+ messages in thread From: tip-bot for Peter Zijlstra @ 2009-05-25 19:51 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, paulus, hpa, mingo, jkacur, a.p.zijlstra, tglx, cjashfor, mingo Commit-ID: ff99be573e02e9f7edc23b472c7f9a5ddba12795 Gitweb: http://git.kernel.org/tip/ff99be573e02e9f7edc23b472c7f9a5ddba12795 Author: Peter Zijlstra <a.p.zijlstra@chello.nl> AuthorDate: Mon, 25 May 2009 17:39:03 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 25 May 2009 21:41:11 +0200 perf_counter: x86: Expose INV and EDGE bits Expose the INV and EDGE bits of the PMU to raw configs. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.494709027@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/cpu/perf_counter.c | 8 ++++++++ 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_counter.c b/arch/x86/kernel/cpu/perf_counter.c index 6cc1660..c14437f 100644 --- a/arch/x86/kernel/cpu/perf_counter.c +++ b/arch/x86/kernel/cpu/perf_counter.c @@ -87,11 +87,15 @@ static u64 intel_pmu_raw_event(u64 event) { #define CORE_EVNTSEL_EVENT_MASK 0x000000FFULL #define CORE_EVNTSEL_UNIT_MASK 0x0000FF00ULL +#define CORE_EVNTSEL_EDGE_MASK 0x00040000ULL +#define CORE_EVNTSEL_INV_MASK 0x00800000ULL #define CORE_EVNTSEL_COUNTER_MASK 0xFF000000ULL #define CORE_EVNTSEL_MASK \ (CORE_EVNTSEL_EVENT_MASK | \ CORE_EVNTSEL_UNIT_MASK | \ + CORE_EVNTSEL_EDGE_MASK | \ + CORE_EVNTSEL_INV_MASK | \ CORE_EVNTSEL_COUNTER_MASK) return event & CORE_EVNTSEL_MASK; @@ -119,11 +123,15 @@ static u64 amd_pmu_raw_event(u64 event) { #define K7_EVNTSEL_EVENT_MASK 0x7000000FFULL #define K7_EVNTSEL_UNIT_MASK 0x00000FF00ULL +#define K7_EVNTSEL_EDGE_MASK 0x000040000ULL +#define K7_EVNTSEL_INV_MASK 0x000800000ULL #define K7_EVNTSEL_COUNTER_MASK 0x0FF000000ULL #define K7_EVNTSEL_MASK \ (K7_EVNTSEL_EVENT_MASK | \ K7_EVNTSEL_UNIT_MASK | \ + K7_EVNTSEL_EDGE_MASK | \ + K7_EVNTSEL_INV_MASK | \ K7_EVNTSEL_COUNTER_MASK) return event & K7_EVNTSEL_MASK; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/3] perf_counter: x86: remove interrupt throttle 2009-05-25 15:39 [PATCH 0/3] even more perf counter patches Peter Zijlstra 2009-05-25 15:39 ` [PATCH 1/3] perf_counter: x86: expose INV and EDGE bits Peter Zijlstra @ 2009-05-25 15:39 ` Peter Zijlstra 2009-05-25 19:51 ` [tip:perfcounters/core] perf_counter: x86: Remove " tip-bot for Peter Zijlstra 2009-05-25 15:39 ` [PATCH 3/3] perf_counter: generic per counter " Peter Zijlstra 2 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2009-05-25 15:39 UTC (permalink / raw) To: Ingo Molnar Cc: Paul Mackerras, Corey Ashford, linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo, John Kacur [-- Attachment #1: perf_counter-x86-remove-throttle.patch --] [-- Type: text/plain, Size: 4127 bytes --] remove the x86 specific interrupt throttle LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- arch/x86/kernel/apic/apic.c | 2 - arch/x86/kernel/cpu/perf_counter.c | 47 +++---------------------------------- include/linux/perf_counter.h | 2 - 3 files changed, 5 insertions(+), 46 deletions(-) Index: linux-2.6/arch/x86/kernel/apic/apic.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/apic/apic.c +++ linux-2.6/arch/x86/kernel/apic/apic.c @@ -817,8 +817,6 @@ static void local_apic_timer_interrupt(v inc_irq_stat(apic_timer_irqs); evt->event_handler(evt); - - perf_counter_unthrottle(); } /* Index: linux-2.6/arch/x86/kernel/cpu/perf_counter.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/cpu/perf_counter.c +++ linux-2.6/arch/x86/kernel/cpu/perf_counter.c @@ -719,11 +719,6 @@ static void intel_pmu_save_and_restart(s } /* - * Maximum interrupt frequency of 100KHz per CPU - */ -#define PERFMON_MAX_INTERRUPTS (100000/HZ) - -/* * This handler is triggered by the local APIC, so the APIC IRQ handling * rules apply: */ @@ -775,15 +770,14 @@ again: if (status) goto again; - if (++cpuc->interrupts != PERFMON_MAX_INTERRUPTS) - perf_enable(); + perf_enable(); return 1; } static int amd_pmu_handle_irq(struct pt_regs *regs, int nmi) { - int cpu, idx, throttle = 0, handled = 0; + int cpu, idx, handled = 0; struct cpu_hw_counters *cpuc; struct perf_counter *counter; struct hw_perf_counter *hwc; @@ -792,16 +786,7 @@ static int amd_pmu_handle_irq(struct pt_ cpu = smp_processor_id(); cpuc = &per_cpu(cpu_hw_counters, cpu); - if (++cpuc->interrupts == PERFMON_MAX_INTERRUPTS) { - throttle = 1; - __perf_disable(); - cpuc->enabled = 0; - barrier(); - } - for (idx = 0; idx < x86_pmu.num_counters; idx++) { - int disable = 0; - if (!test_bit(idx, cpuc->active_mask)) continue; @@ -809,45 +794,23 @@ static int amd_pmu_handle_irq(struct pt_ hwc = &counter->hw; if (counter->hw_event.nmi != nmi) - goto next; + continue; val = x86_perf_counter_update(counter, hwc, idx); if (val & (1ULL << (x86_pmu.counter_bits - 1))) - goto next; + continue; /* counter overflow */ x86_perf_counter_set_period(counter, hwc, idx); handled = 1; inc_irq_stat(apic_perf_irqs); - disable = perf_counter_overflow(counter, nmi, regs, 0); - -next: - if (disable || throttle) + if (perf_counter_overflow(counter, nmi, regs, 0)) amd_pmu_disable_counter(hwc, idx); } return handled; } -void perf_counter_unthrottle(void) -{ - struct cpu_hw_counters *cpuc; - - if (!x86_pmu_initialized()) - return; - - cpuc = &__get_cpu_var(cpu_hw_counters); - if (cpuc->interrupts >= PERFMON_MAX_INTERRUPTS) { - /* - * Clear them before re-enabling irqs/NMIs again: - */ - cpuc->interrupts = 0; - perf_enable(); - } else { - cpuc->interrupts = 0; - } -} - void smp_perf_counter_interrupt(struct pt_regs *regs) { irq_enter(); Index: linux-2.6/include/linux/perf_counter.h =================================================================== --- linux-2.6.orig/include/linux/perf_counter.h +++ linux-2.6/include/linux/perf_counter.h @@ -570,7 +570,6 @@ extern void perf_counter_init_task(struc extern void perf_counter_exit_task(struct task_struct *child); extern void perf_counter_do_pending(void); extern void perf_counter_print_debug(void); -extern void perf_counter_unthrottle(void); extern void __perf_disable(void); extern bool __perf_enable(void); extern void perf_disable(void); @@ -635,7 +634,6 @@ static inline void perf_counter_init_tas static inline void perf_counter_exit_task(struct task_struct *child) { } static inline void perf_counter_do_pending(void) { } static inline void perf_counter_print_debug(void) { } -static inline void perf_counter_unthrottle(void) { } static inline void perf_disable(void) { } static inline void perf_enable(void) { } static inline int perf_counter_task_disable(void) { return -EINVAL; } -- ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:perfcounters/core] perf_counter: x86: Remove interrupt throttle 2009-05-25 15:39 ` [PATCH 2/3] perf_counter: x86: remove interrupt throttle Peter Zijlstra @ 2009-05-25 19:51 ` tip-bot for Peter Zijlstra 0 siblings, 0 replies; 9+ messages in thread From: tip-bot for Peter Zijlstra @ 2009-05-25 19:51 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, paulus, hpa, mingo, jkacur, a.p.zijlstra, tglx, cjashfor, mingo Commit-ID: 48e22d56ecdeddd1ffb42a02fccba5c6ef42b133 Gitweb: http://git.kernel.org/tip/48e22d56ecdeddd1ffb42a02fccba5c6ef42b133 Author: Peter Zijlstra <a.p.zijlstra@chello.nl> AuthorDate: Mon, 25 May 2009 17:39:04 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 25 May 2009 21:41:12 +0200 perf_counter: x86: Remove interrupt throttle remove the x86 specific interrupt throttle Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.616671838@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/apic/apic.c | 2 - arch/x86/kernel/cpu/perf_counter.c | 47 ++++-------------------------------- include/linux/perf_counter.h | 2 - 3 files changed, 5 insertions(+), 46 deletions(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index b4f6440..89b63b5 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -763,8 +763,6 @@ static void local_apic_timer_interrupt(void) inc_irq_stat(apic_timer_irqs); evt->event_handler(evt); - - perf_counter_unthrottle(); } /* diff --git a/arch/x86/kernel/cpu/perf_counter.c b/arch/x86/kernel/cpu/perf_counter.c index c14437f..8c8177f 100644 --- a/arch/x86/kernel/cpu/perf_counter.c +++ b/arch/x86/kernel/cpu/perf_counter.c @@ -719,11 +719,6 @@ static void intel_pmu_save_and_restart(struct perf_counter *counter) } /* - * Maximum interrupt frequency of 100KHz per CPU - */ -#define PERFMON_MAX_INTERRUPTS (100000/HZ) - -/* * This handler is triggered by the local APIC, so the APIC IRQ handling * rules apply: */ @@ -775,15 +770,14 @@ again: if (status) goto again; - if (++cpuc->interrupts != PERFMON_MAX_INTERRUPTS) - perf_enable(); + perf_enable(); return 1; } static int amd_pmu_handle_irq(struct pt_regs *regs, int nmi) { - int cpu, idx, throttle = 0, handled = 0; + int cpu, idx, handled = 0; struct cpu_hw_counters *cpuc; struct perf_counter *counter; struct hw_perf_counter *hwc; @@ -792,16 +786,7 @@ static int amd_pmu_handle_irq(struct pt_regs *regs, int nmi) cpu = smp_processor_id(); cpuc = &per_cpu(cpu_hw_counters, cpu); - if (++cpuc->interrupts == PERFMON_MAX_INTERRUPTS) { - throttle = 1; - __perf_disable(); - cpuc->enabled = 0; - barrier(); - } - for (idx = 0; idx < x86_pmu.num_counters; idx++) { - int disable = 0; - if (!test_bit(idx, cpuc->active_mask)) continue; @@ -809,45 +794,23 @@ static int amd_pmu_handle_irq(struct pt_regs *regs, int nmi) hwc = &counter->hw; if (counter->hw_event.nmi != nmi) - goto next; + continue; val = x86_perf_counter_update(counter, hwc, idx); if (val & (1ULL << (x86_pmu.counter_bits - 1))) - goto next; + continue; /* counter overflow */ x86_perf_counter_set_period(counter, hwc, idx); handled = 1; inc_irq_stat(apic_perf_irqs); - disable = perf_counter_overflow(counter, nmi, regs, 0); - -next: - if (disable || throttle) + if (perf_counter_overflow(counter, nmi, regs, 0)) amd_pmu_disable_counter(hwc, idx); } return handled; } -void perf_counter_unthrottle(void) -{ - struct cpu_hw_counters *cpuc; - - if (!x86_pmu_initialized()) - return; - - cpuc = &__get_cpu_var(cpu_hw_counters); - if (cpuc->interrupts >= PERFMON_MAX_INTERRUPTS) { - /* - * Clear them before re-enabling irqs/NMIs again: - */ - cpuc->interrupts = 0; - perf_enable(); - } else { - cpuc->interrupts = 0; - } -} - void smp_perf_counter_interrupt(struct pt_regs *regs) { irq_enter(); diff --git a/include/linux/perf_counter.h b/include/linux/perf_counter.h index d3e85de..0c160be 100644 --- a/include/linux/perf_counter.h +++ b/include/linux/perf_counter.h @@ -570,7 +570,6 @@ extern int perf_counter_init_task(struct task_struct *child); extern void perf_counter_exit_task(struct task_struct *child); extern void perf_counter_do_pending(void); extern void perf_counter_print_debug(void); -extern void perf_counter_unthrottle(void); extern void __perf_disable(void); extern bool __perf_enable(void); extern void perf_disable(void); @@ -635,7 +634,6 @@ static inline int perf_counter_init_task(struct task_struct *child) { } static inline void perf_counter_exit_task(struct task_struct *child) { } static inline void perf_counter_do_pending(void) { } static inline void perf_counter_print_debug(void) { } -static inline void perf_counter_unthrottle(void) { } static inline void perf_disable(void) { } static inline void perf_enable(void) { } static inline int perf_counter_task_disable(void) { return -EINVAL; } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 3/3] perf_counter: generic per counter interrupt throttle 2009-05-25 15:39 [PATCH 0/3] even more perf counter patches Peter Zijlstra 2009-05-25 15:39 ` [PATCH 1/3] perf_counter: x86: expose INV and EDGE bits Peter Zijlstra 2009-05-25 15:39 ` [PATCH 2/3] perf_counter: x86: remove interrupt throttle Peter Zijlstra @ 2009-05-25 15:39 ` Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] perf_counter: Generic " tip-bot for Peter Zijlstra ` (2 more replies) 2 siblings, 3 replies; 9+ messages in thread From: Peter Zijlstra @ 2009-05-25 15:39 UTC (permalink / raw) To: Ingo Molnar Cc: Paul Mackerras, Corey Ashford, linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo, John Kacur [-- Attachment #1: perf_counter-generic-throttle.patch --] [-- Type: text/plain, Size: 6429 bytes --] Introduce a generic per counter interrupt throttle. This uses the perf_counter_overflow() quick disable to throttle a specific counter when its going too fast when a pmu->unthrottle() method is provided which can undo the quick disable. Power needs to implement both the quick disable and the unthrottle method. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> --- arch/x86/kernel/cpu/perf_counter.c | 13 ++++++++ include/linux/perf_counter.h | 11 ++++++ kernel/perf_counter.c | 59 ++++++++++++++++++++++++++++++++++--- kernel/sysctl.c | 8 +++++ 4 files changed, 87 insertions(+), 4 deletions(-) Index: linux-2.6/include/linux/perf_counter.h =================================================================== --- linux-2.6.orig/include/linux/perf_counter.h +++ linux-2.6/include/linux/perf_counter.h @@ -267,6 +267,15 @@ enum perf_event_type { PERF_EVENT_PERIOD = 4, /* + * struct { + * struct perf_event_header header; + * u64 time; + * }; + */ + PERF_EVENT_THROTTLE = 5, + PERF_EVENT_UNTHROTTLE = 6, + + /* * When header.misc & PERF_EVENT_MISC_OVERFLOW the event_type field * will be PERF_RECORD_* * @@ -367,6 +376,7 @@ struct pmu { int (*enable) (struct perf_counter *counter); void (*disable) (struct perf_counter *counter); void (*read) (struct perf_counter *counter); + void (*unthrottle) (struct perf_counter *counter); }; /** @@ -613,6 +623,7 @@ extern struct perf_callchain_entry *perf extern int sysctl_perf_counter_priv; extern int sysctl_perf_counter_mlock; +extern int sysctl_perf_counter_limit; extern void perf_counter_init(void); Index: linux-2.6/kernel/perf_counter.c =================================================================== --- linux-2.6.orig/kernel/perf_counter.c +++ linux-2.6/kernel/perf_counter.c @@ -46,6 +46,7 @@ static atomic_t nr_comm_tracking __read_ int sysctl_perf_counter_priv __read_mostly; /* do we need to be privileged */ int sysctl_perf_counter_mlock __read_mostly = 512; /* 'free' kb per user */ +int sysctl_perf_counter_limit __read_mostly = 100000; /* max NMIs per second */ /* * Lock for (sysadmin-configurable) counter reservations: @@ -1091,12 +1092,15 @@ int perf_counter_task_disable(void) return 0; } +#define MAX_INTERRUPTS (~0ULL) + +static void perf_log_throttle(struct perf_counter *counter, int enable); static void perf_log_period(struct perf_counter *counter, u64 period); static void perf_adjust_freq(struct perf_counter_context *ctx) { struct perf_counter *counter; - u64 irq_period; + u64 interrupts, irq_period; u64 events, period; s64 delta; @@ -1105,10 +1109,19 @@ static void perf_adjust_freq(struct perf if (counter->state != PERF_COUNTER_STATE_ACTIVE) continue; + interrupts = counter->hw.interrupts; + counter->hw.interrupts = 0; + + if (interrupts == MAX_INTERRUPTS) { + perf_log_throttle(counter, 1); + counter->pmu->unthrottle(counter); + interrupts = 2*sysctl_perf_counter_limit/HZ; + } + if (!counter->hw_event.freq || !counter->hw_event.irq_freq) continue; - events = HZ * counter->hw.interrupts * counter->hw.irq_period; + events = HZ * interrupts * counter->hw.irq_period; period = div64_u64(events, counter->hw_event.irq_freq); delta = (s64)(1 + period - counter->hw.irq_period); @@ -1122,7 +1135,6 @@ static void perf_adjust_freq(struct perf perf_log_period(counter, irq_period); counter->hw.irq_period = irq_period; - counter->hw.interrupts = 0; } spin_unlock(&ctx->lock); } @@ -2545,6 +2557,35 @@ static void perf_log_period(struct perf_ } /* + * IRQ throttle logging + */ + +static void perf_log_throttle(struct perf_counter *counter, int enable) +{ + struct perf_output_handle handle; + int ret; + + struct { + struct perf_event_header header; + u64 time; + } throttle_event = { + .header = { + .type = PERF_EVENT_THROTTLE + 1, + .misc = 0, + .size = sizeof(throttle_event), + }, + .time = sched_clock(), + }; + + ret = perf_output_begin(&handle, counter, sizeof(throttle_event), 0, 0); + if (ret) + return; + + perf_output_put(&handle, throttle_event); + perf_output_end(&handle); +} + +/* * Generic counter overflow handling. */ @@ -2552,9 +2593,19 @@ int perf_counter_overflow(struct perf_co int nmi, struct pt_regs *regs, u64 addr) { int events = atomic_read(&counter->event_limit); + int throttle = counter->pmu->unthrottle != NULL; int ret = 0; - counter->hw.interrupts++; + if (!throttle) { + counter->hw.interrupts++; + } else if (counter->hw.interrupts != MAX_INTERRUPTS) { + counter->hw.interrupts++; + if (HZ*counter->hw.interrupts > (u64)sysctl_perf_counter_limit) { + counter->hw.interrupts = MAX_INTERRUPTS; + perf_log_throttle(counter, 0); + ret = 1; + } + } /* * XXX event_limit might not quite work as expected on inherited Index: linux-2.6/kernel/sysctl.c =================================================================== --- linux-2.6.orig/kernel/sysctl.c +++ linux-2.6/kernel/sysctl.c @@ -939,6 +939,14 @@ static struct ctl_table kern_table[] = { .mode = 0644, .proc_handler = &proc_dointvec, }, + { + .ctl_name = CTL_UNNUMBERED, + .procname = "perf_counter_int_limit", + .data = &sysctl_perf_counter_limit, + .maxlen = sizeof(sysctl_perf_counter_limit), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, #endif /* * NOTE: do not add new entries to this table unless you have read Index: linux-2.6/arch/x86/kernel/cpu/perf_counter.c =================================================================== --- linux-2.6.orig/arch/x86/kernel/cpu/perf_counter.c +++ linux-2.6/arch/x86/kernel/cpu/perf_counter.c @@ -623,6 +623,18 @@ try_generic: return 0; } +static void x86_pmu_unthrottle(struct perf_counter *counter) +{ + struct cpu_hw_counters *cpuc = &__get_cpu_var(cpu_hw_counters); + struct hw_perf_counter *hwc = &counter->hw; + + if (WARN_ON_ONCE(hwc->idx >= X86_PMC_IDX_MAX || + cpuc->counters[hwc->idx] != counter)) + return; + + x86_pmu.enable(hwc, hwc->idx); +} + void perf_counter_print_debug(void) { u64 ctrl, status, overflow, pmc_ctrl, pmc_count, prev_left, fixed; @@ -1038,6 +1050,7 @@ static const struct pmu pmu = { .enable = x86_pmu_enable, .disable = x86_pmu_disable, .read = x86_pmu_read, + .unthrottle = x86_pmu_unthrottle, }; const struct pmu *hw_perf_counter_init(struct perf_counter *counter) -- ^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip:perfcounters/core] perf_counter: Generic per counter interrupt throttle 2009-05-25 15:39 ` [PATCH 3/3] perf_counter: generic per counter " Peter Zijlstra @ 2009-05-25 19:52 ` tip-bot for Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] Revert "perf_counter, x86: speed up the scheduling fast-path" tip-bot for Ingo Molnar 2009-05-25 20:06 ` [tip:perfcounters/core] perf_counter: fix warning & lockup tip-bot for Ingo Molnar 2 siblings, 0 replies; 9+ messages in thread From: tip-bot for Peter Zijlstra @ 2009-05-25 19:52 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, paulus, hpa, mingo, jkacur, a.p.zijlstra, tglx, cjashfor, mingo Commit-ID: a78ac3258782f3e64cb40beb5990808e1febcc0c Gitweb: http://git.kernel.org/tip/a78ac3258782f3e64cb40beb5990808e1febcc0c Author: Peter Zijlstra <a.p.zijlstra@chello.nl> AuthorDate: Mon, 25 May 2009 17:39:05 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 25 May 2009 21:41:12 +0200 perf_counter: Generic per counter interrupt throttle Introduce a generic per counter interrupt throttle. This uses the perf_counter_overflow() quick disable to throttle a specific counter when its going too fast when a pmu->unthrottle() method is provided which can undo the quick disable. Power needs to implement both the quick disable and the unthrottle method. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.703093461@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/cpu/perf_counter.c | 13 ++++++++ include/linux/perf_counter.h | 11 +++++++ kernel/perf_counter.c | 59 +++++++++++++++++++++++++++++++++-- kernel/sysctl.c | 8 +++++ 4 files changed, 87 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_counter.c b/arch/x86/kernel/cpu/perf_counter.c index 8c8177f..c4b543d 100644 --- a/arch/x86/kernel/cpu/perf_counter.c +++ b/arch/x86/kernel/cpu/perf_counter.c @@ -623,6 +623,18 @@ try_generic: return 0; } +static void x86_pmu_unthrottle(struct perf_counter *counter) +{ + struct cpu_hw_counters *cpuc = &__get_cpu_var(cpu_hw_counters); + struct hw_perf_counter *hwc = &counter->hw; + + if (WARN_ON_ONCE(hwc->idx >= X86_PMC_IDX_MAX || + cpuc->counters[hwc->idx] != counter)) + return; + + x86_pmu.enable(hwc, hwc->idx); +} + void perf_counter_print_debug(void) { u64 ctrl, status, overflow, pmc_ctrl, pmc_count, prev_left, fixed; @@ -1038,6 +1050,7 @@ static const struct pmu pmu = { .enable = x86_pmu_enable, .disable = x86_pmu_disable, .read = x86_pmu_read, + .unthrottle = x86_pmu_unthrottle, }; const struct pmu *hw_perf_counter_init(struct perf_counter *counter) diff --git a/include/linux/perf_counter.h b/include/linux/perf_counter.h index 0c160be..e3a7585 100644 --- a/include/linux/perf_counter.h +++ b/include/linux/perf_counter.h @@ -267,6 +267,15 @@ enum perf_event_type { PERF_EVENT_PERIOD = 4, /* + * struct { + * struct perf_event_header header; + * u64 time; + * }; + */ + PERF_EVENT_THROTTLE = 5, + PERF_EVENT_UNTHROTTLE = 6, + + /* * When header.misc & PERF_EVENT_MISC_OVERFLOW the event_type field * will be PERF_RECORD_* * @@ -367,6 +376,7 @@ struct pmu { int (*enable) (struct perf_counter *counter); void (*disable) (struct perf_counter *counter); void (*read) (struct perf_counter *counter); + void (*unthrottle) (struct perf_counter *counter); }; /** @@ -613,6 +623,7 @@ extern struct perf_callchain_entry *perf_callchain(struct pt_regs *regs); extern int sysctl_perf_counter_priv; extern int sysctl_perf_counter_mlock; +extern int sysctl_perf_counter_limit; extern void perf_counter_init(void); diff --git a/kernel/perf_counter.c b/kernel/perf_counter.c index 14b1fe9..ec9c400 100644 --- a/kernel/perf_counter.c +++ b/kernel/perf_counter.c @@ -46,6 +46,7 @@ static atomic_t nr_comm_tracking __read_mostly; int sysctl_perf_counter_priv __read_mostly; /* do we need to be privileged */ int sysctl_perf_counter_mlock __read_mostly = 512; /* 'free' kb per user */ +int sysctl_perf_counter_limit __read_mostly = 100000; /* max NMIs per second */ /* * Lock for (sysadmin-configurable) counter reservations: @@ -1066,12 +1067,15 @@ static void perf_counter_cpu_sched_in(struct perf_cpu_context *cpuctx, int cpu) __perf_counter_sched_in(ctx, cpuctx, cpu); } +#define MAX_INTERRUPTS (~0ULL) + +static void perf_log_throttle(struct perf_counter *counter, int enable); static void perf_log_period(struct perf_counter *counter, u64 period); static void perf_adjust_freq(struct perf_counter_context *ctx) { struct perf_counter *counter; - u64 irq_period; + u64 interrupts, irq_period; u64 events, period; s64 delta; @@ -1080,10 +1084,19 @@ static void perf_adjust_freq(struct perf_counter_context *ctx) if (counter->state != PERF_COUNTER_STATE_ACTIVE) continue; + interrupts = counter->hw.interrupts; + counter->hw.interrupts = 0; + + if (interrupts == MAX_INTERRUPTS) { + perf_log_throttle(counter, 1); + counter->pmu->unthrottle(counter); + interrupts = 2*sysctl_perf_counter_limit/HZ; + } + if (!counter->hw_event.freq || !counter->hw_event.irq_freq) continue; - events = HZ * counter->hw.interrupts * counter->hw.irq_period; + events = HZ * interrupts * counter->hw.irq_period; period = div64_u64(events, counter->hw_event.irq_freq); delta = (s64)(1 + period - counter->hw.irq_period); @@ -1097,7 +1110,6 @@ static void perf_adjust_freq(struct perf_counter_context *ctx) perf_log_period(counter, irq_period); counter->hw.irq_period = irq_period; - counter->hw.interrupts = 0; } spin_unlock(&ctx->lock); } @@ -2544,6 +2556,35 @@ static void perf_log_period(struct perf_counter *counter, u64 period) } /* + * IRQ throttle logging + */ + +static void perf_log_throttle(struct perf_counter *counter, int enable) +{ + struct perf_output_handle handle; + int ret; + + struct { + struct perf_event_header header; + u64 time; + } throttle_event = { + .header = { + .type = PERF_EVENT_THROTTLE + 1, + .misc = 0, + .size = sizeof(throttle_event), + }, + .time = sched_clock(), + }; + + ret = perf_output_begin(&handle, counter, sizeof(throttle_event), 0, 0); + if (ret) + return; + + perf_output_put(&handle, throttle_event); + perf_output_end(&handle); +} + +/* * Generic counter overflow handling. */ @@ -2551,9 +2592,19 @@ int perf_counter_overflow(struct perf_counter *counter, int nmi, struct pt_regs *regs, u64 addr) { int events = atomic_read(&counter->event_limit); + int throttle = counter->pmu->unthrottle != NULL; int ret = 0; - counter->hw.interrupts++; + if (!throttle) { + counter->hw.interrupts++; + } else if (counter->hw.interrupts != MAX_INTERRUPTS) { + counter->hw.interrupts++; + if (HZ*counter->hw.interrupts > (u64)sysctl_perf_counter_limit) { + counter->hw.interrupts = MAX_INTERRUPTS; + perf_log_throttle(counter, 0); + ret = 1; + } + } /* * XXX event_limit might not quite work as expected on inherited diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 3cb1849..0c4bf86 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -930,6 +930,14 @@ static struct ctl_table kern_table[] = { .mode = 0644, .proc_handler = &proc_dointvec, }, + { + .ctl_name = CTL_UNNUMBERED, + .procname = "perf_counter_int_limit", + .data = &sysctl_perf_counter_limit, + .maxlen = sizeof(sysctl_perf_counter_limit), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, #endif /* * NOTE: do not add new entries to this table unless you have read ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [tip:perfcounters/core] Revert "perf_counter, x86: speed up the scheduling fast-path" 2009-05-25 15:39 ` [PATCH 3/3] perf_counter: generic per counter " Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] perf_counter: Generic " tip-bot for Peter Zijlstra @ 2009-05-25 19:52 ` tip-bot for Ingo Molnar 2009-05-25 20:06 ` [tip:perfcounters/core] perf_counter: fix warning & lockup tip-bot for Ingo Molnar 2 siblings, 0 replies; 9+ messages in thread From: tip-bot for Ingo Molnar @ 2009-05-25 19:52 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, paulus, hpa, mingo, jkacur, a.p.zijlstra, tglx, cjashfor, mingo Commit-ID: 53b441a565bf4036ab49c8ea04c5ad06ace7dd6b Gitweb: http://git.kernel.org/tip/53b441a565bf4036ab49c8ea04c5ad06ace7dd6b Author: Ingo Molnar <mingo@elte.hu> AuthorDate: Mon, 25 May 2009 21:41:28 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 25 May 2009 21:41:28 +0200 Revert "perf_counter, x86: speed up the scheduling fast-path" This reverts commit b68f1d2e7aa21029d73c7d453a8046e95d351740. It is causing problems (stuck/stuttering profiling) - when mixed NMI and non-NMI counters are used. Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.703093461@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- arch/x86/kernel/cpu/perf_counter.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/perf_counter.c b/arch/x86/kernel/cpu/perf_counter.c index c4b543d..189bf9d 100644 --- a/arch/x86/kernel/cpu/perf_counter.c +++ b/arch/x86/kernel/cpu/perf_counter.c @@ -293,7 +293,6 @@ static int __hw_perf_counter_init(struct perf_counter *counter) return -EACCES; hwc->nmi = 1; } - perf_counters_lapic_init(hwc->nmi); if (!hwc->irq_period) hwc->irq_period = x86_pmu.max_period; @@ -612,6 +611,8 @@ try_generic: hwc->counter_base = x86_pmu.perfctr; } + perf_counters_lapic_init(hwc->nmi); + x86_pmu.disable(hwc, idx); cpuc->counters[idx] = counter; @@ -1037,7 +1038,7 @@ void __init init_hw_perf_counters(void) pr_info("... counter mask: %016Lx\n", perf_counter_mask); - perf_counters_lapic_init(1); + perf_counters_lapic_init(0); register_die_notifier(&perf_counter_nmi_notifier); } ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [tip:perfcounters/core] perf_counter: fix warning & lockup 2009-05-25 15:39 ` [PATCH 3/3] perf_counter: generic per counter " Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] perf_counter: Generic " tip-bot for Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] Revert "perf_counter, x86: speed up the scheduling fast-path" tip-bot for Ingo Molnar @ 2009-05-25 20:06 ` tip-bot for Ingo Molnar 2 siblings, 0 replies; 9+ messages in thread From: tip-bot for Ingo Molnar @ 2009-05-25 20:06 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, acme, paulus, hpa, mingo, jkacur, a.p.zijlstra, tglx, cjashfor, mingo Commit-ID: 0127c3ea082ee9f1034789b978dfc7fd83254617 Gitweb: http://git.kernel.org/tip/0127c3ea082ee9f1034789b978dfc7fd83254617 Author: Ingo Molnar <mingo@elte.hu> AuthorDate: Mon, 25 May 2009 22:03:26 +0200 Committer: Ingo Molnar <mingo@elte.hu> CommitDate: Mon, 25 May 2009 22:02:23 +0200 perf_counter: fix warning & lockup - remove bogus warning - fix wakeup from NMI path lockup - also fix up whitespace noise in perf_counter.h Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: John Kacur <jkacur@redhat.com> LKML-Reference: <20090525153931.703093461@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/linux/perf_counter.h | 78 +++++++++++++++++++++--------------------- kernel/perf_counter.c | 4 +-- 2 files changed, 40 insertions(+), 42 deletions(-) diff --git a/include/linux/perf_counter.h b/include/linux/perf_counter.h index e3a7585..2b16ed3 100644 --- a/include/linux/perf_counter.h +++ b/include/linux/perf_counter.h @@ -73,7 +73,7 @@ enum sw_event_ids { PERF_SW_EVENTS_MAX = 7, }; -#define __PERF_COUNTER_MASK(name) \ +#define __PERF_COUNTER_MASK(name) \ (((1ULL << PERF_COUNTER_##name##_BITS) - 1) << \ PERF_COUNTER_##name##_SHIFT) @@ -98,14 +98,14 @@ enum sw_event_ids { * in the overflow packets. */ enum perf_counter_record_format { - PERF_RECORD_IP = 1U << 0, - PERF_RECORD_TID = 1U << 1, - PERF_RECORD_TIME = 1U << 2, - PERF_RECORD_ADDR = 1U << 3, - PERF_RECORD_GROUP = 1U << 4, - PERF_RECORD_CALLCHAIN = 1U << 5, - PERF_RECORD_CONFIG = 1U << 6, - PERF_RECORD_CPU = 1U << 7, + PERF_RECORD_IP = 1U << 0, + PERF_RECORD_TID = 1U << 1, + PERF_RECORD_TIME = 1U << 2, + PERF_RECORD_ADDR = 1U << 3, + PERF_RECORD_GROUP = 1U << 4, + PERF_RECORD_CALLCHAIN = 1U << 5, + PERF_RECORD_CONFIG = 1U << 6, + PERF_RECORD_CPU = 1U << 7, }; /* @@ -235,13 +235,13 @@ enum perf_event_type { * correlate userspace IPs to code. They have the following structure: * * struct { - * struct perf_event_header header; + * struct perf_event_header header; * - * u32 pid, tid; - * u64 addr; - * u64 len; - * u64 pgoff; - * char filename[]; + * u32 pid, tid; + * u64 addr; + * u64 len; + * u64 pgoff; + * char filename[]; * }; */ PERF_EVENT_MMAP = 1, @@ -249,27 +249,27 @@ enum perf_event_type { /* * struct { - * struct perf_event_header header; + * struct perf_event_header header; * - * u32 pid, tid; - * char comm[]; + * u32 pid, tid; + * char comm[]; * }; */ PERF_EVENT_COMM = 3, /* * struct { - * struct perf_event_header header; - * u64 time; - * u64 irq_period; + * struct perf_event_header header; + * u64 time; + * u64 irq_period; * }; */ PERF_EVENT_PERIOD = 4, /* * struct { - * struct perf_event_header header; - * u64 time; + * struct perf_event_header header; + * u64 time; * }; */ PERF_EVENT_THROTTLE = 5, @@ -280,23 +280,23 @@ enum perf_event_type { * will be PERF_RECORD_* * * struct { - * struct perf_event_header header; + * struct perf_event_header header; * - * { u64 ip; } && PERF_RECORD_IP - * { u32 pid, tid; } && PERF_RECORD_TID - * { u64 time; } && PERF_RECORD_TIME - * { u64 addr; } && PERF_RECORD_ADDR - * { u64 config; } && PERF_RECORD_CONFIG - * { u32 cpu, res; } && PERF_RECORD_CPU + * { u64 ip; } && PERF_RECORD_IP + * { u32 pid, tid; } && PERF_RECORD_TID + * { u64 time; } && PERF_RECORD_TIME + * { u64 addr; } && PERF_RECORD_ADDR + * { u64 config; } && PERF_RECORD_CONFIG + * { u32 cpu, res; } && PERF_RECORD_CPU * - * { u64 nr; - * { u64 event, val; } cnt[nr]; } && PERF_RECORD_GROUP + * { u64 nr; + * { u64 event, val; } cnt[nr]; } && PERF_RECORD_GROUP * - * { u16 nr, - * hv, - * kernel, - * user; - * u64 ips[nr]; } && PERF_RECORD_CALLCHAIN + * { u16 nr, + * hv, + * kernel, + * user; + * u64 ips[nr]; } && PERF_RECORD_CALLCHAIN * }; */ }; @@ -406,7 +406,7 @@ struct perf_mmap_data { atomic_t wakeup; /* needs a wakeup */ struct perf_counter_mmap_page *user_page; - void *data_pages[0]; + void *data_pages[0]; }; struct perf_pending_entry { @@ -422,7 +422,7 @@ struct perf_counter { struct list_head list_entry; struct list_head event_entry; struct list_head sibling_list; - int nr_siblings; + int nr_siblings; struct perf_counter *group_leader; const struct pmu *pmu; diff --git a/kernel/perf_counter.c b/kernel/perf_counter.c index ec9c400..070f92d 100644 --- a/kernel/perf_counter.c +++ b/kernel/perf_counter.c @@ -2576,7 +2576,7 @@ static void perf_log_throttle(struct perf_counter *counter, int enable) .time = sched_clock(), }; - ret = perf_output_begin(&handle, counter, sizeof(throttle_event), 0, 0); + ret = perf_output_begin(&handle, counter, sizeof(throttle_event), 1, 0); if (ret) return; @@ -3449,8 +3449,6 @@ void perf_counter_exit_task(struct task_struct *child) struct perf_counter_context *child_ctx; unsigned long flags; - WARN_ON_ONCE(child != current); - child_ctx = child->perf_counter_ctxp; if (likely(!child_ctx)) ^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2009-05-25 20:08 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-05-25 15:39 [PATCH 0/3] even more perf counter patches Peter Zijlstra 2009-05-25 15:39 ` [PATCH 1/3] perf_counter: x86: expose INV and EDGE bits Peter Zijlstra 2009-05-25 19:51 ` [tip:perfcounters/core] perf_counter: x86: Expose " tip-bot for Peter Zijlstra 2009-05-25 15:39 ` [PATCH 2/3] perf_counter: x86: remove interrupt throttle Peter Zijlstra 2009-05-25 19:51 ` [tip:perfcounters/core] perf_counter: x86: Remove " tip-bot for Peter Zijlstra 2009-05-25 15:39 ` [PATCH 3/3] perf_counter: generic per counter " Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] perf_counter: Generic " tip-bot for Peter Zijlstra 2009-05-25 19:52 ` [tip:perfcounters/core] Revert "perf_counter, x86: speed up the scheduling fast-path" tip-bot for Ingo Molnar 2009-05-25 20:06 ` [tip:perfcounters/core] perf_counter: fix warning & lockup tip-bot for Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox