* [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
@ 2011-03-04 19:48 Cyrill Gorcunov
2011-03-04 20:05 ` Joe Perches
2011-03-05 14:14 ` Lin Ming
0 siblings, 2 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2011-03-04 19:48 UTC (permalink / raw)
To: Lin Ming; +Cc: Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
Errata N15 and 17 of 249199-071 should be taken
into account. They are mostly hard to hit at moment
i believe but still better to be fixed.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
Ming, mind to give it a shot? I seems will be able to test myself
at Wednesday only. And I would like to get a review first to eliminate
some silly mistakes ;)
arch/x86/kernel/cpu/perf_event_p4.c | 54 ++++++++++++++++++++++++++++++++++--
1 file changed, 51 insertions(+), 3 deletions(-)
Index: linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
=====================================================================
--- linux-2.6.git.orig/arch/x86/kernel/cpu/perf_event_p4.c
+++ linux-2.6.git/arch/x86/kernel/cpu/perf_event_p4.c
@@ -809,15 +809,51 @@ static void p4_pmu_disable_pebs(void)
static inline void p4_pmu_disable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
+ unsigned int cntr_msr;
+ u32 prev_lo, prev_hi, new_lo, new_hi;
+
+ cntr_msr = hwc->event_base + hwc->idx;
+
+ /*
+ * Erratum N17 of 249199-071 says if a performance counter is
+ * stopped on the precise internal clock cycle where the intermediate
+ * carry from the lower 32 bits of the counter to the upper eight
+ * bits occurs, the intermediate carry is lost.
+ *
+ * As a workaround we read before stop and check for lost carry
+ * bit, if it get lost simply write previous value back, this is
+ * of course might introduce a delta in precise counting but still
+ * it's a way better than 2^32 magnitude lost.
+ */
+ rdmsr(cntr_msr, prev_lo, prev_hi);
/*
* If event gets disabled while counter is in overflowed
* state we need to clear P4_CCCR_OVF, otherwise interrupt get
* asserted again and again
+ *
+ * Erratum N15 of 249199-071 says we are to clear P4_CCCR_COMPARE
+ * otherwise writing a performance counter may result in incorrect
+ * value (to be sure we do a double write later) but since
+ * we're modifying CCCR anyway better to take this bit into account
+ * just to be double safe. Note we don't touch the former
+ * config so no affects on user supplied data.
*/
(void)checking_wrmsrl(hwc->config_base,
(u64)(p4_config_unpack_cccr(hwc->config)) &
- ~P4_CCCR_ENABLE & ~P4_CCCR_OVF & ~P4_CCCR_RESERVED);
+ ~P4_CCCR_COMPARE & ~P4_CCCR_ENABLE &
+ ~P4_CCCR_OVF & ~P4_CCCR_RESERVED);
+
+ /*
+ * Lets try to recover from error if happened
+ */
+ if (prev_lo == -1U) {
+ rdmsr(cntr_msr, new_lo, new_hi);
+ if (new_lo == 0 && (new_hi - prev_hi) == 0) {
+ wrmsr_safe(cntr_msr, prev_lo, prev_hi);
+ printk_once("P4 PMU: Recover lost carry bit\n");
+ }
+ }
}
static void p4_pmu_disable_all(void)
@@ -841,6 +877,16 @@ static void p4_pmu_enable_pebs(u64 confi
struct p4_pebs_bind *bind;
unsigned int idx;
+ /*
+ * NOTE: There is an errata says the full PEBS support
+ * requires to check if associated counting logic if properly
+ * configured, in short -- if an event requires some
+ * additional uops tagging and friends it *must* be guaranted
+ * the tagging is done properly otherwise the results are
+ * unknown, for while there is no classic PEBS support but better
+ * to keep this (potential) problem explicitly marked
+ */
+
BUILD_BUG_ON(P4_PEBS_METRIC__max > P4_PEBS_CONFIG_METRIC_MASK);
idx = p4_config_unpack_metric(config);
@@ -866,8 +912,10 @@ static void p4_pmu_enable_event(struct p
escr_addr = (u64)bind->escr_msr[thread];
/*
- * - we dont support cascaded counters yet
- * - and counter 1 is broken (erratum)
+ * In a sake of erratum:
+ * - cascaded counters do not work properly with
+ * force overflow flag set but take it wider
+ * - counter 1 is broken
*/
WARN_ON_ONCE(p4_is_event_cascaded(hwc->config));
WARN_ON_ONCE(hwc->idx == 1);
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-04 19:48 [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17 Cyrill Gorcunov
@ 2011-03-04 20:05 ` Joe Perches
2011-03-04 20:34 ` Cyrill Gorcunov
2011-03-05 14:14 ` Lin Ming
1 sibling, 1 reply; 8+ messages in thread
From: Joe Perches @ 2011-03-04 20:05 UTC (permalink / raw)
To: Cyrill Gorcunov
Cc: Lin Ming, Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On Fri, 2011-03-04 at 22:48 +0300, Cyrill Gorcunov wrote:
> Errata N15 and 17 of 249199-071 should be taken
> into account. They are mostly hard to hit at moment
> i believe but still better to be fixed.
Trivia:
> + printk_once("P4 PMU: Recover lost carry bit\n");
pr_info_once(etc...)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-04 20:05 ` Joe Perches
@ 2011-03-04 20:34 ` Cyrill Gorcunov
2011-03-04 20:35 ` Cyrill Gorcunov
0 siblings, 1 reply; 8+ messages in thread
From: Cyrill Gorcunov @ 2011-03-04 20:34 UTC (permalink / raw)
To: Joe Perches; +Cc: Lin Ming, Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On 03/04/2011 11:05 PM, Joe Perches wrote:
> On Fri, 2011-03-04 at 22:48 +0300, Cyrill Gorcunov wrote:
>> Errata N15 and 17 of 249199-071 should be taken
>> into account. They are mostly hard to hit at moment
>> i believe but still better to be fixed.
>
> Trivia:
>
>> + printk_once("P4 PMU: Recover lost carry bit\n");
>
D0h. Joe you point to missed KERN_WARNING, right?
--
Cyrill
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-04 20:34 ` Cyrill Gorcunov
@ 2011-03-04 20:35 ` Cyrill Gorcunov
0 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2011-03-04 20:35 UTC (permalink / raw)
To: Joe Perches; +Cc: Lin Ming, Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On 03/04/2011 11:34 PM, Cyrill Gorcunov wrote:
> On 03/04/2011 11:05 PM, Joe Perches wrote:
>> On Fri, 2011-03-04 at 22:48 +0300, Cyrill Gorcunov wrote:
>>> Errata N15 and 17 of 249199-071 should be taken
>>> into account. They are mostly hard to hit at moment
>>> i believe but still better to be fixed.
>>
>> Trivia:
>>
>>> + printk_once("P4 PMU: Recover lost carry bit\n");
>>
>
> D0h. Joe you point to missed KERN_WARNING, right?
>
Ah, sorry found what you mean ;) Thanks!
I'll update as only the rest will be reviewed.
--
Cyrill
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-04 19:48 [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17 Cyrill Gorcunov
2011-03-04 20:05 ` Joe Perches
@ 2011-03-05 14:14 ` Lin Ming
2011-03-05 14:49 ` Cyrill Gorcunov
1 sibling, 1 reply; 8+ messages in thread
From: Lin Ming @ 2011-03-05 14:14 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On Sat, 2011-03-05 at 03:48 +0800, Cyrill Gorcunov wrote:
> Errata N15 and 17 of 249199-071 should be taken
> into account. They are mostly hard to hit at moment
> i believe but still better to be fixed.
>
> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
> ---
>
> Ming, mind to give it a shot? I seems will be able to test myself
> at Wednesday only. And I would like to get a review first to eliminate
> some silly mistakes ;)
I'll test this next Monday when I back to office.
Thanks.
>
> arch/x86/kernel/cpu/perf_event_p4.c | 54 ++++++++++++++++++++++++++++++++++--
> 1 file changed, 51 insertions(+), 3 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-05 14:14 ` Lin Ming
@ 2011-03-05 14:49 ` Cyrill Gorcunov
2011-03-07 8:27 ` Lin Ming
0 siblings, 1 reply; 8+ messages in thread
From: Cyrill Gorcunov @ 2011-03-05 14:49 UTC (permalink / raw)
To: Lin Ming; +Cc: Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On 03/05/2011 05:14 PM, Lin Ming wrote:
> On Sat, 2011-03-05 at 03:48 +0800, Cyrill Gorcunov wrote:
>> Errata N15 and 17 of 249199-071 should be taken
>> into account. They are mostly hard to hit at moment
>> i believe but still better to be fixed.
>>
>> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
>> ---
>>
>> Ming, mind to give it a shot? I seems will be able to test myself
>> at Wednesday only. And I would like to get a review first to eliminate
>> some silly mistakes ;)
>
> I'll test this next Monday when I back to office.
>
> Thanks.
>
Great, no hurry.
--
Cyrill
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-05 14:49 ` Cyrill Gorcunov
@ 2011-03-07 8:27 ` Lin Ming
2011-03-07 8:40 ` Cyrill Gorcunov
0 siblings, 1 reply; 8+ messages in thread
From: Lin Ming @ 2011-03-07 8:27 UTC (permalink / raw)
To: Cyrill Gorcunov; +Cc: Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On Sat, 2011-03-05 at 22:49 +0800, Cyrill Gorcunov wrote:
> On 03/05/2011 05:14 PM, Lin Ming wrote:
> > On Sat, 2011-03-05 at 03:48 +0800, Cyrill Gorcunov wrote:
> >> Errata N15 and 17 of 249199-071 should be taken
> >> into account. They are mostly hard to hit at moment
> >> i believe but still better to be fixed.
> >>
> >> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
> >> ---
> >>
> >> Ming, mind to give it a shot? I seems will be able to test myself
> >> at Wednesday only. And I would like to get a review first to eliminate
> >> some silly mistakes ;)
> >
> > I'll test this next Monday when I back to office.
> >
> > Thanks.
> >
>
> Great, no hurry.
This patch causes problem.
# perf top -e instructions
Pid: 1963, comm: perf Not tainted 2.6.38-rc7-tip-mlin+ #88 Dell Computer
Corporation Dimension 4600 /02Y832
EIP: 0060:[<c010ee9e>] EFLAGS: 00210016 CPU: 0
EIP is at p4_pmu_disable_all+0x54/0xfa
EAX: ffffffff EBX: 0000000c ECX: 00000318 EDX: f6627c00
ESI: f6627cc8 EDI: 00000318 EBP: f5b99ccc ESP: f5b99ca0
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process perf (pid: 1963, ti=f5b98000 task=f548d7c0 task.ti=f5b98000)
Stack:
00000000 00000000 f6627c00 c01265bd 0000000c f68014e0 8061b54b 000000db
00000000 f6807abc 00000000 f5b99cd4 c010e499 f5b99cdc c0189993 f5b99d08
c018ec40 f6803600 f68036e4 f6807ba0 00000000 003d09fa 00000000 c0710380
Call Trace:
[<c01265bd>] ? cpuacct_charge+0x72/0x7a
[<c010e499>] x86_pmu_disable+0x3b/0x3d
[<c0189993>] perf_pmu_disable+0x1c/0x1e
[<c018ec40>] perf_event_task_tick+0xd6/0x1ef
[<c012a485>] scheduler_tick+0x9c/0x1dd
[<c013dc6a>] update_process_times+0x4c/0x58
[<c0156b46>] tick_sched_timer+0x6e/0x9c
[<c014cfd0>] __run_hrtimer+0x97/0x10e
[<c070641b>] ? _raw_spin_lock+0x22/0x2a
[<c0156ad8>] ? tick_sched_timer+0x0/0x9c
[<c014d111>] hrtimer_interrupt+0xca/0x1e0
[<c0118a20>] smp_apic_timer_interrupt+0x69/0x7c
[<c0706fbb>] apic_timer_interrupt+0x2f/0x34
[<c0341e0a>] ? memset+0xf/0x19
[<c0197706>] get_page_from_freelist+0x307/0x38a
[<c01978d6>] __alloc_pages_nodemask+0x137/0x5a3
[<c01597d7>] ? trace_hardirqs_on+0xb/0xd
[<c0190908>] ? perf_mmap+0x1a8/0x2cf
[<c018c3f2>] perf_mmap_alloc_page+0x16/0x28
[<c0190930>] perf_mmap+0x1d0/0x2cf
[<c01acf20>] mmap_region+0x1e5/0x398
[<c01ad304>] do_mmap_pgoff+0x231/0x293
[<c01ad4cf>] sys_mmap_pgoff+0x169/0x18a
[<c01027f0>] sysenter_do_call+0x12/0x36
Code: e8 8b 5d e4 8b 14 91 89 55 dc 0f a3 99 00 01 00 00 19 c0 85 c0 0f
84 9c 00 00 00 89 d6 81 c6 c8 00 00 00 8b 7e 18 03 7e 14 89 f9 <0f> 32
89 55 f0 8b 55 dc 89 45 ec 8b 8a c8 00 00 00 c7 45 d8 00
EIP: [<c010ee9e>] p4_pmu_disable_all+0x54/0xfa SS:ESP 0068:f5b99ca0
---[ end trace cd06b2a54d3444fb ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 1963, comm: perf Tainted: G D 2.6.38-rc7-tip-mlin+ #88
Call Trace:
[<c0133c6d>] ? panic+0x58/0x15b
[<c0707ad9>] ? oops_end+0x8b/0x9a
[<c0104b91>] ? die+0x53/0x59
[<c07077d5>] ? do_general_protection+0x10a/0x112
[<c07076cb>] ? do_general_protection+0x0/0x112
[<c07071f7>] ? error_code+0x5f/0x64
[<c01200d8>] ? change_page_attr_set_clr+0x1a4/0x2a6
[<c07076cb>] ? do_general_protection+0x0/0x112
[<c010ee9e>] ? p4_pmu_disable_all+0x54/0xfa
[<c01265bd>] ? cpuacct_charge+0x72/0x7a
[<c010e499>] ? x86_pmu_disable+0x3b/0x3d
[<c0189993>] ? perf_pmu_disable+0x1c/0x1e
[<c018ec40>] ? perf_event_task_tick+0xd6/0x1ef
[<c012a485>] ? scheduler_tick+0x9c/0x1dd
[<c013dc6a>] ? update_process_times+0x4c/0x58
[<c0156b46>] ? tick_sched_timer+0x6e/0x9c
[<c014cfd0>] ? __run_hrtimer+0x97/0x10e
[<c070641b>] ? _raw_spin_lock+0x22/0x2a
[<c0156ad8>] ? tick_sched_timer+0x0/0x9c
[<c014d111>] ? hrtimer_interrupt+0xca/0x1e0
[<c0118a20>] ? smp_apic_timer_interrupt+0x69/0x7c
[<c0706fbb>] ? apic_timer_interrupt+0x2f/0x34
[<c0341e0a>] ? memset+0xf/0x19
[<c0197706>] ? get_page_from_freelist+0x307/0x38a
[<c01978d6>] ? __alloc_pages_nodemask+0x137/0x5a3
[<c01597d7>] ? trace_hardirqs_on+0xb/0xd
[<c0190908>] ? perf_mmap+0x1a8/0x2cf
[<c018c3f2>] ? perf_mmap_alloc_page+0x16/0x28
[<c0190930>] ? perf_mmap+0x1d0/0x2cf
[<c01acf20>] ? mmap_region+0x1e5/0x398
[<c01ad304>] ? do_mmap_pgoff+0x231/0x293
[<c01ad4cf>] ? sys_mmap_pgoff+0x169/0x18a
[<c01027f0>] ? sysenter_do_call+0x12/0x36
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17
2011-03-07 8:27 ` Lin Ming
@ 2011-03-07 8:40 ` Cyrill Gorcunov
0 siblings, 0 replies; 8+ messages in thread
From: Cyrill Gorcunov @ 2011-03-07 8:40 UTC (permalink / raw)
To: Lin Ming; +Cc: Ingo Molnar, Peter Zijlstra, Robert Richter, lkml
On 03/07/2011 11:27 AM, Lin Ming wrote:
> On Sat, 2011-03-05 at 22:49 +0800, Cyrill Gorcunov wrote:
>> On 03/05/2011 05:14 PM, Lin Ming wrote:
>>> On Sat, 2011-03-05 at 03:48 +0800, Cyrill Gorcunov wrote:
>>>> Errata N15 and 17 of 249199-071 should be taken
>>>> into account. They are mostly hard to hit at moment
>>>> i believe but still better to be fixed.
>>>>
>>>> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
>>>> ---
>>>>
>>>> Ming, mind to give it a shot? I seems will be able to test myself
>>>> at Wednesday only. And I would like to get a review first to eliminate
>>>> some silly mistakes ;)
>>>
>>> I'll test this next Monday when I back to office.
>>>
>>> Thanks.
>>>
>>
>> Great, no hurry.
>
> This patch causes problem.
>
Thanks Ming, i'll investigate.
--
Cyrill
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-03-07 8:40 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-03-04 19:48 [RFC -tip] perf, x86: P4 PMU -- Address erratum 15 and 17 Cyrill Gorcunov
2011-03-04 20:05 ` Joe Perches
2011-03-04 20:34 ` Cyrill Gorcunov
2011-03-04 20:35 ` Cyrill Gorcunov
2011-03-05 14:14 ` Lin Ming
2011-03-05 14:49 ` Cyrill Gorcunov
2011-03-07 8:27 ` Lin Ming
2011-03-07 8:40 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox