From: Robert Richter <robert.richter@amd.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Don Zickus <dzickus@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Cyrill Gorcunov <gorcunov@gmail.com>,
Lin Ming <ming.m.lin@intel.com>,
"fweisbec@gmail.com" <fweisbec@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"Huang, Ying" <ying.huang@intel.com>,
Yinghai Lu <yinghai@kernel.org>, Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH -v3] perf, x86: try to handle unknown nmis with running perfctrs
Date: Wed, 25 Aug 2010 11:48:19 +0200 [thread overview]
Message-ID: <20100825094819.GB3198@erda.amd.com> (raw)
In-Reply-To: <20100820152510.GA4167@elte.hu>
On 20.08.10 11:25:10, Ingo Molnar wrote:
> > Ingo Molnar <mingo@elte.hu> wrote:
> >
> > >
> > >it's not working so well, i'm getting:
> > >
> > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > Do you have a strange power saving mode enabled?
> > > Dazed and confused, but trying to continue
> > >
> > >on a nehalem box, after a perf top and perf stat run.
>
> FYI, it does not trigger on an AMD box.
Ingo,
do you mean it does not trigger false positives on AMD? Both patches
applied on top of current tip/perf/urgent (c6db67c) are working on the
systems I have.
You might use the debug patch below for diagnostics.
-Robert
--
>From 1bbb5aa64e96360529c34a593a072e1a84114f04 Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Wed, 11 Aug 2010 18:14:00 +0200
Subject: [PATCH] debug
Signed-off-by: Robert Richter <robert.richter@amd.com>
---
arch/x86/kernel/cpu/perf_event.c | 54 ++++++++++++++++++++++++++++++++++++-
1 files changed, 52 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index dd2fceb..059ef09 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1203,10 +1203,43 @@ void perf_events_lapic_init(void)
struct pmu_nmi_state {
unsigned int marked;
int handled;
+ u64 timestamp;
};
static DEFINE_PER_CPU(struct pmu_nmi_state, nmi);
+struct nmi_debug {
+ int cpu;
+ unsigned int this_nmi;
+ unsigned int marked;
+ int handled;
+ u64 timestamp;
+ u64 delta;
+};
+
+static DEFINE_PER_CPU(struct nmi_debug[16], nmi_debug);
+
+static void nmi_handler_debug(void)
+{
+ struct nmi_debug *debug;
+ int i;
+
+ if (!printk_ratelimit())
+ return;
+
+ for (i = 0; i < 16; i++) {
+ debug = &__get_cpu_var(nmi_debug)[i];
+ printk(KERN_EMERG
+ "cpu #%d, nmi #%d, marked #%d, handled = %d, time = %llu, delta = %llu\n",
+ debug->cpu,
+ debug->this_nmi,
+ debug->marked,
+ debug->handled,
+ debug->timestamp,
+ debug->delta);
+ }
+}
+
static int __kprobes
perf_event_nmi_handler(struct notifier_block *self,
unsigned long cmd, void *__args)
@@ -1214,6 +1247,8 @@ perf_event_nmi_handler(struct notifier_block *self,
struct die_args *args = __args;
unsigned int this_nmi;
int handled;
+ struct nmi_debug *debug;
+ u64 timestamp;
if (!atomic_read(&active_events))
return NOTIFY_DONE;
@@ -1224,9 +1259,11 @@ perf_event_nmi_handler(struct notifier_block *self,
break;
case DIE_NMIUNKNOWN:
this_nmi = percpu_read(irq_stat.__nmi_count);
- if (this_nmi != __get_cpu_var(nmi).marked)
+ if (this_nmi != __get_cpu_var(nmi).marked) {
+ nmi_handler_debug();
/* let the kernel handle the unknown nmi */
return NOTIFY_DONE;
+ }
/*
* This one is a PMU back-to-back nmi. Two events
* trigger 'simultaneously' raising two back-to-back
@@ -1242,10 +1279,21 @@ perf_event_nmi_handler(struct notifier_block *self,
apic_write(APIC_LVTPC, APIC_DM_NMI);
handled = x86_pmu.handle_irq(args->regs);
+ this_nmi = percpu_read(irq_stat.__nmi_count);
+
+ debug = &__get_cpu_var(nmi_debug)[0xf & this_nmi];
+ debug->cpu = smp_processor_id();
+ debug->this_nmi = this_nmi;
+ debug->marked = __get_cpu_var(nmi).marked;
+ debug->handled = handled;
+ rdtscll(timestamp);
+ debug->delta = timestamp - __get_cpu_var(nmi).timestamp;
+ __get_cpu_var(nmi).timestamp = timestamp;
+ debug->timestamp = timestamp;
+
if (!handled)
return NOTIFY_DONE;
- this_nmi = percpu_read(irq_stat.__nmi_count);
if ((handled > 1) ||
/* the next nmi could be a back-to-back nmi */
((__get_cpu_var(nmi).marked == this_nmi) &&
@@ -1262,6 +1310,8 @@ perf_event_nmi_handler(struct notifier_block *self,
*/
__get_cpu_var(nmi).marked = this_nmi + 1;
__get_cpu_var(nmi).handled = handled;
+ debug->marked = __get_cpu_var(nmi).marked;
+ debug->handled = handled;
}
return NOTIFY_STOP;
--
1.7.1.1
--
Advanced Micro Devices, Inc.
Operating System Research Center
next prev parent reply other threads:[~2010-08-25 10:11 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-20 15:05 [PATCH -v3] perf, x86: try to handle unknown nmis with running perfctrs Don Zickus
2010-08-20 15:25 ` Ingo Molnar
2010-08-23 8:53 ` Ingo Molnar
2010-08-24 16:22 ` Cyrill Gorcunov
2010-08-24 17:09 ` Robert Richter
2010-08-24 17:20 ` Cyrill Gorcunov
2010-08-24 17:21 ` Cyrill Gorcunov
2010-08-24 17:15 ` Robert Richter
2010-08-24 17:28 ` Cyrill Gorcunov
2010-08-24 18:46 ` Don Zickus
2010-08-24 18:54 ` Cyrill Gorcunov
2010-08-24 19:52 ` Cyrill Gorcunov
2010-08-24 20:27 ` Don Zickus
2010-08-24 20:40 ` Cyrill Gorcunov
2010-08-25 23:52 ` Frederic Weisbecker
2010-08-26 9:11 ` Cyrill Gorcunov
2010-08-25 10:20 ` Robert Richter
2010-08-26 21:14 ` Don Zickus
2010-08-27 7:51 ` Robert Richter
2010-08-27 13:39 ` Don Zickus
2010-08-27 8:10 ` Robert Richter
2010-08-27 13:44 ` Don Zickus
2010-08-27 14:05 ` Robert Richter
2010-08-27 15:05 ` Don Zickus
2010-08-27 15:48 ` Robert Richter
2010-08-27 18:57 ` Don Zickus
2010-08-27 19:00 ` Yinghai Lu
2010-08-27 19:33 ` Robert Richter
2010-08-25 9:48 ` Robert Richter [this message]
2010-08-25 10:41 ` Ingo Molnar
2010-08-25 11:00 ` Ingo Molnar
2010-08-25 20:11 ` Don Zickus
2010-08-25 20:24 ` Cyrill Gorcunov
2010-08-25 21:20 ` Don Zickus
2010-08-25 21:36 ` Cyrill Gorcunov
2010-08-26 9:00 ` Robert Richter
2010-08-26 9:18 ` Cyrill Gorcunov
2010-08-26 14:31 ` Don Zickus
2010-08-26 15:22 ` Don Zickus
2010-08-26 15:34 ` Cyrill Gorcunov
2010-08-26 16:40 ` Don Zickus
2010-08-26 18:02 ` Cyrill Gorcunov
2010-08-27 7:57 ` Robert Richter
2010-08-27 8:11 ` Peter Zijlstra
2010-08-27 8:31 ` Robert Richter
2010-08-25 11:02 ` Robert Richter
2010-08-25 11:19 ` Ingo Molnar
2010-08-20 23:31 ` Don Zickus
-- strict thread matches above, loose matches on Subject: below --
2010-08-04 15:18 A question of perf NMI handler Cyrill Gorcunov
2010-08-04 15:50 ` Don Zickus
2010-08-04 16:10 ` Cyrill Gorcunov
2010-08-04 16:20 ` Don Zickus
2010-08-04 16:39 ` Cyrill Gorcunov
2010-08-04 18:48 ` Robert Richter
2010-08-04 19:26 ` Cyrill Gorcunov
2010-08-06 6:52 ` Robert Richter
2010-08-06 14:21 ` Don Zickus
2010-08-09 19:48 ` [PATCH] perf, x86: try to handle unknown nmis with running perfctrs Robert Richter
2010-08-17 15:22 ` [PATCH -v3] " Robert Richter
2010-08-17 16:17 ` Cyrill Gorcunov
2010-08-19 10:45 ` Peter Zijlstra
2010-08-19 12:39 ` Robert Richter
2010-08-19 14:12 ` Don Zickus
2010-08-19 14:27 ` Peter Zijlstra
2010-08-19 15:20 ` Don Zickus
2010-08-19 17:43 ` Cyrill Gorcunov
2010-08-19 17:53 ` Peter Zijlstra
2010-08-19 21:58 ` Don Zickus
2010-08-20 8:50 ` Peter Zijlstra
2010-08-20 1:50 ` Don Zickus
2010-08-20 8:16 ` Ingo Molnar
2010-08-20 10:04 ` Peter Zijlstra
2010-08-20 10:30 ` Cyrill Gorcunov
2010-08-20 12:39 ` Don Zickus
2010-08-20 13:27 ` Ingo Molnar
2010-08-20 13:51 ` Don Zickus
2010-08-20 14:17 ` Ingo Molnar
2010-08-20 20:45 ` Cyrill Gorcunov
2010-08-24 21:48 ` Don Zickus
2010-08-20 8:36 ` Robert Richter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100825094819.GB3198@erda.amd.com \
--to=robert.richter@amd.com \
--cc=andi@firstfloor.org \
--cc=dzickus@redhat.com \
--cc=fweisbec@gmail.com \
--cc=gorcunov@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.m.lin@intel.com \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=ying.huang@intel.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.