* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
@ 2011-04-13 3:19 George Spelvin
[not found] ` <BANLkTi=AOvVELYKA=BSTsOLbzfZyDg8UaQ@mail.gmail.com>
0 siblings, 1 reply; 4+ messages in thread
From: George Spelvin @ 2011-04-13 3:19 UTC (permalink / raw)
To: dzickus, gorcunov, linux-kernel
Cc: a.p.zijlstra, airlied, eranian, linux, ming.m.lin
This problem which I thought was fixed by commit 7d44ec193d, appears to be
still very much present in 2.6.39-rc3:
Apr 12 22:16:00 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:16:00 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:16:00 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:17:22 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:17:22 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:17:22 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:21:17 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:21:17 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:21:17 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:25:02 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:25:02 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:25:02 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:28:24 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:28:24 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:28:24 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:31:28 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:31:28 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:31:28 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:34:20 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:34:20 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:34:20 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:37:01 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:37:01 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:37:01 horizon kernel: Dazed and confused, but trying to continue
Same system as before: 1.6 GHz P4, ASUS P4B266LA, Intel 845 chipset.
I suspect commit 242214f9c1 wasn't quite as effective as hoped.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
[not found] ` <BANLkTi=AOvVELYKA=BSTsOLbzfZyDg8UaQ@mail.gmail.com>
@ 2011-04-13 5:06 ` Cyrill Gorcunov
2011-04-14 15:16 ` George Spelvin
0 siblings, 1 reply; 4+ messages in thread
From: Cyrill Gorcunov @ 2011-04-13 5:06 UTC (permalink / raw)
To: George Spelvin
Cc: airlied, dzickus, linux-kernel, a.p.zijlstra, ming.m.lin, eranian,
sruffell
[-- Attachment #1: Type: text/plain, Size: 256 bytes --]
On Wed, Apr 13, 2011 at 8:34 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> There is a patch fromDon we are testing now, i will resend it in a couple of
> hours.
>
George, mind to give these patches a shot please? (Sorry for sending
it as attachment).
[-- Attachment #2: don-apic-nmi-unmask-fix.patch --]
[-- Type: application/octet-stream, Size: 1687 bytes --]
From: Don Zickus <dzickus@redhat.com>
Date: Tue, 12 Apr 2011 12:30:31 -0400
Subject: [PATCH] perf, x86: fix unknown NMIs on a Pentium4 box
When using perf on a Pentium4 box, lots of unknown NMIs would be generated.
This is the result of a P4 quirk that is subtle. The P4 generates an NMI
when the counter overflow and unlike other arches where the NMI is a one time
event, the P4 continues to assert its NMI until clear by the OS.
As a side effect to this quirk, the NMI on the apic is masked off to prevent
a stream of NMIs until the overflow flag is cleared. During the perf
re-design, this subtle-ness was overlooked and the apic was unmasked _before_
the overflow flag was cleared. As a result, this generated an extra NMI on
the P4 mchines.
The fix is trivial, wait until the NMI is properly handled before un-masking
the apic.
Sadly, in the old nmi watchdog there was a note that explained this exact
behaviour.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Tested-by: Shaun Ruffell <sruffell@digium.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
arch/x86/kernel/cpu/perf_event.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index eed3673a..e108ef8 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1370,9 +1370,9 @@ perf_event_nmi_handler(struct notifier_block *self,
return NOTIFY_DONE;
}
- apic_write(APIC_LVTPC, APIC_DM_NMI);
handled = x86_pmu.handle_irq(args->regs);
+ apic_write(APIC_LVTPC, APIC_DM_NMI);
if (!handled)
return NOTIFY_DONE;
--
1.7.4.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
2011-04-13 5:06 ` Cyrill Gorcunov
@ 2011-04-14 15:16 ` George Spelvin
2011-04-14 15:22 ` Cyrill Gorcunov
0 siblings, 1 reply; 4+ messages in thread
From: George Spelvin @ 2011-04-14 15:16 UTC (permalink / raw)
To: gorcunov
Cc: a.p.zijlstra, airlied, dzickus, eranian, linux, linux-kernel,
ming.m.lin, sruffell
> George, mind to give these patches a shot please? (Sorry for sending
> it as attachment).
It ran overnight with no problems. (Well, I'm having a problem with udev
using 100% of CPU, but that appears to be unrelated.)
Thank you very much!
Acked-By: George Spelvin <linux@horizon.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
2011-04-14 15:16 ` George Spelvin
@ 2011-04-14 15:22 ` Cyrill Gorcunov
0 siblings, 0 replies; 4+ messages in thread
From: Cyrill Gorcunov @ 2011-04-14 15:22 UTC (permalink / raw)
To: George Spelvin
Cc: a.p.zijlstra, airlied, dzickus, eranian, linux-kernel, ming.m.lin,
sruffell
On 04/14/2011 07:16 PM, George Spelvin wrote:
>> George, mind to give these patches a shot please? (Sorry for sending
>> it as attachment).
>
> It ran overnight with no problems. (Well, I'm having a problem with udev
> using 100% of CPU, but that appears to be unrelated.)
>
> Thank you very much!
>
> Acked-By: George Spelvin <linux@horizon.com>
Thanks a lot for testing! Kudo goes to Don :)
--
Cyrill
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-04-14 15:23 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-13 3:19 Uhhuh. NMI received for unknown reason 2d on CPU 0 George Spelvin
[not found] ` <BANLkTi=AOvVELYKA=BSTsOLbzfZyDg8UaQ@mail.gmail.com>
2011-04-13 5:06 ` Cyrill Gorcunov
2011-04-14 15:16 ` George Spelvin
2011-04-14 15:22 ` Cyrill Gorcunov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).