linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
@ 2011-04-13  3:19 George Spelvin
       [not found] ` <BANLkTi=AOvVELYKA=BSTsOLbzfZyDg8UaQ@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: George Spelvin @ 2011-04-13  3:19 UTC (permalink / raw)
  To: dzickus, gorcunov, linux-kernel
  Cc: a.p.zijlstra, airlied, eranian, linux, ming.m.lin

This problem which I thought was fixed by commit 7d44ec193d, appears to be
still very much present in 2.6.39-rc3:

Apr 12 22:16:00 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:16:00 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:16:00 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:17:22 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:17:22 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:17:22 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:21:17 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:21:17 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:21:17 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:25:02 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:25:02 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:25:02 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:28:24 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:28:24 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:28:24 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:31:28 horizon kernel: Uhhuh. NMI received for unknown reason 2d on CPU 0.
Apr 12 22:31:28 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:31:28 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:34:20 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:34:20 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:34:20 horizon kernel: Dazed and confused, but trying to continue
Apr 12 22:37:01 horizon kernel: Uhhuh. NMI received for unknown reason 3d on CPU 0.
Apr 12 22:37:01 horizon kernel: Do you have a strange power saving mode enabled?
Apr 12 22:37:01 horizon kernel: Dazed and confused, but trying to continue

Same system as before: 1.6 GHz P4, ASUS P4B266LA, Intel 845 chipset.

I suspect commit 242214f9c1 wasn't quite as effective as hoped.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
       [not found] ` <BANLkTi=AOvVELYKA=BSTsOLbzfZyDg8UaQ@mail.gmail.com>
@ 2011-04-13  5:06   ` Cyrill Gorcunov
  2011-04-14 15:16     ` George Spelvin
  0 siblings, 1 reply; 4+ messages in thread
From: Cyrill Gorcunov @ 2011-04-13  5:06 UTC (permalink / raw)
  To: George Spelvin
  Cc: airlied, dzickus, linux-kernel, a.p.zijlstra, ming.m.lin, eranian,
	sruffell

[-- Attachment #1: Type: text/plain, Size: 256 bytes --]

On Wed, Apr 13, 2011 at 8:34 AM, Cyrill Gorcunov <gorcunov@gmail.com> wrote:
> There is a patch fromDon we are testing now, i will resend it in a couple of
> hours.
>

George, mind to give these patches a shot please? (Sorry for sending
it as attachment).

[-- Attachment #2: don-apic-nmi-unmask-fix.patch --]
[-- Type: application/octet-stream, Size: 1687 bytes --]

From: Don Zickus <dzickus@redhat.com>
Date: Tue, 12 Apr 2011 12:30:31 -0400
Subject: [PATCH] perf, x86: fix unknown NMIs on a Pentium4 box

When using perf on a Pentium4 box, lots of unknown NMIs would be generated.
This is the result of a P4 quirk that is subtle.  The P4 generates an NMI
when the counter overflow and unlike other arches where the NMI is a one time
event, the P4 continues to assert its NMI until clear by the OS.

As a side effect to this quirk, the NMI on the apic is masked off to prevent
a stream of NMIs until the overflow flag is cleared.  During the perf
re-design, this subtle-ness was overlooked and the apic was unmasked _before_
the overflow flag was cleared.  As a result, this generated an extra NMI on
the P4 mchines.

The fix is trivial, wait until the NMI is properly handled before un-masking
the apic.

Sadly, in the old nmi watchdog there was a note that explained this exact
behaviour.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Tested-by: Shaun Ruffell <sruffell@digium.com>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
---
 arch/x86/kernel/cpu/perf_event.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index eed3673a..e108ef8 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1370,9 +1370,9 @@ perf_event_nmi_handler(struct notifier_block *self,
 		return NOTIFY_DONE;
 	}
 
-	apic_write(APIC_LVTPC, APIC_DM_NMI);
 
 	handled = x86_pmu.handle_irq(args->regs);
+	apic_write(APIC_LVTPC, APIC_DM_NMI);
 	if (!handled)
 		return NOTIFY_DONE;
 
-- 
1.7.4.2


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
  2011-04-13  5:06   ` Cyrill Gorcunov
@ 2011-04-14 15:16     ` George Spelvin
  2011-04-14 15:22       ` Cyrill Gorcunov
  0 siblings, 1 reply; 4+ messages in thread
From: George Spelvin @ 2011-04-14 15:16 UTC (permalink / raw)
  To: gorcunov
  Cc: a.p.zijlstra, airlied, dzickus, eranian, linux, linux-kernel,
	ming.m.lin, sruffell

> George, mind to give these patches a shot please? (Sorry for sending
> it as attachment).

It ran overnight with no problems.  (Well, I'm having a problem with udev
using 100% of CPU, but that appears to be unrelated.)

Thank you very much!

Acked-By: George Spelvin <linux@horizon.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Uhhuh. NMI received for unknown reason 2d on CPU 0.
  2011-04-14 15:16     ` George Spelvin
@ 2011-04-14 15:22       ` Cyrill Gorcunov
  0 siblings, 0 replies; 4+ messages in thread
From: Cyrill Gorcunov @ 2011-04-14 15:22 UTC (permalink / raw)
  To: George Spelvin
  Cc: a.p.zijlstra, airlied, dzickus, eranian, linux-kernel, ming.m.lin,
	sruffell

On 04/14/2011 07:16 PM, George Spelvin wrote:
>> George, mind to give these patches a shot please? (Sorry for sending
>> it as attachment).
> 
> It ran overnight with no problems.  (Well, I'm having a problem with udev
> using 100% of CPU, but that appears to be unrelated.)
> 
> Thank you very much!
> 
> Acked-By: George Spelvin <linux@horizon.com>

Thanks a lot for testing! Kudo goes to Don :)

-- 
    Cyrill

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-04-14 15:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-13  3:19 Uhhuh. NMI received for unknown reason 2d on CPU 0 George Spelvin
     [not found] ` <BANLkTi=AOvVELYKA=BSTsOLbzfZyDg8UaQ@mail.gmail.com>
2011-04-13  5:06   ` Cyrill Gorcunov
2011-04-14 15:16     ` George Spelvin
2011-04-14 15:22       ` Cyrill Gorcunov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).