All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <robert.richter@amd.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Don Zickus <dzickus@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	"gorcunov@gmail.com" <gorcunov@gmail.com>,
	"fweisbec@gmail.com" <fweisbec@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"ying.huang@intel.com" <ying.huang@intel.com>,
	"ming.m.lin@intel.com" <ming.m.lin@intel.com>,
	"yinghai@kernel.org" <yinghai@kernel.org>,
	"andi@firstfloor.org" <andi@firstfloor.org>,
	"eranian@google.com" <eranian@google.com>
Subject: Re: [PATCH 0/3 v2] nmi perf fixes
Date: Fri, 10 Sep 2010 17:17:35 +0200	[thread overview]
Message-ID: <20100910151735.GC13563@erda.amd.com> (raw)
In-Reply-To: <20100910144634.GA1060@elte.hu>

On 10.09.10 10:46:34, Ingo Molnar wrote:
> 
> * Don Zickus <dzickus@redhat.com> wrote:
> 
> > On Fri, Sep 10, 2010 at 01:41:40PM +0200, Peter Zijlstra wrote:
> > > On Thu, 2010-09-02 at 15:07 -0400, Don Zickus wrote:
> > > > Fixes to allow unknown nmis to pass through the perf nmi handler instead
> > > > of being swallowed.  Contains patches that are already in Ingo's tree.  Added
> > > > here for completeness.  Based on ingo/tip
> > > > 
> > > > Tested on intel/amd
> > > > 
> > > > v2: patch cleanups and consolidation, no code changes
> > > > 
> > > > Don Zickus (1):
> > > >   perf, x86: Fix accidentally ack'ing a second event on intel perf
> > > >     counter
> > > > 
> > > > Peter Zijlstra (1):
> > > >   perf, x86: Fix handle_irq return values
> > > > 
> > > > Robert Richter (1):
> > > >   perf, x86: Try to handle unknown nmis with an enabled PMU
> > > > 
> > > >  arch/x86/kernel/cpu/perf_event.c       |   59 +++++++++++++++++++++++++-------
> > > >  arch/x86/kernel/cpu/perf_event_intel.c |   15 +++++---
> > > >  arch/x86/kernel/cpu/perf_event_p4.c    |    2 +-
> > > >  3 files changed, 56 insertions(+), 20 deletions(-)
> > > 
> > > Both Ingo and I are getting Dazed and confused on our AMD machines, it
> > > started before yesterday (that is, after backing out all my recent
> > > changes it still gets dazed), so I suspect this set.
> > > 
> > > I'll look at getting a trace of the thing, but if any of you has a
> > > bright idea...
> > 
> > What are you running to create the problem?  I can try and duplicate 
> > it here.
> 
> It happens easily here - just running something like:
> 
>    perf record -g ./hackbench 10

I try to reproduce it, which systems are affected?

> 
> a couple of times triggers it. Note, unlike with the earlier bug, the 
> NMIs are not permanently 'stuck' - and everything continues working. 
> Obviously the messages are nasty looking so this is a regression we need 
> to fix.

The patch below adds ratelimits.

-Robert

--

>From 1747710d684302b806b145e5acb590ab2088e5ca Mon Sep 17 00:00:00 2001
From: Robert Richter <robert.richter@amd.com>
Date: Thu, 5 Aug 2010 18:04:31 +0200
Subject: [PATCH] x86: ratelimit NMI messages

In case of a storm of unknown NMIs the cpu get stucked. This patch
adds ratelimits to avoid this.

Signed-off-by: Robert Richter <robert.richter@amd.com>
---
 arch/x86/kernel/traps.c |   16 +++++++++-------
 1 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 60788de..97a492d 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -366,15 +366,17 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
 		return;
 	}
 #endif
-	printk(KERN_EMERG
-		"Uhhuh. NMI received for unknown reason %02x on CPU %d.\n",
-			reason, smp_processor_id());
-
-	printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
+	if (printk_ratelimit()) {
+		printk(KERN_EMERG "Uhhuh. NMI received for unknown reason"
+		       " %02x on CPU %d.\n", reason, smp_processor_id());
+		printk(KERN_EMERG
+		       "Do you have a strange power saving mode enabled?\n");
+	}
 	if (panic_on_unrecovered_nmi)
 		panic("NMI: Not continuing");
-
-	printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
+	if (printk_ratelimit())
+		printk(KERN_EMERG
+		       "Dazed and confused, but trying to continue\n");
 }
 
 static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
-- 
1.7.1.1



-- 
Advanced Micro Devices, Inc.
Operating System Research Center


  reply	other threads:[~2010-09-10 15:20 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-02 19:07 [PATCH 0/3 v2] nmi perf fixes Don Zickus
2010-09-02 19:07 ` [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event on intel perf counter Don Zickus
2010-09-02 19:26   ` Cyrill Gorcunov
2010-09-02 20:00     ` Don Zickus
2010-09-02 20:36       ` Cyrill Gorcunov
2010-09-03  7:10   ` [tip:perf/urgent] " tip-bot for Don Zickus
2010-09-03  7:39     ` Yinghai Lu
2010-09-03 15:00       ` Don Zickus
2010-09-03 17:15         ` Yinghai Lu
2010-09-03 18:35           ` Don Zickus
2010-09-03 19:24             ` Yinghai Lu
2010-09-03 20:10               ` Don Zickus
2010-10-04 23:24             ` Yinghai Lu
2010-10-11 20:25               ` Don Zickus
2010-09-02 19:07 ` [PATCH 2/3] perf, x86: Try to handle unknown nmis with an enabled PMU Don Zickus
2010-09-03  7:11   ` [tip:perf/urgent] " tip-bot for Robert Richter
2010-09-02 19:07 ` [PATCH 3/3] perf, x86: Fix handle_irq return values Don Zickus
2010-09-03  7:10   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
2010-09-10 11:41 ` [PATCH 0/3 v2] nmi perf fixes Peter Zijlstra
2010-09-10 12:10   ` Stephane Eranian
2010-09-10 12:13     ` Stephane Eranian
2010-09-10 13:27   ` Don Zickus
2010-09-10 14:46     ` Ingo Molnar
2010-09-10 15:17       ` Robert Richter [this message]
2010-09-10 15:58         ` Peter Zijlstra
2010-09-10 16:41           ` Ingo Molnar
2010-09-10 16:42             ` Ingo Molnar
2010-09-10 16:37         ` Ingo Molnar
2010-09-10 16:51           ` Ingo Molnar
2010-09-10 15:56       ` [PATCH] x86: fix duplicate calls of the nmi handler Robert Richter
2010-09-10 16:15         ` Peter Zijlstra
2010-09-11  9:41         ` Ingo Molnar
2010-09-11 11:44           ` Robert Richter
2010-09-11 12:45             ` Ingo Molnar
2010-09-12  9:52               ` Robert Richter
2010-09-13 14:37                 ` Robert Richter
2010-09-14 17:41                   ` Robert Richter
2010-09-15 16:20                     ` [PATCH] perf, x86: catch spurious interrupts after disabling counters Robert Richter
2010-09-15 16:36                       ` Stephane Eranian
2010-09-15 17:00                         ` Robert Richter
2010-09-15 17:32                           ` Stephane Eranian
2010-09-15 18:44                             ` Robert Richter
2010-09-15 19:34                               ` Cyrill Gorcunov
2010-09-15 20:21                                 ` Stephane Eranian
2010-09-15 20:39                                   ` Cyrill Gorcunov
2010-09-15 22:27                                     ` Robert Richter
2010-09-16 14:51                                       ` Frederic Weisbecker
2010-09-15 16:46                       ` Cyrill Gorcunov
2010-09-15 16:47                         ` Stephane Eranian
2010-09-15 17:02                           ` Cyrill Gorcunov
2010-09-15 17:28                             ` Robert Richter
2010-09-15 17:40                               ` Cyrill Gorcunov
2010-09-15 22:10                                 ` Robert Richter
2010-09-16  6:53                                   ` Cyrill Gorcunov
2010-09-16 17:34                       ` Peter Zijlstra
2010-09-17  8:51                         ` Robert Richter
2010-09-17  9:14                           ` Peter Zijlstra
2010-09-17 13:06                       ` Stephane Eranian
2010-09-20  8:41                         ` Robert Richter
2010-09-24  0:02                       ` Don Zickus
2010-09-24  3:18                         ` Don Zickus
2010-09-24 10:03                           ` Robert Richter
2010-09-24 13:38                             ` Stephane Eranian
2010-09-30 12:33                               ` Peter Zijlstra
2010-09-24 18:11                             ` Don Zickus
2010-09-24 10:41                       ` [tip:perf/urgent] perf, x86: Catch " tip-bot for Robert Richter
2010-09-29 12:26                         ` Stephane Eranian
2010-09-29 12:53                           ` Robert Richter
2010-09-29 12:54                             ` Robert Richter
2010-09-29 13:13                               ` Stephane Eranian
2010-09-29 13:28                                 ` Stephane Eranian
2010-09-29 15:01                                   ` Robert Richter
2010-09-29 15:12                                     ` Robert Richter
2010-09-29 15:27                                       ` Cyrill Gorcunov
2010-09-29 15:33                                         ` Stephane Eranian
2010-09-29 15:45                                           ` Cyrill Gorcunov
2010-09-29 15:51                                             ` Cyrill Gorcunov
2010-09-29 16:32                                               ` Robert Richter
2010-09-29 16:48                                                 ` Cyrill Gorcunov
2010-09-29 16:00                                             ` Stephane Eranian
2010-09-29 17:09                                               ` Robert Richter
2010-09-29 17:41                                                 ` Cyrill Gorcunov
2010-09-29 18:12                                                 ` Don Zickus
2010-09-29 19:42                                                   ` Stephane Eranian
2010-09-29 20:03                                                     ` Don Zickus
2010-09-30  9:12                                                     ` Robert Richter
2010-09-30 19:44                                                       ` Don Zickus
2010-10-01  7:17                                                         ` Robert Richter
     [not found]                                                           ` <AANLkTimUyLaVaBigjm0-CwRsdh4UXWDiss2ffX53S+k_@mail.gmail.com>
2010-10-01 11:53                                                             ` Stephane Eranian
2010-10-02  9:35                                                               ` Robert Richter
2010-10-04  8:53                                                                 ` Stephane Eranian
2010-10-04  9:07                                                                   ` Andi Kleen
2010-10-04 17:28                                                                     ` Stephane Eranian
2010-09-29 16:31                                           ` Robert Richter
2010-09-29 16:22                                         ` Robert Richter
2010-09-29 19:01                                         ` Don Zickus
2010-09-29 13:39                                 ` Robert Richter
2010-09-29 13:56                                   ` Stephane Eranian
2010-09-29 14:00                                     ` Stephane Eranian
2010-10-02  9:50                                       ` Robert Richter
2010-10-02 17:40                                         ` Stephane Eranian
2010-09-29 15:02                                     ` Cyrill Gorcunov
2010-09-16 17:42         ` [PATCH] x86: fix duplicate calls of the nmi handler Peter Zijlstra
2010-09-16 20:18           ` Stephane Eranian
2010-09-17  7:09             ` Peter Zijlstra
2010-09-17  0:13           ` Huang Ying
2010-09-17  7:52             ` Peter Zijlstra
2010-09-17  8:13               ` Robert Richter
2010-09-17  8:37                 ` Cyrill Gorcunov
2010-09-17  8:47               ` Huang Ying
2010-09-10 13:34   ` [PATCH 0/3 v2] nmi perf fixes Peter Zijlstra
2010-09-10 13:52     ` Peter Zijlstra
2010-09-13  8:55       ` Cyrill Gorcunov
2010-09-13  9:54         ` Stephane Eranian
2010-09-13 10:07           ` Cyrill Gorcunov
2010-09-13 10:10             ` Stephane Eranian
2010-09-13 10:12               ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100910151735.GC13563@erda.amd.com \
    --to=robert.richter@amd.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=eranian@google.com \
    --cc=fweisbec@gmail.com \
    --cc=gorcunov@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.m.lin@intel.com \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.