linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yinghai Lu <yinghai@kernel.org>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	hpa@zytor.com, linux-kernel@vger.kernel.org, tglx@linutronix.de,
	linux-tip-commits@vger.kernel.org
Subject: Re: [tip:perf/urgent] perf, x86: Fix accidentally ack'ing a second event on intel perf counter
Date: Mon, 04 Oct 2010 16:24:14 -0700	[thread overview]
Message-ID: <4CAA621E.7040403@kernel.org> (raw)
In-Reply-To: <20100903183559.GB4879@redhat.com>

On 09/03/2010 11:35 AM, Don Zickus wrote:
> On Fri, Sep 03, 2010 at 10:15:16AM -0700, Yinghai Lu wrote:
>> On 09/03/2010 08:00 AM, Don Zickus wrote:
>>>>
>>>> [PATCH] x86,nmi: move unknown_nmi_panic to traps.c
>>>
>>> This patch duplicates a bunch of stuff we already have in
>>> unknown_nmi_error.  The only thing I think you are interested in is using
>>> the 'unknown_nmi_panic' flag.  I am putting together a smaller patch that
>>> uses that flag in traps.c (though it would be nice to combine that flag
>>> with panic_on_unrecovered_nmi).
>>
>> please make sure
>> keep using unknown_nmi_panic in boot command line and  sysctl
>> when LOCKUP_DETECTOR is defined.
>>
>> that does work until hw nmi watchdog is merged with software lock detector.
>> assume that time hw nmi watchdog is relying on perf nmi and perf nmi would eat all unknown nmi.
>> good to have Robert/Peter/Don's patches to make per nmi not to eat all unknown nmi.
> 
> Hi Yinghai,
> 
> Here is the simpler patch I came up with.  It piggy backs off the
> unknown_nmi_error code already available.  I compile it with the old and
> new nmi watchdog and tested it with sysctl and the kernel parameter.
> Everything seems to panic properly.
> 
> Let me know if this meets your needs.


...

> 
> diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
> index cefd694..e66b16d 100644
> --- a/arch/x86/kernel/apic/hw_nmi.c
> +++ b/arch/x86/kernel/apic/hw_nmi.c
> @@ -100,7 +100,6 @@ void acpi_nmi_disable(void) { return; }
>  #endif
>  atomic_t nmi_active = ATOMIC_INIT(0);           /* oprofile uses this */
>  EXPORT_SYMBOL(nmi_active);
> -int unknown_nmi_panic;
>  void cpu_nmi_set_wd_enabled(void) { return; }
>  void stop_apic_nmi_watchdog(void *unused) { return; }
>  void setup_apic_nmi_watchdog(void *unused) { return; }
> diff --git a/arch/x86/kernel/apic/nmi.c b/arch/x86/kernel/apic/nmi.c
> index a43f71c..dc35af4 100644
> --- a/arch/x86/kernel/apic/nmi.c
> +++ b/arch/x86/kernel/apic/nmi.c
> @@ -37,7 +37,6 @@
>  
>  #include <asm/mach_traps.h>
>  
> -int unknown_nmi_panic;
>  int nmi_watchdog_enabled;
>  
>  /* For reliability, we're prepared to waste bits here. */
> @@ -483,13 +482,6 @@ static void disable_ioapic_nmi_watchdog(void)
>  	on_each_cpu(stop_apic_nmi_watchdog, NULL, 1);
>  }
>  
> -static int __init setup_unknown_nmi_panic(char *str)
> -{
> -	unknown_nmi_panic = 1;
> -	return 1;
> -}
> -__setup("unknown_nmi_panic", setup_unknown_nmi_panic);
> -
>  static int unknown_nmi_panic_callback(struct pt_regs *regs, int cpu)
>  {
>  	unsigned char reason = get_nmi_reason();
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index 60788de..095eea8 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -300,6 +300,16 @@ gp_in_kernel:
>  	die("general protection fault", regs, error_code);
>  }
>  
> +#if defined(CONFIG_SYSCTL) && defined(CONFIG_X86_LOCAL_APIC)
> +int unknown_nmi_panic;
> +static int __init setup_unknown_nmi_panic(char *str)
> +{
> +	unknown_nmi_panic = 1;
> +	return 1;
> +}
> +__setup("unknown_nmi_panic", setup_unknown_nmi_panic);
> +#endif
> +
>  static notrace __kprobes void
>  mem_parity_error(unsigned char reason, struct pt_regs *regs)
>  {
> @@ -371,6 +381,10 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
>  			reason, smp_processor_id());
>  
>  	printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
> +#if defined(CONFIG_SYSCTL) && defined(CONFIG_X86_LOCAL_APIC)
> +	if (unknown_nmi_panic)
> +		die_nmi("", regs, 1);
> +#endif
>  	if (panic_on_unrecovered_nmi)
>  		panic("NMI: Not continuing");

please consider to move 
	printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
down.

test it, and it works on my test setups.

please submit complete patch to Ingo.

Thanks

Yinghai


>  
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index ca38e8e..71516a4 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -739,7 +739,7 @@ static struct ctl_table kern_table[] = {
>  		.extra2		= &one,
>  	},
>  #endif
> -#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR)
> +#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
>  	{
>  		.procname       = "unknown_nmi_panic",
>  		.data           = &unknown_nmi_panic,
> @@ -747,6 +747,8 @@ static struct ctl_table kern_table[] = {
>  		.mode           = 0644,
>  		.proc_handler   = proc_dointvec,
>  	},
> +#endif
> +#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR)
>  	{
>  		.procname       = "nmi_watchdog",
>  		.data           = &nmi_watchdog_enabled,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tip-commits" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


  parent reply	other threads:[~2010-10-04 23:24 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-02 19:07 [PATCH 0/3 v2] nmi perf fixes Don Zickus
2010-09-02 19:07 ` [PATCH 1/3] perf, x86: Fix accidentally ack'ing a second event on intel perf counter Don Zickus
2010-09-02 19:26   ` Cyrill Gorcunov
2010-09-02 20:00     ` Don Zickus
2010-09-02 20:36       ` Cyrill Gorcunov
2010-09-03  7:10   ` [tip:perf/urgent] " tip-bot for Don Zickus
2010-09-03  7:39     ` Yinghai Lu
2010-09-03 15:00       ` Don Zickus
2010-09-03 17:15         ` Yinghai Lu
2010-09-03 18:35           ` Don Zickus
2010-09-03 19:24             ` Yinghai Lu
2010-09-03 20:10               ` Don Zickus
2010-10-04 23:24             ` Yinghai Lu [this message]
2010-10-11 20:25               ` Don Zickus
2010-09-02 19:07 ` [PATCH 2/3] perf, x86: Try to handle unknown nmis with an enabled PMU Don Zickus
2010-09-03  7:11   ` [tip:perf/urgent] " tip-bot for Robert Richter
2010-09-02 19:07 ` [PATCH 3/3] perf, x86: Fix handle_irq return values Don Zickus
2010-09-03  7:10   ` [tip:perf/urgent] " tip-bot for Peter Zijlstra
2010-09-10 11:41 ` [PATCH 0/3 v2] nmi perf fixes Peter Zijlstra
2010-09-10 12:10   ` Stephane Eranian
2010-09-10 12:13     ` Stephane Eranian
2010-09-10 13:27   ` Don Zickus
2010-09-10 14:46     ` Ingo Molnar
2010-09-10 15:17       ` Robert Richter
2010-09-10 15:58         ` Peter Zijlstra
2010-09-10 16:41           ` Ingo Molnar
2010-09-10 16:42             ` Ingo Molnar
2010-09-10 16:37         ` Ingo Molnar
2010-09-10 16:51           ` Ingo Molnar
2010-09-10 15:56       ` [PATCH] x86: fix duplicate calls of the nmi handler Robert Richter
2010-09-10 16:15         ` Peter Zijlstra
2010-09-11  9:41         ` Ingo Molnar
2010-09-11 11:44           ` Robert Richter
2010-09-11 12:45             ` Ingo Molnar
2010-09-12  9:52               ` Robert Richter
2010-09-13 14:37                 ` Robert Richter
2010-09-14 17:41                   ` Robert Richter
2010-09-15 16:20                     ` [PATCH] perf, x86: catch spurious interrupts after disabling counters Robert Richter
2010-09-15 16:36                       ` Stephane Eranian
2010-09-15 17:00                         ` Robert Richter
2010-09-15 17:32                           ` Stephane Eranian
2010-09-15 18:44                             ` Robert Richter
2010-09-15 19:34                               ` Cyrill Gorcunov
2010-09-15 20:21                                 ` Stephane Eranian
2010-09-15 20:39                                   ` Cyrill Gorcunov
2010-09-15 22:27                                     ` Robert Richter
2010-09-16 14:51                                       ` Frederic Weisbecker
2010-09-15 16:46                       ` Cyrill Gorcunov
2010-09-15 16:47                         ` Stephane Eranian
2010-09-15 17:02                           ` Cyrill Gorcunov
2010-09-15 17:28                             ` Robert Richter
2010-09-15 17:40                               ` Cyrill Gorcunov
2010-09-15 22:10                                 ` Robert Richter
2010-09-16  6:53                                   ` Cyrill Gorcunov
2010-09-16 17:34                       ` Peter Zijlstra
2010-09-17  8:51                         ` Robert Richter
2010-09-17  9:14                           ` Peter Zijlstra
2010-09-17 13:06                       ` Stephane Eranian
2010-09-20  8:41                         ` Robert Richter
2010-09-24  0:02                       ` Don Zickus
2010-09-24  3:18                         ` Don Zickus
2010-09-24 10:03                           ` Robert Richter
2010-09-24 13:38                             ` Stephane Eranian
2010-09-30 12:33                               ` Peter Zijlstra
2010-09-24 18:11                             ` Don Zickus
2010-09-24 10:41                       ` [tip:perf/urgent] perf, x86: Catch " tip-bot for Robert Richter
2010-09-29 12:26                         ` Stephane Eranian
2010-09-29 12:53                           ` Robert Richter
2010-09-29 12:54                             ` Robert Richter
2010-09-29 13:13                               ` Stephane Eranian
2010-09-29 13:28                                 ` Stephane Eranian
2010-09-29 15:01                                   ` Robert Richter
2010-09-29 15:12                                     ` Robert Richter
2010-09-29 15:27                                       ` Cyrill Gorcunov
2010-09-29 15:33                                         ` Stephane Eranian
2010-09-29 15:45                                           ` Cyrill Gorcunov
2010-09-29 15:51                                             ` Cyrill Gorcunov
2010-09-29 16:32                                               ` Robert Richter
2010-09-29 16:48                                                 ` Cyrill Gorcunov
2010-09-29 16:00                                             ` Stephane Eranian
2010-09-29 17:09                                               ` Robert Richter
2010-09-29 17:41                                                 ` Cyrill Gorcunov
2010-09-29 18:12                                                 ` Don Zickus
2010-09-29 19:42                                                   ` Stephane Eranian
2010-09-29 20:03                                                     ` Don Zickus
2010-09-30  9:12                                                     ` Robert Richter
2010-09-30 19:44                                                       ` Don Zickus
2010-10-01  7:17                                                         ` Robert Richter
     [not found]                                                           ` <AANLkTimUyLaVaBigjm0-CwRsdh4UXWDiss2ffX53S+k_@mail.gmail.com>
2010-10-01 11:53                                                             ` Stephane Eranian
2010-10-02  9:35                                                               ` Robert Richter
2010-10-04  8:53                                                                 ` Stephane Eranian
2010-10-04  9:07                                                                   ` Andi Kleen
2010-10-04 17:28                                                                     ` Stephane Eranian
2010-09-29 16:31                                           ` Robert Richter
2010-09-29 16:22                                         ` Robert Richter
2010-09-29 19:01                                         ` Don Zickus
2010-09-29 13:39                                 ` Robert Richter
2010-09-29 13:56                                   ` Stephane Eranian
2010-09-29 14:00                                     ` Stephane Eranian
2010-10-02  9:50                                       ` Robert Richter
2010-10-02 17:40                                         ` Stephane Eranian
2010-09-29 15:02                                     ` Cyrill Gorcunov
2010-09-16 17:42         ` [PATCH] x86: fix duplicate calls of the nmi handler Peter Zijlstra
2010-09-16 20:18           ` Stephane Eranian
2010-09-17  7:09             ` Peter Zijlstra
2010-09-17  0:13           ` Huang Ying
2010-09-17  7:52             ` Peter Zijlstra
2010-09-17  8:13               ` Robert Richter
2010-09-17  8:37                 ` Cyrill Gorcunov
2010-09-17  8:47               ` Huang Ying
2010-09-10 13:34   ` [PATCH 0/3 v2] nmi perf fixes Peter Zijlstra
2010-09-10 13:52     ` Peter Zijlstra
2010-09-13  8:55       ` Cyrill Gorcunov
2010-09-13  9:54         ` Stephane Eranian
2010-09-13 10:07           ` Cyrill Gorcunov
2010-09-13 10:10             ` Stephane Eranian
2010-09-13 10:12               ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CAA621E.7040403@kernel.org \
    --to=yinghai@kernel.org \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).