All of lore.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Huang Ying <ying.huang@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org, Andi Kleen <andi@firstfloor.org>
Subject: Re: [RFC 1/6] x86, NMI, Add symbol definition for NMI magic constants
Date: Tue, 21 Sep 2010 17:48:47 -0400	[thread overview]
Message-ID: <20100921214847.GF26290@redhat.com> (raw)
In-Reply-To: <1284087065-32722-1-git-send-email-ying.huang@intel.com>

On Fri, Sep 10, 2010 at 10:51:00AM +0800, Huang Ying wrote:
> Replace the NMI related magic numbers with symbol constants.

Hi Huang,

Sorry for disappearing for a week..

Ingo asked me to shepherd these patches.  I finally got around to do some
testing on them.  I'll do some more tomorrow.

Anyway, I don't have a problem with patches 1-3 and 6 (I guess the rename
and rename again doesn't really bother me and it kinda makes some logical
sense).

I am ok with most of patch 4 but I was wondering if you could split out
the part of using other cpus to access the reason register.  To me it seem
like the nmi handler rewrite and allowing !bsp cpus to access the reason
registers were two different ideas.  For bisecting reasons it would be
easier to seperate them in case we have problems with lost NMIs later.  It
would be easier to determine if the lost NMIs were from the rewrite or the
migration of the reason register to other cpus.

I still have a stupid hangup about the raw_spin_lock but if no one else
has any issues, then I'll just shutup about it. :-)

As for patch 5, I am worried about breaking existing user systems.  I went
through the fedora buglist and noticed a couple dozen bugzillas
complaining about unknown nmis.  The people complaining still seemed to
have functioning systems (at least they seemed to think so).  Adding in
the panic gets me worried that we might break a user's setup and cause
them regressions.

Though I understand what Andi is saying an unknown NMI is bad and the
system should panic, but on the other hand, unless we have a way of
analyzing it and give a user an option to either fix it or override it,
just panicing may not be the best way right now IMO.

I guess adding either another knob to override the hardware error option
or tying it in with the panic_on_unknown_error option might make me more
comfortable.  That way enterprise customers can always just enable it by
default and desktop users (for now) could have it off.

Thoughts?

Cheers,
Don
> 
> Signed-off-by: Huang Ying <ying.huang@intel.com>
> ---
>  arch/x86/include/asm/mach_traps.h |   12 +++++++++++-
>  arch/x86/kernel/traps.c           |   18 +++++++++---------
>  2 files changed, 20 insertions(+), 10 deletions(-)
> 
> --- a/arch/x86/include/asm/mach_traps.h
> +++ b/arch/x86/include/asm/mach_traps.h
> @@ -7,9 +7,19 @@
>  
>  #include <asm/mc146818rtc.h>
>  
> +#define NMI_REASON_PORT		0x61
> +
> +#define NMI_REASON_MEMPAR	0x80
> +#define NMI_REASON_IOCHK	0x40
> +#define NMI_REASON_MASK		(NMI_REASON_MEMPAR | NMI_REASON_IOCHK)
> +
> +#define NMI_REASON_CLEAR_MEMPAR	0x04
> +#define NMI_REASON_CLEAR_IOCHK	0x08
> +#define NMI_REASON_CLEAR_MASK	0x0f
> +
>  static inline unsigned char get_nmi_reason(void)
>  {
> -	return inb(0x61);
> +	return inb(NMI_REASON_PORT);
>  }
>  
>  static inline void reassert_nmi(void)
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -323,8 +323,8 @@ mem_parity_error(unsigned char reason, s
>  	printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
>  
>  	/* Clear and disable the memory parity error line. */
> -	reason = (reason & 0xf) | 4;
> -	outb(reason, 0x61);
> +	reason = (reason & NMI_REASON_CLEAR_MASK) | NMI_REASON_CLEAR_MEMPAR;
> +	outb(reason, NMI_REASON_PORT);
>  }
>  
>  static notrace __kprobes void
> @@ -339,15 +339,15 @@ io_check_error(unsigned char reason, str
>  		panic("NMI IOCK error: Not continuing");
>  
>  	/* Re-enable the IOCK line, wait for a few seconds */
> -	reason = (reason & 0xf) | 8;
> -	outb(reason, 0x61);
> +	reason = (reason & NMI_REASON_CLEAR_MASK) | NMI_REASON_CLEAR_IOCHK;
> +	outb(reason, NMI_REASON_PORT);
>  
>  	i = 2000;
>  	while (--i)
>  		udelay(1000);
>  
> -	reason &= ~8;
> -	outb(reason, 0x61);
> +	reason &= ~NMI_REASON_CLEAR_IOCHK;
> +	outb(reason, NMI_REASON_PORT);
>  }
>  
>  static notrace __kprobes void
> @@ -388,7 +388,7 @@ static notrace __kprobes void default_do
>  	if (!cpu)
>  		reason = get_nmi_reason();
>  
> -	if (!(reason & 0xc0)) {
> +	if (!(reason & NMI_REASON_MASK)) {
>  		if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT)
>  								== NOTIFY_STOP)
>  			return;
> @@ -418,9 +418,9 @@ static notrace __kprobes void default_do
>  		return;
>  
>  	/* AK: following checks seem to be broken on modern chipsets. FIXME */
> -	if (reason & 0x80)
> +	if (reason & NMI_REASON_MEMPAR)
>  		mem_parity_error(reason, regs);
> -	if (reason & 0x40)
> +	if (reason & NMI_REASON_IOCHK)
>  		io_check_error(reason, regs);
>  #ifdef CONFIG_X86_32
>  	/*
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  parent reply	other threads:[~2010-09-21 21:49 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-10  2:51 [RFC 1/6] x86, NMI, Add symbol definition for NMI magic constants Huang Ying
2010-09-10  2:51 ` [RFC 2/6] x86, NMI, Add touch_nmi_watchdog to io_check_error delay Huang Ying
2010-09-10  2:51 ` [RFC 3/6] x86, NMI, Rename memory parity error to PCI SERR error Huang Ying
2010-09-13  1:02   ` Robert Richter
2010-09-13  2:02     ` Huang Ying
2010-09-16  8:18       ` Robert Richter
2010-09-17  0:08         ` Huang Ying
2010-09-17  9:14           ` Robert Richter
2010-09-19  0:20             ` Huang Ying
2010-09-20  8:00               ` Robert Richter
2010-09-20 12:59                 ` Borislav Petkov
2010-09-21  0:22                   ` Huang Ying
2010-09-21  6:37                     ` Borislav Petkov
2010-09-21 14:08                       ` Doug Thompson
2010-09-21 23:04   ` Maciej W. Rozycki
2010-09-23  5:37     ` huang ying
2010-09-29  0:26       ` Maciej W. Rozycki
2010-09-10  2:51 ` [RFC 4/6] x86, NMI, Rewrite NMI handler Huang Ying
2010-09-10 15:56   ` Don Zickus
2010-09-10 16:03     ` Andi Kleen
2010-09-10 18:29       ` Don Zickus
2010-09-13  2:09         ` Huang Ying
2010-09-13 14:04           ` Don Zickus
2010-09-14  5:12             ` Huang Ying
2010-09-14 13:37               ` Don Zickus
2010-09-13  1:16   ` Robert Richter
2010-09-10  2:51 ` [RFC 5/6] x86, NMI, Add support to notify hardware error with unknown NMI Huang Ying
2010-09-10 16:02   ` Don Zickus
2010-09-10 16:19     ` Andi Kleen
2010-09-10 18:40       ` Don Zickus
2010-09-13  2:19         ` Huang Ying
2010-09-13 14:11           ` Don Zickus
2010-09-13 15:24             ` Andi Kleen
2010-09-13 15:47               ` Don Zickus
2010-09-13 16:57                 ` Andi Kleen
2010-09-13 17:53                   ` Don Zickus
2010-09-13 18:07                     ` Andi Kleen
2010-09-13 18:23                       ` Don Zickus
2010-09-13 18:36                         ` Andi Kleen
2010-09-13 19:36                           ` Don Zickus
2010-09-13 20:49                             ` Andi Kleen
2010-09-13 21:25                               ` Valdis.Kletnieks
2010-09-14  7:48                                 ` Andi Kleen
2010-09-14 17:54                                   ` Valdis.Kletnieks
2010-09-14 12:21                             ` Ingo Molnar
2010-09-14 13:45                               ` Don Zickus
2010-09-14 19:34                               ` Cyrill Gorcunov
2010-09-15  9:29                                 ` Ingo Molnar
2010-09-10  2:51 ` [RFC 6/6] x86, NMI, Remove do_nmi_callback logic Huang Ying
2010-09-10 16:13   ` Don Zickus
2010-09-13  2:27     ` Huang Ying
2010-09-13  6:24       ` Ingo Molnar
2010-09-10 20:37 ` [RFC 1/6] x86, NMI, Add symbol definition for NMI magic constants Peter Zijlstra
2010-09-10 22:58   ` H. Peter Anvin
2010-09-11  8:50   ` Andi Kleen
2010-09-13  1:30     ` Robert Richter
2010-09-21 21:48 ` Don Zickus [this message]
2010-09-21 22:19   ` Andi Kleen
2010-09-22 16:07     ` Don Zickus
2010-09-23  9:29       ` huang ying
2010-09-23 14:16         ` Don Zickus
2010-09-24 11:50           ` huang ying
2010-09-24 14:29             ` Don Zickus
2010-09-23  9:51   ` huang ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100921214847.GF26290@redhat.com \
    --to=dzickus@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.