public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <robert.richter@amd.com>
To: huang ying <huang.ying.caritas@gmail.com>
Cc: Huang Ying <ying.huang@intel.com>,
	Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH -v2 4/7] x86, NMI, Rewrite NMI handler
Date: Mon, 27 Sep 2010 15:25:38 +0200	[thread overview]
Message-ID: <20100927132537.GO13563@erda.amd.com> (raw)
In-Reply-To: <AANLkTimTg1uqXdAJ+GyZ3_B6n2xKWdqFL96jvv18JBM+@mail.gmail.com>

On 27.09.10 08:39:24, huang ying wrote:

Looking at all you comments below I would vote for the following:

We implement all handlers using DIE_NMI and set its priority
accordingly in struct notifier_block when registering the the nmi
handler. We define NMI priorities as macros such as
NMI_PRIORITY_LOCAL, NMI_PRIORITY_WATCHDOG, NMI_PRIORITY_IO, etc. and
require all handlers to set the priority. register_die_notifier() with
(!nb->priority) should return -EINVAL. DIE_NMI_UNKNOWN should only be
used if there is a handler for the case when all others fail such as
implemented in the perf nmi handler or when reporting an unknown nmi.

This will avoid all the confusion below and also makes the code much
cleaner.

-Robert

> Hi, Robert,
> 
> On Mon, Sep 27, 2010 at 5:41 PM, Robert Richter <robert.richter@amd.com> wrote:
> > On 26.09.10 20:57:03, Huang Ying wrote:
> >> The original NMI handler is quite outdated in many aspects. This patch
> >> try to fix it.
> >>
> >> The order to process the NMI sources are changed as follow:
> >>
> >> notify_die(DIE_NMI_IPI);
> >> notify_die(DIE_NMI);
> >> /* process io port 0x61 */
> >> nmi_watchdog_touch();
> >> notify_die(DIE_NMIUNKNOWN);
> >> unknown_nmi();
> >>
> >> DIE_NMI_IPI is used to process CPU specific NMI sources, such as perf
> >> event, oprofile, crash IPI, etc. While DIE_NMI is used to process
> >> non-CPU-specific NMI sources, such as APEI (ACPI Platform Error
> >> Interface) GHES (Generic Hardware Error Source), etc. Non-CPU-specific
> >> NMI sources can be processed on any CPU,
> >>
> >> DIE_NMI_IPI must be processed before DIE_NMI. For example, perf event
> >> trigger a NMI on CPU 1, at the same time, APEI GHES trigger another
> >> NMI on CPU 0. If DIE_NMI is processed before DIE_NMI_IPI, it is
> >> possible that APEI GHES is processed on CPU 1, while unknown NMI is
> >> gotten on CPU 0.
> >
> > I think macro names DIE_NMI_IPI and DIE_NMI should be swapped as
> > e.g. the perf nmi is actually local and non-IPI.
> 
> DIE_NMI_IPI may be not a good name for perf, but DIE_NMI is a even
> worse name for perf! DIE_NMI is originally used for IOCHK and PCI SERR
> NMI.
> 
> > We might consider to rework the IPI thing completly, but may be in a
> > follow-on patch.
> >
> >>
> >> In this new order of processing, performance sensitive NMI sources
> >> such as oprofile or perf event will have better performance because
> >> the time consuming IO port reading is done after them.
> >>
> >> Only one NMI is eaten for each NMI handler call, even for PCI SERR and
> >> IOCHK NMIs. Because one NMI should be raised for each of them, eating
> >> too many NMI will cause unnecessary unknown NMI.
> >>
> >> The die value used in NMI sources are fixed accordingly.
> >>
> >> The NMI handler in the patch is designed by Andi Kleen.
> >>
> >>
> >> v2:
> >>
> >> - Split process NMI reason (0x61) on non-BSP into another patch
> >>
> >> Signed-off-by: Huang Ying <ying.huang@intel.com>
> >> ---
> >>  arch/x86/kernel/cpu/perf_event.c  |    1
> >>  arch/x86/kernel/traps.c           |   80 +++++++++++++++++++-------------------
> >>  arch/x86/oprofile/nmi_int.c       |    1
> >>  arch/x86/oprofile/nmi_timer_int.c |    2
> >>  drivers/char/ipmi/ipmi_watchdog.c |    2
> >>  drivers/watchdog/hpwdt.c          |    2
> >>  6 files changed, 43 insertions(+), 45 deletions(-)
> >>
> >> --- a/arch/x86/kernel/cpu/perf_event.c
> >> +++ b/arch/x86/kernel/cpu/perf_event.c
> >> @@ -1247,7 +1247,6 @@ perf_event_nmi_handler(struct notifier_b
> >>               return NOTIFY_DONE;
> >>
> >>       switch (cmd) {
> >> -     case DIE_NMI:
> >>       case DIE_NMI_IPI:
> >
> > See my comment above. Same is true for oprofile and some other
> > handlers below. It isn't an IPI and should be case DIE_NMI: instead.
> >
> >>               break;
> >>       case DIE_NMIUNKNOWN:
> >> --- a/arch/x86/kernel/traps.c
> >> +++ b/arch/x86/kernel/traps.c
> >> @@ -354,9 +354,6 @@ io_check_error(unsigned char reason, str
> >>  static notrace __kprobes void
> >>  unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
> >>  {
> >> -     if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) ==
> >> -                     NOTIFY_STOP)
> >> -             return;
> >>  #ifdef CONFIG_MCA
> >>       /*
> >>        * Might actually be able to figure out what the guilty party
> >> @@ -385,51 +382,54 @@ static notrace __kprobes void default_do
> >>
> >>       cpu = smp_processor_id();
> >
> > This should go to if (!cpu) and maybe we drop variable cpu completly.
> 
> The variable cpu is dropped in 5/7.
> 
> >>
> >> -     /* Only the BSP gets external NMIs from the system. */
> >> -     if (!cpu)
> >> -             reason = get_nmi_reason();
> >> +     /*
> >> +      * CPU-specific NMI must be processed before non-CPU-specific
> >> +      * NMI, otherwise we may lose it, because the CPU-specific
> >> +      * NMI can not be detected/processed on other CPUs.
> >> +      */
> >>
> >> -     if (!(reason & NMI_REASON_MASK)) {
> >> -             if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, reason, 2, SIGINT)
> >> -                                                             == NOTIFY_STOP)
> >> -                     return;
> >> +     /*
> >> +      * CPU-specific NMI: send to specific CPU or NMI sources must
> >> +      * be processed on specific CPU
> >> +      */
> >> +     if (notify_die(DIE_NMI_IPI, "nmi_ipi", regs, 0, 2, SIGINT)
> >> +         == NOTIFY_STOP)
> >> +             return;
> >>
> >> -#ifdef CONFIG_X86_LOCAL_APIC
> >
> > Are you sure we may drop this option?
> 
> Yes. DIE_NMI is used for non-CPU-specific NMI sources now.
> 
> >> -             if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
> >> -                                                     == NOTIFY_STOP)
> >> -                     return;
> >> +     /* Non-CPU-specific NMI: NMI sources can be processed on any CPU */
> >> +     if (notify_die(DIE_NMI, "nmi", regs, 0, 2, SIGINT) == NOTIFY_STOP)
> >> +             return;
> >
> > As said, IPI and non-IPI are mixed up.
> 
> They are processed one after the other.
> 
> >>
> >> -#ifndef CONFIG_LOCKUP_DETECTOR
> >> -             /*
> >> -              * Ok, so this is none of the documented NMI sources,
> >> -              * so it must be the NMI watchdog.
> >> -              */
> >> -             if (nmi_watchdog_tick(regs, reason))
> >> -                     return;
> >> -             if (!do_nmi_callback(regs, cpu))
> >> -#endif /* !CONFIG_LOCKUP_DETECTOR */
> >> -                     unknown_nmi_error(reason, regs);
> >> -#else
> >> -             unknown_nmi_error(reason, regs);
> >> +     if (!cpu) {
> >> +             reason = get_nmi_reason();
> >> +             if (reason & NMI_REASON_MASK) {
> >> +                     if (reason & NMI_REASON_SERR)
> >> +                             pci_serr_error(reason, regs);
> >> +                     else if (reason & NMI_REASON_IOCHK)
> >> +                             io_check_error(reason, regs);
> >> +#ifdef CONFIG_X86_32
> >> +                     /*
> >> +                      * Reassert NMI in case it became active
> >> +                      * meanwhile as it's edge-triggered:
> >> +                      */
> >> +                     reassert_nmi();
> >>  #endif
> >> +                     return;
> >> +             }
> >> +     }
> >>
> >> +#if defined(CONFIG_X86_LOCAL_APIC) && !defined(CONFIG_LOCKUP_DETECTOR)
> >> +     if (nmi_watchdog_tick(regs, reason))
> >>               return;
> >> -     }
> >> -     if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT) == NOTIFY_STOP)
> >> +     if (do_nmi_callback(regs, smp_processor_id()))
> >>               return;
> >> -
> >> -     /* AK: following checks seem to be broken on modern chipsets. FIXME */
> >> -     if (reason & NMI_REASON_SERR)
> >> -             pci_serr_error(reason, regs);
> >> -     if (reason & NMI_REASON_IOCHK)
> >> -             io_check_error(reason, regs);
> >> -#ifdef CONFIG_X86_32
> >> -     /*
> >> -      * Reassert NMI in case it became active meanwhile
> >> -      * as it's edge-triggered:
> >> -      */
> >> -     reassert_nmi();
> >>  #endif
> >> +
> >> +     if (notify_die(DIE_NMIUNKNOWN, "nmi_unknown", regs, reason, 2, SIGINT)
> >> +         == NOTIFY_STOP)
> >> +             return;
> >> +
> >> +     unknown_nmi_error(reason, regs);
> >>  }
> >>
> >>  dotraplinkage notrace __kprobes void
> >> --- a/arch/x86/oprofile/nmi_int.c
> >> +++ b/arch/x86/oprofile/nmi_int.c
> >> @@ -64,7 +64,6 @@ static int profile_exceptions_notify(str
> >>       int ret = NOTIFY_DONE;
> >>
> >>       switch (val) {
> >> -     case DIE_NMI:
> >>       case DIE_NMI_IPI:
> >>               if (ctr_running)
> >>                       model->check_ctrs(args->regs, &__get_cpu_var(cpu_msrs));
> >> --- a/arch/x86/oprofile/nmi_timer_int.c
> >> +++ b/arch/x86/oprofile/nmi_timer_int.c
> >> @@ -25,7 +25,7 @@ static int profile_timer_exceptions_noti
> >>       int ret = NOTIFY_DONE;
> >>
> >>       switch (val) {
> >> -     case DIE_NMI:
> >> +     case DIE_NMI_IPI:
> >>               oprofile_add_sample(args->regs, 0);
> >>               ret = NOTIFY_STOP;
> >>               break;
> >> --- a/drivers/char/ipmi/ipmi_watchdog.c
> >> +++ b/drivers/char/ipmi/ipmi_watchdog.c
> >> @@ -1080,7 +1080,7 @@ ipmi_nmi(struct notifier_block *self, un
> >>  {
> >>       struct die_args *args = data;
> >>
> >> -     if (val != DIE_NMI)
> >> +     if (val != DIE_NMIUNKNOWN)
> 
> All watchdogs use DIE_NMIUNKNOWN in this patch. Because they should be
> processed after CPU specific and non-CPU-specific NMIs. Or we define a
> special DIE_NMI_XX for it? like DIE_NMI_WATCHDOG?
> 
> Best Regards,
> Huang Ying
> 

-- 
Advanced Micro Devices, Inc.
Operating System Research Center


  reply	other threads:[~2010-09-27 13:26 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-27  0:57 [PATCH -v2 1/7] x86, NMI, Add symbol definition for NMI magic constants Huang Ying
2010-09-27  0:57 ` [PATCH -v2 2/7] x86, NMI, Add touch_nmi_watchdog to io_check_error delay Huang Ying
2010-09-27  0:57 ` [PATCH -v2 3/7] x86, NMI, Rename memory parity error to PCI SERR error Huang Ying
2010-09-27  8:01   ` Robert Richter
2010-09-27  8:39     ` Huang Ying
2010-09-27  9:00       ` Robert Richter
2010-09-27 15:33         ` Don Zickus
2010-09-27 16:45           ` Robert Richter
2010-09-27 17:50             ` Don Zickus
2010-09-28  1:33             ` Huang Ying
2010-09-28 14:29               ` Robert Richter
2010-09-29  7:56                 ` huang ying
2010-09-28 15:38               ` Don Zickus
2010-09-28  1:22           ` Huang Ying
2010-09-27  0:57 ` [PATCH -v2 4/7] x86, NMI, Rewrite NMI handler Huang Ying
2010-09-27  9:41   ` Robert Richter
2010-09-27 12:39     ` huang ying
2010-09-27 13:25       ` Robert Richter [this message]
2010-09-27 15:29         ` Don Zickus
2010-09-27 17:40           ` Robert Richter
2010-09-27 19:14             ` Don Zickus
2010-09-27 22:35               ` Robert Richter
2010-09-28  1:03         ` Huang Ying
2010-09-28 14:59           ` Robert Richter
2010-09-29  7:54             ` huang ying
2010-09-27  0:57 ` [PATCH -v2 5/7] Make NMI reason io port (0x61) can be processed on any CPU Huang Ying
2010-09-27  0:57 ` [PATCH -v2 6/7] x86, NMI, Add support to notify hardware error with unknown NMI Huang Ying
2010-09-27 10:09   ` Robert Richter
2010-09-27 12:47     ` huang ying
2010-09-27 13:38       ` Robert Richter
2010-09-27 15:20         ` Don Zickus
2010-09-28  0:36           ` Huang Ying
2010-09-28 15:32             ` Don Zickus
2010-09-29  8:17               ` huang ying
2010-09-30  4:36                 ` Don Zickus
2010-09-30  4:57                   ` Huang Ying
2010-09-30  8:38                     ` Robert Richter
2010-09-30  9:36                       ` huang ying
2010-09-30  9:51                         ` Andi Kleen
2010-10-01 20:00                     ` Maciej W. Rozycki
2010-09-30  8:25                   ` Andi Kleen
2010-09-28  1:19         ` Huang Ying
2010-09-28 15:27           ` Robert Richter
2010-09-29  8:07             ` huang ying
2010-09-27 15:38   ` Don Zickus
2010-09-28  1:54     ` Huang Ying
2010-09-27  0:57 ` [PATCH -v2 7/7] x86, NMI, Remove do_nmi_callback logic Huang Ying
2010-09-27 10:44   ` Robert Richter
2010-09-27 12:56     ` huang ying
2010-09-27 13:43       ` Robert Richter
2010-09-27 15:16         ` Don Zickus
2010-09-27 16:58           ` Robert Richter
2010-09-28  1:41             ` Huang Ying
2010-09-28 15:16               ` Robert Richter
2010-09-28 15:21               ` Don Zickus
2010-09-28  0:28           ` Huang Ying
2010-09-28 15:19             ` Don Zickus
2010-09-29  6:55               ` huang ying
2010-09-30  4:04                 ` Don Zickus
2010-09-30  5:21                   ` Huang Ying
2010-09-30  8:24                     ` Andi Kleen
2010-09-30  8:23                   ` Robert Richter
2010-09-27 10:50 ` [PATCH -v2 1/7] x86, NMI, Add symbol definition for NMI magic constants Robert Richter
2010-09-27 15:29   ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100927132537.GO13563@erda.amd.com \
    --to=robert.richter@amd.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=hpa@zytor.com \
    --cc=huang.ying.caritas@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox