From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Don Zickus <dzickus@redhat.com>
Cc: x86@kernel.org, Andi Kleen <andi@firstfloor.org>,
Robert Richter <robert.richter@amd.com>,
Peter Zijlstra <peterz@infradead.org>,
ying.huang@intel.com, LKML <linux-kernel@vger.kernel.org>,
jason.wessel@windriver.com
Subject: Re: [RFC][PATCH 2/6] x86, nmi: create new NMI handler routines
Date: Wed, 24 Aug 2011 10:04:11 -0700 [thread overview]
Message-ID: <20110824170411.GI2417@linux.vnet.ibm.com> (raw)
In-Reply-To: <1313786266-9585-3-git-send-email-dzickus@redhat.com>
On Fri, Aug 19, 2011 at 04:37:42PM -0400, Don Zickus wrote:
> The NMI handlers used to rely on the notifier infrastructure. This worked
> great until we wanted to support handling multiple events better.
>
> One of the key ideas to the nmi handling is to process _all_ the handlers for
> each NMI. The reason behind this switch is because NMIs are edge triggered.
> If enough NMIs are triggered, then they could be lost because the cpu can
> only latch at most one NMI (besides the one currently being processed).
>
> In order to deal with this we have decided to process all the NMI handlers
> for each NMI. This allows the handlers to determine if they recieved an
> event or not (the ones that can not determine this will be left to fend
> for themselves on the unknown NMI list).
>
> As a result of this change it is now possible to have an extra NMI that
> was destined to be received for an already processed event. Because the
> event was processed in the previous NMI, this NMI gets dropped and becomes
> an 'unknown' NMI. This of course will cause printks that scare people.
>
> However, we prefer to have extra NMIs as opposed to losing NMIs and as such
> are have developed a basic mechanism to catch most of them. That will be
> a later patch.
>
> To accomplish this idea, I unhooked the nmi handlers from the notifier
> routines and created a new mechanism loosely based on doIRQ. The reason
> for this is the notifier routines have a couple of shortcomings. One we
> could't guarantee all future NMI handlers used NOTIFY_OK instead of
> NOTIFY_STOP. Second, we couldn't keep track of the number of events being
> handled in each routine (most only handle one, perf can handle more than one).
> Third, I wanted to eventually display which nmi handlers are registered in
> the system in /proc/interrupts to help see who is generating NMIs.
>
> The patch below just implements the new infrastructure but doesn't wire it up
> yet (that is the next patch). Its design is based on doIRQ structs and the
> atomic notifier routines. So the rcu stuff in the patch isn't entirely untested
> (as the notifier routines have soaked it) but it should be double checked in
> case I copied the code wrong.
One comment below.
Thanx, Paul
> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
> arch/x86/include/asm/nmi.h | 19 ++++++
> arch/x86/kernel/nmi.c | 139 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 158 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
> index 4886a68..6d04b28 100644
> --- a/arch/x86/include/asm/nmi.h
> +++ b/arch/x86/include/asm/nmi.h
> @@ -42,6 +42,25 @@ void arch_trigger_all_cpu_backtrace(void);
> #define NMI_LOCAL_NORMAL_PRIOR (NMI_LOCAL_BIT | NMI_NORMAL_PRIOR)
> #define NMI_LOCAL_LOW_PRIOR (NMI_LOCAL_BIT | NMI_LOW_PRIOR)
>
> +#define NMI_FLAG_FIRST 1
> +
> +enum {
> + NMI_LOCAL=0,
> + NMI_UNKNOWN,
> + NMI_EXTERNAL,
> + NMI_MAX
> +};
> +
> +#define NMI_DONE 0
> +#define NMI_HANDLED 1
> +
> +typedef int (*nmi_handler_t)(unsigned int, struct pt_regs *);
> +
> +int register_nmi_handler(unsigned int, nmi_handler_t, unsigned long,
> + const char *);
> +
> +void unregister_nmi_handler(unsigned int, const char *);
> +
> void stop_nmi(void);
> void restart_nmi(void);
>
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index 68d758a..dfc46a8 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -13,6 +13,9 @@
> #include <linux/kprobes.h>
> #include <linux/kdebug.h>
> #include <linux/nmi.h>
> +#include <linux/delay.h>
> +#include <linux/hardirq.h>
> +#include <linux/slab.h>
>
> #if defined(CONFIG_EDAC)
> #include <linux/edac.h>
> @@ -21,6 +24,27 @@
> #include <linux/atomic.h>
> #include <asm/traps.h>
> #include <asm/mach_traps.h>
> +#include <asm/nmi.h>
> +
> +struct nmiaction {
> + struct nmiaction __rcu *next;
> + nmi_handler_t handler;
> + unsigned int flags;
> + const char *name;
> +};
> +
> +struct nmi_desc {
> + spinlock_t lock;
> + struct nmiaction __rcu *head;
> +};
> +
> +static struct nmi_desc nmi_desc[NMI_MAX] =
> +{
> + { .lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[0].lock), },
> + { .lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[1].lock), },
> + { .lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[2].lock), },
> +
> +};
>
> static int ignore_nmis;
>
> @@ -38,6 +62,121 @@ static int __init setup_unknown_nmi_panic(char *str)
> }
> __setup("unknown_nmi_panic", setup_unknown_nmi_panic);
>
> +#define nmi_to_desc(type) (&nmi_desc[type])
> +
> +static int notrace __kprobes nmi_handle(unsigned int type, struct pt_regs *regs)
> +{
> + struct nmi_desc *desc = nmi_to_desc(type);
> + struct nmiaction *next_a, *a, **ap = &desc->head;
> + int handled=0;
> +
> + rcu_read_lock();
> + a = rcu_dereference_raw(*ap);
The reason for rcu_dereference_raw() is to prevent lockdep from choking
due to being called from an NMI handler, correct? If so, please add a
comment to this effect on this and similar uses.
> +
> + /*
> + * NMIs are edge-triggered, which means if you have enough
> + * of them concurrently, you can lose some because only one
> + * can be latched at any given time. Walk the whole list
> + * to handle those situations.
> + */
> + while (a) {
> + next_a = rcu_dereference_raw(a->next);
> +
> + handled += a->handler(type, regs);
> +
> + a = next_a;
> + }
> + rcu_read_unlock();
> +
> + /* return total number of NMI events handled */
> + return handled;
> +}
> +
> +static int __setup_nmi(unsigned int type, struct nmiaction *action)
> +{
> + struct nmi_desc *desc = nmi_to_desc(type);
> + struct nmiaction **a = &(desc->head);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&desc->lock, flags);
> +
> + /*
> + * some handlers need to be executed first otherwise a fake
> + * event confuses some handlers (kdump uses this flag)
> + */
> + if (!(action->flags & NMI_FLAG_FIRST))
> + while ((*a) != NULL)
> + a = &((*a)->next);
> +
> + action->next = *a;
> + rcu_assign_pointer(*a, action);
> +
> + spin_unlock_irqrestore(&desc->lock, flags);
> + return 0;
> +}
> +
> +static struct nmiaction *__free_nmi(unsigned int type, const char *name)
> +{
> + struct nmi_desc *desc = nmi_to_desc(type);
> + struct nmiaction *n, **np = &(desc->head);
> + unsigned long flags;
> +
> + spin_lock_irqsave(&desc->lock, flags);
> +
> + while ((*np) != NULL) {
> + n = *np;
> +
> + /*
> + * the name passed in to describe the nmi handler
> + * is used as the lookup key
> + */
> + if (!strcmp(n->name, name)) {
> + WARN(in_nmi(),
> + "Trying to free NMI (%s) from NMI context!\n", n->name);
> + rcu_assign_pointer(*np, n->next);
> + break;
> + }
> + np = &(n->next);
> + }
> +
> + spin_unlock_irqrestore(&desc->lock, flags);
> + synchronize_rcu();
> + return *np;
> +}
> +
> +int register_nmi_handler(unsigned int type, nmi_handler_t handler,
> + unsigned long nmiflags, const char *devname)
> +{
> + struct nmiaction *action;
> + int retval;
> +
> + if (!handler)
> + return -EINVAL;
> +
> + action = kzalloc(sizeof(struct nmiaction), GFP_KERNEL);
> + if (!action)
> + return -ENOMEM;
> +
> + action->handler = handler;
> + action->flags = nmiflags;
> + action->name = devname;
> +
> + retval = __setup_nmi(type, action);
> +
> + if (retval)
> + kfree(action);
> +
> + return retval;
> +}
> +EXPORT_SYMBOL(register_nmi_handler);
> +
> +void unregister_nmi_handler(unsigned int type, const char *name)
> +{
> + kfree(__free_nmi(type, name));
> +}
> +
> +EXPORT_SYMBOL_GPL(unregister_nmi_handler);
> +
> static notrace __kprobes void
> pci_serr_error(unsigned char reason, struct pt_regs *regs)
> {
> --
> 1.7.6
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2011-08-24 17:07 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-19 20:37 [RFC][PATCH 0/6] x86, nmi: new NMI handling routines Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 1/6] x86, nmi: split out nmi from traps.c Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 2/6] x86, nmi: create new NMI handler routines Don Zickus
2011-08-22 14:13 ` Peter Zijlstra
2011-08-22 15:21 ` Don Zickus
2011-08-22 15:26 ` Peter Zijlstra
2011-08-22 15:41 ` Don Zickus
2011-08-22 15:31 ` Peter Zijlstra
2011-08-22 14:16 ` Peter Zijlstra
2011-08-22 15:23 ` Don Zickus
2011-08-23 14:14 ` Don Zickus
2011-08-23 14:17 ` Peter Zijlstra
2011-08-24 17:04 ` Paul E. McKenney [this message]
2011-08-24 17:44 ` Don Zickus
2011-08-24 17:51 ` Peter Zijlstra
2011-08-24 18:16 ` Don Zickus
2011-08-24 18:19 ` Peter Zijlstra
2011-08-24 19:16 ` Paul E. McKenney
2011-08-19 20:37 ` [RFC][PATCH 3/6] x86, nmi: wire up NMI handlers to new routines Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 4/6] x86, nmi: add in logic to handle multiple events and unknown NMIs Don Zickus
2011-08-22 14:22 ` Peter Zijlstra
2011-08-22 15:25 ` Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 5/6] x86, nmi: track NMI usage stats Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 6/6] x86, nmi: print out NMI stats in /proc/interrupts Don Zickus
2011-08-22 14:27 ` Peter Zijlstra
2011-08-22 15:28 ` Don Zickus
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110824170411.GI2417@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=andi@firstfloor.org \
--cc=dzickus@redhat.com \
--cc=jason.wessel@windriver.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=robert.richter@amd.com \
--cc=x86@kernel.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.