public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Don Zickus <dzickus@redhat.com>
Cc: x86@kernel.org, Andi Kleen <andi@firstfloor.org>,
	Robert Richter <robert.richter@amd.com>,
	Peter Zijlstra <peterz@infradead.org>,
	ying.huang@intel.com, LKML <linux-kernel@vger.kernel.org>,
	jason.wessel@windriver.com
Subject: Re: [RFC][PATCH 2/6] x86, nmi: create new NMI handler routines
Date: Wed, 24 Aug 2011 10:04:11 -0700	[thread overview]
Message-ID: <20110824170411.GI2417@linux.vnet.ibm.com> (raw)
In-Reply-To: <1313786266-9585-3-git-send-email-dzickus@redhat.com>

On Fri, Aug 19, 2011 at 04:37:42PM -0400, Don Zickus wrote:
> The NMI handlers used to rely on the notifier infrastructure.  This worked
> great until we wanted to support handling multiple events better.
> 
> One of the key ideas to the nmi handling is to process _all_ the handlers for
> each NMI.  The reason behind this switch is because NMIs are edge triggered.
> If enough NMIs are triggered, then they could be lost because the cpu can
> only latch at most one NMI (besides the one currently being processed).
> 
> In order to deal with this we have decided to process all the NMI handlers
> for each NMI.  This allows the handlers to determine if they recieved an
> event or not (the ones that can not determine this will be left to fend
> for themselves on the unknown NMI list).
> 
> As a result of this change it is now possible to have an extra NMI that
> was destined to be received for an already processed event.  Because the
> event was processed in the previous NMI, this NMI gets dropped and becomes
> an 'unknown' NMI.  This of course will cause printks that scare people.
> 
> However, we prefer to have extra NMIs as opposed to losing NMIs and as such
> are have developed a basic mechanism to catch most of them.  That will be
> a later patch.
> 
> To accomplish this idea, I unhooked the nmi handlers from the notifier
> routines and created a new mechanism loosely based on doIRQ.  The reason
> for this is the notifier routines have a couple of shortcomings.  One we
> could't guarantee all future NMI handlers used NOTIFY_OK instead of
> NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
> handled in each routine (most only handle one, perf can handle more than one).
> Third, I wanted to eventually display which nmi handlers are registered in
> the system in /proc/interrupts to help see who is generating NMIs.
> 
> The patch below just implements the new infrastructure but doesn't wire it up
> yet (that is the next patch).  Its design is based on doIRQ structs and the
> atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
> (as the notifier routines have soaked it) but it should be double checked in
> case I copied the code wrong.

One comment below.

							Thanx, Paul

> Signed-off-by: Don Zickus <dzickus@redhat.com>
> ---
>  arch/x86/include/asm/nmi.h |   19 ++++++
>  arch/x86/kernel/nmi.c      |  139 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 158 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/include/asm/nmi.h b/arch/x86/include/asm/nmi.h
> index 4886a68..6d04b28 100644
> --- a/arch/x86/include/asm/nmi.h
> +++ b/arch/x86/include/asm/nmi.h
> @@ -42,6 +42,25 @@ void arch_trigger_all_cpu_backtrace(void);
>  #define NMI_LOCAL_NORMAL_PRIOR	(NMI_LOCAL_BIT | NMI_NORMAL_PRIOR)
>  #define NMI_LOCAL_LOW_PRIOR	(NMI_LOCAL_BIT | NMI_LOW_PRIOR)
> 
> +#define NMI_FLAG_FIRST	1
> +
> +enum {
> +	NMI_LOCAL=0,
> +	NMI_UNKNOWN,
> +	NMI_EXTERNAL,
> +	NMI_MAX
> +};
> +
> +#define NMI_DONE	0
> +#define NMI_HANDLED	1
> +
> +typedef int (*nmi_handler_t)(unsigned int, struct pt_regs *);
> +
> +int register_nmi_handler(unsigned int, nmi_handler_t, unsigned long,
> +			 const char *);
> +
> +void unregister_nmi_handler(unsigned int, const char *);
> +
>  void stop_nmi(void);
>  void restart_nmi(void);
> 
> diff --git a/arch/x86/kernel/nmi.c b/arch/x86/kernel/nmi.c
> index 68d758a..dfc46a8 100644
> --- a/arch/x86/kernel/nmi.c
> +++ b/arch/x86/kernel/nmi.c
> @@ -13,6 +13,9 @@
>  #include <linux/kprobes.h>
>  #include <linux/kdebug.h>
>  #include <linux/nmi.h>
> +#include <linux/delay.h>
> +#include <linux/hardirq.h>
> +#include <linux/slab.h>
> 
>  #if defined(CONFIG_EDAC)
>  #include <linux/edac.h>
> @@ -21,6 +24,27 @@
>  #include <linux/atomic.h>
>  #include <asm/traps.h>
>  #include <asm/mach_traps.h>
> +#include <asm/nmi.h>
> +
> +struct nmiaction {
> +	struct nmiaction __rcu *next;
> +	nmi_handler_t handler;
> +	unsigned int flags;
> +	const char *name;
> +};
> +
> +struct nmi_desc {
> +	spinlock_t lock;
> +	struct nmiaction __rcu *head;
> +};
> +
> +static struct nmi_desc nmi_desc[NMI_MAX] = 
> +{
> +	{	.lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[0].lock), },
> +	{	.lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[1].lock), },
> +	{	.lock = __SPIN_LOCK_UNLOCKED(&nmi_desc[2].lock), },
> +
> +};
> 
>  static int ignore_nmis;
> 
> @@ -38,6 +62,121 @@ static int __init setup_unknown_nmi_panic(char *str)
>  }
>  __setup("unknown_nmi_panic", setup_unknown_nmi_panic);
> 
> +#define nmi_to_desc(type) (&nmi_desc[type])
> +
> +static int notrace __kprobes nmi_handle(unsigned int type, struct pt_regs *regs)
> +{
> +	struct nmi_desc *desc = nmi_to_desc(type);
> +	struct nmiaction *next_a, *a, **ap = &desc->head;
> +	int handled=0;
> +
> +	rcu_read_lock();
> +	a = rcu_dereference_raw(*ap);

The reason for rcu_dereference_raw() is to prevent lockdep from choking
due to being called from an NMI handler, correct?  If so, please add a
comment to this effect on this and similar uses.

> +
> +	/*
> +	 * NMIs are edge-triggered, which means if you have enough
> +	 * of them concurrently, you can lose some because only one
> +	 * can be latched at any given time.  Walk the whole list
> +	 * to handle those situations.
> +	 */
> +	while (a) {
> +		next_a = rcu_dereference_raw(a->next);
> +
> +		handled += a->handler(type, regs);
> +
> +		a = next_a;
> +	}
> +	rcu_read_unlock();
> +
> +	/* return total number of NMI events handled */
> +	return handled;
> +}
> +
> +static int __setup_nmi(unsigned int type, struct nmiaction *action)
> +{
> +	struct nmi_desc *desc = nmi_to_desc(type);
> +	struct nmiaction **a = &(desc->head);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&desc->lock, flags);
> +
> +	/*
> +	 * some handlers need to be executed first otherwise a fake
> +	 * event confuses some handlers (kdump uses this flag)
> +	 */
> +	if (!(action->flags & NMI_FLAG_FIRST))
> +		while ((*a) != NULL)
> +			a = &((*a)->next);
> +	
> +	action->next = *a;
> +	rcu_assign_pointer(*a, action);
> +
> +	spin_unlock_irqrestore(&desc->lock, flags);
> +	return 0;
> +}
> +
> +static struct nmiaction *__free_nmi(unsigned int type, const char *name)
> +{
> +	struct nmi_desc *desc = nmi_to_desc(type);
> +	struct nmiaction *n, **np = &(desc->head);
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&desc->lock, flags);
> +
> +	while ((*np) != NULL) {
> +		n = *np;
> +
> +		/*
> +		 * the name passed in to describe the nmi handler
> +		 * is used as the lookup key
> +		 */
> +		if (!strcmp(n->name, name)) {
> +			WARN(in_nmi(),
> +				"Trying to free NMI (%s) from NMI context!\n", n->name);
> +			rcu_assign_pointer(*np, n->next);
> +			break;
> +		}
> +		np = &(n->next);
> +	}
> +
> +	spin_unlock_irqrestore(&desc->lock, flags);
> +	synchronize_rcu();
> +	return *np;
> +}
> +
> +int register_nmi_handler(unsigned int type, nmi_handler_t handler,
> +			unsigned long nmiflags, const char *devname)
> +{
> +	struct nmiaction *action;
> +	int retval;
> +
> +	if (!handler)
> +		return -EINVAL;
> +
> +	action = kzalloc(sizeof(struct nmiaction), GFP_KERNEL);
> +	if (!action)
> +		return -ENOMEM;
> +
> +	action->handler = handler;
> +	action->flags = nmiflags;
> +	action->name = devname;
> +
> +	retval = __setup_nmi(type, action);
> +
> +	if (retval)
> +		kfree(action);
> +	
> +	return retval;
> +}
> +EXPORT_SYMBOL(register_nmi_handler);
> +
> +void unregister_nmi_handler(unsigned int type, const char *name)
> +{
> +	kfree(__free_nmi(type, name));
> +}
> +
> +EXPORT_SYMBOL_GPL(unregister_nmi_handler);
> +
>  static notrace __kprobes void
>  pci_serr_error(unsigned char reason, struct pt_regs *regs)
>  {
> -- 
> 1.7.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

  parent reply	other threads:[~2011-08-24 17:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-08-19 20:37 [RFC][PATCH 0/6] x86, nmi: new NMI handling routines Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 1/6] x86, nmi: split out nmi from traps.c Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 2/6] x86, nmi: create new NMI handler routines Don Zickus
2011-08-22 14:13   ` Peter Zijlstra
2011-08-22 15:21     ` Don Zickus
2011-08-22 15:26       ` Peter Zijlstra
2011-08-22 15:41         ` Don Zickus
2011-08-22 15:31       ` Peter Zijlstra
2011-08-22 14:16   ` Peter Zijlstra
2011-08-22 15:23     ` Don Zickus
2011-08-23 14:14     ` Don Zickus
2011-08-23 14:17       ` Peter Zijlstra
2011-08-24 17:04   ` Paul E. McKenney [this message]
2011-08-24 17:44     ` Don Zickus
2011-08-24 17:51       ` Peter Zijlstra
2011-08-24 18:16         ` Don Zickus
2011-08-24 18:19           ` Peter Zijlstra
2011-08-24 19:16             ` Paul E. McKenney
2011-08-19 20:37 ` [RFC][PATCH 3/6] x86, nmi: wire up NMI handlers to new routines Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 4/6] x86, nmi: add in logic to handle multiple events and unknown NMIs Don Zickus
2011-08-22 14:22   ` Peter Zijlstra
2011-08-22 15:25     ` Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 5/6] x86, nmi: track NMI usage stats Don Zickus
2011-08-19 20:37 ` [RFC][PATCH 6/6] x86, nmi: print out NMI stats in /proc/interrupts Don Zickus
2011-08-22 14:27   ` Peter Zijlstra
2011-08-22 15:28     ` Don Zickus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110824170411.GI2417@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=jason.wessel@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=x86@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox