All of lore.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Jack Steiner <steiner@sgi.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>, Ingo Molnar <mingo@elte.hu>,
	tglx@linutronix.de, hpa@zytor.com, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH] x86, UV: Fix NMI handler for UV platforms
Date: Thu, 24 Mar 2011 14:43:28 -0400	[thread overview]
Message-ID: <20110324184328.GG1239@redhat.com> (raw)
In-Reply-To: <20110324170944.GA28825@sgi.com>

On Thu, Mar 24, 2011 at 12:09:44PM -0500, Jack Steiner wrote:
> 
> I added tracing to see if I could get more clues on the cause
> of the "dazed" message. Unfortunately, I don't see anything - maybe
> you do.

There goes my other theory where the back-to-back nmi logic broke down
because the UV nmi jumped in the middle of the chain but continued with
the back-to-back nmis.

> 
> I used a tracing module that I've used for other things. I'm sure
> there are other facilities available, but I've used this for a long time & it's
> easy to update for specific purposes.
> 	rtc = usec clock
> 	rtc-delta = usec since previous trace entry
> 	id  = trace identifier (not particularily useful here)
> 	p1, p2 = tracepoint specific data. See patch below
> 	   For hw_perf
> 		p1 [63:32] this_nmi
> 		   [31:0]  handled
> 		p2 [63:32] pmu_nmi.marked
> 		   [31:0]  pmu_nmi.handled

I have done similar stuff using trace_printk around all the wrmsrl and
rdmsrls.  I have noticed that the counter is shutdown in prep to sched
out a task (it calls x86_pmu_del, which calls x86_pmu_stop).  This is in a
non-nmi context.  Shortly after the pmu is stopped an unknown nmi comes in
and causes the 'Dazed' messages.  I thought is was the x86_pmu.disable
call racing with the disabling of the active_mask, but that didn't fix my
problem. :-/

Unfortunately, I am very busy at work and was hoping to postpone further
debugging for a couple of weeks until things quiet down.  I know Russ has
a bug opened for it, we can track it (so I don't forget :-p ).

Cheers,
Don

> 
> 
> Here is a trace leading up to a failure. Times are in usec:
> 
>  cpu              rtc    rtc-delta   id               p1               p2 desc
>   10         80996952        44005    1                0                0 NMI handler
>   10         80996952            0   40                0                0 perf_event_nmi_handler
>   10         80996952            0   40                0                0 perf_event_nmi_handler NMI
>   10         80996955            3   40     343000000001     33bc00000002 perf_event_nmi_handler NMI handled - this/handled pmumarked/handled
>   10         80996955            0    1                0                0 NMI handler OK
> 
>   10         81036965        40010    1                0                0 NMI handler
>   10         81036965            0   40                0                0 perf_event_nmi_handler
>   10         81036966            1   40                0                0 perf_event_nmi_handler NMI
>   10         81036968            2   40     343100000001     33bc00000002 perf_event_nmi_handler NMI handled - this/handled pmumarked/handled
>   10         81036968            0    1                0                0 NMI handler OK
> 
>   10         81064135        27167    1                0                0 NMI handler
>   10         81064136            1   40                0                0 perf_event_nmi_handler
>   10         81064137            1   40                0                0 perf_event_nmi_handler NMI
>   10         81064138            1   40                0                0 perf_event_nmi_handler - not handled
>   10         81064138            0    3                0                0 NMI handler failed
>   10         81064146            8    4                0                0 Unknown NMI handler
>   10         81064147            1   20               95                0 UV NMI not received
>   10         81064147            0   40                0                0 perf_event_nmi_handler
>   10         81064148            1   40             3432             33bc perf_event_nmi_handler NMIUNKNOWN
>   10         81064148            0   99                0                0 Unknown NMI handler
> 
> 
> The last trace is just prior to a "dazed" failure.
> 
> I dont see anything unusual. Just looks like a spurious NMI with no cause. The PMU did not
> see an NMI cause. The previous couple of NMIs looked (at least to me) normal.
> NMIs are occuring every ~40msec. No UV NMIs were recently received. No multiple PMU
> events handled.
> 
> Here is a trace where a UV NMI was received:
> 
>    0        371742833         2453    1                0                0 NMI handler
>    0        371742834            1   40                0                0 perf_event_nmi_handler
>    0        371742834            0   40                0                0 perf_event_nmi_handler NMI
>    0        371742836            2   40                0                0 perf_event_nmi_handler - not handled
>    0        371742836            0    3                0                0 NMI handler failed
>    0        371742856           20    4                0                0 Unknown NMI handler
>    0        371742913           57   21               f1                0 UV NMI received
> 
> 
> 
> I've include the patch (latest x86 tree) so you can see exactly where the trace points
> were inserted.
> 
> 
> 
> Index: linux/arch/x86/kernel/apic/x2apic_uv_x.c
> ===================================================================
> --- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c	2011-03-23 10:30:35.000000000 -0500
> +++ linux/arch/x86/kernel/apic/x2apic_uv_x.c	2011-03-24 10:47:59.865562087 -0500
> @@ -23,6 +23,7 @@
>  #include <linux/io.h>
>  #include <linux/pci.h>
>  #include <linux/kdebug.h>
> +#include <linux/utrace.h>
>  
>  #include <asm/uv/uv_mmrs.h>
>  #include <asm/uv/uv_hub.h>
> @@ -54,6 +55,9 @@ unsigned int uv_apicid_hibits;
>  EXPORT_SYMBOL_GPL(uv_apicid_hibits);
>  static DEFINE_SPINLOCK(uv_nmi_lock);
>  
> +void (*utrace_func)(int id, unsigned long, unsigned long, const char*);
> +EXPORT_SYMBOL_GPL(utrace_func);
> +
>  /* Should be part of uv_hub_info but that breas the KABI */
>  static struct uv_nmi_info {
>  	spinlock_t	nmi_lock;
> @@ -692,11 +696,14 @@ int uv_handle_nmi(struct notifier_block
>  	 * if a hw_perf and BMC NMI are received at about the same time
>  	 * and both events are processed with the first NMI.
>  	 */
> -	if (__get_cpu_var(cpu_last_nmi_count) == uv_nmi_info[blade].nmi_count)
> +	if (__get_cpu_var(cpu_last_nmi_count) == uv_nmi_info[blade].nmi_count) {
> +		UTRACE(20, __get_cpu_var(cpu_last_nmi_count), 0, "UV NMI not received");
>  		return NOTIFY_DONE;
> +	}
>  
>  	printk("ZZZ:%d NMI %ld %ld\n", smp_processor_id(), __get_cpu_var(cpu_last_nmi_count), uv_nmi_info[blade].nmi_count);
>  	__get_cpu_var(cpu_last_nmi_count) = uv_nmi_info[blade].nmi_count;
> +	UTRACE(21, __get_cpu_var(cpu_last_nmi_count), 0, "UV NMI received");
>  
>  	/*
>  	 * Use a lock so only one cpu prints at a time.
> Index: linux/arch/x86/kernel/cpu/perf_event.c
> ===================================================================
> --- linux.orig/arch/x86/kernel/cpu/perf_event.c	2011-03-23 15:33:48.000000000 -0500
> +++ linux/arch/x86/kernel/cpu/perf_event.c	2011-03-24 10:47:20.101496911 -0500
> @@ -25,6 +25,7 @@
>  #include <linux/highmem.h>
>  #include <linux/cpu.h>
>  #include <linux/bitops.h>
> +#include <linux/utrace.h>
>  
>  #include <asm/apic.h>
>  #include <asm/stacktrace.h>
> @@ -1341,15 +1342,19 @@ perf_event_nmi_handler(struct notifier_b
>  	struct die_args *args = __args;
>  	unsigned int this_nmi;
>  	int handled;
> +	unsigned long tmp1, tmp2;
>  
>  	if (!atomic_read(&active_events))
>  		return NOTIFY_DONE;
>  
> +	UTRACE(40, 0, 0, "perf_event_nmi_handler");
>  	switch (cmd) {
>  	case DIE_NMI:
> +		UTRACE(40, 0, 0, "perf_event_nmi_handler NMI");
>  		break;
>  	case DIE_NMIUNKNOWN:
>  		this_nmi = percpu_read(irq_stat.__nmi_count);
> +		UTRACE(40, this_nmi, __this_cpu_read(pmu_nmi.marked), "perf_event_nmi_handler NMIUNKNOWN");
>  		if (this_nmi != __this_cpu_read(pmu_nmi.marked))
>  			/* let the kernel handle the unknown nmi */
>  			return NOTIFY_DONE;
> @@ -1368,10 +1373,15 @@ perf_event_nmi_handler(struct notifier_b
>  	apic_write(APIC_LVTPC, APIC_DM_NMI);
>  
>  	handled = x86_pmu.handle_irq(args->regs);
> -	if (!handled)
> +	if (!handled) {
> +		UTRACE(40, handled, 0, "perf_event_nmi_handler - not handled");
>  		return NOTIFY_DONE;
> +	}
>  
>  	this_nmi = percpu_read(irq_stat.__nmi_count);
> +	tmp1 = ((unsigned long)this_nmi << 32) | handled;
> +	tmp2 = ((unsigned long)__this_cpu_read(pmu_nmi.marked) << 32) | __this_cpu_read(pmu_nmi.handled);
> +	UTRACE(40, tmp1, tmp2, "perf_event_nmi_handler NMI handled - this/handled pmumarked/handled");
>  	if ((handled > 1) ||
>  		/* the next nmi could be a back-to-back nmi */
>  	    ((__this_cpu_read(pmu_nmi.marked) == this_nmi) &&
> Index: linux/arch/x86/kernel/traps.c
> ===================================================================
> --- linux.orig/arch/x86/kernel/traps.c	2011-03-22 15:10:36.000000000 -0500
> +++ linux/arch/x86/kernel/traps.c	2011-03-24 10:35:15.410168027 -0500
> @@ -31,6 +31,7 @@
>  #include <linux/mm.h>
>  #include <linux/smp.h>
>  #include <linux/io.h>
> +#include <linux/utrace.h>
>  
>  #ifdef CONFIG_EISA
>  #include <linux/ioport.h>
> @@ -371,9 +372,11 @@ io_check_error(unsigned char reason, str
>  static notrace __kprobes void
>  unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
>  {
> +	UTRACE(4, 0, 0, "Unknown NMI handler");
>  	if (notify_die(DIE_NMIUNKNOWN, "nmi", regs, reason, 2, SIGINT) ==
>  			NOTIFY_STOP)
>  		return;
> +	UTRACE(99, 0, 0, "Unknown NMI handler");
>  #ifdef CONFIG_MCA
>  	/*
>  	 * Might actually be able to figure out what the guilty party
> @@ -403,8 +406,12 @@ static notrace __kprobes void default_do
>  	 * NMI, otherwise we may lose it, because the CPU-specific
>  	 * NMI can not be detected/processed on other CPUs.
>  	 */
> -	if (notify_die(DIE_NMI, "nmi", regs, 0, 2, SIGINT) == NOTIFY_STOP)
> +	UTRACE(1, 0, 0, "NMI handler");
> +	if (notify_die(DIE_NMI, "nmi", regs, 0, 2, SIGINT) == NOTIFY_STOP) {
> +		UTRACE(1, 0, 0, "NMI handler OK");
>  		return;
> +	}
> +	UTRACE(3, 0, 0, "NMI handler failed");
>  
>  	/* Non-CPU-specific NMI: NMI sources can be processed on any CPU */
>  	raw_spin_lock(&nmi_reason_lock);
> Index: linux/include/linux/utrace.h
> ===================================================================
> --- /dev/null	1970-01-01 00:00:00.000000000 +0000
> +++ linux/include/linux/utrace.h	2011-03-24 10:30:52.438555195 -0500
> @@ -0,0 +1,14 @@
> +#ifndef _LINUX_UTRACE_H_
> +#define _LINUX_UTRACE_H_
> +
> +
> +extern void (*utrace_func)(int id, unsigned long, unsigned long, const char *);
> +
> +#define UTRACE(id, a, b, c)						\
> +       do {								\
> +               if (unlikely(utrace_func))				\
> +                       (*utrace_func)(id, a, b, c);			\
> +       } while (0)
> +
> +#endif         /* _LINUX_UTRACE_H_ */
> +
> 

  reply	other threads:[~2011-03-24 18:43 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-21 16:01 [PATCH] x86, UV: Fix NMI handler for UV platforms Jack Steiner
2011-03-21 16:14 ` Ingo Molnar
2011-03-21 16:26   ` Cyrill Gorcunov
2011-03-21 16:43     ` Cyrill Gorcunov
2011-03-21 17:00       ` Cyrill Gorcunov
2011-03-21 17:08         ` Jack Steiner
2011-03-21 17:19           ` Cyrill Gorcunov
2011-03-21 17:34             ` Jack Steiner
2011-03-21 17:48               ` Cyrill Gorcunov
2011-03-21 17:55                 ` Cyrill Gorcunov
2011-03-21 18:15           ` Cyrill Gorcunov
2011-03-21 18:24             ` Jack Steiner
2011-03-21 17:53       ` Don Zickus
2011-03-21 17:51     ` Don Zickus
2011-03-21 18:00       ` Cyrill Gorcunov
2011-03-21 18:22       ` Jack Steiner
2011-03-21 19:37         ` Don Zickus
2011-03-21 20:37           ` Jack Steiner
2011-03-22 17:11           ` Jack Steiner
2011-03-22 18:44             ` Don Zickus
2011-03-22 20:02               ` Jack Steiner
2011-03-22 21:25               ` Jack Steiner
2011-03-22 22:02                 ` Cyrill Gorcunov
2011-03-23 13:36                   ` Jack Steiner
2011-03-22 22:05                 ` Don Zickus
2011-03-23 16:32                   ` Jack Steiner
2011-03-23 17:53                     ` Don Zickus
2011-03-23 20:00                       ` Don Zickus
2011-03-23 20:41                         ` Cyrill Gorcunov
2011-03-23 20:45                         ` Cyrill Gorcunov
2011-03-23 21:22                           ` Don Zickus
2011-03-23 20:46                         ` Jack Steiner
2011-03-23 21:23                           ` Don Zickus
2011-03-24 17:09                             ` Jack Steiner
2011-03-24 18:43                               ` Don Zickus [this message]
2011-03-21 16:56   ` Jack Steiner
2011-03-21 18:05     ` Ingo Molnar
2011-03-21 19:23       ` [PATCH V2] " Jack Steiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110324184328.GG1239@redhat.com \
    --to=dzickus@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=gorcunov@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=steiner@sgi.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.