public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jack Steiner <steiner@sgi.com>
To: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	tglx@linutronix.de, hpa@zytor.com, x86@kernel.org,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [PATCH] x86, UV: Fix NMI handler for UV platforms
Date: Wed, 23 Mar 2011 08:36:04 -0500	[thread overview]
Message-ID: <20110323133604.GA21288@sgi.com> (raw)
In-Reply-To: <4D891C93.8070502@gmail.com>

On Wed, Mar 23, 2011 at 01:02:59AM +0300, Cyrill Gorcunov wrote:
> On 03/23/2011 12:25 AM, Jack Steiner wrote:
> > On Tue, Mar 22, 2011 at 02:44:50PM -0400, Don Zickus wrote:
> >> On Tue, Mar 22, 2011 at 12:11:18PM -0500, Jack Steiner wrote:
> >>> How certain are you that multiple NMIs triggered at about the same time will
> >>> deliver discrete NMI events? I updated the patch so that I'm running with:
> >>
> >> I think as long as there isn't more than two (1 active, 1 latched), you
> >> would be ok.  A third one looks like it would get dropped.
> >>
> >>>
> >>> 	- no special code in traps.c (I removed the traps.c code that was
> >>> 	  in the patch I posted)
> >>> 	- used die_notifier for calling the UV nmi handler
> >>> 	- UV priority is higher than the hw_perf priority
> >>>
> >>> Both hw_perf (perf top) & UV NMIs work correctly under light loads. However, if I
> >>> run for 10 - 15 minutes injecting UV NMIs at a rate of about 30/min, "perf top"
> >>> stops generating output. Strace shows that it continues to poll() but no data
> >>> is received.
> >>
> >> That's a low frequency and it still gets stuck?
> >>
> >>>
> >>> While "perf top" is hung, if I inject an NMI into the system in a way that will NOT
> >>> be consumed by the UV nmi handler, "perf top" resumes output but will stop again after
> >>> a few minutes.
> >>
> >> So that means the PMU set its interrupt bit but the cpu failed to get the
> >> NMI.
> >>
> >>>
> >>>
> >>> AFAICT, the UV nmi handler is not consuming extra NMI interrupts. I can't
> >>> rule out that I'm missing something but I don't see it.
> >>
> >> What happens if you put the UV nmi handler below the hw_perf handler in
> >> priority?  I assume the DIE_NMIUNKNOWN snippet in the hw_perf handler will
> >> swallow some of the UV NMIs, but more importantly does it still generate
> >> the hang you see?
> > 
> > I verified that the failures ("perf top" stops) are the same on both RHEL6.1 & the
> > latest x86 2.6.38+ tree.
> > 
> > I switched priorities & as expected, "perf top" no longer hangs. I see an occassional
> > missed UV NMI - about 1 every minute. I also see a few "dazed" messages as
> > well - 3 in a 5 minute period. This testing was done on a 2.6.38+ kernel.
> > 
> > I'm running on a 48p system.
> > 
> > Ideas?
> > 
> 
>   I fear there is always a probability for eaten nmi (due to inflight nmi logic
> we have) or missed nmi (due to non-instant deliery of nmi).  Say the following
> scenario may happen:
> 
> 1) perf-nmi-0 (from counter 0) issued
> 2) uv-nmi issued
> 3) perf-nmi-0 latched
> 4) perf-nmi-1 (from counter 1) not yet issued but couter overflowed
> 5) nmi-handler
> 6) uv-nmi-latched
> 7) nmi-handler eats both nmis from perf-nmi-0 and uv-nmi because of in-flight
>    nmi logic we have
> 8) finally perf-nmi-1 should appear on line but counter already pulled down so
>    no nmi
> 
> and here you get missed nmi you expect from uv. I *guess*, not sure if it's possible.

Makes sense.


> If you disable nmi-watchdog on boot line, does it help?

Nmi_watchdog is disabled by default on our platforms.

  reply	other threads:[~2011-03-23 13:36 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-21 16:01 [PATCH] x86, UV: Fix NMI handler for UV platforms Jack Steiner
2011-03-21 16:14 ` Ingo Molnar
2011-03-21 16:26   ` Cyrill Gorcunov
2011-03-21 16:43     ` Cyrill Gorcunov
2011-03-21 17:00       ` Cyrill Gorcunov
2011-03-21 17:08         ` Jack Steiner
2011-03-21 17:19           ` Cyrill Gorcunov
2011-03-21 17:34             ` Jack Steiner
2011-03-21 17:48               ` Cyrill Gorcunov
2011-03-21 17:55                 ` Cyrill Gorcunov
2011-03-21 18:15           ` Cyrill Gorcunov
2011-03-21 18:24             ` Jack Steiner
2011-03-21 17:53       ` Don Zickus
2011-03-21 17:51     ` Don Zickus
2011-03-21 18:00       ` Cyrill Gorcunov
2011-03-21 18:22       ` Jack Steiner
2011-03-21 19:37         ` Don Zickus
2011-03-21 20:37           ` Jack Steiner
2011-03-22 17:11           ` Jack Steiner
2011-03-22 18:44             ` Don Zickus
2011-03-22 20:02               ` Jack Steiner
2011-03-22 21:25               ` Jack Steiner
2011-03-22 22:02                 ` Cyrill Gorcunov
2011-03-23 13:36                   ` Jack Steiner [this message]
2011-03-22 22:05                 ` Don Zickus
2011-03-23 16:32                   ` Jack Steiner
2011-03-23 17:53                     ` Don Zickus
2011-03-23 20:00                       ` Don Zickus
2011-03-23 20:41                         ` Cyrill Gorcunov
2011-03-23 20:45                         ` Cyrill Gorcunov
2011-03-23 21:22                           ` Don Zickus
2011-03-23 20:46                         ` Jack Steiner
2011-03-23 21:23                           ` Don Zickus
2011-03-24 17:09                             ` Jack Steiner
2011-03-24 18:43                               ` Don Zickus
2011-03-21 16:56   ` Jack Steiner
2011-03-21 18:05     ` Ingo Molnar
2011-03-21 19:23       ` [PATCH V2] " Jack Steiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110323133604.GA21288@sgi.com \
    --to=steiner@sgi.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=dzickus@redhat.com \
    --cc=gorcunov@gmail.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox