public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jason Wessel <jason.wessel@windriver.com>
To: Don Zickus <dzickus@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <peterz@infradead.org>,
	Robert Richter <robert.richter@amd.com>,
	ying.huang@intel.com, Andi Kleen <andi@firstfloor.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>
Subject: Re: [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift
Date: Fri, 12 Nov 2010 09:55:53 -0600	[thread overview]
Message-ID: <4CDD6389.2080206@windriver.com> (raw)
In-Reply-To: <20101112154231.GN4823@redhat.com>

On 11/12/2010 09:42 AM, Don Zickus wrote:
> On Fri, Nov 12, 2010 at 09:05:03AM -0600, Jason Wessel wrote:
>   
>> On 11/12/2010 08:43 AM, Don Zickus wrote:
>>     
>>> Restructuring the nmi handler to be more readable and simpler.
>>>
>>> This is just laying the ground work for future improvements in this area.
>>>
>>> I also left out one of Huang's patch until we figure out how we are going
>>> to proceed with a new notifier.
>>>
>>> Tested 32-bit and 64-bit on AMD and Intel machines.
>>>
>>> V2:  add a patch to kill DIE_NMI_IPI and add in priorities
>>>
>>>   
>>>       
>> Had you tested this code with kgdb boot tests at all?
>>
>> CONFIG_LOCKUP_DETECTOR=y
>> CONFIG_HARDLOCKUP_DETECTOR=y
>> CONFIG_KGDB=y
>> CONFIG_KGDB_TESTS_ON_BOOT=y
>> CONFIG_KGDB_TESTS_BOOT_STRING="V1F100"
>>
>> There has been a regression in kgdb due to the use of perf/NMI in the
>> lockup detector ever since the new version has been introduced.   The
>> perf callbacks in the lockup detector were consuming NMI events not
>> related to the call back and causing the kernel debugger not to work at
>> all on SMP systems configured with the lockup detector.
>>     
>
> Well 2.6.36 should have fixed that.  Perf was blindly eating all NMI
> events if it had a user.  With the new lockup detector, that created a
> 'user' for perf and it happily ate everything.  But we spent a lot of time
> trying to fix that for 2.6.36.  If we missed something, we would like to
> know.
>
> To answer your question, I doubt this patch series will change that
> outcome if it is still broken.
>
>   

It was most definitely broken in 2.6.36->2.6.37-rc1.  Randy Dunlap had
pointed this out in a separate exchange that was not on LKML.

The symptom you would see looks like:

...kernel boot...
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:06: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
brd: module loaded
kgdb: Registered I/O driver kgdbts.
kgdbts:RUN plant and detach test
[...HARD HANG STARTS HERE...]

The kernel is looping at that point waiting for the master kgdb cpu to
have all the slaves join the debugger but it never happens because the
perf callback chain which is used by the lockup detector eats the NMI
IPI event.  After the perf callback is processed perf returns
NOTIFY_STOP so the notifier which brings the slave CPU into the debugger
never fires.

You can even see the behavior booting a kernel with the kgdb tests using
kvm with -smp 2.

I did build with your 6 part series, and the behavior is no different
(meaning it is still broken).

Jason.

  reply	other threads:[~2010-11-12 15:56 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12 14:43 [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift Don Zickus
2010-11-12 14:43 ` [PATCH 1/6] x86, NMI: Add NMI symbol constants and rename memory parity to PCI SERR Don Zickus
2010-11-12 14:43 ` [PATCH 2/6] x86, NMI: Add touch_nmi_watchdog to io_check_error delay Don Zickus
2010-11-12 14:43 ` [PATCH 3/6] x86, NMI: Rewrite NMI handler Don Zickus
2010-11-12 14:43 ` [PATCH 4/6] x86, NMI: Remove DIE_NMI_IPI and add priorties to handlers Don Zickus
2010-11-12 14:43 ` [PATCH 5/6] x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU Don Zickus
2010-11-12 14:43 ` [PATCH 6/6] x86, NMI: Remove do_nmi_callback logic Don Zickus
2010-11-12 15:05 ` [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift Jason Wessel
2010-11-12 15:42   ` Don Zickus
2010-11-12 15:55     ` Jason Wessel [this message]
2010-11-12 16:11       ` Don Zickus
2010-11-12 16:34         ` Jason Wessel
2010-11-12 17:27           ` Don Zickus
2010-11-16 18:43             ` Don Zickus
2010-11-16 20:04               ` Jason Wessel
2010-11-18  8:05                 ` Ingo Molnar
2010-11-18 12:47                   ` Jason Wessel
2010-11-18 13:17                     ` Peter Zijlstra
2010-11-18 14:32                       ` Don Zickus
2010-11-18 15:18                         ` Jason Wessel
2010-11-18 15:38                       ` Peter Zijlstra
2010-11-18 19:32                       ` Don Zickus
2010-11-18 19:51                         ` Jason Wessel
2010-11-18 20:04                           ` Peter Zijlstra
2010-11-18 20:08                           ` Don Zickus
2010-11-18 20:11                             ` Cyrill Gorcunov
2010-11-18 20:52                               ` Don Zickus
2010-11-18 21:01                                 ` Cyrill Gorcunov
2010-11-18 21:16                                   ` Don Zickus
2010-11-18 21:26                                     ` Cyrill Gorcunov
2010-11-18 20:28                             ` Cyrill Gorcunov
2010-11-18 20:39                               ` Cyrill Gorcunov
2010-11-18 21:02                                 ` Don Zickus
2010-11-18 21:19                                   ` Cyrill Gorcunov
2010-11-18 20:30                             ` Peter Zijlstra
2010-11-19 16:59                               ` Don Zickus
2010-11-19 18:25                                 ` Peter Zijlstra
2010-11-19 22:59                                   ` Don Zickus
2010-11-19 23:09                                     ` Peter Zijlstra
2010-11-19 23:30                                       ` Jason Wessel
2010-11-22 14:22                                         ` Don Zickus
2010-11-22 14:22                                       ` Don Zickus
2010-11-22 14:29                                         ` Peter Zijlstra
2010-11-18 20:04                         ` Cyrill Gorcunov
2010-11-18 21:56                         ` Cyrill Gorcunov
2010-11-18 21:58                           ` Cyrill Gorcunov
2010-11-18 22:15                           ` Cyrill Gorcunov
2010-11-18 22:24                             ` Jason Wessel
2010-11-18 22:27                               ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4CDD6389.2080206@windriver.com \
    --to=jason.wessel@windriver.com \
    --cc=andi@firstfloor.org \
    --cc=dzickus@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox