All of lore.kernel.org
 help / color / mirror / Atom feed
From: Don Zickus <dzickus@redhat.com>
To: Jason Wessel <jason.wessel@windriver.com>
Cc: Ingo Molnar <mingo@elte.hu>,
	Peter Zijlstra <peterz@infradead.org>,
	Robert Richter <robert.richter@amd.com>,
	ying.huang@intel.com, Andi Kleen <andi@firstfloor.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <fweisbec@gmail.com>
Subject: Re: [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift
Date: Tue, 16 Nov 2010 13:43:25 -0500	[thread overview]
Message-ID: <20101116184325.GB4823@redhat.com> (raw)
In-Reply-To: <20101112172755.GR4823@redhat.com>

On Fri, Nov 12, 2010 at 12:27:55PM -0500, Don Zickus wrote:

Hi Jason,

> 
> > 
> > I tested 2.6.35 and it does not hard hang, but suffered from a different
> > problem with a perf API change.   The kgdb tests appear to loop and loop
> > emitting endless streams of output in 2.6.35 and I already have that
> > problem patched.

I keep getting the following stack trace which is different than your
hang.  Is this looping I am seeing something with the NMI or kgdb?

Cheers,
Don

> 
> It doesn't look like this does it?  This is the streaming output I see
> when try to reproduce this using the config suggestions you gave me.
> 
> [    7.778578] ------------[ cut here ]------------
> [    7.778580] WARNING: at
> /ssd/dzickus/git/upstream/drivers/misc/kgdbts.c:702 run_simple_test+0x18d/0x2f0()
> [    7.778582] Hardware name: To be filled by O.E.M.
> [    7.778583] Modules linked in: ata_generic i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mod
> [    7.778589] Pid: 150, comm: udevd Tainted: G        W   2.6.36-killnmi+ #12
> [    7.778590] Call Trace:
> [    7.778591]  <#DB>  [<ffffffff810631cf>] warn_slowpath_common+0x7f/0xc0
> [    7.778595]  [<ffffffff8106322a>] warn_slowpath_null+0x1a/0x20
> [    7.778598]  [<ffffffff8132941d>] run_simple_test+0x18d/0x2f0
> [    7.778600]  [<ffffffff81328ded>] kgdbts_put_char+0x1d/0x20
> [    7.778603]  [<ffffffff810c6cbd>] put_packet+0x5d/0x120
> [    7.778605]  [<ffffffff810c7f44>] gdb_serial_stub+0xa24/0xc20
> [    7.778609]  [<ffffffff810c6558>] kgdb_cpu_enter+0x2c8/0x590
> [    7.778612]  [<ffffffff810c6a91>] kgdb_handle_exception+0x121/0x170
> [    7.778615]  [<ffffffff814cd7b8>] ?  hw_breakpoint_exceptions_notify+0xe8/0x1d0
> [    7.778617]  [<ffffffff81033472>] __kgdb_notify+0x82/0x1b0
> [    7.778620]  [<ffffffff810335c7>] kgdb_notify+0x27/0x40
> [    7.778623]  [<ffffffff814cf8e5>] notifier_call_chain+0x55/0x80
> [    7.778625]  [<ffffffff814cf958>] __atomic_notifier_call_chain+0x48/0x70
> [    7.778628]  [<ffffffff814cf996>] atomic_notifier_call_chain+0x16/0x20
> [    7.778631]  [<ffffffff814cf9ce>] notify_die+0x2e/0x30
> [    7.778633]  [<ffffffff814cc953>] do_debug+0xa3/0x170
> [    7.778636]  [<ffffffff814cc438>] debug+0x28/0x40
> [    7.778639]  [<ffffffff81062310>] ? do_fork+0x0/0x450
> [    7.778640]  <<EOE>>  [<ffffffff81014938>] ? sys_clone+0x28/0x30
> [    7.778644]  [<ffffffff8100c4d3>] stub_clone+0x13/0x20
> [    7.778647]  [<ffffffff8100c1b2>] ? system_call_fastpath+0x16/0x1b
> [    7.778649] ---[ end trace ecf07e0cd1846c34 ]---
> [    7.778650] kgdbts: ERROR: beyond end of test on 'do_fork_test' line 11
> [    7.778651] ------------[ cut here ]------------
> 
> > 
> > At this point we have to get back to a working base line.  At this point
> > if you use 2.6.37-rc1 the last remaining problem is the perf + lockup
> > detector callback eating the injected DIE_NMI event which is meant to
> > enter the debugger.
> 
> This shouldn't be too hard to solve once we figure out which path it takes
> in the perf nmi handler.
> 
> Cheers,
> Don
> 
> > 
> > 
> > >> The symptom you would see looks like:
> > >>
> > >> ...kernel boot...
> > >> Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> > >> serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> > >> 00:06: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
> > >> brd: module loaded
> > >> kgdb: Registered I/O driver kgdbts.
> > >> kgdbts:RUN plant and detach test
> > >> [...HARD HANG STARTS HERE...]
> > >>
> > >> The kernel is looping at that point waiting for the master kgdb cpu to
> > >> have all the slaves join the debugger but it never happens because the
> > >> perf callback chain which is used by the lockup detector eats the NMI
> > >> IPI event.  After the perf callback is processed perf returns
> > >> NOTIFY_STOP so the notifier which brings the slave CPU into the debugger
> > >> never fires.
> > >>     
> > >
> > > Ok.  We have code to handle extra spurious NMIs that is hard to accurately
> > > determine if the NMI was for perf or someone else.  This logic may still
> > > need tweaking.  What cpu are you running on?  AMD/Intel?  If Intel, then
> > > core/core2/nehalem?
> > >
> > >   
> > 
> > In this case I just built a 32 bit kernel and ran it under kvm on a 64
> > bit host.  I can send you the .config separately.
> > 
> > kvm  -nographic -k en-us -kernel arch/x86/boot/bzImage -net user -net
> > nic,macaddr=52:54:00:12:34:56,model=i82557b -append
> > "console=ttyS0,115200 ip=dhcp root=/dev/nfs
> > nfsroot=10.0.2.2:/space/exp/x86 rw acpi=force UMA=1" -smp 2
> 
> Does that you hit the problem on the kvm guest or host?  I wasn't aware
> the perf worked inside the guest (well at least the hardware pieces of
> it, like NMI).
> 
> Cheers,
> Don

  reply	other threads:[~2010-11-16 18:44 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12 14:43 [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift Don Zickus
2010-11-12 14:43 ` [PATCH 1/6] x86, NMI: Add NMI symbol constants and rename memory parity to PCI SERR Don Zickus
2010-11-12 14:43 ` [PATCH 2/6] x86, NMI: Add touch_nmi_watchdog to io_check_error delay Don Zickus
2010-11-12 14:43 ` [PATCH 3/6] x86, NMI: Rewrite NMI handler Don Zickus
2010-11-12 14:43 ` [PATCH 4/6] x86, NMI: Remove DIE_NMI_IPI and add priorties to handlers Don Zickus
2010-11-12 14:43 ` [PATCH 5/6] x86, NMI: Allow NMI reason io port (0x61) to be processed on any CPU Don Zickus
2010-11-12 14:43 ` [PATCH 6/6] x86, NMI: Remove do_nmi_callback logic Don Zickus
2010-11-12 15:05 ` [V2 PATCH 0/6] x86, NMI: give NMI handler a face-lift Jason Wessel
2010-11-12 15:42   ` Don Zickus
2010-11-12 15:55     ` Jason Wessel
2010-11-12 16:11       ` Don Zickus
2010-11-12 16:34         ` Jason Wessel
2010-11-12 17:27           ` Don Zickus
2010-11-16 18:43             ` Don Zickus [this message]
2010-11-16 20:04               ` Jason Wessel
2010-11-18  8:05                 ` Ingo Molnar
2010-11-18 12:47                   ` Jason Wessel
2010-11-18 13:17                     ` Peter Zijlstra
2010-11-18 14:32                       ` Don Zickus
2010-11-18 15:18                         ` Jason Wessel
2010-11-18 15:38                       ` Peter Zijlstra
2010-11-18 19:32                       ` Don Zickus
2010-11-18 19:51                         ` Jason Wessel
2010-11-18 20:04                           ` Peter Zijlstra
2010-11-18 20:08                           ` Don Zickus
2010-11-18 20:11                             ` Cyrill Gorcunov
2010-11-18 20:52                               ` Don Zickus
2010-11-18 21:01                                 ` Cyrill Gorcunov
2010-11-18 21:16                                   ` Don Zickus
2010-11-18 21:26                                     ` Cyrill Gorcunov
2010-11-18 20:28                             ` Cyrill Gorcunov
2010-11-18 20:39                               ` Cyrill Gorcunov
2010-11-18 21:02                                 ` Don Zickus
2010-11-18 21:19                                   ` Cyrill Gorcunov
2010-11-18 20:30                             ` Peter Zijlstra
2010-11-19 16:59                               ` Don Zickus
2010-11-19 18:25                                 ` Peter Zijlstra
2010-11-19 22:59                                   ` Don Zickus
2010-11-19 23:09                                     ` Peter Zijlstra
2010-11-19 23:30                                       ` Jason Wessel
2010-11-22 14:22                                         ` Don Zickus
2010-11-22 14:22                                       ` Don Zickus
2010-11-22 14:29                                         ` Peter Zijlstra
2010-11-18 20:04                         ` Cyrill Gorcunov
2010-11-18 21:56                         ` Cyrill Gorcunov
2010-11-18 21:58                           ` Cyrill Gorcunov
2010-11-18 22:15                           ` Cyrill Gorcunov
2010-11-18 22:24                             ` Jason Wessel
2010-11-18 22:27                               ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101116184325.GB4823@redhat.com \
    --to=dzickus@redhat.com \
    --cc=andi@firstfloor.org \
    --cc=fweisbec@gmail.com \
    --cc=jason.wessel@windriver.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=peterz@infradead.org \
    --cc=robert.richter@amd.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.