* how to debug a deadlock'ed kernel?
@ 2001-12-11 19:57 Brian Horton
2001-12-11 19:58 ` Bruce Harada
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Brian Horton @ 2001-12-11 19:57 UTC (permalink / raw)
To: linux kernel
Anyone got any good tips on how to debug a SMP system that is locked up
in a deadlock situation in the kernel? I'm working on a kernel module,
and after some number of hours of stress testing, the box locks up. None
of the sysrq options show anything on the display, though the reBoot
option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
hang for me on 2.4, so I need to debug it here...
Any hints?
thx.bri.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: how to debug a deadlock'ed kernel?
2001-12-11 19:57 how to debug a deadlock'ed kernel? Brian Horton
@ 2001-12-11 19:58 ` Bruce Harada
2001-12-11 21:15 ` Oliver Xymoron
2001-12-11 21:41 ` george anzinger
2 siblings, 0 replies; 7+ messages in thread
From: Bruce Harada @ 2001-12-11 19:58 UTC (permalink / raw)
To: Brian Horton; +Cc: linux-kernel
On Tue, 11 Dec 2001 13:57:52 -0600
Brian Horton <go_gators@mail.com> wrote:
> Anyone got any good tips on how to debug a SMP system that is locked up
> in a deadlock situation in the kernel? I'm working on a kernel module,
> and after some number of hours of stress testing, the box locks up. None
> of the sysrq options show anything on the display, though the reBoot
> option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
> hang for me on 2.4, so I need to debug it here...
>
> Any hints?
Try using a serial console (activate it in your kernel config and hook up
another PC to the serial port - if it's oopsing, you should see the oops over
the serial line.)
Also, I believe you can use kdb via serial as well (although I've never tried).
Bruce
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: how to debug a deadlock'ed kernel?
2001-12-11 19:57 how to debug a deadlock'ed kernel? Brian Horton
2001-12-11 19:58 ` Bruce Harada
@ 2001-12-11 21:15 ` Oliver Xymoron
2001-12-11 21:41 ` george anzinger
2 siblings, 0 replies; 7+ messages in thread
From: Oliver Xymoron @ 2001-12-11 21:15 UTC (permalink / raw)
To: Brian Horton; +Cc: linux kernel
On Tue, 11 Dec 2001, Brian Horton wrote:
> Anyone got any good tips on how to debug a SMP system that is locked up
> in a deadlock situation in the kernel? I'm working on a kernel module,
> and after some number of hours of stress testing, the box locks up. None
> of the sysrq options show anything on the display, though the reBoot
> option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
> hang for me on 2.4, so I need to debug it here...
You might try Keith Owen's kdb. When you lock-up, hit <pause> which brings
up a kdb prompt. From there you can do backtraces, memory examination, and
disassembly on either processor.
It's often quite helpful to modify your test to narrow down what is making
it crash and/or make it happen faster. Reads vs writes, short/long
packets, etc.
--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: how to debug a deadlock'ed kernel?
2001-12-11 19:57 how to debug a deadlock'ed kernel? Brian Horton
2001-12-11 19:58 ` Bruce Harada
2001-12-11 21:15 ` Oliver Xymoron
@ 2001-12-11 21:41 ` george anzinger
2001-12-11 23:41 ` Brian Horton
2 siblings, 1 reply; 7+ messages in thread
From: george anzinger @ 2001-12-11 21:41 UTC (permalink / raw)
To: Brian Horton; +Cc: linux kernel
Brian Horton wrote:
>
> Anyone got any good tips on how to debug a SMP system that is locked up
> in a deadlock situation in the kernel? I'm working on a kernel module,
> and after some number of hours of stress testing, the box locks up. None
> of the sysrq options show anything on the display, though the reBoot
> option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
> hang for me on 2.4, so I need to debug it here...
>
> Any hints?
First read about the NMI boot option in Documentation/nmi_watchdog.txt.
If you have this turned on and are not oopsing, then the timer (at
least) is interrupting. The next step I would take would be to used
either kdb (no experience) or kgdb. I have my own version of this if
you are interested. It does, however, require an RS232 (serial)
connection to a host machine.
I don't know about kdb, but kgdb (my version) uses the NMI to trap the
other cpus and also traps NMIs on the way to oopsing.
--
George george@mvista.com
High-res-timers: http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: how to debug a deadlock'ed kernel?
2001-12-11 21:41 ` george anzinger
@ 2001-12-11 23:41 ` Brian Horton
2001-12-12 0:25 ` Oliver Xymoron
0 siblings, 1 reply; 7+ messages in thread
From: Brian Horton @ 2001-12-11 23:41 UTC (permalink / raw)
To: george anzinger; +Cc: linux kernel
Thanks, I'll try the nmi_watchdog option out. It appears to not be in
the 2.2.14 kernel, but is in a 2.2.19 kernel that I have from RedHat.
Is there a version of kgdb that works with 2.2.x kernels? I only see 2.4
kernels on the web page (http://kgdb.sourceforge.net)
thx.bri.
george anzinger wrote:
>
> Brian Horton wrote:
> >
> > Anyone got any good tips on how to debug a SMP system that is locked up
> > in a deadlock situation in the kernel? I'm working on a kernel module,
> > and after some number of hours of stress testing, the box locks up. None
> > of the sysrq options show anything on the display, though the reBoot
> > option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
> > hang for me on 2.4, so I need to debug it here...
> >
> > Any hints?
>
> First read about the NMI boot option in Documentation/nmi_watchdog.txt.
> If you have this turned on and are not oopsing, then the timer (at
> least) is interrupting. The next step I would take would be to used
> either kdb (no experience) or kgdb. I have my own version of this if
> you are interested. It does, however, require an RS232 (serial)
> connection to a host machine.
>
> I don't know about kdb, but kgdb (my version) uses the NMI to trap the
> other cpus and also traps NMIs on the way to oopsing.
> --
> George george@mvista.com
> High-res-timers: http://sourceforge.net/projects/high-res-timers/
> Real time sched: http://sourceforge.net/projects/rtsched/
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: how to debug a deadlock'ed kernel?
2001-12-11 23:41 ` Brian Horton
@ 2001-12-12 0:25 ` Oliver Xymoron
2001-12-14 21:35 ` Brian Horton
0 siblings, 1 reply; 7+ messages in thread
From: Oliver Xymoron @ 2001-12-12 0:25 UTC (permalink / raw)
To: Brian Horton; +Cc: george anzinger, linux kernel
On Tue, 11 Dec 2001, Brian Horton wrote:
> Thanks, I'll try the nmi_watchdog option out. It appears to not be in
> the 2.2.14 kernel, but is in a 2.2.19 kernel that I have from RedHat.
It's there by default in SMP, you just have to enable it with
nmiwatchdog=1 at boot (or something). I didn't mention it because it
probably won't help your problem: as you can use the magic sysrq keys to
reboot, you are not deadlocked with interrupts off, therefore the timer
interrupt will keep the NMI watchdog from ever firing. NMI watchdog is
mostly of use when you can't even get the capslock light to toggle..
> > > Anyone got any good tips on how to debug a SMP system that is locked up
> > > in a deadlock situation in the kernel? I'm working on a kernel module,
> > > and after some number of hours of stress testing, the box locks up. None
> > > of the sysrq options show anything on the display, though the reBoot
> > > option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
> > > hang for me on 2.4, so I need to debug it here...
--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: how to debug a deadlock'ed kernel?
2001-12-12 0:25 ` Oliver Xymoron
@ 2001-12-14 21:35 ` Brian Horton
0 siblings, 0 replies; 7+ messages in thread
From: Brian Horton @ 2001-12-14 21:35 UTC (permalink / raw)
To: Oliver Xymoron; +Cc: linux kernel
Yup, the nmi_watchdog option didn't work, but I was able to get
information I needed with kdb!
thx! .bri.
Oliver Xymoron wrote:
>
> On Tue, 11 Dec 2001, Brian Horton wrote:
>
> > Thanks, I'll try the nmi_watchdog option out. It appears to not be in
> > the 2.2.14 kernel, but is in a 2.2.19 kernel that I have from RedHat.
>
> It's there by default in SMP, you just have to enable it with
> nmiwatchdog=1 at boot (or something). I didn't mention it because it
> probably won't help your problem: as you can use the magic sysrq keys to
> reboot, you are not deadlocked with interrupts off, therefore the timer
> interrupt will keep the NMI watchdog from ever firing. NMI watchdog is
> mostly of use when you can't even get the capslock light to toggle..
>
> > > > Anyone got any good tips on how to debug a SMP system that is locked up
> > > > in a deadlock situation in the kernel? I'm working on a kernel module,
> > > > and after some number of hours of stress testing, the box locks up. None
> > > > of the sysrq options show anything on the display, though the reBoot
> > > > option does reboot the system. RedHat 6.2 and its 2.2.14 kernel. Doesn't
> > > > hang for me on 2.4, so I need to debug it here...
>
> --
> "Love the dolphins," she advised him. "Write by W.A.S.T.E.."
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2001-12-14 21:35 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-12-11 19:57 how to debug a deadlock'ed kernel? Brian Horton
2001-12-11 19:58 ` Bruce Harada
2001-12-11 21:15 ` Oliver Xymoron
2001-12-11 21:41 ` george anzinger
2001-12-11 23:41 ` Brian Horton
2001-12-12 0:25 ` Oliver Xymoron
2001-12-14 21:35 ` Brian Horton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox