From: "Gregory K. Ruiz-Ade" <gregory@castandcrew.com>
To: Keith Owens <kaos@ocs.com.au>
Cc: linux-kernel@vger.kernel.org
Subject: Re: system lockup issues w/ 2.4.19
Date: Mon, 10 Mar 2003 10:12:01 -0800 [thread overview]
Message-ID: <200303101012.02294.gregory@castandcrew.com> (raw)
In-Reply-To: <13694.1047106361@ocs3.intra.ocs.com.au>
On Friday 07 March 2003 22:52, Keith Owens wrote:
> Those symptoms do not necessarily mean a full process table. You get
> exactly those symptoms if some code has grabbed a spin lock related to
> process creation and not released it.
Hmm... Well, it happened again on Friday night, and pouring through the
syslogs, sendmail started refusing mail due to a load average of 18 and
then 19... This system, even under it's heaviest use, never breaks a
system load average of 4-5. Would a "stuck" spinlock result in an
artificial inflation of system load averages (as a symptom)?
> You need kernel debugging features to find out which lock is the
> problem. Booting with nmi_watchdog and a serial console (see
> linux/Documentation) will often tell you what has hung.
I'm building a 2.4.20 kernel using sources from kernel.org, and turning on
the following options:
-->8--[Cut Here (.config)]-->8--
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_DEBUG_HIGHMEM=y
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_IOVIRT is not set
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_FRAME_POINTER=y
-->8--[Cut Here (.config)]-->8--
Should I enable the other two, as well?
Also, do I've wired up the serial console on this machine to another machine
(that's much more stable) so that I can access it remotely... should I try
to set something up that simply monitors the serial console constantly and
logs it to a file, or will I be able to get the info I need via a program
like minicom by poking the kernel after the fact?
> The kdb patch (ftp://oss.sgi.com/projects/kdb/download/v3.0) will let
> you print the state of each process and find out where they are
> spinning. Note: kdb patches are against standard kernels, ask your
> distributor about how to patch the distributor's kernel with kdb.
I'll add this in to the kernel as well. Hopefully I'll be able to get some
more useful information out of the system the next time this happens.
Thanks for all the pointers!
Gregory
--
Gregory K. Ruiz-Ade <gregory@castandcrew.com>
Sr. Systems Administrator
Cast & Crew Entertainment Services, Inc.
next parent reply other threads:[~2003-03-10 18:02 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <13694.1047106361@ocs3.intra.ocs.com.au>
2003-03-10 18:12 ` Gregory K. Ruiz-Ade [this message]
[not found] <mailman.1046898841.30893.linux-kernel2news@redhat.com>
2003-03-05 23:42 ` system lockup issues w/ 2.4.19 Pete Zaitcev
2003-03-05 21:10 Gregory K. Ruiz-Ade
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200303101012.02294.gregory@castandcrew.com \
--to=gregory@castandcrew.com \
--cc=kaos@ocs.com.au \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox