From: David Golombek <daveg@permabit.com>
To: Benjamin LaHaise <bcrl@kvack.org>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.4.31 hangs, no information on console or serial port
Date: 27 Feb 2006 11:24:10 -0500 [thread overview]
Message-ID: <7yfym4lqhh.fsf@questionably-configured.permabit.com> (raw)
In-Reply-To: <20060221152949.GA31273@kvack.org>
> On Tue, Feb 21, 2006 at 10:23:56AM -0500, David Golombek wrote:
> > I have a box running a modified Debian/woody system and 2.4.31. It is
> > intermittently hanging such that:
> >
> > * All logging to /var/log ceases.
> > * Machine is still pingable.
> > * Machine can be telneted to on time port, but no time is echoed.
> > * After attaching a console+keyboard, console would not unblank.
> > * Nothing responded when attaching a serial console.
> > * Machine does not respond to Ctrl-Alt-Del
> > * No DMI messages are logged.
> > * Hang is persistent until physical reboot.
> >
> > This has happened 4 times, on 2 separate machines (under roughly
> > similar conditions). Machines are up variable amounts of time before
> > crashing, between many weeks and less than 1 day. Nothing unusual is
> > logged in /var/log/{deamon.log,kern.log,messages,syslog} prior the
> > hang, except that /var/log/messages includes the "TCP: Treason
> > uncloaked!" warnings that are fixed in 2.4.32. No users were logged
> > on at the time of 3 of the 4 crashes, and no local user activity was
> > present at the time of the 4th.
> >
> > The machines are Intel P4's with 2GB of memory
> >
> > The machine is under relatively high load and has a custom userspace
> > nfs server running on it (which is potentially to blame, but we've
> > been unable to determine how). The custom userspace nfs server and
> > tomcat4 are the primary applications running.
> >
> > Any suggestions as to how we might debug this or possible causes would
> > be greatly appreciated.
>
> Benjamin LaHaise <bcrl@kvack.org> writes:
> Have you tried turning on the NMI watchdog (nmi_watchdog=1)? It
> should be able to kick the machine out of the locked state, as these
> symptoms would hint at a spinlock deadlock with interrupts disabled.
> Also, try to reproduce on the latest 2.4.33pre. That said, for an
> io intensive workload like you're running, 2.6 is much better,
> especially for systems using highmem.
After a week of intensive testing, we were finally able to reproduce
this hang. Sadly, the nmi watchdog did not appear to trigger (I'm
pretty sure it was configured correctly, I did see NMIs occurring).
No information appeared on serial or console (although this time they
weren't blanked). We're building 2.4.33pre kernel now to try and test
on now to see if we're still able to reproduce using it.
We're beginning to suspect that a hung loopback NFS mount might be to
blame, although we can't reproduce this trivially. Is there anyway in
which a mount that was behaving badly could affect the kernel in this
manner?
Dave
next prev parent reply other threads:[~2006-02-27 16:24 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-02-21 15:23 2.4.31 hangs, no information on console or serial port David Golombek
2006-02-21 15:29 ` Benjamin LaHaise
2006-02-21 16:04 ` David Golombek
2006-02-21 21:41 ` Willy Tarreau
2006-02-27 16:24 ` David Golombek [this message]
2006-02-27 16:39 ` Benjamin LaHaise
2006-02-27 17:48 ` David Golombek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7yfym4lqhh.fsf@questionably-configured.permabit.com \
--to=daveg@permabit.com \
--cc=bcrl@kvack.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox