From: Andrew Morton <akpm@osdl.org>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: sim@netnation.com, linux-kernel@vger.kernel.org
Subject: Re: 2.4.24 SMP lockups
Date: Sat, 10 Jan 2004 14:40:49 -0800 [thread overview]
Message-ID: <20040110144049.5e195ebd.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.58L.0401101719400.1310@logos.cnet>
Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
>
>
>
> On Fri, 9 Jan 2004, Simon Kirby wrote:
>
> > 'lo all,
>
> Hi Simon,
>
> > We've had about 6 cases of this now, across 4 separate boxes. Since
> > upgrading to 2.4.24, our SMP web server boxes (both Intel and AMD
> > hardware) are randomly blowing up. This may have happened on 2.4.23 as
> > well, but they weren't really running long enough to tell. 2.4.22 was
> > fine. GCC 3.3.3.
> >
> > These boxes are all dual CPU, and the failure case shows up suddenly with
> > no warning. Sysreq-P works, but only reports from one CPU no matter how
> > many times I try. In normal operation, every machine distributes all
> > IRQs across both CPUs, and Sysreq-P reports from both CPUs.
> >
> > Mapping the EIP reported by Sysreq-P to symbols shows that the responding
> > CPU is spinning on a spinlock (so far I have seen .text.lock.fcntl,
> > .text.lock.sched, .text.lock.locks, and .text.lock.inode), which I assume
> > is being held by the other (dead) CPU.
>
> This sounds like a deadlock. I wonder why the NMI watchdog is not
> triggering.
Presumably it's spinning on the lock with interrupts enabled. Make that
the `NMI' counters in /proc/interrupts are incrementing for all CPUs.
> > Even on boxes with nmi_watchdog=1, nothing is reported from the NMI
> > watchdog.
>
> Can you share all available SysRQ-P output for the locked CPU ? SysRQ-T if
> possible, too.
sysrq-T would be best.
We don't have an each-CPU backtrace facility - it could be handy. There's
one in the low-latency patch for some reason.
next prev parent reply other threads:[~2004-01-10 22:40 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-01-09 21:04 2.4.24 SMP lockups Simon Kirby
2004-01-09 22:20 ` Arkadiusz Miskiewicz
2004-01-10 15:51 ` Thomas Zehetbauer
[not found] ` <Pine.LNX.4.58L.0401101719400.1310@logos.cnet>
2004-01-10 22:40 ` Andrew Morton [this message]
2004-01-11 4:12 ` Rik van Riel
2004-01-11 13:16 ` Marcelo Tosatti
2004-01-12 12:18 ` Marcelo Tosatti
2004-01-12 12:43 ` Thomas Zehetbauer
2004-01-11 8:55 ` Simon Kirby
2004-01-11 9:30 ` Willy Tarreau
2004-01-14 17:07 ` Simon Kirby
2004-01-14 17:56 ` Marcelo Tosatti
2004-01-16 2:34 ` Philippe Troin
2004-01-14 18:28 ` David Woodhouse
2004-01-14 21:01 ` David Woodhouse
-- strict thread matches above, loose matches on Subject: below --
2004-01-10 19:58 Marcelo Tosatti
2004-01-11 9:01 ` Simon Kirby
2004-01-14 16:23 ` Marcelo Tosatti
2004-01-15 14:35 ` Thomas Zehetbauer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20040110144049.5e195ebd.akpm@osdl.org \
--to=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=sim@netnation.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox