public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: Marcelo Tosatti <marcelo.tosatti@cyclades.com>
Cc: sim@netnation.com, linux-kernel@vger.kernel.org
Subject: Re: 2.4.24 SMP lockups
Date: Sat, 10 Jan 2004 14:40:49 -0800	[thread overview]
Message-ID: <20040110144049.5e195ebd.akpm@osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.58L.0401101719400.1310@logos.cnet>

Marcelo Tosatti <marcelo.tosatti@cyclades.com> wrote:
>
> 
> 
> On Fri, 9 Jan 2004, Simon Kirby wrote:
> 
> > 'lo all,
> 
> Hi Simon,
> 
> > We've had about 6 cases of this now, across 4 separate boxes.  Since
> > upgrading to 2.4.24, our SMP web server boxes (both Intel and AMD
> > hardware) are randomly blowing up.  This may have happened on 2.4.23 as
> > well, but they weren't really running long enough to tell.  2.4.22 was
> > fine.  GCC 3.3.3.
> >
> > These boxes are all dual CPU, and the failure case shows up suddenly with
> > no warning.  Sysreq-P works, but only reports from one CPU no matter how
> > many times I try.  In normal operation, every machine distributes all
> > IRQs across both CPUs, and Sysreq-P reports from both CPUs.
> >
> > Mapping the EIP reported by Sysreq-P to symbols shows that the responding
> > CPU is spinning on a spinlock (so far I have seen .text.lock.fcntl,
> > .text.lock.sched, .text.lock.locks, and .text.lock.inode), which I assume
> > is being held by the other (dead) CPU.
> 
> This sounds like a deadlock. I wonder why the NMI watchdog is not
> triggering.

Presumably it's spinning on the lock with interrupts enabled.  Make that
the `NMI' counters in /proc/interrupts are incrementing for all CPUs.


> > Even on boxes with nmi_watchdog=1, nothing is reported from the NMI
> > watchdog.
> 
> Can you share all available SysRQ-P output for the locked CPU ? SysRQ-T if
> possible, too.

sysrq-T would be best.

We don't have an each-CPU backtrace facility - it could be handy.  There's
one in the low-latency patch for some reason.



  parent reply	other threads:[~2004-01-10 22:40 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-09 21:04 2.4.24 SMP lockups Simon Kirby
2004-01-09 22:20 ` Arkadiusz Miskiewicz
2004-01-10 15:51 ` Thomas Zehetbauer
     [not found] ` <Pine.LNX.4.58L.0401101719400.1310@logos.cnet>
2004-01-10 22:40   ` Andrew Morton [this message]
2004-01-11  4:12     ` Rik van Riel
2004-01-11 13:16       ` Marcelo Tosatti
2004-01-12 12:18       ` Marcelo Tosatti
2004-01-12 12:43         ` Thomas Zehetbauer
2004-01-11  8:55     ` Simon Kirby
2004-01-11  9:30       ` Willy Tarreau
2004-01-14 17:07   ` Simon Kirby
2004-01-14 17:56     ` Marcelo Tosatti
2004-01-16  2:34       ` Philippe Troin
2004-01-14 18:28     ` David Woodhouse
2004-01-14 21:01       ` David Woodhouse
  -- strict thread matches above, loose matches on Subject: below --
2004-01-10 19:58 Marcelo Tosatti
2004-01-11  9:01 ` Simon Kirby
2004-01-14 16:23   ` Marcelo Tosatti
2004-01-15 14:35     ` Thomas Zehetbauer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040110144049.5e195ebd.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=sim@netnation.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox