public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Bill Davidsen <davidsen@tmr.com>
To: linux-kernel@vger.kernel.org
Subject: Re: 2.6.7 SMP trouble?
Date: Tue, 20 Jul 2004 15:57:23 -0400	[thread overview]
Message-ID: <cdjt08$c70$1@gatekeeper.tmr.com> (raw)
In-Reply-To: <Pine.LNX.4.53.0407191557590.3740@chaos>

Richard B. Johnson wrote:
> On Mon, 19 Jul 2004, Jason Gauthier wrote:
> 
> 
>>I've found an IBM netfinity (5600) box that was shelved a few years ago.  I
>>spent $80 and got two processors for it. (P3-667).
>>
>>I put them in the box, installed Linux (slackware) and upgraded the kernel
>>to 2.6.7.  I then started installing my software on it.  Nagios, MRTG,
>>samba, and some other tools we use for network monitoring.  This is going to
>>be an upgrade to a monitoring server we have.  Well, I went home, came in
>>the next day and the box was locked hard.  No messages, no console output.
>>Just dead.
>>
>>Thinking it was a fluke, I fired it up.  Again, after several hours running;
>>total death.  So, I figured I have two options.  Software or hardware is
>>making it die.  I removed each processor in turn, and ran the box for over
>>24 hours under HIGH stress. (5+ load average). The system is running the
>>above mentioned software.  But, just to make sure this processor gets a
>>workout I am compiling code over and over.  Both processors have been rock
>>solid for the duration of the test.
>>
>>I then placed both processors in the box and started the same test.  It was
>>dead within 8 hours.  I am now very suspicious of the kernel.
>>
>>So, I installed 2.4.22 and ran the same tests.  It went over 48 hours with
>>no issues.  Now I'm certain it's the kernel.  Can anyone confirm any SMP
>>issues that might cause this?
>>
>>Thanks,
>>
>>Jason
> 
> 
> Another data-point. I haven't been able to run any new (2.6+) kernel
> reliably in a SMP machine. They stop. Just like you noted. That's
> why all my SMP machines still run 2.4.26. It's rock solid and has
> the latest-and-greatest updates (there's a -pre-27 coming out).
> Anyway, for production machines, you probably need to run 2.4.26.
> 
> If you don't really need anything reliable, you might try to enable
> Sys Req and see if you can find out where it's stopped. When my
> machines stop, the CPUs get cold, just like their clocks were
> shut off! -- another data-point --

I suspect it's s config thing, rather than some overall evil, I have 
some production machines up 72+ days. These are production news servers 
with hundreds of users all day long.

The exact version is 2.6.5aa5, but I had a 2.6.7 up for 30 days or so 
until AS3.0 got a hotfix for my applications.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

  parent reply	other threads:[~2004-07-20 20:00 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-19 19:16 2.6.7 SMP trouble? Jason Gauthier
2004-07-19 20:04 ` Richard B. Johnson
2004-07-20 14:22   ` Zwane Mwaikambo
2004-07-20 19:57   ` Bill Davidsen [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-07-20 14:21 Jason Gauthier
2004-07-20 14:34 ` Zwane Mwaikambo
2004-08-02 13:14 Jason Gauthier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='cdjt08$c70$1@gatekeeper.tmr.com' \
    --to=davidsen@tmr.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox