From: Bill Davidsen <davidsen@tmr.com>
To: linux-kernel@vger.kernel.org
Subject: Re: 2.6.7 SMP trouble?
Date: Tue, 20 Jul 2004 15:57:23 -0400 [thread overview]
Message-ID: <cdjt08$c70$1@gatekeeper.tmr.com> (raw)
In-Reply-To: <Pine.LNX.4.53.0407191557590.3740@chaos>
Richard B. Johnson wrote:
> On Mon, 19 Jul 2004, Jason Gauthier wrote:
>
>
>>I've found an IBM netfinity (5600) box that was shelved a few years ago. I
>>spent $80 and got two processors for it. (P3-667).
>>
>>I put them in the box, installed Linux (slackware) and upgraded the kernel
>>to 2.6.7. I then started installing my software on it. Nagios, MRTG,
>>samba, and some other tools we use for network monitoring. This is going to
>>be an upgrade to a monitoring server we have. Well, I went home, came in
>>the next day and the box was locked hard. No messages, no console output.
>>Just dead.
>>
>>Thinking it was a fluke, I fired it up. Again, after several hours running;
>>total death. So, I figured I have two options. Software or hardware is
>>making it die. I removed each processor in turn, and ran the box for over
>>24 hours under HIGH stress. (5+ load average). The system is running the
>>above mentioned software. But, just to make sure this processor gets a
>>workout I am compiling code over and over. Both processors have been rock
>>solid for the duration of the test.
>>
>>I then placed both processors in the box and started the same test. It was
>>dead within 8 hours. I am now very suspicious of the kernel.
>>
>>So, I installed 2.4.22 and ran the same tests. It went over 48 hours with
>>no issues. Now I'm certain it's the kernel. Can anyone confirm any SMP
>>issues that might cause this?
>>
>>Thanks,
>>
>>Jason
>
>
> Another data-point. I haven't been able to run any new (2.6+) kernel
> reliably in a SMP machine. They stop. Just like you noted. That's
> why all my SMP machines still run 2.4.26. It's rock solid and has
> the latest-and-greatest updates (there's a -pre-27 coming out).
> Anyway, for production machines, you probably need to run 2.4.26.
>
> If you don't really need anything reliable, you might try to enable
> Sys Req and see if you can find out where it's stopped. When my
> machines stop, the CPUs get cold, just like their clocks were
> shut off! -- another data-point --
I suspect it's s config thing, rather than some overall evil, I have
some production machines up 72+ days. These are production news servers
with hundreds of users all day long.
The exact version is 2.6.5aa5, but I had a 2.6.7 up for 30 days or so
until AS3.0 got a hotfix for my applications.
--
-bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me
next prev parent reply other threads:[~2004-07-20 20:00 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-19 19:16 2.6.7 SMP trouble? Jason Gauthier
2004-07-19 20:04 ` Richard B. Johnson
2004-07-20 14:22 ` Zwane Mwaikambo
2004-07-20 19:57 ` Bill Davidsen [this message]
-- strict thread matches above, loose matches on Subject: below --
2004-07-20 14:21 Jason Gauthier
2004-07-20 14:34 ` Zwane Mwaikambo
2004-08-02 13:14 Jason Gauthier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='cdjt08$c70$1@gatekeeper.tmr.com' \
--to=davidsen@tmr.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.