Symmetric Multiprocessing (SMP) development
 help / color / mirror / Atom feed
From: cerise@armory.com
To: linux-smp@vger.kernel.org
Subject: Re: FC4 crashes repeatedly on Supermicro AS1020A-T dual-core Opterons, SMP
Date: Fri, 5 May 2006 08:28:47 -0700	[thread overview]
Message-ID: <20060505152847.GB8408@boogeyman> (raw)
In-Reply-To: <Pine.LNX.4.63.0605051017180.19547@crafty.cis.uab.edu>

Hi Robert:

That might be because SuSE's compiled kernel doesn't use mce.  If you can look
in the .config for the compiled kernel (or you can ask one of the maintainers
for SuSE...or you're fortunate enough to have a /proc/config), I'd be curious
if it has MCE enabled (you'd be looking for "CONFIG_X86_MCE=y").  That would
nicely explain the discrepancy. 8)

-Phil/CERisE

On Fri, May 05, 2006 at 10:18:36AM -0500, Robert M. Hyatt wrote:
> 
> One note.  I am running on a quad 875 system, but am using Suse rather 
> than FC4.  It is running perfectly reliable (this is a 4 cpu, dual-core, 
> 2.2ghz box, 8 processors total).  I had problems with FC4 myself, 
> although it runs perfectly on my normal dual xeon boxes...
> 
> 
> Robert M. Hyatt, Ph.D.          Computer and Information Sciences
> hyatt@uab.edu                   University of Alabama at Birmingham
> (205) 934-2213                  136A Campbell Hall
> (205) 934-5473 FAX              Birmingham, AL 35294-1170
> 
> On Fri, 5 May 2006, Bill Davidsen wrote:
> 
> >Michal Szymanski wrote:
> >
> >>Hi all,
> >>
> >>I have recently purchased three Supermicro AS1020A-T servers equipped
> >>with two dual-core Opterons 280 each. H8DAR-T motherboards, 8 or 12 GB
> >>RAM. The systems carry FC4 x86_64 with proprietary driver (made by
> >>Adaptec) for the onboard Marvell 88SX6041 SATA Controller. Original
> >>(install) kernel 2.6.11-1.1369_FC4smp - unfortunately not upgradable due
> >>to the lack of the SATA driver for other kernel versions.
> >>
> >>All systems crash (either hang with some "machine check exception"
> >>kernel messages or reset) when loaded with repeating runs of 1.3gb, CPU
> >>intensive with some I/O. I run 2 or 4 jobs simultaneously and they had
> >>never survived more than a few hours.
> >>
> >>Suspecting it may be the SATA driver problem I mounted /tmp as "tmpfs"
> >>and repeated the tests entirely in /tmp (with plenty of RAM this means
> >>(IMHO) doing I/O in memory). No success.
> >>
> >>It is somewhat better when I run similar size no-I/O jobs but these also
> >>crash, although less frequently.
> >>
> >>I tried to install i386 version, also crashes. Same (or even worse) with
> >>FC3.
> >>
> >>Memtest does not show any RAM errors. 
> >>Finally I did two tests which seem to have excluded SATA
> >>controller/driver as the reason for crashes:
> >>
> >>1. I installed an additional IDE hard disk and put FC4/x86_64 system on
> >>it (without the Adaptec driver, so the system does not even see the SATA
> >>disks), updated the kernel to the latest (2.6.16) - also crashed.
> >>
> >>2. I ran non-SMP 2.6.11 kernel (with Adaptec driver) on another machine.
> >>There have been two test repeating 1.3g jobs running on it (each getting 
> >>50%
> >>of the single CPU used by the system) for over 50 hours now, no crashes.
> >>Also, a single test job running on SMP kernel gave no crashes in 24 hours.
> >>
> >>It seems there is a problem with SMP kernel and dual-core Opterons, at
> >>least on this hardware. I am stuck with three top-level machines which
> >>can work only at 25% of nominal cpu power. Any hints would be
> >>appreciated.
> >>
> >>
> >What happens if you use only one CPU? Either with a uni kernel (you should 
> >have gotten one) or "maxcpus=1" in the boot commands. You are running a 
> >custom kernel with custom drivers, so you really should be asking the 
> >supplier, all we can do is suggest things which might provide extra 
> >information.
> >
> >-- 
> >bill davidsen <davidsen@tmr.com>
> >CTO TMR Associates, Inc
> >Doing interesting things with small computers since 1979
> >
> >-
> >To unsubscribe from this list: send the line "unsubscribe linux-smp" in
> >the body of a message to majordomo@vger.kernel.org
> >More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-smp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2006-05-05 15:28 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-18 19:11 FC4 crashes repeatedly on Supermicro AS1020A-T dual-core Opterons, SMP Michal Szymanski
2006-05-05 14:00 ` Bill Davidsen
2006-05-05 15:18   ` Robert M. Hyatt
2006-05-05 15:28     ` cerise [this message]
2006-05-05 16:31       ` Robert M. Hyatt
2006-05-09 12:23     ` Michal Szymanski
2006-05-24 20:23       ` Bill Davidsen
2006-05-24 20:28         ` Bill Davidsen
2006-05-05 15:23   ` cerise
2006-05-12 10:54     ` Michal Szymanski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20060505152847.GB8408@boogeyman \
    --to=cerise@armory.com \
    --cc=linux-smp@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox