From: "Pasi Kärkkäinen" <pasik@iki.fi>
To: "dwight at supercomputer.org" <dwight@supercomputer.org>
Cc: xen-devel@lists.xensource.com
Subject: Re: XCP: Crashes on dual Xeon HP ProLiant systems
Date: Fri, 30 Apr 2010 21:20:07 +0300 [thread overview]
Message-ID: <20100430182007.GA17817@reaktio.net> (raw)
In-Reply-To: <201004300932.37495.dwight@supercomputer.org>
On Fri, Apr 30, 2010 at 09:32:37AM -0700, dwight at supercomputer.org wrote:
> Is anyone else running the latest XCP on HP ProLiant DL380
> systems? Or a similar dual Xeon 8-core system? I'm seeing
> spontaneous reboots when under a load.
>
> Specifically, when 4 Windows HVMs are loaded, I haven't noticed
> any reboots yet. But when running 7 or 8, the system will
> reboot within minutes. Very little information appears on
> the console.
>
> I built a debugging version of the hypervisor, which changed
> the behavior; the system managed to stay up for 2-3 hours
> with 7 VMs running. However, it again spontaneously rebooted,
> with no real messages on the console as to why.
>
> I can send out the console log messages this evening, along
> with the system information if there's interest. Alas, I
> don't have access to these items at the moment.
>
> I have also been running memtest86 overnight. As of 1.5 hours into
> the test, there were no errors. But there are 48 GB of RAM
> on the system, so the testing wasn't complete when I left.
>
> Any suggestions here? I was going to build a 32-bit kernel
> from the latest patches, but it appears Centos 5.4 Xen is
> also not stable on these systems. I had trouble getting
> the kernel to build here, with various errors. The most
> notable of which was:
>
> ----------------------
> CC arch/x86/kernel/acpi/processor.o
> In file included from arch/x86/kernel/acpi/processor.c:8:
> include/linux/kernel.h:185: internal compiler error: Segmentation
> fault
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://bugzilla.redhat.com/bugzilla> for instructions.
> The bug is not reproducible, so it is likely a hardware or OS
> problem.
> make[2]: *** [arch/x86/kernel/acpi/processor.o] Error 1
> make[1]: *** [arch/x86/kernel/acpi] Error 2
> make: *** [arch/x86/kernel] Error 2
> ----------------------
>
Uhm.. the compiler really shouldn't crash.
Are you sure your hardware is OK? If the stock EL5.4 Xen also crashes,
it could be broken hardware?
Did you try running memtest86+ ?
Is baremetal Linux stable, if you run for example
"make -j8 bzImage && make -j8 modules && make clean" kernel build in a loop?
-- Pasi
next prev parent reply other threads:[~2010-04-30 18:20 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-30 16:32 XCP: Crashes on dual Xeon HP ProLiant systems dwight at supercomputer.org
2010-04-30 18:20 ` Pasi Kärkkäinen [this message]
2010-05-01 21:06 ` dwight at supercomputer.org
2010-04-30 19:15 ` Ian Campbell
2010-05-01 21:07 ` dwight at supercomputer.org
2010-05-24 16:35 ` XCP: Epilog - " dwight at supercomputer.org
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100430182007.GA17817@reaktio.net \
--to=pasik@iki.fi \
--cc=dwight@supercomputer.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).