From: "Pasi Kärkkäinen" <pasik@iki.fi>
To: "dwight at supercomputer.org" <dwight@supercomputer.org>
Cc: xen-devel@lists.xensource.com
Subject: Re: XCP: Crashes on dual Xeon HP ProLiant systems
Date: Fri, 30 Apr 2010 21:20:07 +0300 [thread overview]
Message-ID: <20100430182007.GA17817@reaktio.net> (raw)
In-Reply-To: <201004300932.37495.dwight@supercomputer.org>
On Fri, Apr 30, 2010 at 09:32:37AM -0700, dwight at supercomputer.org wrote:
> Is anyone else running the latest XCP on HP ProLiant DL380
> systems? Or a similar dual Xeon 8-core system? I'm seeing
> spontaneous reboots when under a load.
>
> Specifically, when 4 Windows HVMs are loaded, I haven't noticed
> any reboots yet. But when running 7 or 8, the system will
> reboot within minutes. Very little information appears on
> the console.
>
> I built a debugging version of the hypervisor, which changed
> the behavior; the system managed to stay up for 2-3 hours
> with 7 VMs running. However, it again spontaneously rebooted,
> with no real messages on the console as to why.
>
> I can send out the console log messages this evening, along
> with the system information if there's interest. Alas, I
> don't have access to these items at the moment.
>
> I have also been running memtest86 overnight. As of 1.5 hours into
> the test, there were no errors. But there are 48 GB of RAM
> on the system, so the testing wasn't complete when I left.
>
> Any suggestions here? I was going to build a 32-bit kernel
> from the latest patches, but it appears Centos 5.4 Xen is
> also not stable on these systems. I had trouble getting
> the kernel to build here, with various errors. The most
> notable of which was:
>
> ----------------------
> CC arch/x86/kernel/acpi/processor.o
> In file included from arch/x86/kernel/acpi/processor.c:8:
> include/linux/kernel.h:185: internal compiler error: Segmentation
> fault
> Please submit a full bug report,
> with preprocessed source if appropriate.
> See <http://bugzilla.redhat.com/bugzilla> for instructions.
> The bug is not reproducible, so it is likely a hardware or OS
> problem.
> make[2]: *** [arch/x86/kernel/acpi/processor.o] Error 1
> make[1]: *** [arch/x86/kernel/acpi] Error 2
> make: *** [arch/x86/kernel] Error 2
> ----------------------
>
Uhm.. the compiler really shouldn't crash.
Are you sure your hardware is OK? If the stock EL5.4 Xen also crashes,
it could be broken hardware?
Did you try running memtest86+ ?
Is baremetal Linux stable, if you run for example
"make -j8 bzImage && make -j8 modules && make clean" kernel build in a loop?
-- Pasi
next prev parent reply other threads:[~2010-04-30 18:20 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-30 16:32 XCP: Crashes on dual Xeon HP ProLiant systems dwight at supercomputer.org
2010-04-30 18:20 ` Pasi Kärkkäinen [this message]
2010-05-01 21:06 ` dwight at supercomputer.org
2010-04-30 19:15 ` Ian Campbell
2010-05-01 21:07 ` dwight at supercomputer.org
2010-05-24 16:35 ` XCP: Epilog - " dwight at supercomputer.org
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100430182007.GA17817@reaktio.net \
--to=pasik@iki.fi \
--cc=dwight@supercomputer.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.